[antlr-interest] Manipulating text in the lexer

Indhu Bharathi indhu.b at s7software.com
Thu Feb 26 10:09:17 PST 2009


>it's no longer possible to alter the content of a token away from what's on the input at all.

I'm not sure if this is right. I still do token.setText(...) in my actions and I'm using ANTLR 3.1.1. 

Just a guess... Maybe you have to use TokenRewriteStream instead of the regular CommonTokenStream.

- Indhu


----- Original Message -----
From: Sam Barnett-Cormack <s.barnett-cormack at lancaster.ac.uk>
To: ANTLR Interest Mailing List <antlr-interest at antlr.org>
Sent: Thursday, February 26, 2009 10:09:15 PM GMT+0530 Asia/Calcutta
Subject: [antlr-interest] Manipulating text in the lexer

Hey again all,

(apologies if this has already been received - as far as I can tell it 
didn't get through the first time)

So, having returned to ANTLR (as previously mentioned), I've been trying
to do things that used to be possible, and appear no longer to be so.
http://www.antlr.org/blog/antlr3/lexical.tml suggests that it's no
longer possible to alter the content of a token away from what's on the
input at all. Crafting an ASN.1 grammar this is rather a pain - as well
as the obvious matter of wanting to be able to strip the '"' from each
end of a string literal, ASN.1 string literals have an odd requirement
on the handling of whitespace and newlines within them, hopefully
illustrated by these grammar fragments:

fragment
CSTRINGNL : WSNONL* NL WSNONL* {setText("");};

CSTRING : '"' ((CSTRINGNL)=> CSTRINGNL | '""' | ~'"') '"';

WS : (WSNONL | NL) {$channel=HIDDEN;};

fragment
NL : ('\n' | '\r' | '\v' | '\f');

fragment
WSNONL : (' ' | '\t');

Ideally, I'd also want to turn the '""' that's found inside a string
literal into a single '"' before passing it on to the parser, as there's
no need whatsoever to hold onto that. However, it's a *requirement* to
discard newlines, along with any other whitespace immediately preceding
or succeeding each. It'd be really frustrating to have to change that at
a later stage in processing.

So, can anyone clarify this for me, or let me know of some sort of
workaround?

Thanks,

Sam Barnett-Cormack


List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list