[antlr-interest] (follow up) setting, altering text in lexer rules
Martin Probst
mail at martin-probst.com
Mon Jun 12 13:59:49 PDT 2006
Hi,
> Ok, I propose that we take a big step back and say "you can set the
> text for the token manually". You get a setText() method and the
> auto mechanism will see your altered text if nonnull. If you want
> to build up a token piecemeal you must do so manually. So you'd do
> this:
>
> ESC : '\\' 'n' {setText("\n");} ;
>
> I still need to spend time inc/dec the rule level though so I know
> when to emit a token. It seems to cost a wee bit but that is ok I
> guess.
are you 100% sure about this? I think the "!" operator is one of the
most important feature of ANTLR's Lexers. And there are cases where
it's not that easy to figure out the text - the user would have to re-
parse the text in $getText() to get to his result. That's almost
certainly more expensive. Is there absolutely no way of supporting
this in a "if you use it you pay" way?
Did you try StringBuilder instead? If you call .setLength(0) once per
token it really shouldn't matter that much except for the
synchronization on StringBuffer.
What about the optimization of truncating start and end characters
simply by using different offsets? I think this is the most common
use case, e.g.:
STRING: '\"'! CHARS '\"'!;
Martin
More information about the antlr-interest
mailing list