[antlr-interest] Modifying token text - is it possible?

Steve Bennett stevagewp at gmail.com
Tue Nov 20 21:29:03 PST 2007


On 11/21/07, Jim Idle <jimi at temporal-wave.com> wrote:
> How about:
>
> fragment
> TAG_GUTS: LETTER+
>                 { emit(); }
> ;
> HTML_TAG: '<' TAG_GUTS '>'
> ;

Hmm, that looks like it would pass <> up the tree? Let's say I want
the opposite, just the guts, not the < and >...


> You may need to set the token type for the emitted token.
>
> You can also do: xxx=FRAGRULE  and set the token text to $xxx.text

That doesn't seem to work for me:

TAG_GUTS: xxx='0'..'9'+
               { emit($xxx.text); };

gives:

[16:25:29] \tmp\antlrworks\tinyLexer.java:135: int cannot be dereferenced
[16:25:29]              emit(xxx.text);

Same without the $.

In any case, passing any arguments to emit() in a lexer rule seems to fail:

[16:26:37] \tmp\antlrworks\tinyLexer.java:135: cannot find symbol
[16:26:37] symbol  : method emit(java.lang.String)
[16:26:37] location: class tinyLexer
[16:26:37]              emit("blah");
[16:26:37]              ^
[16:26:37] 1 error

Is there really no documentation on this stuff at all? I seem to end
up relentlessly text-searching the book pdf for constructs like
"emit(" and hope to hit a fragment of code that looks like what I'm
trying to do. I would dearly love a list of constructs that ANTLR
recognises, that would answer questions like "In a Lexer rule, you can
do X. In a Parser rule you cannot do X but you can do Y" etc etc. It's
very frustrating atm :(

Steve


More information about the antlr-interest mailing list