[antlr-interest] Can a lexer rule have user defined attributes?

Raphael Reitzig r_reitzi at cs.uni-kl.de
Fri Jul 25 08:00:20 PDT 2008


Hi!

You could expect ANTLR to recognize such use of a token rule ('wow, my  
user wants to have extra data in his tokens!') and automatically  
create a subclass of Token with the required attributes and methods.  
Do I overlook something that makes this task extraordinary hard?

Regards

Raphael

"Mark Wright" <markwright at internode.on.net> wrote (Fri Jul 25 12:04:01 2008):

> On Fri, 25 Jul 2008 21:49:45 +1200
> Gavin Lambert <antlr at mirality.co.nz> wrote:
>
>> >- Why can lexer rules not have user defined attributes?
>>
>> Because there's nowhere to put them.  All lexer rules (top level
>> ones, at least) are required to return a Token, and (except in the
>> C target) the standard token doesn't have any "spare"
>> storage.  You can, however, subclass either Token or CommonToken
>> and add your own fields.  Then you can store whatever additional
>> data you want.
>>
>> However it's not quite as simple as the above; to get extra data
>> in there you'll need to override the emit method and pass it in
>> that way.  (Johannes recently posted an example of this for C#
>> just today, in fact.  I'm sure the Java version is similar.)
>
> Something like this:
>
> MyToken.java:
>
> import org.antlr.runtime.CommonToken;
> import org.antlr.runtime.CharStream;
>
> public class MyToken extends CommonToken {
>   protected Symbol symbol;
>
>   public MyToken(CharStream input, int type, int channel, int start,  
> int stop) {
>     super(input, type, channel, start, stop);
>     symbol = (Symbol)null;
>   }
>
>   public MyToken(int type, String text) {
>     super(type, text);
>     symbol = (Symbol)null;
>   }
>
>   public final Symbol getSymbol() {
>     return symbol;
>   }
>
>   public void setSymbol(Symbol value) {
>     symbol = value;
>   }
> }
>
> ANTLR 3.1 beta grammar:
>
> @lexer::members {
>     public Token emit() {
>         Token t = new MyToken(input, state.type, state.channel,  
> state.tokenStartCharIndex, getCharIndex()-1);
>         t.setLine(state.tokenStartLine);
>         t.setText(state.text);
>         t.setCharPositionInLine(state.tokenStartCharPositionInLine);
>         emit(t);
>         return t;
>     }
> }
>
>
> --
>



----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 191 bytes
Desc: PGP Digital Signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20080725/c37320a2/attachment.bin 


More information about the antlr-interest mailing list