[antlr-interest] Can a lexer rule have user defined attributes?

Mark Wright markwright at internode.on.net
Fri Jul 25 03:04:01 PDT 2008


On Fri, 25 Jul 2008 21:49:45 +1200
Gavin Lambert <antlr at mirality.co.nz> wrote:

> >- Why can lexer rules not have user defined attributes?
> 
> Because there's nowhere to put them.  All lexer rules (top level 
> ones, at least) are required to return a Token, and (except in the 
> C target) the standard token doesn't have any "spare" 
> storage.  You can, however, subclass either Token or CommonToken 
> and add your own fields.  Then you can store whatever additional 
> data you want.
> 
> However it's not quite as simple as the above; to get extra data 
> in there you'll need to override the emit method and pass it in 
> that way.  (Johannes recently posted an example of this for C# 
> just today, in fact.  I'm sure the Java version is similar.)

Something like this:

MyToken.java:

import org.antlr.runtime.CommonToken;
import org.antlr.runtime.CharStream;

public class MyToken extends CommonToken {
  protected Symbol symbol;

  public MyToken(CharStream input, int type, int channel, int start, int stop) {
    super(input, type, channel, start, stop);
    symbol = (Symbol)null;
  }

  public MyToken(int type, String text) {
    super(type, text);
    symbol = (Symbol)null;
  }

  public final Symbol getSymbol() {
    return symbol;
  }

  public void setSymbol(Symbol value) {
    symbol = value;
  }
}

ANTLR 3.1 beta grammar:

@lexer::members {
    public Token emit() {
        Token t = new MyToken(input, state.type, state.channel, state.tokenStartCharIndex, getCharIndex()-1);
        t.setLine(state.tokenStartLine);
        t.setText(state.text);
        t.setCharPositionInLine(state.tokenStartCharPositionInLine);
        emit(t);
        return t;
    }
}


-- 


More information about the antlr-interest mailing list