[antlr-interest] First and Last Token of a Rule

Zachary Palmer zep_antlr at bahj.com
Fri Jan 15 14:02:40 PST 2010


All,

I think this is a pretty simple operation, but I have no idea how to 
execute it.  Suppose I'm in some action code and have a reference to the 
parser.  Is there a way for me to obtain the most recently used token?  
How about the token that started the most recent grammar rule?

For instance, consider the following grammar (using a Java target language):

foo: 'a' bar* 'd' { doStuff(); };
bar: ('b' | 'c') { doStuff(); };

Let's assume we are feeding this grammar the string "abcd".  In that 
case, doStuff is called three times: once after the token 'b' is matched 
in the bar rule, once after the token 'c' is matched in the bar rule, 
and once after the tokens 'a' through 'd' are matched in the foo rule.  
I would like, from within the body of the doStuff method, to obtain the 
first and last token of each rule matched.  So, for instance, if my 
doStuff method looked like this:

void doStuff() {
    Token first = ...; // first token of the current rule
    Token last = ...; // token most recently used
    System.out.println(first.getText() + ", " + last.getText());
}

then the output to the above grammar when provided the input "abcd" 
should be

b,b
c,c
a,d

This is, of course, a representative example; the real situation is a 
bit more complicated.  The catch is that I don't want to add any 
arguments to the doStuff method or do anything else that would require 
me to change every rule in this 3,000 line grammar.  Is there a way that 
I can get the first token of the current rule and the most recently used 
token without tweaking every single grammar rule?

Many thanks for reading!

Zachary Palmer


More information about the antlr-interest mailing list