[antlr-interest] First and Last Token of a Rule

Zachary Palmer zep_antlr at bahj.com
Fri Jan 15 16:09:32 PST 2010


Thanks for the reply.  :)   That's good to know.  Any idea about how to 
get the first token in a given rule?  With the information you've given 
me, I could always stick something in an @init and an @after in every 
rule, but I'd definitely like to avoid that.  I guess what I'm really 
wanting is an @allrulesinit and an @allrulesafter (to occur before and 
after the @init and @after, respectively), but it doesn't seem like 
those exist.

In fact, I am building an AST.  The actions I mentioned previously are 
doing just that and every rule I have is of the (unfortunate) form:

foo returns [FooNode ret]
       bar ';'
           $ret = factory.makeFooNode($bar.ret);

Because I want node creation to be indirected through a factory (and 
because I want a heterogeneous AST), there doesn't seem to be any choice 
but to use this approach.  The people who wrote the ANTLR3 Java 1.5 
grammar I pulled from the ANTLR website seemed to agree; the OpenJDK 
project uses the same approach for their ANTLR parser.  I've gotten 
exactly the tree I needed (built to a different API than the Java 
Compiler API for purposes of my project) and now I want to tag those 
nodes with their start and end tokens.  I might actually have some luck 
with scopes; I should look into that.

Thanks again for the help!


> The upcoming token at any point is returned by input.LT(1), the previous token by input.LT(-1)
> So:
> foo
> @init {
>  CommonToken sToken = input.LT(1);
> }
> : A bar* D { doStuff(sToken, input.LT(-1)); }
> ;
> And so on. Also look at things like $start depending on what the output is etc.
> However, you will be much better off building an AST then walking the tree to do your actions. 
> Jim
>> -----Original Message-----
>> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
>> bounces at antlr.org] On Behalf Of Zachary Palmer
>> Sent: Friday, January 15, 2010 2:03 PM
>> To: antlr-interest at antlr.org
>> Subject: [antlr-interest] First and Last Token of a Rule
>> All,
>> I think this is a pretty simple operation, but I have no idea how to
>> execute it.  Suppose I'm in some action code and have a reference to
>> the
>> parser.  Is there a way for me to obtain the most recently used token?
>> How about the token that started the most recent grammar rule?
>> For instance, consider the following grammar (using a Java target
>> language):
>> foo: 'a' bar* 'd' { doStuff(); };
>> bar: ('b' | 'c') { doStuff(); };
>> Let's assume we are feeding this grammar the string "abcd".  In that
>> case, doStuff is called three times: once after the token 'b' is
>> matched
>> in the bar rule, once after the token 'c' is matched in the bar rule,
>> and once after the tokens 'a' through 'd' are matched in the foo rule.
>> I would like, from within the body of the doStuff method, to obtain the
>> first and last token of each rule matched.  So, for instance, if my
>> doStuff method looked like this:
>> void doStuff() {
>>     Token first = ...; // first token of the current rule
>>     Token last = ...; // token most recently used
>>     System.out.println(first.getText() + ", " + last.getText());
>> }
>> then the output to the above grammar when provided the input "abcd"
>> should be
>> b,b
>> c,c
>> a,d
>> This is, of course, a representative example; the real situation is a
>> bit more complicated.  The catch is that I don't want to add any
>> arguments to the doStuff method or do anything else that would require
>> me to change every rule in this 3,000 line grammar.  Is there a way
>> that
>> I can get the first token of the current rule and the most recently
>> used
>> token without tweaking every single grammar rule?
>> Many thanks for reading!
>> Zachary Palmer
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
>> email-address
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address

More information about the antlr-interest mailing list