[antlr-interest] Need some help with AST creation

John B. Brodie jbb at acm.org
Fri Aug 6 11:14:25 PDT 2010


Greetings!

On Fri, 2010-08-06 at 16:41 +0100, Luis Pureza wrote:
> Hi,
> 
> I need some help from the ANTLR wizards :)

I am not sure I qualify as an ANTLR wizard, but I try to answer your
question anyway....
> 
> I'm trying to match expressions with field accesses and array indexes.
> For example:
> 
> costumers.length
> costumers[0].address
> costumers[costumers.length - 1].orders[0].total
> 
> 
> The following rule seems to work:
> 
> fieldExpr      : atom ('.'^ identifier | ('['^ expr ']'!))*;
> 
> However, it creates trees with notes annotated with '[', and I'd
> prefer to have a dummy token like INDEX. For example, costumers[0] now
> returns
> 
> ([ (ID costumers) (INT 0))
> 
> But I'd like it to return
> 
> (INDEX (ID costumers) (INT 0))
> 
> I tried to create the AST manually with -> ^(...), but I ended up
> nowhere. Maybe I should've tried to refactor the grammar, but that
> would make it a little less readable, so I didn't do it.
> How do you suggest I do this?
> 

This is discussed starting on page 174 of Dr. Parr's book The Definitive
ANTLR Reference [TDAR]. It may also be somewhere in the wiki -- haven't
looked for it there...

Bottom line: you are allowed to refer to the rule's name in a re-write
section.

but first you need to tell ANTLR about your imaginary tokens by putting

tokens { INDEX; DOT /* or whatever */; }

this belongs after the options{} block, if any, but before the first
rule (possibly before any @members, can't remember...)

and now your fieldExpr rule becomes:

fieldExpr : ( a=atom -> $a /*initializes $fieldExpr*/ )
    ( ( x='.' i=identifier -> ^(DOT[$x,"DOT"] $fieldExpr $i) )
    | ( x='[' e=expr ']' -> ^(INDEX[$x,"INDEX"] $fieldExpr $e) )
    )* ;

(note the above specific meta-syntax is from memory, may have some
slight errors, but you get the idea, i hope)

(i think you do need all of those parentheses, as I recall, needed when
mixing rewrites in amongst syntax specifications)

(you can probably not have all of the labels (a=...,x=...,etc), but
somehow to my poor brain it is clearer with the labels)

so in its potentially tersest form:

fieldExpr : 
    atom ( ('.' identifier -> ^(DOT['.',"DOT"] $fieldExpr identifier)
           ('[' expr ']' -> ^(INDEX['[',"INDEX"] $fieldExpr expr )
         )* ;


Hope this helps...
   -jbb






More information about the antlr-interest mailing list