[antlr-interest] Need some help with AST creation
John B. Brodie
jbb at acm.org
Fri Aug 6 11:14:25 PDT 2010
Greetings!
On Fri, 2010-08-06 at 16:41 +0100, Luis Pureza wrote:
> Hi,
>
> I need some help from the ANTLR wizards :)
I am not sure I qualify as an ANTLR wizard, but I try to answer your
question anyway....
>
> I'm trying to match expressions with field accesses and array indexes.
> For example:
>
> costumers.length
> costumers[0].address
> costumers[costumers.length - 1].orders[0].total
>
>
> The following rule seems to work:
>
> fieldExpr : atom ('.'^ identifier | ('['^ expr ']'!))*;
>
> However, it creates trees with notes annotated with '[', and I'd
> prefer to have a dummy token like INDEX. For example, costumers[0] now
> returns
>
> ([ (ID costumers) (INT 0))
>
> But I'd like it to return
>
> (INDEX (ID costumers) (INT 0))
>
> I tried to create the AST manually with -> ^(...), but I ended up
> nowhere. Maybe I should've tried to refactor the grammar, but that
> would make it a little less readable, so I didn't do it.
> How do you suggest I do this?
>
This is discussed starting on page 174 of Dr. Parr's book The Definitive
ANTLR Reference [TDAR]. It may also be somewhere in the wiki -- haven't
looked for it there...
Bottom line: you are allowed to refer to the rule's name in a re-write
section.
but first you need to tell ANTLR about your imaginary tokens by putting
tokens { INDEX; DOT /* or whatever */; }
this belongs after the options{} block, if any, but before the first
rule (possibly before any @members, can't remember...)
and now your fieldExpr rule becomes:
fieldExpr : ( a=atom -> $a /*initializes $fieldExpr*/ )
( ( x='.' i=identifier -> ^(DOT[$x,"DOT"] $fieldExpr $i) )
| ( x='[' e=expr ']' -> ^(INDEX[$x,"INDEX"] $fieldExpr $e) )
)* ;
(note the above specific meta-syntax is from memory, may have some
slight errors, but you get the idea, i hope)
(i think you do need all of those parentheses, as I recall, needed when
mixing rewrites in amongst syntax specifications)
(you can probably not have all of the labels (a=...,x=...,etc), but
somehow to my poor brain it is clearer with the labels)
so in its potentially tersest form:
fieldExpr :
atom ( ('.' identifier -> ^(DOT['.',"DOT"] $fieldExpr identifier)
('[' expr ']' -> ^(INDEX['[',"INDEX"] $fieldExpr expr )
)* ;
Hope this helps...
-jbb
More information about the antlr-interest
mailing list