[antlr-interest] Tree grammars & optional AST nodes
Terence Parr
parrt at cs.usfca.edu
Thu Oct 19 16:30:47 PDT 2006
On Oct 19, 2006, at 3:49 PM, Nicolas Rouquette wrote:
> When using ANTLR3 for tree grammars, I've noticed some annoying
> behavior.
>
> Suppose that we take a grammar like Java1.5:
>
> expression
> : conditionalExpression (assignmentOperator expression)?
> ;
>
> conditionalExpression
> : conditionalOrExpression ( '?' expression ':' expression )?
> ;
>
> ...
>
>
> As a tree grammar, it would make sense to write:
Hi :)
You mean a parser grammar that builds trees, right?
> expression
> : conditionalExpression
> | e1=conditionalExpression op=assignmentOperator e2=expression
> -> ^($op $e1 $e2)
> ;
Yes, but that is hard for ANTLRto deal with because conditional
expression is recursive and ANTLR cannot decide which alternative to
take. you could turn on the backtrack mode:
expression
options {backtrack=true;}
: conditionalExpression
| e1=conditionalExpression op=assignmentOperator e2=expression
-> ^($op $e1 $e2)
;
but that won't work either because it will always match the first
alternative--you would have to reverse the alternatives.
expression
options {backtrack=true;}
: e1=conditionalExpression op=assignmentOperator e2=expression
-> ^($op $e1 $e2)
| conditionalExpression
;
OR, do this:
expression
: conditionalExpression (assignmentOperator^^ expression)?
;
:)
Hard to beat tree operators for expression stuff.
Ter
>
> conditionalExpression
> : conditionalOrExpression
> | c=conditionalOrExpression '?' e1=expression ':' e2=expression
> -> ^(IF_THEN_ELSE $c $e1 $e2)
> ;
>
> ...
>
> I get a bunch of errors:
>
> internal eror:
> org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:778):
> could not even do k=1 for decision <N>
>
> for some value of <N>
>
> When I run Antlr with the -Xwatchconversion option,
> I see that the culprit is in 'expression' and 'conditionalExpression'.
>
> To get rid of these errors, I changed the grammar like this:
>
> expression
> : e1=conditionalExpression (op=assignmentOperator e2=expression)?
> -> ^(COND_EXPR $e1 $op? $e2?)
> ;
>
> conditionalExpression
> : c=conditionalOrExpression ('?' e1=expression ':' e2=expression)?
> -> ^(IF_THEN_ELSE $c $e1? $e2?)
> ;
>
> ...
>
> This would be OK except that when parsing the mother-of-all
> expressions, 42,
> we'd get something like this:
>
> COND_EXPR -> IF_THEN_ELSE -> .... -> (INTEGER 42)
>
> when instead I'd like to have just:
>
> (INTEGER 42)
>
> So here are my universal questions:
>
> a) is there a better way to write the tree grammar
> to avoid non-essential AST nodes?
> b) if not, can someone point me to an example of a "cleanup" tree
> phase
> that eliminate all but the essential AST nodes?
>
> -- Nicolas.
>
> P.S. Yes, Antlr3 is really cool; ANTLRWorks makes lazy look cool.
>
>
More information about the antlr-interest
mailing list