[antlr-interest] Tree grammars & optional AST nodes

Terence Parr parrt at cs.usfca.edu
Thu Oct 19 16:30:47 PDT 2006


On Oct 19, 2006, at 3:49 PM, Nicolas Rouquette wrote:

> When using ANTLR3 for tree grammars, I've noticed some annoying  
> behavior.
>
> Suppose that we take a grammar like Java1.5:
>
> expression
>  : conditionalExpression (assignmentOperator expression)?
>  ;
>
> conditionalExpression
>  : conditionalOrExpression ( '?' expression ':' expression )?
>  ;
>
> ...
>
>
> As a tree grammar, it would make sense to write:

Hi :)

You mean a parser grammar that builds trees, right?

> expression
>  : conditionalExpression
>  | e1=conditionalExpression op=assignmentOperator e2=expression
>    -> ^($op $e1 $e2)
>  ;

Yes, but that is hard for ANTLRto deal with because conditional  
expression is recursive and ANTLR cannot decide which alternative to  
take. you could turn on the backtrack mode:

expression
options {backtrack=true;}
  : conditionalExpression
  | e1=conditionalExpression op=assignmentOperator e2=expression
    -> ^($op $e1 $e2)
  ;

  but that won't work either because it will always match the first  
alternative--you would have to reverse the alternatives.

expression
options {backtrack=true;}
  : e1=conditionalExpression op=assignmentOperator e2=expression
    -> ^($op $e1 $e2)
  | conditionalExpression
  ;

OR, do this:

expression
  : conditionalExpression (assignmentOperator^^ expression)?
  ;

:)

Hard to beat tree operators for expression stuff.

Ter

>
> conditionalExpression
>  : conditionalOrExpression
>  | c=conditionalOrExpression '?' e1=expression ':' e2=expression
>    -> ^(IF_THEN_ELSE $c $e1 $e2)
>  ;
>
> ...
>
> I get a bunch of errors:
>
> internal eror:
> org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:778):
> could not even do k=1 for decision <N>
>
> for some value of <N>
>
> When I run Antlr with the -Xwatchconversion option,
> I see that the culprit is in 'expression' and 'conditionalExpression'.
>
> To get rid of these errors, I changed the grammar like this:
>
> expression
>  : e1=conditionalExpression (op=assignmentOperator e2=expression)?
>    -> ^(COND_EXPR $e1 $op? $e2?)
>  ;
>
> conditionalExpression
>  : c=conditionalOrExpression ('?' e1=expression ':' e2=expression)?
>    -> ^(IF_THEN_ELSE $c $e1? $e2?)
>  ;
>
> ...
>
> This would be OK except that when parsing the mother-of-all  
> expressions, 42,
> we'd get something like this:
>
> COND_EXPR -> IF_THEN_ELSE -> .... -> (INTEGER 42)
>
> when instead I'd like to have just:
>
> (INTEGER 42)
>
> So here are my universal questions:
>
> a) is there a better way to write the tree grammar
>    to avoid non-essential AST nodes?
> b) if not, can someone point me to an example of a "cleanup" tree  
> phase
>    that eliminate all but the essential AST nodes?
>
> -- Nicolas.
>
> P.S. Yes, Antlr3 is really cool; ANTLRWorks makes lazy look cool.
>
>



More information about the antlr-interest mailing list