[antlr-interest] start/stop tokens for subrules?

David Holroyd dave at badgers-in-foil.co.uk
Sun Sep 2 05:58:04 PDT 2007


On Thu, Aug 30, 2007 at 08:58:15PM +0000, David Holroyd wrote:
> Knowing the start/stop tokens for an AST node is very important to my
> application.
> 
> ANTLRv3 generated parsers don't currently seem to set start/stop tokens
> for the root AST nodes created by subrules.  Only the final result of the
> rule as a whole gets these values defined.  e.g. in,
> 
>   identifier
>    :  (  qualifiedIdent -> qualifiedIdent
>       )
>       (  options{greedy=true;}
>       :   d=DOT qualifiedIdent
>           -> ^(PROPERTY_OR_IDENTIFIER[$d] $identifier qualifiedIdent)
>       )*
>       -> ^(IDENTIFIER $identifier)
> 
> the generated PROPERTY_OR_IDENTIFIER node which is created as the root
> of the subtree does not get the adaptor.setTokenBoundaries() love.
> 
> 
> Admittedly, even if the subrule's root *did* get these values defined,
> they would be 'wrong' for the tree I want to construct (I want the
> startToken to reflect the start of the first child, not LT(1) at the
> start of the subrule).
> 
> I therefore end up creating auxiliary rules like this:
> 
>   identifier 
>    :  (  qualifiedIdent -> qualifiedIdent
>       )
>       (  options{greedy=true;}
>       :  poi=propOrIdent[root_0, retval.start] -> $poi
>       )*
>       -> ^(IDENTIFIER $identifier)
>     ;
> 
>   propOrIdent[Tree identPrimary, Token startToken]
>    :  { retval.start = startToken; }
>       d=DOT propId=qualifiedIdent
>       -> ^(PROPERTY_OR_IDENTIFIER[$d] {$identPrimary} $propId)
>     ;

I've come up with a simpler scheme elsewhere in the grammar, where I'm
using tree operators rather than rewrites:

  // multiplication/division/modulo (level 2)
  multiplicativeExpression
    :  unaryExpression
       (  o=multiplicativeOperator^
          unaryExpression
          { demarcate($o.tree); }
       )*
    ;

The demarcate() function just sets the start/stop tokens for the given
node to be the start and stop tokens from the first and last child
nodes, respectively.  With my custom tree node class, it is written as,

  private void demarcate(LinkedListTree parent) {
      parent.setStartToken(parent.getFirstChild().getStartToken());
      parent.setStopToken(parent.getLastChild().getStopToken());
  }




-- 
http://david.holroyd.me.uk/


More information about the antlr-interest mailing list