[antlr-interest] Grammar puzzle....

Thu Jul 12 06:13:14 PDT 2007

I try to parse the following input:
===========
3 & 4 + a is c + 4 & 3
===========

with the following grammar:
===========
grammar TreeTest;

options {output=AST;}
tokens
{
 IS='is';
 XAMLNS;
}

expression: logical ;

logical : compare (LOR^ | LAND^ compare)* ;

compare : (additive -> additive)
  (
   ( op=(LT | GT) s=additive -> ^($op $compare $s) )
  | is='is' i=additive -> ^(IS[$is] $compare $i)
  )?
  ;

additive: multiple ((PLUS^ | MINUS^) multiple)* ;

multiple: atom ((MULT^ | DIV^) atom)* ;

atom :  identifier | INT;

identifier 
 : ( xaml=ID COLON )? id0=ID
  ( DOT id+=ID )*
  -> ^( ID[$id0] ^( XAMLNS[$xaml] ) ^( ID[$id0] $id0 $id+ )  )
 ;

ID : 'a'..'z' + ;
INT : '0'..'9' +;
PLUS :  '+';
MINUS : '-';
MULT :  '*';
DIV : '/';
LAND : '&';
LOR : '|';
LT : '<';
GT : '>';
DOT : '.';
COMMA : ',';
COLON : ':';
WS : (' ' |'\n' |'\r' ) {$channel=HIDDEN;} ;
===========

The parsing stop just before 'is', i.e. I can only parse "3 & 4 + a"
I can't understand why.

What seems even more mysterious to me is, if I simplify my 'identifier' rule to be like that:
===========
identifier: ID;
===========

I could parse all my input.

For the life of me I can't understand why the previous syntax for the 'identifier' rule  prevent 'is' to be parsed....

Any tip?!?!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070712/a93eff2f/attachment.html