[antlr-interest] Problem parsing left-recursive expressions (AntLR 2.7.5)

Koehne Kai Kai.Koehne at student.hpi.uni-potsdam.de
Fri May 19 08:49:11 PDT 2006


Hi,
 
I have a rather nasty left-recursive structure to parse which I just cannot manage to handle correctly :-( Maybe I am just to blind to see, and you can enlighten me :-)
 
Here is the grammar I started with:
 
  variable:
    NAME
  | variable LBRACK expression RBRACK
  | LBRACK variable FROM base (FOR count)? RBRACK
  ;
 
Which is left-recursive. So I tried to refactore it a bit:
 
  variable:
    (LBRACK)* NAME
      ( 
           ( FROM base ( FOR count )? RBRACK )
         | ( LBRACK expression RBRACK )
      )*
  ;
 
which is not exactly the same - it is not ensured that there is an equal number of LBRACK and RBRACK tokens. This is why further augmented it:
 
  variable:
  {
    int openBrackets = 0;
  }
    (LBRACK {openBrackets++;})* NAME
      ( 
           ( openBrackets > 0 )?
           ( FROM base ( FOR count )? RBRACK  { openBrackets--;} )
         | 
           ( LBRACK expression RBRACK )
      )*
  {
    if (openBrackets > 0)
      throw new SemanticException();
  }
  ;
 
Which works most of the time. UNLESS it is used in a syntactic predicate, where all the openBracket stuff is not executed, and applies incorrectly to stuff like "LBRACK NAME".
 
So my questions are:
  - Is there another clever way to refactor the rules?
  - If not, how can I ensure that the openBracket checks are even executed the guessing mode? I already tried things like faking the "inputState.guessing" setting, embedding everything in syntactic predicates which are always true ... but nothing worked out.
  - I would even consider to hand-code the variable rule. Is this somehow possible?
 
Thanks in advance,
 
Kai Koehne


More information about the antlr-interest mailing list