[antlr-interest] Problem parsing left-recursive expressions (AntLR 2.7.5)

Fri May 19 08:49:11 PDT 2006

Hi,

I have a rather nasty left-recursive structure to parse which I just cannot manage to handle correctly :-( Maybe I am just to blind to see, and you can enlighten me :-)

Here is the grammar I started with:

  variable:
    NAME
  | variable LBRACK expression RBRACK
  | LBRACK variable FROM base (FOR count)? RBRACK
  ;

Which is left-recursive. So I tried to refactore it a bit:

  variable:
    (LBRACK)* NAME
      ( 
           ( FROM base ( FOR count )? RBRACK )
         | ( LBRACK expression RBRACK )
      )*
  ;

which is not exactly the same - it is not ensured that there is an equal number of LBRACK and RBRACK tokens. This is why further augmented it:

  variable:
  {
    int openBrackets = 0;
  }
    (LBRACK {openBrackets++;})* NAME
      ( 
           ( openBrackets > 0 )?
           ( FROM base ( FOR count )? RBRACK  { openBrackets--;} )
         | 
           ( LBRACK expression RBRACK )
      )*
  {
    if (openBrackets > 0)
      throw new SemanticException();
  }
  ;

Which works most of the time. UNLESS it is used in a syntactic predicate, where all the openBracket stuff is not executed, and applies incorrectly to stuff like "LBRACK NAME".

So my questions are:
  - Is there another clever way to refactor the rules?
  - If not, how can I ensure that the openBracket checks are even executed the guessing mode? I already tried things like faking the "inputState.guessing" setting, embedding everything in syntactic predicates which are always true ... but nothing worked out.
  - I would even consider to hand-code the variable rule. Is this somehow possible?

Thanks in advance,

Kai Koehne