[antlr-interest] empty alternatives erroneously consume tokens

Alexey Demakov demakov at ispras.ru
Thu Mar 2 02:31:57 PST 2006


1. empty alternative doesn't not consume tokens because only match(...)
method consume tokens. There are no match() calls in code for empty alternative.

2. _tokenSet_68 contains all tokens that can be after cont() rule.
So, if it is full grammar, _tokenSet_68 is empty. It causes
line 7:1: unexpected token: null

3. Try to add EOF at the end of c_expression() rule:
c_expression : (c_atom|c_bracket_exp|c_prefix_exp) cont EOF;
I'm sure it will solve the problem.

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com


----- Original Message ----- 
From: "Michael Brade" <brade at informatik.uni-muenchen.de>
To: <antlr-interest at antlr.org>
Sent: Thursday, March 02, 2006 12:19 PM
Subject: [antlr-interest] empty alternatives erroneously consume tokens

Hi,

I have a problem writing a grammar for parsing arithmetic expressions that
contain mixed prefix and infix notation. For instance, this expression
  +( 1-(3+4) +(2 3)-1 )
simply means
  1-(3+4)+2+3-1

What I would need in antlr is a syntactic predicate inside a loop, like this:

  c_expression: c_atom ( (arithmetic_op c_atom) => arithmetic_op c_atom )* ;

to test the following tokens at each repetition and quit the loop as soon as
the predicate fails---but that doesn't work.

So I thought about creating a recursive rule "cont" with the predicate and an
empty alternative if the predicate fails. This is the complete little parser:

c_expression : (c_atom|c_bracket_exp|c_prefix_exp) cont ;

cont  : (arithmetic_op (c_atom|c_bracket_exp|c_prefix_exp))
            => arithmetic_op (c_atom|c_bracket_exp|c_prefix_exp) cont
      | { /* empty */ }
      ;

c_bracket_exp: BR_OPEN c_expression BR_CLOSE ;
c_prefix_exp : arithmetic_op BR_OPEN c_expression c_expression BR_CLOSE ;
c_atom       : INT ;

arithmetic_op: PLUS | MINUS ;

BR_OPEN: '(' ;
BR_CLOSE: ')' ;

However, instead of generating an empty "else" in cont() the generated code
looks like this:

if ( synPredMatched221 ) {
    // here's the code for the predicate
}
else if ((_tokenSet_68.member(LA(1)))) {
    if ( inputState.guessing==0 ) {
        /* empty */
    }
}
else {
    throw new NoViableAltException(LT(1), getFilename());
}

So when parsing an expression I get
   ....
   < c_expression; LA(1)==)
  < c_prefix_exp; LA(1)==null
  > cont; LA(1)==null
line 7:1: unexpected token: null  // here the expression ends correctly
  < cont; LA(1)==null
 < c_expression; LA(1)==null

at the end of the parse since the "else if" wants another token.
If I remove the "else if ((_tokenSet_68...." stuff and change the code to

if ( synPredMatched221 ) {
    // here's the code for the predicate
}
else {
}

it works! So what can I do to make ANTLR generate the code I want to have?
I.e., an empty else branch?

Cheers,
-- 
Michael Brade;                 KDE Developer, Student of Computer Science
  |-mail: echo brade !#|tr -d "c oh"|s\e\d 's/e/\@/2;s/$/.org/;s/bra/k/2'
  °--web: http://www.kde.org/people/michaelb.html

KDE 4: Beyond Your Expectations




More information about the antlr-interest mailing list