[antlr-interest] A look-ahead/rewind problem

Mon Mar 23 05:09:18 PDT 2009

Hi all,

I'm writing a translator which aims to convert input delimited by tags

BLOCK <name>

in the following way:

BLOCK A      =>      start of block A
BLOCK B      =>       start of block B
BLOCK B      =>       end of block B
BLOCK A      =>      end of block A

The difficulty is caused by the fact that the opening tag (e.g. 'BLOCK A')
is exactly the same as the closing one ('BLOCK A'), so I need to track on my
own if it is an opening or a closing.

The algorithm is simple: if I encounter a block with the name that is
already known, then I have encountered a closing tag. If the name is
unknown, than I open a new block.

BLOCK A      =>      start of block A
BLOCK A      =>      end of block A
BLOCK A      =>      start of block A
BLOCK A      =>      end of block A

The problem is that I can make the decision only when I have collected the
entire name of a block. If I can see, that the name is already known (so it
is not an opening of a new block), then I need to unput the chars and go to
closing rule. Unfortunately there is nothing like the *unput()* function in
antlr, so I need to use syntactic predicate to make an unlimited look ahead
for the name.

I wrote grammar, but I get an error. Please help me understand why.

[12:48:09] Checking Grammar...
[12:48:09] warning(200): block.g:7:8: Decision can match input such as
"BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple alternatives:
1, 2
As a result, alternative(s) 2 were disabled for that input
[12:48:09] warning(200): block.g:8:19: Decision can match input such as
"BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple alternatives:
1, 2
As a result, alternative(s) 2 were disabled for that input
[12:48:09] error(201): block.g:8:19: The following alternatives can never be
matched: 2

This is the grammar that I use:

grammar block;
@members{
  Stack<String> stack = new Stack<String>();
}

data: blocks EOF;
blocks:block+;
block: block_open blocks? block_close;
block_open
    :  ((BLOCK_BOUNDARY
block_name)=>{stack.size()==0||!stack.peek().equals($block_name.text)}?) =>
    BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
{stack.push($block_name.text);System.out.println("start of block " +
$block_name.text);};
block_close:BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
{System.out.println("end of block " + $block_name.text);stack.pop();};
block_name:LETTER+ ALPHANUM*;

LETTER    :     ('a'..'z'|'A'..'Z');
ALPHANUM    :    (LETTER|'0'..'9');
BLOCK_BOUNDARY     :     'block';
NEWLINE :     ( CR )? LF | CR;
fragment CR :'\r';
fragment LF : '\n';
INSIGNIFICANT_CHAR:.;

It looks like antlr does not see that the block_open is guarder by syntactic
predicate, which is not always true (because
"stack.size()==0||!stack.peek().equals($block_name.text)"
condition can be false).

Thanks for any advice,
Lukasz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090323/b2165d48/attachment.html