[antlr-interest] A look-ahead/rewind problem
Andreas Meyer
andreas.meyer at smartshift.de
Mon Mar 23 05:20:15 PDT 2009
Lukasz Guminski schrieb:
> Hi all,
>
> I'm writing a translator which aims to convert input delimited by tags
>
> BLOCK <name>
>
> in the following way:
>
> BLOCK A => start of block A
> BLOCK B => start of block B
> BLOCK B => end of block B
> BLOCK A => end of block A
>
> The difficulty is caused by the fact that the opening tag (e.g. 'BLOCK
> A') is exactly the same as the closing one ('BLOCK A'), so I need to
> track on my own if it is an opening or a closing.
>
> The algorithm is simple: if I encounter a block with the name that is
> already known, then I have encountered a closing tag. If the name is
> unknown, than I open a new block.
>
> BLOCK A => start of block A
> BLOCK A => end of block A
> BLOCK A => start of block A
> BLOCK A => end of block A
>
> The problem is that I can make the decision only when I have collected
> the entire name of a block. If I can see, that the name is already
> known (so it is not an opening of a new block), then I need to unput
> the chars and go to closing rule. Unfortunately there is nothing like
> the /unput()/ function in antlr, so I need to use syntactic predicate
> to make an unlimited look ahead for the name.
>
> I wrote grammar, but I get an error. Please help me understand why.
>
> [12:48:09] Checking Grammar...
> [12:48:09] warning(200): block.g:7:8: Decision can match input such as
> "BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> [12:48:09] warning(200): block.g:8:19: Decision can match input such
> as "BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple
> alternatives: 1, 2
> As a result, alternative(s) 2 were disabled for that input
> [12:48:09] error(201): block.g:8:19: The following alternatives can
> never be matched: 2
>
> This is the grammar that I use:
>
> grammar block;
> @members{
> Stack<String> stack = new Stack<String>();
> }
>
> data: blocks EOF;
> blocks:block+;
> block: block_open blocks? block_close;
> block_open
> : ((BLOCK_BOUNDARY
> block_name)=>{stack.size()==0||!stack.peek().equals($block_name.text)}?)
> =>
> BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
> {stack.push($block_name.text);System.out.println("start of block " +
> $block_name.text);};
> block_close:BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
> {System.out.println("end of block " + $block_name.text);stack.pop();};
> block_name:LETTER+ ALPHANUM*;
>
> LETTER : ('a'..'z'|'A'..'Z');
> ALPHANUM : (LETTER|'0'..'9');
> BLOCK_BOUNDARY : 'block';
> NEWLINE : ( CR )? LF | CR;
> fragment CR :'\r';
> fragment LF : '\n';
> INSIGNIFICANT_CHAR:.;
>
> It looks like antlr does not see that the block_open is guarder by
> syntactic predicate, which is not always true (because
> "stack.size()==0||!stack.peek().equals($block_name.text)" condition
> can be false).
>
> Thanks for any advice,
> Lukasz
The problem is that semantic predicates like {...}? are only evaluated
when there is an ambiguity ... and maybe ANTLR thinks the ambiguity is
already resolved by your syntactic predicate. Actually, I am not sure
about the structure you gave, namely, a syntactic predicate inside a
syntactic predicate, maybe it also means "match the empty string after
seeing BLOCK_BOUNDARY, I dont know.
I would suggest you try without the syntactic predicate, turn your
semantic predicate into a gated semantic predicate {...}? => such that
it is forced to be evaluated:
block_open: {is_block_open()}?=> BLOCK_BOUNDARY
Btw, why do you have BLOCK_BOUNDARY twice in each rule? ACcording to
your example, you would only need, one, right?
Best,
Andreas Meyer
More information about the antlr-interest
mailing list