[antlr-interest] A look-ahead/rewind problem

Lukasz Guminski lguminski+antlr at gmail.com
Mon Mar 23 06:59:54 PDT 2009


Below my responses which mistakenly have been sent to Andreas only.

2009/3/23 Andreas Meyer <andreas.meyer at smartshift.de>

Lukasz Guminski schrieb:
> > Hi all,
> >
> > I'm writing a translator which aims to convert input delimited by tags
> >
> > BLOCK <name>
> >
> > in the following way:
> >
> > BLOCK A      =>      start of block A
> > BLOCK B      =>       start of block B
> > BLOCK B      =>       end of block B
> > BLOCK A      =>      end of block A
> >
> > The difficulty is caused by the fact that the opening tag (e.g. 'BLOCK
> > A') is exactly the same as the closing one ('BLOCK A'), so I need to
> > track on my own if it is an opening or a closing.
> >
> > The algorithm is simple: if I encounter a block with the name that is
> > already known, then I have encountered a closing tag. If the name is
> > unknown, than I open a new block.
> >
> > BLOCK A      =>      start of block A
> > BLOCK A      =>      end of block A
> > BLOCK A      =>      start of block A
> > BLOCK A      =>      end of block A
> >
> > The problem is that I can make the decision only when I have collected
> > the entire name of a block. If I can see, that the name is already
> > known (so it is not an opening of a new block), then I need to unput
> > the chars and go to closing rule. Unfortunately there is nothing like
> > the /unput()/ function in antlr, so I need to use syntactic predicate
> > to make an unlimited look ahead for the name.
> >
> > I wrote grammar, but I get an error. Please help me understand why.
> >
> > [12:48:09] Checking Grammar...
> > [12:48:09] warning(200): block.g:7:8: Decision can match input such as
> > "BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple
> > alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> > [12:48:09] warning(200): block.g:8:19: Decision can match input such
> > as "BLOCK_BOUNDARY LETTER ALPHANUM BLOCK_BOUNDARY" using multiple
> > alternatives: 1, 2
> > As a result, alternative(s) 2 were disabled for that input
> > [12:48:09] error(201): block.g:8:19: The following alternatives can
> > never be matched: 2
> >
> > This is the grammar that I use:
> >
> > grammar block;
> > @members{
> >   Stack<String> stack = new Stack<String>();
> > }
> >
> > data: blocks EOF;
> > blocks:block+;
> > block: block_open blocks? block_close;
> > block_open
> >     :  ((BLOCK_BOUNDARY
> > block_name)=>{stack.size()==0||!stack.peek().equals($block_name.text)}?)
> > =>
> >     BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
> > {stack.push($block_name.text);System.out.println("start of block " +
> > $block_name.text);};
> > block_close:BLOCK_BOUNDARY block_name BLOCK_BOUNDARY
> > {System.out.println("end of block " + $block_name.text);stack.pop();};
> > block_name:LETTER+ ALPHANUM*;
> >
> > LETTER    :     ('a'..'z'|'A'..'Z');
> > ALPHANUM    :    (LETTER|'0'..'9');
> > BLOCK_BOUNDARY     :     'block';
> > NEWLINE :     ( CR )? LF | CR;
> > fragment CR :'\r';
> > fragment LF : '\n';
> > INSIGNIFICANT_CHAR:.;
> >
> > It looks like antlr does not see that the block_open is guarder by
> > syntactic predicate, which is not always true (because
> > "stack.size()==0||!stack.peek().equals($block_name.text)" condition
> > can be false).
> >
> > Thanks for any advice,
> > Lukasz
>
>
> The problem is that semantic predicates like {...}? are only evaluated
> when there is an ambiguity ... and maybe ANTLR thinks the ambiguity is
> already resolved by your syntactic predicate. Actually, I am not sure
> about the structure you gave, namely, a syntactic predicate inside a
> syntactic predicate, maybe it also means "match the empty string after
> seeing BLOCK_BOUNDARY, I dont know.


The idea of mixing syntactic and semantic predicate comes from the example
posted on May 11, 2006 at http://www.antlr.org/blog/antlr3/lookahead.tml

*match alpha if not followed by beta:
a : ( alpha ((beta)=>{false}? | ) => alpha ;
 *


>
> I would suggest you try without the syntactic predicate, turn your
> semantic predicate into a gated semantic predicate {...}? => such that
> it is forced to be evaluated:
>
> block_open: {is_block_open()}?=> BLOCK_BOUNDARY


I cannot use a function without a parameter, because the decision bases on
the name of a block. So the function needs to be of a form:
*is_block_open(String
blockName)* function, and retrieving the parameter requires the parser to
make a look-ahead. That's what I need the syntactic predicate for.


>
>
> Btw, why do you have BLOCK_BOUNDARY twice in each rule? ACcording to
> your example, you would only need, one, right?
>

You're right. That's a typo. Thanks,
Lucas


>
> Best,
> Andreas Meyer
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090323/95f3fdc5/attachment.html 


More information about the antlr-interest mailing list