[antlr-interest] Very simple grammar confusing me

Wed Nov 10 02:01:20 PST 2010

Oliver Zeigermann wrote:
> Folks!
> 
> This is my grammar
> 
> ------------------
> SHRASS : '>>=' ;
> SEMI  : ';' ;
> GT : '>';
> 
> rule : (GT | SEMI | SHRASS)+ ;
> ------------------
> 
> I though it should parse
> 
>>>;
> 
> into a token stream of
> 
> GT GT SEMI
> 
> but as I see both at runtime as well is in the mToken method it tries
> to match the above input using SHRASS. Which of course fails.
> 
> Any hints what I could do to work around that?

John gave you the hint...

To reduce the probability that this happens again: Your basic
problem above is that there is a prefix of a token that is not
covered by any other token rule, namely '>>'.

For ANTLR lexers, *all* possible prefixes of any token must be
matched by another token. Otherwise lexers will fail, as ANTLR
lexers can not backtrack.

The standard solution is to refactor the token grammar. If that
gets too difficult or unmaintainable, one can combine another lexer
generator with ANTLR stream and tree parser capabilities.

HTH,
	Joachim

-- 
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Joachim Schrod				Email: jschrod at acm.org
Roedermark, Germany