[antlr-interest] [newbie] Lexer Confusion
Johannes Luber
jaluber at gmx.de
Fri Jul 4 14:26:50 PDT 2008
UW Student schrieb:
> Hello,
>
> I'm having some trouble understanding the behaviour of Antlr's lexer. I
> am quite new to Antlr (having previously focussed on JFlex) so please
> excuse me if this is a naive question.
>
> My grammar is as follows
>
> grammar Test;
>
> nonTerm : TERM1 TERM2;
>
> TERM1 : '..'+;
> TERM2 : '.';
>
> However, when I try to recognize the string '...' (without the quotes),
> AntlrWorks indicates a MismatchedTokenException. (Looking at the
> generated code, I believe this is because TERM1 is consuming the third
> DOT and then failing to find a fourth.) I do not understand why this is
> happening.
>
> The above example is a toy language that I created to try to isolate the
> problem I was having. My actual lexer looks more like this:
>
> TERM1 : (' ' | '...')+
> TERM2 : '.'
>
> And I would like ' .' to be lexed as [TERM1, TERM2].
>
> Any suggestions would be greatly appreciated.
>
> Thanks,
> Andrew
>
ANTLR doesn't try TERM2 once it decides to try TERM1. This is a
limitation of the analysis algorithm. To get your result, you have to
try something like:
grammar Test2;
tokens{
TERM2;
}
nonTerm : TERM1 TERM2;
TERM1: '.' ( ('.')=> '.' {$type = TERM2;} ) ;
Johannes
More information about the antlr-interest
mailing list