[antlr-interest] Lexer bug? (with test cases!)
Terence Parr
parrt at cs.usfca.edu
Wed Oct 24 02:57:44 PDT 2007
On Oct 24, 2007, at 12:13 PM, Loring Craymer wrote:
>>> lexer grammar test;
>>> NUMBER: ('0'..'9')+ ('.' ('0'..'9')+)?;
>>> OTHER: .;
>
> Take another look. The '.' in the posted grammar is the character
> '.', not a wildcard; there is no ambiguity, just an LL(2)
> decision. Unfortunately, the generated code makes an LL(1)
> decision and generates runtime errors as a result. This is not a
> backtracking problem; note the selected workaround--it avoids
> having an epsilon alternative, but depends on k>1.
Oh, sorry. you're talking about the (...)? subrule decision? Ah,
well, it's the same really. *Any* char can follow a token so the
wildcard follows every decision. I can choose dot or wildcard.
That's ambig so I say LL(1). Lex does some backtracking to make it
work more naturally. ANTLR builds LL(*) recognizers, which are not
tuned specifically for building lexers as lex is. Perhaps in the
future there could be a way for ANTLR to do this within confines of LL
(*).
Terence
More information about the antlr-interest
mailing list