[antlr-interest] Repeatedly parsing number literals
Gavin Lambert
antlr at mirality.co.nz
Sat Mar 28 22:53:29 PDT 2009
At 18:43 29/03/2009, Rick Mann wrote:
>Well, you would think that this is true, but it turns out not to
>be. I lifted those rules from Terrence's Java grammar. Sure
>enough, it works as expected, to the degree that if a parser
>calls for a float literal, and I give it a literal that would
>match DecimalLiteral, it complains.
The parser has no influence over the lexer, so what the parser
calls for is irrelevant -- for a given bit of input text you will
always get the same token regardless of parser context.
In theory, if you throw the input "1234" at the rules you posted,
it should always end up being a DecimalLiteral (and you should
have gotten an "unreachable alternative" warning for
FloatLiteral). This is because when two rules can be statically
determined to match the same input then ANTLR will normally pick
the rule that was listed first. If you give it input such as
"1234.5" then you'll get a FloatLiteral; if you give it
"1234.abcd" or "1234..1238" then you'll get a runtime error within
the bowels of FloatLiteral, even if you have dot or double-dot
tokens available.
Performance will definitely be improved if you combine and
left-factor, though -- that'll reduce it to k=1 decisions, if done
right.
More information about the antlr-interest
mailing list