[antlr-interest] a simple (not for me :)) grammar problem
Gavin Lambert
antlr at mirality.co.nz
Sun Jan 6 23:57:43 PST 2008
At 16:26 7/01/2008, Mark Volkmann wrote:
>It should be easy right. Terr already gave the hint that the
>problem is that it was greedily grabbing the DOT for FLOAT
>instead of leaving it for the separator between the number
>and the identifier. Piece of cake? Well I've tried several
>things I thought would work to no avail.
>Why in the world doesn't this work?
[...]
> backtrack = true; // I shouldn't need this, but I don't think
it
>can hurt.
It's not going to help, either. "backtrack = true" has no effect
on the lexer.
>FLOAT: NUMBER DOT NUMBER;
>INTEGER: NUMBER;
>IDENTIFIER: LETTER+;
>DOT: '.';
>fragment NUMBER: DIGIT+;
>fragment LETTER: 'a' .. 'z';
>fragment DIGIT: '0' .. '9';
This has been discussed to death before. For reasons of
performance (and some other obscure thing, I think), when
processing a + loop ANTLR will use k=1 lookahead. Thus when faced
with the choice between FLOAT and INTEGER, it looks ahead to see
at least one DIGIT and then says "ok, that's a FLOAT". It doesn't
look past all the DIGITs to see whether there's a DOT or
not. (Ter has said he might look into improving this a bit in a
later version.)
Whenever there's a common prefix in your tokens, you will need to
combine the rules to remove the ambiguity:
INTEGER
: NUMBER
( /* nothing afterwards */
| DOT NUMBER { $type = FLOAT; }
)
;
More information about the antlr-interest
mailing list