[antlr-interest] Number tokenizer vs. number grammar
Gavin Lambert
antlr at mirality.co.nz
Sat Nov 15 22:30:35 PST 2008
At 09:50 16/11/2008, Todd O'Bryan wrote:
>Assume that that both 2 * 3+2i and 2*3+2i should lex as NUMBER
OP
>NUMBER. What does that determine about my possible approaches?
:-)
It implies that you're going to experience pain with "2+3+2i" (or
"2/3+2i", for that matter, given that you've already said that
this ought to be a single NUMBER). :)
If you can require that whitespace is significant (ie. "2 / 3+2i"
is two NUMBERs and a division, but "2/3+2i" is a single NUMBER,
and "2 /3+2i" is simply illegal), then probably the simplest way
to deal with this (and avoid duplication) is to define NUMBER as
any sequence with a leading digit and any combination of digits
and operators afterwards, with no whitespace:
fragment DIGIT : '0'..'9';
NUMBER : '-'? '.'? DIGIT (DIGIT | '+' | '-' | '/' | '.' | 'i')* ;
This will of course be able to match invalid constructs as well,
but you can deal with that at the parser / tree parser / driver
code level (which permits better error messages anyway).
More information about the antlr-interest
mailing list