[antlr-interest] Repeatedly parsing number literals
Gavin Lambert
antlr at mirality.co.nz
Sat Mar 28 22:36:31 PDT 2009
At 15:44 29/03/2009, Rick Mann wrote:
>DecimalLiteral
> : '0'..'9' '0'..'9'* { $value = };
>
>FloatingPointLiteral
> : ('0'..'9')+ '.' ('0'..'9')* Exponent?
> | ('0'..'9')+ Exponent
> | ('0'..'9')+
> ;
Note that these rules are lexically ambiguous -- the final alt of
FloatingPointLiteral is indistinguishable from DecimalLiteral, and
all of the alternatives share a common left prefix. This is going
to get you into trouble.
You should rewrite these two rules into a single lexer rule and
left-factor the common prefix away.
>And a number of parser rules that refer to them. Do I need
>to write actions like this:
>
>$value = Integer.parseInt($DecimalLiteral.text);
Yes. The only return from a lexer rule is the token.
Having said that, you *can* add custom data to a token (exactly
how you do that depends on your target language; Java requires
subclassing the token, for example), so it's not completely
impossible to deal with it at lexing time; but it's usually not
worth the hassle.
More information about the antlr-interest
mailing list