[antlr-interest] Repeatedly parsing number literals
    Gavin Lambert 
    antlr at mirality.co.nz
       
    Sat Mar 28 22:36:31 PDT 2009
    
    
  
At 15:44 29/03/2009, Rick Mann wrote:
 >DecimalLiteral
 >	: '0'..'9' '0'..'9'* { $value = };
 >
 >FloatingPointLiteral
 >	:	('0'..'9')+ '.' ('0'..'9')* Exponent?
 >	|	('0'..'9')+ Exponent
 >	|	('0'..'9')+
 >	;
Note that these rules are lexically ambiguous -- the final alt of 
FloatingPointLiteral is indistinguishable from DecimalLiteral, and 
all of the alternatives share a common left prefix.  This is going 
to get you into trouble.
You should rewrite these two rules into a single lexer rule and 
left-factor the common prefix away.
 >And a number of parser rules that refer to them. Do I need
 >to write actions like this:
 >
 >$value = Integer.parseInt($DecimalLiteral.text);
Yes.  The only return from a lexer rule is the token.
Having said that, you *can* add custom data to a token (exactly 
how you do that depends on your target language; Java requires 
subclassing the token, for example), so it's not completely 
impossible to deal with it at lexing time; but it's usually not 
worth the hassle.
    
    
More information about the antlr-interest
mailing list