[antlr-interest] How to distinguish between integer an binary number?

Sat Dec 13 15:15:17 PST 2008

At 09:31 14/12/2008, Johannes Luber wrote:
 >Mario Prada schrieb:
 >> I need to distinguish between integer an binary number in
 >> my grammar.<P>
[...]
 >You actually can't do that in the lexer. ANTLR does lexing all
 >in-front, before even the parser sees the first token. Thus
 >all numbers fitting the binary pattern are BINARIOs, even if
 >they are supposed to be normal numbers. The solution is make
 >one lexer rule for all kinds of numbers and to check only in
 >the parser, if a number contains only zeros and ones.

Another possibility is to replace all usages of "INT" in the 
parser with "integer", then define a new rule like so:

integer : INT | BINARIO;

This way the lexer will still generate distinct INT and BINARIO 
tokens, but if the parser context is expecting an integer, it will 
accept both.  (If you're building an AST, you will probably also 
want to add a rewrite to convert the BINARIO token to an INT token 
for the AST.)

As Sohail said, though, if you have the luxury of (re)defining the 
language then you should consider adding some lexically-obvious 
prefix or suffix to distinguish binary constants from decimal 
constants.  That should help to remove some of the confusion both 
on ANTLR's part and potentially that of anyone reading the input 
file.