[antlr-interest] how to parse a fraction

Gavin Lambert antlr at mirality.co.nz
Fri Sep 26 16:21:45 PDT 2008


At 06:25 27/09/2008, Sven Prevrhal wrote:
 >This above works. However, this below  where I replaced the word 

 >parser rule with a WORD lexer token
 >
 >integer	:	INT;
 >fraction:	INT SLASH INT;
 >float	:	INT DOT INT;
 >
 >WORD	:	~('\r' | '\n' | ' ' | '\t')+ ;
 >INT	:	'0'..'9'+;
 >SLASH 	:	'/';
 >DOT	:	'.';
 >WS 	:	(' ' | '\t')+ {$channel = HIDDEN;} ;
 >
 >does not work for fraction like 2/3 but it does work for 2 / 3
 >(with spaces). Is it that when I define a word through the lexer 

 >it precedes the fraction definition and grabs 2/3 as a WORD?

Right.  When multiple lexer rules can possibly match a given 
input, ANTLR will usually choose the one that seems to match more 
of it -- and failing that, will match the first rule listed.

The quick fix would be to include '/' as one of the characters 
that can't appear within a WORD; you should probably also list it 
last, just in case.

 >Second question:
 >
 >If I define
 >
 >amount	:	integer
 >		| fraction
 >		| float
 >		;
 >
 >with the first code block, rule 'amount' works for fractions and 

 >floats but not for integers (no viable alt exception)! What is
 >happening?

You have a common left prefix (the first token of all three alts 
is INT); this sort of thing can sometimes confuse ANTLR -- 
although usually it's better about it when it's present in the 
parser, rather than the lexer.

Try rewriting your rule like this, first of all, to make it test 
the least-consuming alternative last:

amount : fraction
        | float
        | integer
        ;

If that doesn't work, then try adding predicates to force it to 
use explicit lookahead:

amount : (fraction) => fraction
        | (float) => float
        | integer
        ;



More information about the antlr-interest mailing list