[antlr-interest] Fixed length tokens again

Micheal J open.zone at virgin.net
Thu Jul 5 10:11:32 PDT 2007


Hi,

> El 5/7/2007, a las 8:47, Stefan Wohlgemuth escribió:
> 
> > Is there a way to define tokens which have a fixed length? 
> Say I would 
> > like to define a lexer rule N3 which defines a token of three and 
> > another N4 of four numeric characters.
> >
> > Something like this:
> >
> > test: N3 N4;
> >
> > N3:  Digit Digit Digit;         //  This in combination with N4  
> > does not
> > work
> > N4:  Digit Digit Digit Digit; //
> >
> > fragment
> > Digit: '0'..'9'
> 
> Those rules are ambiguous... how does ANTLR know what to match for  
> input like "012345678912"? It could match either four N3 tokens, or  
> three N4 tokens.
> 
> One way to address the ambiguity might be to use syntactic predicates:
> 
> tokens {
>    N3;
>    N4;
> }
> 
> N : (Digit Digit Digit Digit)=> Digit Digit Digit Digit { 
> $type = N4; }
>    | Digit Digit Digit { $type = N3; }
>    ;

Or avoiding syntactic predicates:

tokens {
	N4;
}
 
N3	: Digit Digit Digit ( Digit { $type = N4; } )? 
	;


Micheal

-----------------------
The best way to contact me is via the list/forum. My time is very limited.



More information about the antlr-interest mailing list