[antlr-interest] Fixed length tokens again

Wincent Colaiuta win at wincent.com
Thu Jul 5 00:33:01 PDT 2007


El 5/7/2007, a las 8:47, Stefan Wohlgemuth escribió:

> Is there a way to define tokens which have a fixed length?
> Say I would like to define a lexer rule N3 which defines a token of
> three and another N4 of four numeric characters.
>
> Something like this:
>
> test: N3 N4;
>
> N3:  Digit Digit Digit;         //  This in combination with N4  
> does not
> work
> N4:  Digit Digit Digit Digit; //
>
> fragment
> Digit: '0'..'9'

Those rules are ambiguous... how does ANTLR know what to match for  
input like "012345678912"? It could match either four N3 tokens, or  
three N4 tokens.

One way to address the ambiguity might be to use syntactic predicates:

tokens {
   N3;
   N4;
}

N : (Digit Digit Digit Digit)=> Digit Digit Digit Digit { $type = N4; }
   | Digit Digit Digit { $type = N3; }
   ;

Cheers,
Wincent



More information about the antlr-interest mailing list