[antlr-interest] Problem with lexer rule for an optional suffix

John B. Brodie jbb at acm.org
Sat Nov 14 08:32:56 PST 2009


Greetings!
On Sat, 2009-11-14 at 09:08 +0000, Scott Oakes wrote:
> Hoping for some newbie help on the following lexer.
> 
>   fragment DIGIT:      '0'..'9';
>   fragment LETTER: ('a'..'z'|'A'..'Z');
> 
>   ID:  (LETTER | '.')+ ('.' DIGIT+)?
>        | DIGIT+
>       ;
> 
> The idea is that ID is things like: "foo", "32", "bar.baz", or
> "foo.bar.32". However with input "foo.bar.32", I get two tokens,
> "foo.bar." and "32". How could I rewrite this so I get a single ID
> token, "foo.bar.32"?

the following almost works (tested with your samples)

ID : LETTER+ ( '.' LETTER+ )* ('.' DIGIT+)?
     | DIGIT+
     ;

this won't work for things like "." or "..32" or "car..cod" or "..."
which your original rule had POTENTIAL for recognizing. Did you mean for
those to be valid? if so maybe:

ID : LETTER* ( '.' LETTER* )+ DIGIT*
     | DIGIT+
     ;

Hope this helps
   -jbb




More information about the antlr-interest mailing list