[antlr-interest] detecting illegal input chars in the lexer

Jim Idle jimi at temporal-wave.com
Sat May 8 20:17:11 PDT 2010


Should not enter an infinite loop so your rules are probably broken somewhere (but maybe not). Check for rules that can match nothing at all:  T : 'a'* ;

However, your lexer should never enter the ANTLR error trapping usually, you should program for all eventualities if you can. The final catch all is a last rule in the file:

ANY : . { /* Call your error handler and talk about illegal characters */ ; $type = HIDDEN; } ;

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Bob Frankel
> Sent: Saturday, May 08, 2010 4:17 PM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] detecting illegal input chars in the lexer
> 
> what's the best way to detected illegal input chars in the lexer -- in
> my case, chars with a code > 127  [i just had my grammar enter an
> infinite loop on an arithmetic expression where the minus sign was
> really an en-dash with code == 150, but maybe that's another
> problem!!!]
> 
> presumably, some pattern that matches chars \u0080 -- \uFFFF and yields
> some distinguished token that causes the grammar to fail???
> 
> thanks in advance....
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address





More information about the antlr-interest mailing list