[antlr-interest] Creating a lexer that returns a token for bad characters
Bryan H. Haber
bryan.haber at gmail.com
Sun Apr 27 11:51:50 PDT 2008
Hi,
I'm trying to create a lexer that will return a token for invalid
characters. For example, if you have a this:
INT : 'int';
WHITESPACE : (' ')+;
And the input is 'int iint'. I would want a token stream of INT('int'),
WHITESPACE(' ') and BAD('iint'). I just got the ANTLR book, but is such a
thing possible? It looks like I would have to create a new nextToken()
method that tracks the start of the bad character, keeps consuming until it
hits a valid token. I would then rollback that valid token and create a bad
token for part recorded. Is there a better way to do this? Any help would
be appreciated, thanks.
-- bryan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080427/9f2cdacf/attachment.html
More information about the antlr-interest
mailing list