[antlr-interest] Lexer error handling

Tue May 15 19:09:21 PDT 2012

Don't try to use the lexer to 'parse' the token. Just accept and escape
character, then when the closing quote is seen, look through the
characters you have collected and issue an error message if any are
incorrect. Then you can accumulate the errors, but still make the token
and get as far through your process as possible. It does not really make
any sense to just error out in the token - what do you do next, just stop?
In general, always try and accumulate as many errors as possible in one go
and don't make your users re-run your process after every single error :)

Jim

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of A Z
Sent: Tuesday, May 15, 2012 1:45 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] Lexer error handling

Hello all,

  The lexer rule below simply matches a quoted string while allowing
escaped characters such as \\n. Any non-escaped control characters are an
error so I'd like to exit the rule when this happens but this doesn't seem
possible unless I use recursive calls to LoopChar instead of *.  I'm
wondering if there is better way to handle this without recursion.

//Quoted string
STR           : '\u0022' LoopChar* '\u0022';

fragment LoopChar :
   '\u0000'..'\u001F'   {ctx->dirLexerError();} //Exit rule here
 | '\u0020'
 | '\u0021'
 | '\u0023'..'\u005B'
 | '\u005C' .
 | '\u005D'..'\u007E'
 | '\u007F'..'\uFFFF'   {ctx->dirLexerError();} //Exit rule here
 ;

Thanks

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address