[antlr-interest] custom error reporting
impulze at impulze.org
Thu Oct 25 03:27:16 PDT 2012
I was searching my way through the custom error reporting facilities of ANTLR,
especially by overriding nextToken, recoverFromMismatchedToken and displayRecognitionError
but I cannot quite figure out how to achieve the following behaviour.
I have a lexer grammar that consists of the following IDENTIFIER token:
IDENTIFIER : ASCII_LETTER (ASCII_LETTER|'_'|DEC_DIGIT)*;
fragment DEC_DIGIT : '0'..'9';
fragment ASCII_LETTER : 'a'..'z'|'A'..'Z';
If the lexed stream now contains something like "identifi€r" the string representation
of the tokens are:
token: [Index: 0 (Start: 167592512-Stop: 167592519) ='identifi', type<13> Line: 1 LinePos:-1]
token: [Index: 1 (Start: 167592523-Stop: 167592523) ='r', type<13> Line: 1 LinePos:8]
Also the following error occurs during lexing:
[...] : lexer error 3 :
1:1: Tokens : ( OCT_LITERAL | HEX_LITERAL | DEC_LITERAL | FLOAT_LITERAL | STRING_LITERAL | IDENTIFIER | COMMENT | NEWLINE | WHITESPACE ); at offset 8, near '¬' :
I want to achieve a more detailed error message like "invalid identifier 'identifi€r' due to failure at lexing '€'" or
something like that. Is this even possible and if so, what would I have to look at?
Thanks a lot in advance.
If you still don't like it, that's ok: that's why I'm boss. I simply know better than you do.
--- Linus Torvalds, comp.os.linux.advocacy, 1996/07/22
More information about the antlr-interest