[antlr-interest] custom error reporting

Daniel Mierswa impulze at impulze.org
Thu Oct 25 03:27:16 PDT 2012


Hi there,

I was searching my way through the custom error reporting facilities of ANTLR,
especially by overriding nextToken, recoverFromMismatchedToken and displayRecognitionError
but I cannot quite figure out how to achieve the following behaviour.

I have a lexer grammar that consists of the following IDENTIFIER token:

IDENTIFIER : ASCII_LETTER (ASCII_LETTER|'_'|DEC_DIGIT)*;
fragment DEC_DIGIT : '0'..'9';
fragment ASCII_LETTER : 'a'..'z'|'A'..'Z';

If the lexed stream now contains something like "identifi€r" the string representation
of the tokens are:

token: [Index: 0 (Start: 167592512-Stop: 167592519) ='identifi', type<13> Line: 1 LinePos:-1]
token: [Index: 1 (Start: 167592523-Stop: 167592523) ='r', type<13> Line: 1 LinePos:8]

Also the following error occurs during lexing:
[...] : lexer error 3 :
	1:1: Tokens : ( OCT_LITERAL | HEX_LITERAL | DEC_LITERAL | FLOAT_LITERAL | STRING_LITERAL | IDENTIFIER | COMMENT | NEWLINE | WHITESPACE ); at offset 8, near '¬' :
	€r

I want to achieve a more detailed error message like "invalid identifier 'identifi€r' due to failure at lexing '€'" or
something like that. Is this even possible and if so, what would I have to look at?

Thanks a lot in advance.

-- 
Mierswa, Daniel

If you still don't like it, that's ok: that's why I'm boss. I simply know better than you do.
               --- Linus Torvalds, comp.os.linux.advocacy, 1996/07/22


More information about the antlr-interest mailing list