[antlr-interest] Lexer recognition exceptions
Jim Idle
jimi at temporal-wave.com
Wed Jan 28 08:51:18 PST 2009
Bruno Marc-Aurele wrote:
> Hi,
>
> I am currently doing a project where the description of our software
> architecture is very important. Therefore, I have to understand the code that's
> generated by ANTLR properly (academic stuff... Need to provide documents and
> follow them...)
>
> I have seen that the generated parser catches the RecognitionExpections during
> parsing. How about the lexer? There is no catch block, so I understand that the
> exceptions are to be catched by us when we instantiate the lexer object. Am I
> right or is there some tricky mechanics involved in the base classes that I am
> not aware of?
>
There is very little that a lexer can do when it encounters something
that it does not like. In fact all it can really do is print a message
and consume the character it is looking at. Look at the nextToken()
method in Lexer.java and the mTokens() code. So, you can override the
error reporting if you like.
However really, you are supposed to code your lexer to handle errors. At
the simplest case, you create a catch all rule to pick out characters
that you have not otherwise created a match for:
BAD : . { your error handler } ;
However, you also need to cater for things that can happen once the
prediction has indicated a particular rule and something goes wrong
while executing the match. For instance, unterminated literal strings
and so on. Code as much of this as is useful/practical, then override
the error reporting mechanism to give you some error that will make
sense to your users that is more than "Unrecognized character". It could
also do custom recovery I suppose too.
Also see:
http://antlr.org/wiki/display/ANTLR3/Lexer+grammar+for+floating+point%2C+dot%2C+range%2C+time+specs
For an example of coding for errors.
Jim
More information about the antlr-interest
mailing list