[antlr-interest] Access to lexer warning/error messages after parsing

Tue Jul 1 10:02:42 PDT 2008

Jim,

As you point out, the default case is that people will want to process their own error messages,
rather than have ANTLR send them to stderr. So doesn't it make more sense to
have ANTLR package them into a data structure (which has a toString() method
which ANTLR calls and sends to stderr? The alternative is that people
have to either parse the error text or look at the ANTLR-generated code to
understand how to override the default behavior.

You mention reusing your error handling mechanism across "virtually all" your
grammars. I think that for almost ANTLR users, the number of lexer/parsers that they're
going to write is exactly 1. Better to make it as easy as possible to write that
first grammar and not assume that they're going to be creating more grammars
anyway. Part of making it easy is to make it possible to build a lexer/parser
as a "black box", without having to ever look at the ANTLR-generated code.

Andy

Jim Idle wrote:
> On Tue, 2008-07-01 at 08:54 +0200, Raphael Reitzig wrote:
>> Hi!
>>
>> I second that for I am about to write something quite similar. System.err
>> is no good in a user oriented GUI application.
>>
>> I can think of two possibilties to integrate such behaviour in ANTLR:
>> * grammar option like "warnMode", i.e. with values "console" and "collect".
>> I'd like to have _one_ exceptions thrown if there ocurred any error while
>> parsing.
>> * possibility to set output stream for error messages via grammar option:
>> @errors { System.err } (default)
>> Implementation of either should be no obstacle (*guess*).
> 
> In the case of lexers, it is best to build a lexer that almost cannot 
> throw errors as once you lex incorrectly then there isn't much you can 
> do. Having rules in the lexer that catch known common mistakes and/or 
> catch any character that makes no sense in your lexer means that your 
> whole solution will be more robust. For most lexers,. just having:
> 
> BADCHAR: . {insert your error code};
> 
> As the last rule will improve things.
> 
> However, in the case of lexer, parser and tree parser it is trivial to 
> override the error output method and add your errors to collections/a 
> collection. As the standard error messages are usually of no use to a 
> real application (and they cannot be, there are too many things you 
> might wish to do on error), then you will almost certainly want to 
> implement your own error output anyway. Just add the message to a 
> collection. I do this with virtually every recognizer I write and it 
> takes less time than learning some new syntax and access methods for 
> ANTLR (which everyone will then complain about because they don't do 
> exactly what they had in mind. ;-)
> 
> So, the method that is called has all the information that you could 
> need, but YOU have to make it in to a collection, format it in a way 
> that makes sense for your application, and present the errors to your 
> users. There is no generic solution that would provide much more than a 
> different set of questions than there is right now. Sure, the errors 
> could all be collected as objects that you then iterate, but then there 
> is more code for people to rip out when they don't want that!
> 
> Come on guys the error messages are an afternoons coding that you can 
> probably reuse on related projects (if they are living in the same 
> environment.) I last did this in C# and if it took an hour to get it all 
> together I would be surprised. You only need to learn the ANTLR bit once.
> 
> Jim