[antlr-interest] Access to lexer warning/error messages after parsing

Tue Jul 1 10:54:08 PDT 2008

I agree that this should be easier to implement. My work-around is to
capture System.err in a buffer and, at the end of parsing, raise an
error if there is any text there. I know there are other ways of doing
this, but this way was simplest for me in that I didn't have to change
a single line of my grammar.

I have this encapsulated in a Ruby module (I use the Java target and
JRuby), so I don't have to think about it and can reuse the logic for
any grammar. This works fine for me now that I have it set up, but
writing to System.err by default seems to be less than ideal.

Also, Andy raises a very good point about many people only ever
creating a single ANTLR grammar. I think that's a good mindset to
approach the UI design with. (Of course, once they've created their
single ANTLR grammar and see how much fun it is, they'll be back for
more.)

2008/7/1 Andy Tripp <antlr at jazillian.com>:
> Jim,
>
> As you point out, the default case is that people will want to process their
> own error messages,
> rather than have ANTLR send them to stderr. So doesn't it make more sense to
> have ANTLR package them into a data structure (which has a toString() method
> which ANTLR calls and sends to stderr? The alternative is that people
> have to either parse the error text or look at the ANTLR-generated code to
> understand how to override the default behavior.
>
> You mention reusing your error handling mechanism across "virtually all"
> your
> grammars. I think that for almost ANTLR users, the number of lexer/parsers
> that they're
> going to write is exactly 1. Better to make it as easy as possible to write
> that
> first grammar and not assume that they're going to be creating more grammars
> anyway. Part of making it easy is to make it possible to build a
> lexer/parser
> as a "black box", without having to ever look at the ANTLR-generated code.
>
> Andy
>
>
> Jim Idle wrote:
>>
>> On Tue, 2008-07-01 at 08:54 +0200, Raphael Reitzig wrote:
>>>
>>> Hi!
>>>
>>> I second that for I am about to write something quite similar. System.err
>>> is no good in a user oriented GUI application.
>>>
>>> I can think of two possibilties to integrate such behaviour in ANTLR:
>>> * grammar option like "warnMode", i.e. with values "console" and
>>> "collect".
>>> I'd like to have _one_ exceptions thrown if there ocurred any error while
>>> parsing.
>>> * possibility to set output stream for error messages via grammar option:
>>> @errors { System.err } (default)
>>> Implementation of either should be no obstacle (*guess*).
>>
>> In the case of lexers, it is best to build a lexer that almost cannot
>> throw errors as once you lex incorrectly then there isn't much you can do.
>> Having rules in the lexer that catch known common mistakes and/or catch any
>> character that makes no sense in your lexer means that your whole solution
>> will be more robust. For most lexers,. just having:
>>
>> BADCHAR: . {insert your error code};
>>
>> As the last rule will improve things.
>>
>> However, in the case of lexer, parser and tree parser it is trivial to
>> override the error output method and add your errors to collections/a
>> collection. As the standard error messages are usually of no use to a real
>> application (and they cannot be, there are too many things you might wish to
>> do on error), then you will almost certainly want to implement your own
>> error output anyway. Just add the message to a collection. I do this with
>> virtually every recognizer I write and it takes less time than learning some
>> new syntax and access methods for ANTLR (which everyone will then complain
>> about because they don't do exactly what they had in mind. ;-)
>>
>> So, the method that is called has all the information that you could need,
>> but YOU have to make it in to a collection, format it in a way that makes
>> sense for your application, and present the errors to your users. There is
>> no generic solution that would provide much more than a different set of
>> questions than there is right now. Sure, the errors could all be collected
>> as objects that you then iterate, but then there is more code for people to
>> rip out when they don't want that!
>>
>> Come on guys the error messages are an afternoons coding that you can
>> probably reuse on related projects (if they are living in the same
>> environment.) I last did this in C# and if it took an hour to get it all
>> together I would be surprised. You only need to learn the ANTLR bit once.
>>
>> Jim
>
>