[antlr-interest] Advanced questions and proposals

Alexey Demakov demakov at ispras.ru
Thu Mar 10 06:21:32 PST 2005


Hi all,

I'm trying to improve my parser (and ANTLR skills :) ) and have the following questions:

1. How to suppress warning for "optional" token in lexer?

I have lexer rule that use another lexer to recognize complex token

CUSTOM_TOKEN : { ... }?
               {
                 selector.push( "custom" );
                 Token t = selector.nextToken();
                 t.setType( CUSTOM_TOKEN );
                 $setToken( t );
               }
             ;

ANTLR complains that this rule is optional (can match "nothing").
Yes, it's reasonable warning, but in other cases I can suppress similar warnings
using options ('warnWhenFollowAmbig' and 'generateAmbigWarnings').
But how can I suppress this warning?

2. Text of EOF token

When ANTLR (or generated parser) reports about unexpected end of file,
error message looks like: 

a.g:2:1: expecting ID, found 'null'

I prefer to see something more informative instead of 'null'.
I've found workaround for that,
but propose to change default behaviour in future versions.

3. Multiple error messages about unexpected EOF in parser.

When EOF is found in some deep parser rule, ANTLR generates syntax error messages
for each exception handler (rule) on stack. I propose to process this case
separately and, for example, suppress all error messages after the first one
when EOF is reached.

4. Error handling - extend default error handler

When I specify my own error handler, ANTLR doesn't generate default one.
But what if I process additional exception types and want use default handler
for RecognitionException? I propose that 
options { defaultErrorHandler=true; }
turn on default error handler generation in this case.

5. How to use follow set in user defined exception handlers?

Other way to extend default error handler - to write it in user error handler.
But default error handler uses follow set:

  catch (RecognitionException ex) {
   if (inputState.guessing==0) {
    reportError(ex);
    consume();
    consumeUntil(_tokenSet_4);
   } else {
     throw ex;
   }
  }

In this case it is _tokenSet_4. What should I write in error handler to mimic
this behaviour? I can't use _tokenSet_4 because only ANTLR knows number of set
and can change it on grammar change.

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com




More information about the antlr-interest mailing list