[antlr-interest] Reg Multi-line comments

Gokulakannan Somasundaram gokul007 at gmail.com
Thu Jul 16 12:34:06 PDT 2009


Actually there is a comment like this before the source code of the
function.

    /** A more general version of getRuleInvocationStack where you can
     *  pass in, for example, a RecognitionException to get it's rule
     *  stack trace.  This routine is shared with all recognizers, hence,
     *  static.
     *
     *  TODO: move to a utility class or something; weird having lexer call
this
     */
    public static List getRuleInvocationStack(Throwable e,
                                              String recognizerClassName)


It more or less sounds to me that lexer should not call this. When i observe
the source code, i see that your approach may not be possible, because the
token to be tried out never gets saved anywhere. But let me check this.

Thanks,
Gokul.

On Thu, Jul 16, 2009 at 9:46 PM, David-Sarah Hopwood <
david-sarah at jacaranda.org> wrote:

> Gokulakannan Somasundaram wrote:
> > Hi,
> >    I am trying to filter out multi-line comments, for which i am using
> the
> > following Token definition (provided in antlr.org)
> > ML_COMMENT
> >     :    '/*' ( options { greedy = false; } : .* ) '*/' { skip(); };
> >
> > But i intend to provide a informative error message, when EOF occurs
> without
> > any '*/'.  Can someone help me on how to achieve this? I am trying out
> lot
> > of things, but nothing seems to work and i seem to missing some basic
> > fact/knowledge.
>
> This is a special case of the more general issue of knowing what rule(s)
> you are in when an error occurs. This information is available in the
> "rule invocation stack". Override the following methods in your lexer:
>
> lexer::members {
>  public String getErrorMessage(RecognitionException e,
>                                String[] tokenNames) {
>    List stack = getRuleInvocationStack(e, this.getClass().getName());
>
>    // The top-level token rule is almost always at position 1 in the
>    // stack, after "mTokens".
>    // .substring(1) strips the "m" prepended to lexer rule names.
>    String rule = stack.size() < 2 ? "" :
>      "in " + stack.get(1).toString().substring(1) + ": ";
>
>    return rule + super.getErrorMessage(e, tokenNames);
>  }
>
>  public String getTokenErrorDisplay(Token t) {
>    return t.toString();
>  }
> }
>
> Now you will get error messages something like:
>
>  line 1:5 in ML_COMMENT: mismatched character '<EOF>' expecting '*'
>
> (I name the rules so that they are more human-friendly; alternatively
> you can map them to localised strings easily enough.)
>
> The same approach can be used in the parser, although in that case the
> rule that is most human-relevant is less likely to be at position 1 in
> the invocation stack. See section 10 of 'The Definitive ANTLR Reference'
> for more information, or for an alternative approach using 'paraphrases'.
>
> --
> David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090717/69b59bf6/attachment.html 


More information about the antlr-interest mailing list