[antlr-interest] Reg Multi-line comments

David-Sarah Hopwood david-sarah at jacaranda.org
Thu Jul 16 09:16:25 PDT 2009


Gokulakannan Somasundaram wrote:
> Hi,
>    I am trying to filter out multi-line comments, for which i am using the
> following Token definition (provided in antlr.org)
> ML_COMMENT
>     :    '/*' ( options { greedy = false; } : .* ) '*/' { skip(); };
> 
> But i intend to provide a informative error message, when EOF occurs without
> any '*/'.  Can someone help me on how to achieve this? I am trying out lot
> of things, but nothing seems to work and i seem to missing some basic
> fact/knowledge.

This is a special case of the more general issue of knowing what rule(s)
you are in when an error occurs. This information is available in the
"rule invocation stack". Override the following methods in your lexer:

lexer::members {
  public String getErrorMessage(RecognitionException e,
                                String[] tokenNames) {
    List stack = getRuleInvocationStack(e, this.getClass().getName());

    // The top-level token rule is almost always at position 1 in the
    // stack, after "mTokens".
    // .substring(1) strips the "m" prepended to lexer rule names.
    String rule = stack.size() < 2 ? "" :
      "in " + stack.get(1).toString().substring(1) + ": ";

    return rule + super.getErrorMessage(e, tokenNames);
  }

  public String getTokenErrorDisplay(Token t) {
    return t.toString();
  }
}

Now you will get error messages something like:

  line 1:5 in ML_COMMENT: mismatched character '<EOF>' expecting '*'

(I name the rules so that they are more human-friendly; alternatively
you can map them to localised strings easily enough.)

The same approach can be used in the parser, although in that case the
rule that is most human-relevant is less likely to be at position 1 in
the invocation stack. See section 10 of 'The Definitive ANTLR Reference'
for more information, or for an alternative approach using 'paraphrases'.

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the antlr-interest mailing list