[antlr-interest] Missing characters in partial matches

Matt Palmer mattpalms at gmail.com
Fri Aug 22 17:20:48 PDT 2008


Hi,

I'm scratching my head about a problem with multi-line comments, where
characters that only partially matched the comment header are removed from
the character stream. I've boiled the problem down to the simple grammar
below:

grammar T;

all     :    ( Text | Lsqb | Comment )* ;

Comment :    '[!--'  (options {greedy=false;} : . )* '--]' ;
Lsqb    :    '[' ;
Text    :    ( ~Lsqb )+ ;

If this text is run through the antlrworks debugger (1.1.7 and 1.2b5):

A test [!-- comment --] of text [!that looks like the start [!-of a
[!comment, but [isn't one.

then the parse tree displays this:

  root
    |
   all

|_____________________________________________________________________________
    |           |                 |                  |              |
|         |   |
  A test *[!-- comment --]* of text  *hat looks like the start* *f a*  *
omment*, but *[* isn't one.


The real comment itself matches fine, and the solitary square bracket is
also OK, but the other characters that are partial prefixes of a comment are
simply stripped out of the rest of the text.  Note that this problem only
surfaces if the comment header is greater than 2 characters in length.   Can
anyone shed some light on this behaviour?

Thanks,

MattP.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080823/14019faf/attachment.html 


More information about the antlr-interest mailing list