[antlr-interest] Comments, EOF, and Debugger

Junkman j at junkwallah.org
Tue Jun 1 12:34:01 PDT 2010

Disclaimer:  I'm a noob.  :)

Taking the newline out of comment seems to work, like this:

COMMENT	: '#' (~( '\r' | '\n' ))* ;
NEWLINE : '\r'? '\n'
	      // kick it off to the hidden channel
              // $channel=HIDDEN;

              // or skip it altogether
              // skip();


Last line comment terminating in EOF presents no problem.

I've seen this pattern for comment in other examples.

Don't know how/why debuggerLexer changes the outcome, but I assume you
can always trace the generated lexers to see how the different outcomes


Nathan Eloe wrote:
> On Jun 1, 2010, at 1:33 PM, anteusz at freemail.hu wrote:
>> 6/1/2010 3:33 PM keltezéssel, Nathan Eloe írta:
>>> Hash: SHA1
>>> Hello all,
>>> I'm working on an AST parser for the Bash language and I've come across the following strange behavior:
>>> I'm trying to handle comments, so I used the comments token you can get when you start a new grammar in ANTLRworks.  It works.
>>>     :   '#' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
>>>     ;
>>> The problem arises when the comment is the last thing from the input (i.e, no new line before EOF).  Removing the '\n' from the token causes it to freak out when I run the tests, but I can't get it to match comments at the end of file.  Leaving that '\n' in lets the code compile, but I still can't match that last case.
>>> Here's where the interesting part happens.  When I run it through the debugger with the same test case that I use in gunit, the debugger allows the input and parses it correctly (meaning, it ignores it as it should) and correctly generates the expected AST.
>>> Does the debugger allow the code to be more robust in its decision making abilities?  Or does it do something to the input to allow it to be matched to a token.
>>> Thanks for the help!
>>> Nathan

List: http://www.antlr.org/mailman/listinfo/antlr-interest

More information about the antlr-interest mailing list