[antlr-interest] ANTLRv3.g problem

Tue Mar 11 10:59:25 PDT 2008

FYI, there seems to be a problem with the grammar ANTLRv3.g
(http://fisheye2.cenqua.com/browse/antlr-examples/java/ANTLR/ANTLRv3.g?r=4288)
on the following legal input (use ANTLRWorks to see the bogus
token created for the action):

grammar xx;
a : { bar("}"); } ;

I debugged the program and it appears that the problem
is related to nesting strings in actions. The relevant rules are:

fragment
NESTED_ACTION :
 '{'
 ( options {greedy=false; k=3;}
 : NESTED_ACTION
 | SL_COMMENT
 | ML_COMMENT
 | ACTION_STRING_LITERAL
 | ACTION_CHAR_LITERAL
 | .
 )*
 '}'
 {$channel = DEFAULT_TOKEN_CHANNEL;}
   ;

fragment
ACTION_STRING_LITERAL
 : '"' (ACTION_ESC|~('\\'|'"'))+ '"'
 ;

It seems that the lexer should jump into an
automaton to recognizing strings once finding a double
quote, but it doesn't.
Instead, after seeing the first double quote, it does a
LA for a "}", like it wants to match the DOT alternative.
Also, it seems that the ACTION_STRING_LITERAL
is not totally correct because of the closure ("+").
This would seem to preclude an empty string (e.g., "").

The rules are slightly different than
what is found in "antlr.g" in org/antlr/tool/, which is used
by the ANTLR tool, but not that different.  So, I'm wondering
if there is a problem with the DFA construction.

I found this problem while using the
ANTLRv3.g grammar for a pretty printer, which is now
starting to work pretty well!

Say, is there a switch in ANTLR to dump all the states and
transitions so I can figure this out?

Ken Domino
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080311/2e78357a/attachment.html