[antlr-interest] ANTLRv3.g problem
Kenneth Domino
kenneth.domino at domemtech.com
Tue Mar 11 10:59:25 PDT 2008
FYI, there seems to be a problem with the grammar ANTLRv3.g
(http://fisheye2.cenqua.com/browse/antlr-examples/java/ANTLR/ANTLRv3.g?r=4288)
on the following legal input (use ANTLRWorks to see the bogus
token created for the action):
grammar xx;
a : { bar("}"); } ;
I debugged the program and it appears that the problem
is related to nesting strings in actions. The relevant rules are:
fragment
NESTED_ACTION :
'{'
( options {greedy=false; k=3;}
: NESTED_ACTION
| SL_COMMENT
| ML_COMMENT
| ACTION_STRING_LITERAL
| ACTION_CHAR_LITERAL
| .
)*
'}'
{$channel = DEFAULT_TOKEN_CHANNEL;}
;
fragment
ACTION_STRING_LITERAL
: '"' (ACTION_ESC|~('\\'|'"'))+ '"'
;
It seems that the lexer should jump into an
automaton to recognizing strings once finding a double
quote, but it doesn't.
Instead, after seeing the first double quote, it does a
LA for a "}", like it wants to match the DOT alternative.
Also, it seems that the ACTION_STRING_LITERAL
is not totally correct because of the closure ("+").
This would seem to preclude an empty string (e.g., "").
The rules are slightly different than
what is found in "antlr.g" in org/antlr/tool/, which is used
by the ANTLR tool, but not that different. So, I'm wondering
if there is a problem with the DFA construction.
I found this problem while using the
ANTLRv3.g grammar for a pretty printer, which is now
starting to work pretty well!
Say, is there a switch in ANTLR to dump all the states and
transitions so I can figure this out?
Ken Domino
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080311/2e78357a/attachment.html
More information about the antlr-interest
mailing list