[antlr-interest] Ambiguous resolution with | and + operator

Kirby Bohling kirby.bohling at gmail.com
Fri Oct 2 11:49:02 PDT 2009


I'm working on parsing an existing markup language that has specific
characters that are delimiters in special cases.  I've trimmed it
down.  There are subsets of markup that are special at various points.
 In this example ':', '=' could both be special in some contexts, and
not in others.  Text and white space are tokenized, but rarely
special.

grammar t;

options { output=AST; }

tokens { INLINE_TEXT; }

inline_text: ((safe_text) => safe_text)+ -> ^(INLINE_TEXT safe_text+);
safe_text: (TEXT|WHITESPACE);

inline_text_with_colon:
(inline_text_with_colon_rewrite_dummy)
-> ^(INLINE_TEXT inline_text_with_colon_rewrite_dummy);

inline_text_with_colon_bad:
((inline_text_with_colon_rewrite_dummy) =>
inline_text_with_colon_rewrite_dummy)
-> ^(INLINE_TEXT inline_text_with_colon_rewrite_dummy);

inline_text_with_colon_rewrite_dummy: (safe_text|COLON)+;
foo: (inline_text|EQUAL_SIGN)+;
bar:	(inline_text_with_colon|EQUAL_SIGN)+ ;
bar_bad:	(inline_text_with_colon_bad|EQUAL_SIGN)+;

TEXT:	'A'..'Z' | 'a'..'z';
WHITESPACE:	 ' ';
EQUAL_SIGN: '=';
COLON: ':';

Using ANTLR 3.1.3, with the above grammar, I get
"inline_text_with_colon_bad" has an ambiguous decision.  ANTLRWorks
1.3 clearly shows the issue.  Actually I don't understand why it is is
happening.  There are two cases:  If a ':' is seen just prior to
safe_text, or just after exiting safe_text it claims it is ambiguous.
I thought that the greedy nature would keep it consuming, and
"exiting" the loop should not be an option.  Clearly the DFA does not
agree with my interpretation.

I just want to make sure that using a Syntactic Predicate was the
correct solution.  If you remove the predicate from inline_text, or
inline_text_with_colon, the grammars are ambiguous.  You can see that
in inline_text_with_colon_bad is ambiguous, and is no different from
inline_text_with_colon.

As a side note, is there anyway to avoid needing the rewrite dummy?
Every time I try and rewrite/reparent something with an parenthesis
and an or condition, I get an error.  I have a lot of dummy rules that
serve no other purpose other then allowing me to do the rewrite.

Thanks in advance,
   Kirby


More information about the antlr-interest mailing list