[antlr-interest] Parsing this ambiguous grammar

Gerald Gutierrez gerald.gutierrez at gmail.com
Thu Jan 26 17:48:48 PST 2012


Hello all,

I'm attempting to use ANTLR to parse some text and I've come across an
ambiguity problem.

Essentially, I've got two tokens defined:

ID  :   ('a'..'z' | 'A'..'Z') ('0'..'9' | 'a'..'z' | 'A'..'Z' | ' ')*;

PITCH
    :   (('A'|'a') '#'?)
    |   (('B'|'b') '#'?)
    |   (('C'|'c') '#'?);

Obviously, the letter "A" would be an ambiguity.

I further define:

note    :   PITCH;
name    :   ID;
main    :   name ':' note '\n'?

One would think that, intuitively, since the main rule says that there
should be a "name" followed by a colon followed by a "note", that it
any name would match first. Also, after the colon, any "note" would
match.

If I enter "A:A" as input to the parser, I always get an error. Either
the parser expects PITCH or ID depending on whether ID or PITCH is
defined first.

What is the right way to resolve this situation and get the parser to
behave as desired? Trying to come to grips with predicates. Would
their use solve this problem?


Regards,
Gerald.


More information about the antlr-interest mailing list