[antlr-interest] Logic like ~ for parsing

Kirby Bohling kirby.bohling at gmail.com
Sun Mar 7 22:45:22 PST 2010


All,

   I have a grammar that is pretty far along.  But I really need a
rule like this:

dangling_match:
    (non_dangling_a | non_dangling_b) => // Failure to match, get back
up to the foo level somehow.
    | DANGLING_TOKEN
;

foo:
   (non_dangling_a | non_dangling_b | options_including_dangling_match)+
;

options_including_dangling_match:
   (option_1 | option_2| option_3| option_4| dangling_match)+;

So non_dangling_a, non_dangling_b, and dangling match all start with
the token DANGLING_MATCH.  I'd really like non_dangling_match to on
cases where I have a dangling token.  As a concrete example, in C, I
only want the dangling option to match if I have a stray '{', but if
the '{' looks like it's part of well formed statement, I don't want
dangling_match to match and consume that input.  I'd like to arrange
for the system to get back up to the "foo" rule, and have the input
consumed there.

If it were a lexer, I think I'd write the rule this way:

dangling_match:
    { input.LA(1) == DANGLING_TOKEN }?
    (~(non_dangling_a|non_dangling_b)) => DANGLING_TOKEN
;

I can't seem to find a way to accomplish this in the parser.  I've
tried marking and resetting the stream, and using dynamic scopes.  The
problem is that everything I've found that compiles will kick me into
an infinite loop.  If it looks like a non_dangling case (the input is
well formed), it won't consume the input, and never leave the
"options_including_dangling_match" rule to get back to the "foo" rule
which would consume the non_dangling input.

I've tried poking around in the FAQ, but I didn't see anything obvious.

Thanks in advance,
   Kirby


More information about the antlr-interest mailing list