[antlr-interest] Parsing with inverse matches

Sun Nov 22 14:39:43 PST 2009

On Sun, Nov 22, 2009 at 4:20 PM, Vipul Delwadia
<vipul.delwadia at gmail.com> wrote:
> Hi,
>
> Suppose I have a very simple grammar:
>
> line:   x;
>
> x       :       STRING+;
>
> fragment BACKSLASH
>        :       '\\';
>
> NOTA:   BACKSLASH A;
>
> A       :       'a';
>
> STRING
>        :       (~(A)|NOTA)+;
>
> Now I want x to be able to match any sequence which doesn't have "a"
> in it, including sequences which have "\a". This works for the most
> part except when I try and match just "\a", at which point I get a
> MismatchTokenException (or sometimes a NoViableAltException). However,
> in the ANTLRWorks IDE if I parse it using the interpreter starting
> from the STRING rule, it seems to match it just fine.
>
> Any ideas?

If I understand it correctly, the '\' is matched by the ~(A) portion
of the rule, and then it is incapable of matching the 'a' character.
I'm not sure what the best way to handle that is.  You can use
semantic or syntactic predicates to resolve these issues.  There are
subtle issues that I don't fully understand in the difference between
those.  I'd review the list posts to see what they discuss.

I think something like this:

((NOTA) => (NOTA)|~A)+

Would work, but I haven't actually tried it.

Kirby

>
> Cheers,
> Vipul
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>