[antlr-interest] Understanding lookahead
Wincent Colaiuta
win at wincent.com
Wed Jun 6 10:46:56 PDT 2007
I'm trying to understand how ANTLR's lookahead mechanisms work using
this grammar:
grammar Simple;
FOO: BAR ':' BAZ {System.out.println("FOO");};
fragment BAR: 'bar' {System.out.println("BAR");};
fragment BAZ: 'baz' {System.out.println("BAZ");};
EVERYTHING_ELSE: . {System.out.println("EVERYTHING_ELSE");};
thing: .* EOF {System.out.println("done");};
I basically wanted to explore the way lookahead works and what ANTLR
does when its lookahead predictions fail. For example, given the
following inputs:
- "bar:baz": recognizes this as a FOO token
- "bar:ba": predicts FOO and complains about missing "z"
- "bar:b": predicts FOO and complains about missing "a"
- "bar:": predicts FOO and complains about missing "b"
- "bar": predicts FOO and complains about missing ":"
- "ba": predicts FOO and complains about missing "r"
- "b": accepts input as EVERYTHING_ELSE
- "...ba": accepts the periods as EVERYTHING_ELSE, then predicts FOO
complains about missing "r"
This exercise was very helpful for me in seeing how ANTLR's lookahead
operates: basically, as soon as its seen enough input to predict the
presence of a particular token ("ba" is enough in this case), it
assumes that it really is that token, scans ahead, and raises an
exception if its expectations aren't met.
So, one way to get this grammar to handle strings like "...ba"
without throwing exceptions is to use the filter=true option. I'm
curious to know, however, is there any other way?
Cheers,
Wincent
More information about the antlr-interest
mailing list