[antlr-interest] lexical nondeterminism
Bryan Ewbank
ewbank at gmail.com
Sat Feb 19 09:35:53 PST 2005
The problem is linear approximate lookahead:
Linear Approximate Lookahead
An approximation to full lookahead (that can be applied to both LL and
LR parsers) for k>1 that reduces the complexity of storing and testing
lookahead from O(n^k) to O(nk); exponential to linear reduction. When
linear approximate lookahead is insufficient (results in a
nondeterministic parser), you can use the approximate lookahead to
attenuate the cost of building the full decision.
-- p9, antlrman.pdf
This means that the lookahead sets are collaped in such a way that
ambiguity shows up. There's a few more paragraphs in the manual that
should help to explain it.
The easiest solution I've found (there are probably others) is to
accept an identifier ([a-z][a-z][a-z]), and than use a lookup table to
grab the ones I want. The other solution I've used is to use flex for
token generation.
On Sun, 20 Feb 2005 00:19:56 +0800, Rice Yeh <riceyeh at gmail.com> wrote:
> Hi,
> I have the following lexical rules where the longest string
> literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> enough for this case), But why there is nondeterminism with message as
> follows:
>
> lexical nondeterminism between alts 1 and 2 of block upon
> k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> k==4:<end-of-token>,'-'
>
> MONTH_OF_YEAR:
> ("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> "SEP" | "OCT" | "NOV" | "DEC")
> ;
>
> DAY_OF_WEEK:
> ("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
> ;
>
> VALUE
> : DAY_OF_WEEK | MONTH_OF_YEAR
> ;
>
> Regards,
> Rice
>
More information about the antlr-interest
mailing list