[antlr-interest] lexical nondeterminism

Bryan Ewbank ewbank at gmail.com
Sat Feb 19 09:35:53 PST 2005


The problem is linear approximate lookahead:

Linear Approximate Lookahead
An approximation to full lookahead (that can be applied to both LL and
LR parsers) for k>1 that reduces the complexity of storing and testing
lookahead from O(n^k) to O(nk); exponential to linear reduction. When
linear approximate lookahead is insufficient (results in a
nondeterministic parser), you can use the approximate lookahead to
attenuate the cost of building the full decision.
-- p9, antlrman.pdf

This means that the lookahead sets are collaped in such a way that
ambiguity shows up.  There's a few more paragraphs in the manual that
should help to explain it.

The easiest solution I've found (there are probably others) is to
accept an identifier ([a-z][a-z][a-z]), and than use a lookup table to
grab the ones I want.  The other solution I've used is to use flex for
token generation.

On Sun, 20 Feb 2005 00:19:56 +0800, Rice Yeh <riceyeh at gmail.com> wrote:
> Hi,
>   I have the following lexical rules where the longest string
> literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> enough for this case), But why there is nondeterminism with message as
>  follows:
> 
> lexical nondeterminism between alts 1 and 2 of block upon
> k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> k==4:<end-of-token>,'-'
> 
> MONTH_OF_YEAR:
>         ("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> "SEP" | "OCT" | "NOV" | "DEC")
>         ;
> 
> DAY_OF_WEEK:
>         ("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
>         ;
> 
> VALUE
>         :       DAY_OF_WEEK | MONTH_OF_YEAR
>         ;
> 
> Regards,
> Rice
>


More information about the antlr-interest mailing list