[antlr-interest] lookahead DFA too big?
Jim Idle
jimi at temporal-wave.com
Thu Mar 5 07:27:20 PST 2009
Andreas Meyer wrote:
> Maybe it's possible to partition the set of keywords, but that would be
> some effort: figuring out for 800 keywords, where they appear, what is
> the context they are used in etc. Note that the problem only appeared
> after switching to ANTLR 3.1, ANTLR 2.7 was fine with it and the
> generated parser works well.
>
Andreas,
What language are you trying to parse? I think that you have taken the
wrong approach perhaps but it is a little difficult to advise without
knowing what you are trying to parse. Basically, what Loring is saying
is that you need to step back a bit. Unfortunately, when there are this
many keywords and they can all be identifiers too, there will have to be
some classification, just as in English there are verbs, nouns, and so on.
Perhaps what you need is something like a set of explicit tokens for
hard and fast keywords, then check ID at various other points. A hybrid
approach in other words. Is the current grammar something you can email
off-line rather than to the whole group? I suspect that the original
parser for this language probably used a scannerless parser, or rather,
a hand crafted program that was neither scanner nor parser in a 'real'
sense.
> On some power-point presentation from Terence Parr, I have seen a slide
> mentioning huge generated DFAs. Is there some additional material
> available that documents some of the possible situations that make the
> DFAs explode?
>
NFA and DFA are wider concepts than ANTLR of course - you will find
plenty of literature on this. As to ANTLR itself, then Ter is always
willing to look at fixes/optimizations/etc but we need to provide some
sort of concrete example. In many cases it is a matter of design and not
bugs. When the DFA explodes because of dangling optional clauses in say
a statement, it is often because there is no definitive end to a
statement, such as Newline or ';' and so on.
With a look at your grammar, I can probably help you.
Jim
More information about the antlr-interest
mailing list