[antlr-interest] setting k Value Versus Predicates

Tue Feb 16 03:15:06 PST 2010

>
> There is an algorithmic improvement that can be made:  replace the DFA in b
> with a pattern matcher that tests the minimum number of tokens needed to
> make the decision; these tokens would be tested out of order--instead of 1 2
> 3 4 5, for example, the order might be 1 5 2 for the longest comparison.
> Since most decisions are LL(1), this would not have as much impact as you
> might think.
>

Hmmm.... I am finding it difficult to comprehend. My head is spinning
slightly :). I will try to read it again after sometime.

 ANTLRWorks doesn't support anything other than java for debugging. I just
tested it. I use the ANTLR grammar supplied with the examples to convert my
C grammar into Java grammar, for my debugging purposes.

Well, i had a small observation with the ANTLR generated code (and of-course
from the performance perspective). Forgive me, if i am not speaking in
parsing terminology. Consider a grammar like this

grammar Simple;
A     :     'a' | 'A';
B    :     'b' | 'B';
C    :     'c' | 'C';
rule1    :    A
        ( B A
          | A C
         )
         ;
rule2    :    A C B
        | rule1
        ;

Here, if my starting rule is rule2, then i find two DFAs. one for rule2 and
one for rule1 in order to resolve one ambiguity (  the ambiguity that might
occur with the first token ). I understand the idea is that any rule can be
made as a first rule. (If we seperate the DFA from the actual matching, then
each rule will have two functions and rule2 can decide to invoke the
matching function directly with the alternate prediction ) Imagine if the
tokens are embedded deep inside rules, then there are multiple DFAs, that
get evaluated for the same ambiguity of the first token. I just feel that
this is an area, where some optimization can be made. If this multiple DFA
evaluation for one ambiguity problem is resolved, i feel that LL(k) would be
more efficient ( of course from the performance angle :) ).

Some kind of path prediction for one rule from a different rule, if
incorporated, would reduce the un-necessary DFA executions during runtime.
Please consider this as a suggestion from a person, who has very less
experince in parser world.

Thanks,
Gokul.