[antlr-interest] setting k Value Versus Predicates

Tue Feb 16 02:06:46 PST 2010

Gokul--

I don't actually know that ANTLR generates slower code than flex/bison; I expect that it does, but that is just an educated guess.

Ter's current implementation does follow the logic you state, although I would recast it as
a)  Match tokens in order of appearance until a decision point is reached.
b)  At each decision point, use a deterministic finite automaton that matches tokens in order of appearance until there is only one correct alternative.
c.) Rewind; select the correct alternative
d.) Exit if the alternative is EOF; otherwise, goto a.

For LL(1) decisions, b) is replaced by a simple if statement.

There is an algorithmic improvement that can be made:  replace the DFA in b with a pattern matcher that tests the minimum number of tokens needed to make the decision; these tokens would be tested out of order--instead of 1 2 3 4 5, for example, the order might be 1 5 2 for the longest comparison.  Since most decisions are LL(1), this would not have as much impact as you might think.

One of the big strengths of ANTLR (dating back to PCCTS) is that it generates human readable code that can be used for debugging (although walking the DFAs can be painful).  My impression is that most of the targets support the ANTLRWorks debugger, and that the variation is just a matter of whether a given target developer has kept his target in sync with ANTLRWorks or not at any given time.  Java does have "most favored language" status, but that is only because the target developers have to follow the Java changes.  I don't use ANTLRWorks, so I can't say much more than this.

--Loring

>
>From: Gokulakannan Somasundaram <gokul007 at gmail.com>
>To: Loring Craymer <lgcraymer at yahoo.com>
>Sent: Mon, February 15, 2010 11:15:44 PM
>Subject: Re: [antlr-interest] setting k Value Versus Predicates
>
>Thanks Loring, i understand what you are saying. I understand that
>LL(1) parser produced by ANTLR is slightly slower than LALR(1) produced
>by flex/bison. I am just curious to know, what needs to be done at the
>LL(1) parser level for ANTLR to catch up with flex/Bison. Because i see
>the logic in ANTLR parsing is more like
>>a) get a cache of next n tokens
>>b) Based on switch case/predicates, follow the grammar.
>
>>What else can be done further to improve the performance? I am asking
>this only for understanding purposes and i get your point on
>concentrating on the big picture.
>
>>As far as the productivity part goes, i feel ANTLR helps a lot for
>anyone who creates the lexer/parser completely in Java. For a person,
>who chooses to take a different target language, he has to spend some
>effort in creating the Java target also, solely for the purpose of
>grammar debugging. Correct me, if i am wrong here.
>
>>Once again, thanks for the explanation
>
>>Thanks,
>>Gokul.
>
>P.S. : I don't know, whether you mistakenly avoided the group in the reply. I just maintained that.
>
>
>On Tue, Feb 16, 2010 at 11:30 AM, Loring Craymer <lgcraymer at yahoo.com> wrote:
>
>Gokul--
>>
>>You are not asking dumb questions, but you are asking the wrong questions.  I would expect a flex/bison generated lexer/parser to run (slightly) faster than its ANTLR counterpart, but that is because flex and bison were written to be open source versions of lex and yacc and to recognize .l and .y files (at least to start with).  That left the developers a small degree of freedom to add features, but the real opportunity was in the speed of generated lexers and parsers and developers have maniacally focused on performance enhancement.
>>
>>ANTLR, on the other hand, evolved to be a tool developers tool and has broader design considerations.  Performance is a consideration, but not the only one.  Once you have a lex/yacc front end, you have exhausted their
>> capabilities, and you still have to insert action code and debug the grammars.  Those steps are much easier with ANTLR, and you still have automated tree construction, tree grammars, and StringTemplate to help with the remaining phases of tool construction.
>>
>>The urge to premature optimization is a disease that needs to be carefully controlled.  If you focus on optimizing each piece of a system in isolation while ignoring the system as a whole, invariably system performance suffers.  Look at the big picture first, then focus on what you see to be essential details.
>>
>>--Loring
>>
>>
>