[antlr-interest] packrat vs. LL(*)

Mon Jun 8 12:43:37 PDT 2009

Steven Obua wrote:
>
> On Jun 8, 2009, at 9:05 PM, Jim Idle wrote:
>
>> Steven Obua wrote:
>>> Hi,
>>>
>>> I am currently building a parser with ANTLR; I am really happy about
>>> this great tool!
>>>
>>> I found that the most convenient settings for my purposes are
>>>
>>> backtrack=true
>>> memoize=true
>>> k=1
>>>
>>> when leaving out the "k=1" option, antlr cannot process my grammar, it
>>> runs into a timeout or something like that.
>>>
>>
>> You need to use the extended option -X conversiontimeout 10000
>
> Does not work, I tried that before using 'k=1', it still does not 
> terminate in reasonable time. I even give ANTLR at lot of RAM (12 GB), 
> still doesn't cope.
>
Then I suggest that your grammar needs some serious work.,
>
>>>
>> Well, first of all, global backtracking should only be used if you don't
>> care about error messages or performance. It is sort of a "get the job
>> done" option that is useful for prototyping, proof of concept, or one
>> off conversion programs (where you know the input is valid, you just
>> want to change it to something else).
>>
>
> That is not true.  
I assure you that it is.
> That is why I am referring to packrat parsing, which still guarantees 
> linear time parsing. 
And how are you guaranteeing linear time with global backtrack=true? All 
this really means is that it tries every alt you specify n the order you 
specify, which means that if your particular input is alt 12, it will 
try 11 others first. Now, that isn't quite true because ANTLR is a bit 
smarter than that, but it CAN be true. The parser time will be 
predictable and provable, but will generally spend a lot of time doing 
things it doesn't need to, and when there is a syntax error, it won't 
have much of a clue why.

What I said still applies - if your task doesn't involve a parser that 
will exist for 20 years, then who cares. If your task doesn't require 
good error messages and you don't care about resources (though I think 
you should if you will keep running the thing), then who cares.

But, if you turn on a timeout of 10000 and still cannot get through the 
grammar, then you need to consider fixing up your grammar.

> You can't get better complexity for a parser :-)  
I am not sure what you are trying to say here.
> About the constant factor slowdown I dont care so much, I just need to 
> know that the speed I currently see from my parser scales. 
Speed does not scale. Without more information, I can;t answer that 
question. Do you mean, will the parser rip through a 500K file in 
proportionate time to a 1K file, then that is one thing. If you mean 
will it work on a system when there are 3000 processes doing it 
simultaneously, then that is another (that probably doesn't involve Java 
;-).

> About memory usage I dont care so much either, my application is using 
> way more system resources after the parsing than the parser itself 
> ever will.
Hmmmm.

Jim