[antlr-interest] lookahead DFA too big? (resolved, I think)

Mon Mar 9 08:30:15 PDT 2009

I have to correct myself: I previously said that the generated DFAs were 
too big. Some were indeed exceeding the 64k limit, but I could fix that 
by splitting up some rules (as suggested). The class-size problem (there 
were 5000 lookahead DFAs with long initialization strings) I could also 
fix by just splitting up the grammar and importing the parts.

But, the actual surprising thing: the rule where I expected the 
biggest/showstopper DFA (the rule that includes all the 800 keywords) 
does not use anything complicated, just a handful of if-then-else. 
Without setting the -Xconversiontimeout to high values (20 seconds) 
ANTLR would timeout in exactly this rule, letting me think that indeed a 
huge prediction DFA is produced there. It seems that the DFA computation 
algorithm tries to determine if it needs a DFA in the first place, and 
it just takes a lot of time until it knows. Having ANTLR compute a 
result faster would be a nice-to-have, but at least it works now, 
despite the 800 keywords :-)

Anyway, I hope others can now avoid this pitfall.

Thank you for your patience and your suggestions!
Andreas

Terence Parr schrieb:
> The problem is that the DFA it creates is just too doggone big.  
> Coincidentally, I have a graduate student who plans on working on DFA  
> minimization; of course that won't help if the NFA to DFA conversion  
> times out because the minimization comes afterwards. If you get to the  
> code generation phase, however, the DFA minimization should help.
>
> Sorry I don't have a good answer for you right now. One thing that you  
> might try is separating the lexer into multiple pieces and then using  
> an import grammar statement. This forces the lexer to be broken up  
> into multiple DFA I believe, one per included lexer.
> Ter
>
> On Mar 6, 2009, at 11:38 AM, Andreas Meyer wrote:
>
>   
>> Terence Parr schrieb:
>>     
>>> Hi Andreas, I'm neck deep in another problem at the moment. Can you
>>> try increasing the timeout?
>>>
>>>  -Xconversiontimeout t  set NFA conversion timeout for each decision
>>>
>>> try 100000 or something liek that.
>>>       
>> Yes, I already did that. Sorry if my initial mail two days back was  
>> too
>> verbose, probably you did not see it :-) The result of running ANTLR  
>> was
>> a huge java file, with lots of definitions for follow-sets (many
>> thousands) and dfa-transitions. The follow-sets issue I resolved  
>> with a
>> perl script (by moving them into a set of static initializer  
>> functions),
>> but the dfa-transitions look a little tricky to move around with perl,
>> so javac still cannot compile the parser.
>>
>> If you think it's feasible to change the corresponding string  
>> templates
>> for a string template newbie (me), I would really appreciate if you
>> could tell me which parts I need to take care of.
>>
>>
>> Thanks a lot!
>> Andreas Meyer
>>
>> List: http://www.antlr.org/mailman/listinfo/antlr-interest
>> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>>     
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>