[antlr-interest] MySQL's YACC grammar to ANTLR: could not even do k=1 for decision

Mark Wright markwright at internode.on.net
Wed Jun 30 04:48:39 PDT 2010


On Wed, Jun 30, 2010 at 07:20:11AM -0400, David Maier wrote:
> Hi,
> 
> 
> I am currently migrating the MySQL grammar from YACC to ANTLR. I am now hitting the problem to get a lot of error messages like:
> 
> " internal error: org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:1279): could not even do k=1 for decision <n>"
> 
> So I tried the following workarounds:
> 
> * Increased the timeout during the generation of the Parser's Java code: This did not solve the problem and ended up in out of memory errors

Hi David,

Often with larger grammars it is necessary to increse the timeout and
to increase the Java VM heap size.

I am not sure about the other problems.

Just to state the obvious though: since Yacc will happilly accept
left recursive grammars, while as ANTLR will not, it is necessary
to remove any left recursion from the grammar.  A good description
of how to do that is on p.129 of the book "Modern Compiler Design"
by Grune, et. al.  There is also a description in the dragon book,
I like the description in Modern Compiler Design better for this
task though.  Presumably there are other descriptions for
left recursion removal.

Regards, Mark

> * Did set k=1, which helps to avoid the error message but introduces limitations regarding the look ahead and so causes new problems
> 
> So I think the right solution will be to remove ambiguous rules. I already began to remove 'Epsilon' rules and I can imagine that there are other ambiguous rules in the grammar as well. The problem I have is to find them easily. So the error message above contains the decision number <n>. Is it easily possible to find the rule in my grammar's source code which is related to this decision number <n>? How to do that.
> 
> Another idea would be to implement an algorithm which starts at a single token definition and creates a tree by substituting the rules back. The idea is that a rule is ambiguous if it is part of 2 different paths in this tree. So for instance:
> 
> rule1: rule2 | rule3;
> rule2: TOKEN1;
> rule3: rule4 | rule5;  
> rule4: TOKEN1;
> rule5: TOKEN2;
> 
> would result in the following tree for TOKEN1:
> 
> TOKEN1 -> rule2 -> rule1;
>        -> rul4 -> rule3 -> rule1;
> 
> I am wondering if there is already such a tool available to find ambiguous rules, is there?
> 
> BTW: Future information can be found here: http://community.ingres.com/wiki/Ingres_Migration_Tool_Set_Idiom_MySQL
> 
> Advise would be highly appreciated! Thanks in advance for your help!
> 
> 
> Regards, David
> 
> 
> --
> David Maier
> Senior Software Engineer
> 
> Ingres Germany GmbH
> Weimarer Straße 1a
> D-98693 Ilmenau
> 
> PHONE:  +49.3677.6785-59
> FAX:    +49.3677.6785-23
> MAIL:   david.maier at ingres.com
> 
> This transmission is confidential and intended solely for the use of the recipient named above. It may contain confidential, proprietary, or legally privileged information. If you are not the intended recipient, you are hereby notified that any unauthorized review, use, disclosure or distribution is strictly prohibited. If you have received this transmission in error, please contact the sender by reply e-mail and delete the original transmission and all copies from your system.
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> 


More information about the antlr-interest mailing list