[antlr-interest] MySQL's YACC grammar to ANTLR: could not even do k=1 for decision

Mark Wright markwright at internode.on.net
Wed Jun 30 05:41:37 PDT 2010


On Wed, Jun 30, 2010 at 08:01:10AM -0400, David Maier wrote:
> Hi Mark,
> 
> thanks for your quick reply. I think that I already removed the left
> recursive rules by replacing them with iterations (Kleene operator
> '*') if possible or by using ANTLRWork's refactoring. However, it
> may be that there are still some Multi Line Recursive rules
> left. Further details are available here:
> http://community.ingres.com/wiki/Ingres_Migration_Tool_Set_Idiom_MySQL#MySQL_grammar_from_YACC_to_ANTLR
  .
> 
> I also already increased the timeout to 2 minutes and assigned 1GB
> of RAM to the JVM in which the code generation happens. The result
> was an Out Of Memory message. So I am not sure but maybe there is as
> you pointed out still a Multi Line left recursion in my grammar
> which causes that the Look Ahead DFA creation comes never to an
> end. Could that be?
> 
> Regards, David

Hi David,

Having a quick look at the grammar, it looks like you are/were
having problems with the expressions.  The recipe to convert those
is desribed on p. 266 of The Definitive ANTLR Reference by
Terence Parr.  Basically it says you have to explicitly encode
the expression priorities by creating rules for them, as you
see in any ANTLR grammar that handles expressions.

And that yacc would instead deal with them by setting the
priority on the expression operators.

I am not really sure why it runs out of memory or why you get
the error:

" internal error: org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:1279): could not even do k=1 for decision <n>"

so I'm kind of guessing.

Thanks, Mark
 
> --
> David Maier
> Senior Software Engineer
> 
> Ingres Germany GmbH
> Weimarer Straße 1a
> D-98693 Ilmenau
> 
> PHONE:  +49.3677.6785-59
> FAX:    +49.3677.6785-23
> MAIL:   david.maier at ingres.com
> 
> This transmission is confidential and intended solely for the use of the recipient named above. It may contain confidential, proprietary, or legally privileged information. If you are not the intended recipient, you are hereby notified that any unauthorized review, use, disclosure or distribution is strictly prohibited. If you have received this transmission in error, please contact the sender by reply e-mail and delete the original transmission and all copies from your system.
> 
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: antlr-interest-bounces at antlr.org im Auftrag von Mark Wright
> Gesendet: Mi 30.06.2010 13:48
> An: antlr-interest at antlr.org
> Betreff: Re: [antlr-interest] MySQL's YACC grammar to ANTLR: could not even do k=1 for decision
>  
> On Wed, Jun 30, 2010 at 07:20:11AM -0400, David Maier wrote:
> > Hi,
> > 
> > 
> > I am currently migrating the MySQL grammar from YACC to ANTLR. I am now hitting the problem to get a lot of error messages like:
> > 
> > " internal error: org.antlr.tool.Grammar.createLookaheadDFA(Grammar.java:1279): could not even do k=1 for decision <n>"
> > 
> > So I tried the following workarounds:
> > 
> > * Increased the timeout during the generation of the Parser's Java code: This did not solve the problem and ended up in out of memory errors
> 
> Hi David,
> 
> Often with larger grammars it is necessary to increse the timeout and
> to increase the Java VM heap size.
> 
> I am not sure about the other problems.
> 
> Just to state the obvious though: since Yacc will happilly accept
> left recursive grammars, while as ANTLR will not, it is necessary
> to remove any left recursion from the grammar.  A good description
> of how to do that is on p.129 of the book "Modern Compiler Design"
> by Grune, et. al.  There is also a description in the dragon book,
> I like the description in Modern Compiler Design better for this
> task though.  Presumably there are other descriptions for
> left recursion removal.
> 
> Regards, Mark
> 
> > * Did set k=1, which helps to avoid the error message but introduces limitations regarding the look ahead and so causes new problems
> > 
> > So I think the right solution will be to remove ambiguous rules. I already began to remove 'Epsilon' rules and I can imagine that there are other ambiguous rules in the grammar as well. The problem I have is to find them easily. So the error message above contains the decision number <n>. Is it easily possible to find the rule in my grammar's source code which is related to this decision number <n>? How to do that.
> > 
> > Another idea would be to implement an algorithm which starts at a single token definition and creates a tree by substituting the rules back. The idea is that a rule is ambiguous if it is part of 2 different paths in this tree. So for instance:
> > 
> > rule1: rule2 | rule3;
> > rule2: TOKEN1;
> > rule3: rule4 | rule5;  
> > rule4: TOKEN1;
> > rule5: TOKEN2;
> > 
> > would result in the following tree for TOKEN1:
> > 
> > TOKEN1 -> rule2 -> rule1;
> >        -> rul4 -> rule3 -> rule1;
> > 
> > I am wondering if there is already such a tool available to find ambiguous rules, is there?
> > 
> > BTW: Future information can be found here: http://community.ingres.com/wiki/Ingres_Migration_Tool_Set_Idiom_MySQL
> > 
> > Advise would be highly appreciated! Thanks in advance for your help!
> > 
> > 
> > Regards, David
> > 
> > 
> > --
> > David Maier
> > Senior Software Engineer
> > 
> > Ingres Germany GmbH
> > Weimarer Straße 1a
> > D-98693 Ilmenau
> > 
> > PHONE:  +49.3677.6785-59
> > FAX:    +49.3677.6785-23
> > MAIL:   david.maier at ingres.com
> > 
> > This transmission is confidential and intended solely for the use of the recipient named above. It may contain confidential, proprietary, or legally privileged information. If you are not the intended recipient, you are hereby notified that any unauthorized review, use, disclosure or distribution is strictly prohibited. If you have received this transmission in error, please contact the sender by reply e-mail and delete the original transmission and all copies from your system.
> > 
> > 
> > List: http://www.antlr.org/mailman/listinfo/antlr-interest
> > Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address
> > 
> 


More information about the antlr-interest mailing list