[antlr-interest] Compiler-exploding grammar
Terence Parr
parrt at cs.usfca.edu
Wed Nov 28 17:03:07 PST 2007
Hi. Try 3.1 daily build (only java at moment). I think i fixed
something in that area.
Ter
On Nov 28, 2007, at 4:39 PM, Steve Bennett wrote:
> On 11/29/07, Andy Tripp <antlr at jazillian.com> wrote:
>> I don't follow. Are you saying that ANTLR generated a single line
>> that
>> was 14000 characters
>> wide, and javac choked on it? If so, that sounds like a javac bug
>> to me.
>
> No, it's more than a single line and apparently more than 14000
> characters wide, too. My editor shows about 30 lines in a row at least
> that long, but it truncates them at that 14384 characters.
>
> Looking into it a bit, it looks like some sort of combinatorial
> expansion gone wrong caused by the combination of two parser rules.
> Rule #1:
>
> close_bold_italics
> @after {text_bold=false; text_italics = false;}
> :
> {text_bold==true && text_italics==true}? => -> B_OFF I_OFF
> |{text_bold==false && text_italics==true}? => -> I_OFF
> |{text_bold==true && text_italics==false}? => -> B_OFF
>
> ;
>
> bold_and_italics:
> {textis("''") && text_italics}? => APOSTROPHES
> {text_italics=false;} -> ^(I_OFF)
> |{textis("''") && !text_italics}? => APOSTROPHES
> {text_italics=true;} -> ^(I_ON)
> |{textis("'''") && text_bold}? => APOSTROPHES
> {text_bold=false;} -> ^(B_OFF)
> |{textis("'''") && !text_bold}? => APOSTROPHES {text_bold=true;}
> -> ^(B_ON)
> |{textis("''''") && text_bold}? => APOSTROPHES
> {text_bold=false;} -> APOSTROPHE ^(B_OFF)
> |{textis("''''") && !text_bold}? => APOSTROPHES {text_bold=true;}
> -> APOSTROPHE ^(B_ON)
> |{textis("'''''") && text_bold && text_italics}? => APOSTROPHES
> {text_bold=false; text_italics=false; } -> ^(B_OFF) ^(I_OFF)
> |{textis("'''''") && text_bold && !text_italics}? => APOSTROPHES
> {text_bold=false; text_italics=true; } -> ^(B_OFF) ^(I_ON)
> |{textis("'''''") && !text_bold && text_italics}? => APOSTROPHES
> {text_bold=true; text_italics=false; } -> ^(B_ON) ^(I_OFF)
> |{textis("'''''") && !text_bold && !text_italics}? =>APOSTROPHES
> {text_bold=true; text_italics=true; } -> ^(B_ON) ^(I_ON)
> ;
>
> The resulting mess:
>
> (text_bold==false && text_italics==true&&textis("''''") && !text_bold)
> ||(text_bold==true && text_italics==false&&textis("'''") && text_bold)
> ||(text_bold==true && text_italics==false&&textis("'''''") &&
> text_bold && !text_italics)
> ||(text_bold==true && text_italics==true&&textis("''''") && !
> text_bold)
> ||(text_bold==true && text_italics==true&&textis("'''''") &&
> !text_bold && !text_italics)
> ...
>
> going through *all* the combinations, even the nonsensical ones:
> ...
> ||(text_bold==true && text_italics==true&&textis("'''") && !text_bold)
> ...
>
> Even so, I don't really get why it explodes quite so massively. I also
> can't really say why these two rules are even joined in this cartesian
> product like that. The only thing they really have in common is that
> they produce the same AST node type.
>
> Steve
More information about the antlr-interest
mailing list