[antlr-interest] bad generated code?

Monty Zukowski monty at codetransform.com
Fri Sep 30 19:26:53 PDT 2005


Tree parsers have to be k=1 because they are actually two dimensional  
and it gets too weird to allow k>1.  Syntactic predicates should  
work, though.

You might even try this with 2.7.4.  This seems pretty basic to get  
wrong.  I wonder if it is a newly introduced bug.

Monty


On Sep 30, 2005, at 6:59 PM, Christian Bird wrote:

> In case anyone cares, I tracked down the problem in the source.  Here
> is the lookahead code generator from JavaCodeGenerator.java:
>
>     private String lookaheadString(int k) {
>         if (grammar instanceof TreeWalkerGrammar) {
>             return "_t.getType()";
>         }
>         return "LA(" + k + ")";
>     }
>
> Note that if it's a tree walker, k isn't examined at all. My solution
> (though I'm not sure that it's always NullPointerException safe) is
> this:
>
>     private String lookaheadString(int k) {
>         if (grammar instanceof TreeWalkerGrammar) {
>             System.out.println("k is " + Integer.toString(k) );
>             String retStr = "_t";
>             while (k > 1) {
>                 retStr += ".getNextSibling()";
>                 k--;
>             }
>             retStr += ".getType()";
>             return retStr;
>         }
>         return "LA(" + k + ")";
>     }
>
> A quick recompile and that seems to do the trick.  It's possible that
> the check for getNextSibling could run off the end of the list and
> generate a NullPointerException, but in my grammar I know that the
> places where lookahead two is required, there's gauranteed to be a
> next sibling.
>
> Terrence,
> Any chance that this or something similar and safer (I'm not too
> familiar with the codebase) could make it into 2.7.6?  I know that
> it's probably not often that a tree parser needs k > 1, but (at least
> in my case) it can occur.  Thanks.
>
> -- Chris
>
> On 9/30/05, Christian Bird <cabird at gmail.com> wrote:
>
>> That didn't seem to work either.  I tried using a syntactic  
>> predicate:
>>
>> name :
>>         (ID DOT) => complexName
>>         | (ID ~DOT) => identifier
>>         ;
>>
>> and adding a rule that changes the followset of name:
>>
>> aname :
>>         name SEMI;
>>
>> But the code still has issues:
>>
>> boolean synPredMatched98 = false;
>> if (((_t.getType()==ID) && (_t.getType()==SEMI||_t.getType() 
>> ==ARROW))) {
>>         AST __t98 = _t;
>>         synPredMatched98 = true;
>>         inputState.guessing++;
>>         try {
>>                 {
>>                 AST tmp63_AST_in = (AST)_t;
>>                 match(_t,ID);
>>                 _t = _t.getNextSibling();
>>                 AST tmp64_AST_in = (AST)_t;
>>                 matchNot(_t,DOT);
>>                 _t = _t.getNextSibling();
>>                 }
>>         }
>>         catch (RecognitionException pe) {
>>                 synPredMatched98 = false;
>>         }
>>         _t = __t98;
>>         inputState.guessing--;
>> }
>> if ( synPredMatched98 ) {
>>         identifier(_t);
>>         _t = _retTree;
>> }
>> else {
>>         throw new NoViableAltException(_t);
>> }
>>
>>
>> Oh well...
>>
>> -- Chris
>>
>> On 9/30/05, Monty Zukowski <monty at codetransform.com> wrote:
>>
>>> I dunno.  Try putting parenthesis around the two alternatives?
>>>
>>> Monty
>>>
>>> On Sep 30, 2005, at 5:48 PM, Christian Bird wrote:
>>>
>>>
>>>> Good suggestion, but unfortunately the code generated for name is
>>>> still the same.  I don't understand antlr could ever generate code
>>>> that looks like:
>>>>
>>>> if ((_t.getType()==A) && (_t.getType()==B)) {}
>>>>
>>>> When A is not the same as B.  I'm guessing, however, that a  
>>>> treeparser
>>>> generator is more complicated to write and probably not as often
>>>> used/tested by antlr users as a normal parser generator (most  
>>>> people
>>>> I've talked to here at UC Davis only use it for parsers and lexers,
>>>> not AST traversals).
>>>>
>>>> Any other ideas?  I appreciate your taking a look at it.
>>>>
>>>> -- Chris
>>>>
>>>> On 9/30/05, Monty Zukowski <monty at codetransform.com> wrote:
>>>>
>>>>
>>>>>
>>>>>
>>>>> On Sep 30, 2005, at 4:57 PM, Christian Bird wrote:
>>>>>
>>>>>  zimport :
>>>>>     #("import"
>>>>>         (name ARROW complexNameList SEMI |
>>>>>         "all" identifier SEMI) )
>>>>>     ;
>>>>> It does seem like a code gen bug.  I would recommend breaking this
>>>>> up into
>>>>> another rule if you can:
>>>>>
>>>>> zimport: #("import" importSuffix)
>>>>> importSuffix: name ARROW complexNameList SEMI
>>>>>                        | "all" identifier SEMI
>>>>>                        ;
>>>>>
>>>>> See if that still triggers the problem.
>>>>>
>>>>> Monty
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Christian Bird
>>>> cabird at gmail.com
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Christian Bird
>> cabird at gmail.com
>>
>>
>
>
> --
> Christian Bird
> cabird at gmail.com
>
>
>



More information about the antlr-interest mailing list