[antlr-interest] bad generated code?

Christian Bird cabird at gmail.com
Fri Sep 30 18:59:30 PDT 2005


In case anyone cares, I tracked down the problem in the source.  Here
is the lookahead code generator from JavaCodeGenerator.java:

    private String lookaheadString(int k) {
        if (grammar instanceof TreeWalkerGrammar) {
            return "_t.getType()";
        }
        return "LA(" + k + ")";
    }

Note that if it's a tree walker, k isn't examined at all. My solution
(though I'm not sure that it's always NullPointerException safe) is
this:

    private String lookaheadString(int k) {
        if (grammar instanceof TreeWalkerGrammar) {
            System.out.println("k is " + Integer.toString(k) );
            String retStr = "_t";
            while (k > 1) {
                retStr += ".getNextSibling()";
                k--;
            }
            retStr += ".getType()";
            return retStr;
        }
        return "LA(" + k + ")";
    }

A quick recompile and that seems to do the trick.  It's possible that
the check for getNextSibling could run off the end of the list and
generate a NullPointerException, but in my grammar I know that the
places where lookahead two is required, there's gauranteed to be a
next sibling.

Terrence,
Any chance that this or something similar and safer (I'm not too
familiar with the codebase) could make it into 2.7.6?  I know that
it's probably not often that a tree parser needs k > 1, but (at least
in my case) it can occur.  Thanks.

-- Chris

On 9/30/05, Christian Bird <cabird at gmail.com> wrote:
> That didn't seem to work either.  I tried using a syntactic predicate:
>
> name :
>         (ID DOT) => complexName
>         | (ID ~DOT) => identifier
>         ;
>
> and adding a rule that changes the followset of name:
>
> aname :
>         name SEMI;
>
> But the code still has issues:
>
> boolean synPredMatched98 = false;
> if (((_t.getType()==ID) && (_t.getType()==SEMI||_t.getType()==ARROW))) {
>         AST __t98 = _t;
>         synPredMatched98 = true;
>         inputState.guessing++;
>         try {
>                 {
>                 AST tmp63_AST_in = (AST)_t;
>                 match(_t,ID);
>                 _t = _t.getNextSibling();
>                 AST tmp64_AST_in = (AST)_t;
>                 matchNot(_t,DOT);
>                 _t = _t.getNextSibling();
>                 }
>         }
>         catch (RecognitionException pe) {
>                 synPredMatched98 = false;
>         }
>         _t = __t98;
>         inputState.guessing--;
> }
> if ( synPredMatched98 ) {
>         identifier(_t);
>         _t = _retTree;
> }
> else {
>         throw new NoViableAltException(_t);
> }
>
>
> Oh well...
>
> -- Chris
>
> On 9/30/05, Monty Zukowski <monty at codetransform.com> wrote:
> > I dunno.  Try putting parenthesis around the two alternatives?
> >
> > Monty
> >
> > On Sep 30, 2005, at 5:48 PM, Christian Bird wrote:
> >
> > > Good suggestion, but unfortunately the code generated for name is
> > > still the same.  I don't understand antlr could ever generate code
> > > that looks like:
> > >
> > > if ((_t.getType()==A) && (_t.getType()==B)) {}
> > >
> > > When A is not the same as B.  I'm guessing, however, that a treeparser
> > > generator is more complicated to write and probably not as often
> > > used/tested by antlr users as a normal parser generator (most people
> > > I've talked to here at UC Davis only use it for parsers and lexers,
> > > not AST traversals).
> > >
> > > Any other ideas?  I appreciate your taking a look at it.
> > >
> > > -- Chris
> > >
> > > On 9/30/05, Monty Zukowski <monty at codetransform.com> wrote:
> > >
> > >>
> > >>
> > >> On Sep 30, 2005, at 4:57 PM, Christian Bird wrote:
> > >>
> > >>  zimport :
> > >>     #("import"
> > >>         (name ARROW complexNameList SEMI |
> > >>         "all" identifier SEMI) )
> > >>     ;
> > >> It does seem like a code gen bug.  I would recommend breaking this
> > >> up into
> > >> another rule if you can:
> > >>
> > >> zimport: #("import" importSuffix)
> > >> importSuffix: name ARROW complexNameList SEMI
> > >>                        | "all" identifier SEMI
> > >>                        ;
> > >>
> > >> See if that still triggers the problem.
> > >>
> > >> Monty
> > >>
> > >
> > >
> > > --
> > > Christian Bird
> > > cabird at gmail.com
> > >
> > >
> > >
> >
> >
>
>
> --
> Christian Bird
> cabird at gmail.com
>


--
Christian Bird
cabird at gmail.com


More information about the antlr-interest mailing list