[antlr-interest] Strange code generation for Fortran

Wed Jul 27 05:56:52 PDT 2005

Hi

In optlabel rule when the type of the next token LA(1) is LABEL 
we build AST node and match LABEL token. optrule execution done.
If LA(1) is not in follow set of optlabel rule (EOS, LITERAL_* etc)
we already have syntax error and can throw exception.
If LA(1) is in follow set of optlabel we don't match next token!
But simply return from optlabel.

About token buffer. I suppose you miss that tokens in array can start not from 0
but from index specified in some other field (don't remember exactly its name).
In debugger it looks like LT(1) is not equal to array[0] element. :)

Regards,
Alexey

-----
Alexey Demakov
TreeDL: Tree Description Language: http://treedl.sourceforge.net
RedVerst Group: http://www.unitesk.com

----- Original Message ----- 
From: "Olivier Dragon" <dragonoe at mcmaster.ca>
To: "ANTLR Interest" <antlr-interest at antlr.org>
Sent: Wednesday, July 27, 2005 4:32 PM
Subject: [antlr-interest] Strange code generation for Fortran

Hi,

I'm in the process of translating a PCCTS Fortran grammar to ANTLR that
was given to me by Terrence.

I'm attempting to fix the few non-determisms issues left. However it
appears as though the code generated from the new grammar is wrong (at
least very weird).

I used a main program and tried to debug through the generated Java
code. Doing so I ran into the following problems: in the subprogramBody
method, the optlabel call bunks on the first assignment statement after
the data declarations (which are all recognized properly).

At the same time, taking a look through the TokenQueue data structure of
the input field I found that at the moment the parser hits this same
assignment statement, the TokenBuffer's token queue become completely
screwed up. Prior to that the queue contains the expected tokens but on
that statement the order of the tokens changes into something wrong. I
don't understand this one at all because when I run the lexer alone and
print the outputs to l.nextToken() everything is fine.

Here is the optlabel rule (where LABEL is a lexer rule to match integers
in columns <= 6) (whole grammar can be found attached below):

optlabel: (LABEL)?

And this is generated as the following code:
------------------------------------
public final void optlabel() throws RecognitionException, TokenStreamException {

returnAST = null;
ASTPair currentAST = new ASTPair();
AST optlabel_AST = null;

{
switch ( LA(1)) {
case LABEL:
{
AST tmp126_AST = null;
tmp126_AST = astFactory.create(LT(1));
astFactory.addASTChild(currentAST, tmp126_AST);
match(LABEL);
break;
}
case EOS:
case LITERAL_entry:
case LITERAL_end:
case LITERAL_dir:
... (bunch of LITERAL_ cases) ...
{
break;
}
default:
{
throw new NoViableAltException(LT(1), getFilename());
}
}
}
optlabel_AST = (AST)currentAST.root;
returnAST = optlabel_AST;
}
----------------------------------------------

This is very strange. LABEL is *optional*. I don't want an exception to
get thrown when there's no viable alternative to LABEL. I especially
don't want any LITERAL_* matching. I tried to modify the code basically
removing all the cases including the default, and the code would exit
the subprogramBody loop prematurely (again, on the first assignment
statement).

If you can please enlighten me on this I would be forever owing!

-Olivier

-- 
          __-/|    ? ?     |\-__
     __--/  /  \   (^^)   /  \  \--__
  _-/   /   /  /\ / ( )  /\  \   \   \-_
 /  /   /  /  /  (   ^^ ~  \  \  \   \  \
 / Oli Dragon    ( dragonoe at mcmaster.ca \
/  B.Eng. Sfwr   (     )    \    \  \    \
/  /  /    /__--_ (   ) __--__\    \  \  \
|  /  /  _/        \_ \_       \_  \  \  |
 \/  / _/            \_ \_       \_ \  \/
  \_/ /                -\_\        \ \_/
    \/                    )         \/
                        *~
        ___--<***************>--___
       [http://dragon.homelinux.org]
        ~~~--<***************>--~~~