[antlr-interest] antlrworks confused by imaginary tokens?

Michael Roberts mike at 7f.com
Thu Mar 15 21:03:44 PDT 2012


I've been happily hacking on my little grammar using antlrworks.
 Everything was going swimmingly until I introduced a section of imaginary
tokens for use in rewrite rules.  For some reason, antlr/antlrworks really
wanted this section of imaginary tokens at the start of the file, directly
behind the options section.  It didn't seem to like it in other places, and
would not recognize the imaginary tokens otherwise.

However, oddly, it didn't like it if I defined my regular tokens inside the
tokens sections and refused to recognize them, flagging mismatched token
exceptions all over the place.  So, accepting defeat, I moved these
non-imaginary tokens back to the end of the file, where they'd previously
been living.  No missing tokens, everything generates fine now.

However, when I attempt to debug my parser, the generated test code
references the first non-imaginary token it finds as the top level
construct, in my case CLOSE_PAREN, and not my top-level compilationUnit
production (which is ahead of it in the file).  Thus:

public class __Test__ {

    public static void main(String args[]) throws Exception {
        JLG2Lexer lex = new JLG2Lexer(new
ANTLRFileStream("C:\\src\\Core\\src\\org\\veve\\reflect\\interpreter\\output\\__Test___input.txt",
"UTF8"));
        CommonTokenStream tokens = new CommonTokenStream(lex);

        JLG2Parser g = new JLG2Parser(tokens, 49100, null);
        try {
            g.CLOSE_PAREN();   // <-- BAD, was expecting to see
compilationUnit here ...
        } catch (RecognitionException e) {
            e.printStackTrace();
        }
    }
}

So, my main question is ..  why doesn't this form of token definition
(below) work:


tokens
{

// Imaginary tokens for AST rewrite ops
IDENTIFIER_PATH;
INVOCATION;
STATEMENT_BLOCK;
AMPERSAND_INVOCATION;
INVOCATION_STAT;
OBJECT;
ARRAY;
ELEMENT_STAT;
MEMBERS;
PAIR;
PAIR_LIST;
METHOD_INVOCATION;
NEW_COMMAND;
STRING;
NUMBER;
ARRAY;
BOOLEAN;
NULL;
PATH;

// Real, defined tokens
CLOSE_PAREN : ')';
AMPERSAND : '@';
WS       :           (' '|'\t'|'\f'|'\n'|'\r')+{ skip(); };
COLON : ':';
EQUALS : '=';
INJECT : '<-';
COMMA : ',';
SLASH : '/';
OPEN_PAREN :    '(' ;
OPEN_BRACE   : '{';
CLOSE_BRACE
:   '}';
DOT
: '.';
SEMI_COLON
: ';';
BLOCK :   '|' ;
}

is the token section just for imaginary tokens, then, and, if not how do I
define regular tokens in it .. and, in essence, what could I possibly be
doing to so confuse the test jig generator code so that it's generating
something silly?

MR


More information about the antlr-interest mailing list