[antlr-interest] Token position 0, -1

Fri Jul 6 04:23:09 PDT 2012

Am 06.07.2012 11:47:27 schrieb(en) Ale Strooisma:
> Hello,
> 
> if I try to get the position of a token in the file with getLine and 
> getCharPositionInLine, many of my tokens give the coördinates 0, -1, 
> which obviously is wrong. Why is this the case, and how can I fix it?

Hi,

please be aware of the fact that imaginary tokens that you create by using 
rewrite rules don't have any location information by default (and no text 
either).

So I think you should first look if all the tokens generated by the lexer 
contains the expected location information. Also you then can see which tokens 
were generated by the lexer and which were generated by a rewrite rule of the 
parser.

In rewrite rules you can attach the location information of a real lexer token
to a generated imaginary token by a syntax like that (see ANTLR book paragraph 
"Deriving Imaginary Nodes from Real Tokens"):

compoundStatement
	:  lc='{' statement*  '}'
		->  ^(SLIST[$lc]  statement*)
	;

All over all my answer is a little bit guesswork because your question is a 
little bit unspecific (lexer / parser or combined grammar ...? What are you 
doing exactly?).

Hope that helps,
	Stefan

Some example code to dump the lexer tokens:

    /** Dump all tokens the <lexer> provides in a readable form to the 
	location where <outFileName> points to.
        */
    private static void generateTokenOutput(Lexer lexer, String outFileName) {
        PrintWriter tknOut;
        boolean     useOutFile;
        Token token;

        System.err.print("==== TKN to ");
        if (null == outFileName) {
            useOutFile = false;
            System.err.println("stdout");
            tknOut = new PrintWriter(System.out);
        }
        else {
            useOutFile = true;
	    /* SNIP-SNAP  .... */
        }
        // Proceed with the real output
        while (null != (token = lexer.nextToken())) {
            int tokenType = token.getType();
            if (tokenType == -1) break;
            tknOut.printf("%4d,%3d, tkn:%4d  name:%-8s\t '%s'%n",
                token.getLine(), token.getCharPositionInLine(),
                tokenType, MyOwnParser.tokenNames[tokenType],
                token.getText());
        }
        // Need to close or flush the Writer to get the output
        if (useOutFile) tknOut.close();
        else tknOut.flush();
    }

Call like this ...
{
        MyOwnLexer lexer = new MyOwnLexer(input);

        // Dump all tokens?
        generateTokenOutput(lexer, null);
}