[antlr-interest] Issue with antlr 2.7.5rc2

Ric Klaren ric.klaren at gmail.com
Thu Feb 3 05:24:01 PST 2005


On Thu, 3 Feb 2005 00:03:52 -0600, Tushar Jog <tusharjog at gmail.com> wrote:
> attached you can find the tinyc cpp antlr example, modified by
> me to illustrate the bug I saw.

It looks like you expect a bit too much of the inheritance system of
antlr. The changes you want to make require a change to the lexer (for
the new "for" token) and a change to the parser for the for rule. As
far as I can see you never derive from the original lexer so the for
token gets 'lost', (the parser adds a value to the enum silently :(
but the literals table is never updated).

So to get things to work you can use a cclexer.g like this:

---snip----
options {  language="Cpp"; mangleLiteralPrefix = "TK_"; }

class TinyCCLexer extends TinyCLexer;
options {
        k=2;
        charVocabulary = '\3'..'\377';
        importVocab = TinyC;
        exportVocab = TinyCC;   // the lexer is now the 'boss' of the vocab
}

tokens {
        "for";
}
// include one rule else antlr will barf... :/
STAR:   '*'    ;
---snip----

This will get you a lexer that will recognize the "for" token as you
can see from the generated initLiterals:

void TinyCCLexer::initLiterals()
{
        literals["else"] = 7;
        literals["if"] = 6;
        literals["int"] = 4;
        literals["for"] = 27;
        literals["char"] = 5;
        literals["while"] = 8;
}

The parser should work then like this:

options { mangleLiteralPrefix = "TK_"; language="Cpp"; }

class TinyCCParser extends TinyCParser;
options {
    importVocab = TinyCC;  // import from the modified lexer
}

statement
	:	(declaration) => declaration
	|	expr SEMI
	|	TK_if LPAREN expr RPAREN statement
		( TK_else statement )?
	|	"while" LPAREN expr RPAREN statement
    | "for" LPAREN RPAREN statement
	|	block
	;

> Do you think that this is a bug, or am I doing something wrong in
> my grammar ?

So in the end you made a slight mistake. Although it's more or less
due to some antlr problems as well. Antlr accepts silently any new
tokens referenced in the parser, this is really annoying behaviour at
times (this would probably have resulted in a warning that would have
hinted at the problem). Another issue is that antlr is not very
consistent in behaviour when you combine lexer and parser in one file
it kinda obfuscates how the token definitions flow, things are more
consistent if you have lexer/parser/treeparser in separate files but
you have to have import/exportVocabs right and build in the right
order. And the inheritance system blows, for 3.0 some tool support is
planned/was talked about.

I built the stuff using:

java -cp antlr.jar antlr.Tool lexer.g
java -cp antlr.jar antlr.Tool -glib lexer.g cclexer.g
java -cp antlr.jar antlr.Tool -glib tinyc.g tinycc.g

Then glue the modified lexer/parser together.

Cheers,

Ric


More information about the antlr-interest mailing list