[antlr-interest] Strange ANTLR behavior when using heterogeneous ASTs

Ric Klaren klaren at cs.utwente.nl
Mon Apr 26 07:10:48 PDT 2004


On Mon, Apr 26, 2004 at 08:44:14PM +0700, Andrey R. Urazov wrote:
> I found it very strange that when in the tokens section I define my own
> AST types for tokens, it is NOT a direct instruction to generate code
> performing custom ASTs' factory methods registration in the
> `initializeASTfactory' member function of the generated parser class.
> The factory method gets registered only when the token is used somewhere
> in the grammar but not in a grammar action.

Looks like an oversight in the code that generates the initialization
stuff. The heterogeneous stuff was never without headaches.

> For example, for the following grammar file:
>
> ------------------------------------------
>
> options { language = Cpp; }
>
> class TestParser extends Parser;
>
> options { buildAST = true; }
>
> tokens {
>     ACTION_TOKEN<AST=MyAST>;
>     GRAMMAR_TOKEN<AST=MyAST>;
> }
>
> start!
> :
>     GRAMMAR_TOKEN
>     {
>         #start = #[ACTION_TOKEN];
>     }
> ;
>
> ------------------------------------------
>
> ANTLR generates the following:
>
> ------------------------------------------
>
> ...
>
> void TestParser::initializeASTFactory( ANTLR_USE_NAMESPACE(antlr)ASTFactory& factory )
> {
> 	factory.registerFactory(5, "MyAST", MyAST::factory);
> 	factory.setMaxNodeType(5);
> }
> const char* TestParser::tokenNames[] = {
> 	"<0>",
> 	"EOF",
> 	"<2>",
> 	"NULL_TREE_LOOKAHEAD",
> 	"ACTION_TOKEN",
> 	"GRAMMAR_TOKEN",
> 	0
> };

Wish everyone supplied problem examples as concise as this ;)

> I don't know whether this is a bug or done intentionally, but to me it
> seems very strange. Maybe there was some motivation to do this in order
> to, for some reason, demarcate usual and imaginary tokens. But I don't
> understand this. To me, it's natural to want to fix AST types for tokens
> --- no matter whether they are real or imaginary --- at once and then
> operate on them without explicit AST type specification.

Well this stuff is around/near the biggest kludges in the codegenerators.
The heterogenous stuff is a bit of a hack on top of antlr and not really
designed up from the ground. I still advice against people using it unless
they're prepared to have the occassional headache and are not afraid to use
snapshots with fixes ;)

I'll have a peek if I can fix it easily (if it looks safe to fix at least
else it will be fixed in a snapshot after 2.7.4 release).

Cheers,

Ric
--
-----+++++*****************************************************+++++++++-------
    ---- Ric Klaren ----- j.klaren at utwente.nl ----- +31 53 4893755  ----
-----+++++*****************************************************+++++++++-------
  "You can't expect to wield supreme executive power just because some
   watery tot throws a sword at you!"
  --- Monty Python and the Holy Grail



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list