[antlr-interest] island grammars and tree parsers

Mike J. Bell ckimyt at gmail.com
Thu May 28 11:33:22 PDT 2009


So I've got this weird DSL that ends up having three grammars, a main
"control" grammar and then two island grammars it starts up when it needs
to.  All of them produce ASTs.  I need these island grammars because the
lexing rules are drastically different between them (irreconcilable
differences).

I figured I was going along a great path by removing all tokens from my AST
productions; i.e. operator tokens like '+' and '*' get changed into shared
synthetic tokens such as:

expression_add:  expression_mult  (expression_add_op^ expression_mult)*;
expression_add_op: '+' -> OPERATOR_ADD;

expression_mult: expression_whatever (expression_mult_op^
expression_whatever)*;
expression_add_op: '*' -> OPERATOR_MULT;

and so on and so forth.  This way I should be able to just use the same
tokenVocab in the tree parsers, and write one for the entire language even
though the lexing/parsing stage uses three different grammars.

Obviously I'm doing something wrong, because as soon as I get to the
synthetic token that starts the major subtree of one of the island grammars
(like OPERATOR_EXPRESSION...) I get this error:

TreeParser.g: node from line 246:28 mismatched input '...' expecting
SOME_KEYWORD

The input is the *entire* glob of characters that the island grammar parsed
(not reproduced here for brevity), and SOME_KEYWORD shouldn't be expected in
this location at all...it's part of a different production.

When I don't run the tree parser in the scheme of things, it appears that
all three island grammars are working properly and parse my input files just
fine.

Does this have something to do with the
CommonTreeNodeStream.setTokenStream()?  I set it initially to the top
controller lexer token stream when I created it, but am I wrong about this?
Does it need to be set to a different token stream when the lexers hand off
control to each other?

If so, how the heck can I manage that at the appropriate times?

Maybe there's something else I'm missing?

Thanks everybody!  =)

P.S.  Is there a simple way to turn on debugging for token or tree node
streams?  I can't seem to get my weird case to run in ANTLRWorks, so I'm
running standalone, and I don't have the nice parse trees or ASTs or
event/stack viewers...so it would be really neat to be able to generate
messages when things are consumed from the streams.

-- 
Mike J. Bell on gmail
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090528/f23d2e2e/attachment.html 


More information about the antlr-interest mailing list