[antlr-interest] Tree parser action only gets text for first token in rule

Darien A Hager darien at technofovea.com
Tue Oct 6 22:54:01 PDT 2009


My tree-parser is finding the right structure, but I'm getting 
unexpected results where I want the text (in a rule) of an entire 
subtree, but instead get the text of the very first token. When I had 
this as a unified grammar, it seemed to work OK. Here's the grammar in 
my tree parser that doesn't seem to be working.

______________
keyval[Node selfNode]      :    ^(KV ^(KEY k=text) ^(VALUE v=text))
   {
       selfNode.addAttribute($k.text,$v.text);
   }
   ;

text
   :    QSTRING|(CHAR+)      ;
______________

In my custom Node class, instead of getting an attribute of of k1 = v1, 
only the very first CHAR token comes through in the rule, yielding k=v, 
even though it seems the entire thing was matched. So in the real world, 
the pairing Classname=SomeClass becomes C=S. I'm not sure if I'm missing 
some magic in how I invoke things in the Java runtime, or whether it's a 
grammar issue.

Here's more details. The test case input data, a key/value pair:
____________________________________________
k1    v1
____________________________________________

This is lexed and turned into an AST. Here's the stream of token names 
and text values the tree-parser gets as its' input, with the text value 
of each character:
____________________________________________
KV
DOWN
KEY
DOWN
CHAR    k
CHAR    1
UP
VALUE
DOWN
CHAR    v
CHAR    1
UP
UP
____________________________________________

I've tried breaking it off into it's own rule and using $rulename.text, 
all sorts of things. Lastly, here's the example Java runtime set-up in 
case I'm using the wrong subclass somewhere:

_____________________________________________
       InputStream is = 
this.getClass().getResourceAsStream(TEST_DATA_FILE);
       // Do lexing
       CharStream cs = new ANTLRInputStream(is);
       BaseValveFormatLexer lexer = new BaseValveFormatLexer(cs);
       //printTokens(lexer);

       // Build AST of this "keyval" rule
       CommonTokenStream tokens = new CommonTokenStream(); 
//TokenRewriteStream?
       tokens.setTokenSource(lexer);
       BaseValveFormat parser = new BaseValveFormat(tokens);
       BaseValveFormat.keyval_return ret = parser.keyval();

       // Set up for sending to tree parser:
       CommonTree tree = (CommonTree) ret.getTree();
       //printTree(tree,0);
       CommonTreeNodeStream ctns = new CommonTreeNodeStream(tree);
       ctns.setTokenStream(parser.getTokenStream());
       //printTreeNodes(ctns);

       // Start traversal of the keyval AST section, modifying our 
"Node" element
       BaseValveFormatWalker walker = new BaseValveFormatWalker(ctns);
       Node n = new Node();
       walker.keyval(n);

       System.out.println(n.getAttributes()); // Print the wrong thing. :(
______________________________________________

Thanks in advance,
--Darien A. Hager.





More information about the antlr-interest mailing list