[antlr-interest] Converting my Java code to work as a C Target.

Wed Apr 25 23:15:12 PDT 2012

I'm a total Antlr noob, but I put together a working language by grabbing
random Java code samples I found.  It's pretty simple, basically I have a
Node class that all the nodes inherited from.  This base class has an "eval"
method and there are specialized classes for the different nodes.

My old grammar has rules like:

multiplicativeExpression returns [Node node]:

    receiver=moduleVariable

      operation=(MUL|DIV|MOD|MINUS) arguments=multiplicativeExpression

    { $node = new MathNode($operation.text, $receiver.node,
$arguments.node); }

  ;

And my MathNode stores the parameters passed into the constructor:

public MathNode(String operation, Node receiver, List<Node> arguments)

I call my eval() method with the current context and it handles all of the
internal work.

At this point I want to rewrite things so I can target C as my
implementation language.  I worked through a few of the samples and read as
much as I can, but I'm not quite sure how to connect what I'm doing now with
the C structures.  I read that my code should "subclass" the data that comes
from the C runtime, but I'm not sure what that means from a C perspective.

I defined a "struct MathNode" that pretty much mirrors the Java class,
complete with a function pointer for eval. In main() I create a lexer and
parser, etc and I think I should be able to do something like this:

struct MathNode *math;

math = psr->multiplicativeExpression(psr);

math->eval(global_context);

My grammar has these options:

Language = C;

ASTLabelType = pANTLR3_BASE_TREE;

output = AST;

I kept the same syntax with my rules: "multiplicativeExpression returns
[Node node]".  The tricky part is what code I need for the rule when I
create a new MathNode and initialize it.  It seems like my code has a lot of
info that it can use, but I'm not quite sure what the "connective glue"
should look like.  I have the same three variables: operation, receiver, and
arguments - operation is pretty straightforward, I can use gettext (or maybe
use the token constant stored in ANTLR3_COMMON_TOKEN.type).

I'm not sure how to manage the two nodes.  The way I set up the grammar
seems to have given me several options, but I think it's the same as the
Java code, $node = sexy_node_making_function($operation->type,
$receiver->node, $arguments->node ).  My node builder could allocate a
MathNode structure, fill in the details, and return for storage in the
return structure.

It seems like an odd way to store things, am I missing something?  The
return value has start/stop tokens (not sure what to do with those) and a
tree, in addition to my node information.

Should I somehow store my node data in the ANTLR3_BASE_TREE.u variable?
Would my grammar still need the "returns [Node node]" bit, or will the
parser still return the tree?

Since there are several levels of structures, how do I call eval() on the
node?

Thanks for any input!

Lee