[antlr-interest] some questions about the "antlr-way" of doing things

Thu Jun 9 05:57:43 PDT 2005

Hi,

I'm working on a custom language to C and VHDL translator. I'm a parsing 
newbie and have some questions about the "antrl-way" of solving things.

The parser generates an AST which can then be used to check the validity of 
the input. One of the things that need to be checked is whether there are 
'undefined symbols'. My custom language does not have global symbols so I can 
do this check on the subtree rooted in a function node.  Currently I've 
implemented this check in Python with a heterogeneous AST. This way I can ask 
each function AST if the check is passed. However, after reading about 
Terence's dislike of heterogeneous trees I started wondering if there are 
better ways of doing this.

What is the "antlr-way" (or maybe "Terence-way") of doing this check on the 
AST? Is it using a Walker or is it coding the check in Python with 
homogeneous trees? 

I have a rough idea about how to write a walker for this. The information 
about which variables are defined is localized in a variable definition node 
(which is a child of the function node) but the check needs to be performed 
at the function node level. In order to bring this information together I can 
imagine using stacks and pushing variables on the stack in a variable 
definition node and checking this stack at the function node. Is this the way 
to do it?

When using walkers, I imagine that you should write a walker for each pass 
(undefined symbol check, type check, etc.). Is it possible to define multiple 
walkers? A drawback of this approach would be the duplication of walker 
grammar.

Another question is about the functions getLine() and getColumn(). When I find 
an unknown symbol I would like to print the line and column numbers of the 
corresponding identifier. I tried to do this but both functions always return 
0. What is the reason for this?

Tanks in advance,

Klaas