[antlr-interest] Comments and questions on a recent project

Mon Aug 26 08:01:29 PDT 2002

>As part of the parsing stage, I tried to do my best to construct a
>succinct AST.  I eliminated lots of punctuation and extraneous
>keywords from AST and I tried to introduce some high-level "imaginary"
>tokens to represent the important nodes in my AST.  I suppose I could
>have used the same token types that the keywords/punctuation generated
>but I guess I feel it is slightly cleaner to create new ones
>specifically for "rule nodes".  I had a few difficulties with this.
>The first was that there are some funny constructs in the language
>where some qualifiers appear in some rules and some appear above them.
>Simple example:
>
>stored_definition
>  : ("final")? class_definition
>  ;
> 
>class_definition
>  : ("encapsulated")? ("partial")? restricted_class IDENT ...
>      { ## = #([DEFINITION, "DEFINITION"], ##); }
>  ;

Why not:

stored_definition
   : ("final")? class_definition
       { ## = #([DEFINITION, "DEFINITION"], ##); }
   ;
class_definition
  : ("encapsulated")? ("partial")? restricted_class IDENT ...
  ;

> The problem here is that while the "encapsulated" and "partial" qualifiers
will appear within the "DEFINITION" node in my AST, it is kind of hard to
get the "final" one in there since it comes "from above".  I realize I could
pass it down, but that introduced more target language specific
modifications to the grammar.  I could also have added it to the
"class_definition" AST after parsing the "stored_definition", but I couldn't
see a way to do that without writing target language specific code.  Did I
miss any built in tree construction capabilities that would have allowed me
to easily do this?
>  
> Another issue with my "imaginary nodes" comes on the tree parser side.  I
tried to create a nice clean AST format on the parser side since this
allowed for a fairly simple tree parser grammar.  One concern I have (again,
I'm not a parser person so I'm probably missing something) is that I have
different applications for my tree parser and I'd like to embellish the AST
in different ways depending on my application.  For example, in some cases I
might be interested in resolving the fully qualified names to all my data
elements.  So I'd like to add associate such names with the instance names
in my AST (not fully qualified).  What is the best way to do this?  I
thought about using heterogeneous AST nodes, but that would make the problem
quite complicated the AST nodes would be potentially in conflict from one
application to another.  Using homogeneous AST nodes, I could certainly add
sub-nodes with the information I want but there are two issues I'm concerned
about:
>

Typically I build tables to handle this sort of thing.  Just about anything
that can be in an AST can be in a table with the node as a key and then a
value of whatever.

I'll comment upon the rest, hopefully at lunchtime.

Monty

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/