[antlr-interest] Representations of AST

Andy Tripp antlr at jazillian.com
Thu Apr 2 11:02:42 PDT 2009


Alexander,

The crux of what I say here:
http://www.jazillian.com/articles/treewalkers.html
is that as the amount of logic needed in your treewalker grows,
the ANTLR treewalker doesn't really help. You start off with
a few simple actions triggered at various points of treewalking,
but then it grows into a large chunk of code where it doesn't really
help to have that code triggered at certain points in a treewalk.
Then you suspect things would be simpler to have your code just
accept an AST as an argument and do its own walking, and throw out
the ANTLR treewalker.

I don't have any good answers on how to "encapsulate my semantic
representation code" better. I've found that when my AST isn't quite
the shape that I want, I have lots of trouble getting ANTLR to create the
AST that I want. But maybe that's just me.

As for your semantic model that you produce from an AST, all I can say
is that I'm now trying to do simple code instrumentation into C code,
and I'm now on my fourth redesign of my model. Just to figure out
a variable's type with all the typedefs, structs, arrays, pointers, etc.
is really hard. Given a declaration "MYTYPE **v[1][2];" and a reference
"*(a.f().v[3] + n)", what type is the reference? I could spend the
rest of my life staring at C ASTs.

So I feel your pain.
I was also shocked to find that the SQL standard was about 1000 pages,
and the language approaches C++ in complexity. Someone needs to
do for SQL (and C++) what XML did for SGML: strip out the 80% that's
cruft.

I know Alexandre Porcelli was also working on an SQL grammar.

Andy



Alexander Brown wrote:
> Hi,
>  
> Perhaps this will sound like a rather stupid question, but I am 
> wondering if there is a better way to approach the problem I am trying 
> to solve.
>  
> I am interested in parsing SQL.  I have developed a grammar based on the 
> (overly complex) SQL2003 specification for my corpus (something like 
> 1GB+) of SQL statements. I've also built a treewalker that walks my AST.  
>  
> My application is currently converting my AST into a Java-based semantic 
> object model that, for all intents and purposes, reflects the structure 
> of the AST on a 1:1 basis.  For my application, I need an object model 
> based representation of SQL.
>  
> Building the object model and matching stringtemplate library has been 
> extremely time consuming- there are something like 1000 rules in the 
> SQL2003 spec and I have also built composite grammars that handle a 
> superset of the spec such as DB specific constructs (old-school Oracle 
> outer join syntax, for example) and procedural wrappers like PLSQL.   My 
> treewalker has thus become intermingled with vast amounts of Java that 
> builds my  sematic model and my Java object model has, of course, a 
> large number of classes.  I am beginning to think that I have done this 
> wrong.
>  
> After the horse has bolted, I am wondering- was there a better way to 
> approach this?  I am particularly keen to encapsulate my semantic 
> representation code and embed little or no Java in my TreeWalker (even 
> if the 1:1 mapping remains).  I think I have missed a step somewhere.
>  
> Thanks for your input.
> 
> Regards,
>  
> Alex
> 
> 
> ------------------------------------------------------------------------
> 
> 
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-email-address



More information about the antlr-interest mailing list