[antlr-interest] Re: Dilemma

antlrlist antlrlist at yahoo.com
Thu Jun 12 02:03:57 PDT 2003


Hello Jorge,

I agree completely with Monty in this matter. You'll probably have to 
use your own AST implementation for this.

If you allow me, I have also a suggestion about grammar complexity. I 
think your writing style might be giving you more troubles than 
benefits. The rules you writted are equivalent to:

context_item
  :  library_clause
  |  use_clause
  ;

library_clause
  :  "library" ln1:IDENTIFIER (COMMA IDENTIFIER)* SEMICOLON
  ;

Writing rules like these two would decrease the number of rules of 
your grammar, therefore making it easy to understand, maintain and 
debug.

Is this form of writing rules suitable for your needs?

Enrique





--- In antlr-interest at yahoogroups.com, "Jorge Scandaliaris" 
<j_scandaliaris at y...> wrote:
> Thanks for the advice Monty. I think I'll use a custom AST then. 
I've
> been following the examples and now have a good idea of how to do it
> (C++ output).
> 
> It's about VHDL, and as I said before the idea is to represent the
> source code (a hw digital design) in a hierarchical way. The tool 
down
> the line will modify it, so some basic info must stored although we 
are
> no making a VHDL compiler.
> One of the simplest cases I came across so far (this one is not
> difficult to solve):
> 
> In this case I am interested only on the source of context clause, 
which
> has to be kept as list or vector of string (the tree has a method 
for
> the input of the source code on a line basis). In more complex 
cases,
> the call hierarchy would be deeper and, for example, I would have to
> keep the source code as in this case but also individual items 
(going
> back to the example I could be interested in having as well all the
> IDENTIFIERs associated to logical_name_list because they would be 
needed
> to invoke a tree construction method).
> I hope this gives an idea of what I have to do. I don't post more
> complex examples because it's already a pain for me to follow them 
in
> the grammar even though I built the creature.
> 
> context_clause
> 	:  (context_item)*
> 	;
> 
> context_item
> 	:  library_clause
> 	|  use_clause
> 	;
> 
> library_clause
>      	:  "library" logical_name_list SEMICOLON
>      	;
> 
> logical_name_list 
>    	:  ln1:logical_name (COMMA logical_name)*
> 	;
> 
> logical_name
> 	:  IDENTIFIER
> 	;
> 
> Would you still go for the custom AST option?
> 
> Jorge
> 
> > -----Mensaje original-----
> > De: mzukowski at y... [mailto:mzukowski at y...]
> > Enviado el: martes, 10 de junio de 2003 17:38
> > Para: antlr-interest at yahoogroups.com
> > Asunto: RE: [antlr-interest] Dilemma
> > 
> > Number 2 is probably the most powerful.  Note that you should be 
able
> to
> > reconstruct a whole region of text given a tree.  You aren't 
limited
> to
> > only
> > putting text into AST nodes, you can put anything you want in by
> making
> > your
> > own node subclass.  See the gcc parser for an example of this.
> > http://www.codetransform.com/gcc.html.
> > 
> > Concrete examples would help me think this through.
> > 
> > Monty
> > 
> > -----Original Message-----
> > From: j_scandaliaris at y... [mailto:j_scandaliaris at y...]
> > Sent: Tuesday, June 10, 2003 4:15 AM
> > To: antlr-interest at yahoogroups.com
> > Subject: [antlr-interest] Dilemma
> > 
> > 
> > Hi all,
> > 	I have an antlr recognizer already built. Now I am adding
> actions to
> > produce the output, which is the generation of a tree (not an 
AST, it
> is a
> > custom tree-structure representing the hierarchy of the source 
code).
> I
> > decided to build the tree manually, mostly because I don't have
> control
> > over
> > it, only access to the methods for creating it. From the input 
source
> code
> > the important thing is to recognize some key structures which are 
the
> > inputs
> > to the tree construction methods, mainly names, numbers and 
portions
> of
> > the
> > source code (in the form of strings).
> > 
> > 	The dilemma I face is to find the simplest (performance not
> > critical) way to do this (the grammar has already some 2000 lines 
and
> over
> > 200 rules). The basic problems I face are that the grammar 
decomposes
> a
> > rule
> > (for which I have some tree construction method) into several 
levels
> of
> > calls to other sub rules. Each sub rule will match some text (a 
name,
> a
> > list
> > of names, or more complex structures) and then I have to return 
this
> info
> > back to the main rule in some way. This is the key point. I have
> devised
> > so
> > far several ways of doing this (in each case with some 
limitations):
> > 
> > 	1. Use rule return values. -> Cumbersome when the sub rule is 
n
> > levels below. One possible return value; when in need to return
> multiple
> > things they have to be grouped; manually done all the way
> > 	2. Use AST= true and take advantage of nodes' text as a sort 
of
> > antlr-managed string-type return value. -> Limited to strings, 
still a
> bit
> > cumbersome when sub rule is deep down in the call hierarchy, adds 
a
> lot of
> > overhead.
> > 	3. Create some data members in the parser class or some
> automatic
> > objects within the class' rules, and communicate through them. ->
> > Difficult
> > to follow, I think it might be prone to hide bugs.
> > 
> > 	So far my ideas, right now I might be going for a combination 
of
> 2
> > and 3, but I am looking forward to learning from other people's
> experience
> > and knowledge. Any help is appreciated,
> > 
> > Regards,
> > 
> > Jorge
> > 
> > 
> > 
> > 
> > Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/
> > 
> > 
> > 
> > 
> > Your use of Yahoo! Groups is subject to
> http://docs.yahoo.com/info/terms/


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list