[antlr-interest] Storing ambiguity in the tree

Thu Nov 25 05:36:41 PST 2004

Thanks for the suggestions,

so if I understand correctly, you propose parsing the source file and storing the 
statements-part of the code as a flat tree. Then, in a second pass (using a 
TreeParser), do the actual parsing of the statements (using a symbol table from the 
first pass).

I have two problems with this:

-) First off is a practical problem: I can't get the treeparser to decently build 
trees :(
It won't allow the "^" operator, and commands like "{ ## = ([VIRT, "VIRTUAL"], ##); }" 
cause invalid AST node type crashes. If I'm being a complete newbie here please 
help me out! :)

-) Second is a more theoretical problem in that it all seems a little.. awkward. The 
grammar gets spread over two parsers (and some parts copied in both), and the source
is walked twice.
This seems especially convoluted when compared to the (IMO) much more intuitive idea 
of keeping statements "on hold" until the accompanying declarations are parsed:

program: stats[true] decs
         { this.setTokenBuffer = mytokenbuffer; }
         stats[false]
         { this.setTokenBuffer = <the lexer>; }
       ;

stats[boolean buffering]
  : { buffering }? 
    ( ~("end")
      { mytokenbuffer.Add(t); }
    )*
  | <actual statement grammar>
  ;

decs: <declaration grammar>
    ;

While this looks more elegant to me, it doesn't seem ANTLR is very supportive of it - 
I can't find an easy way to switch the tokenbuffer back and forth (like a stack) and 
I have a bad feeling about the ramifications this will have on lookahead and 
guessing.
Any input would be greatly appreciated :)

Regards, Harald Maassen

	-----Original Message----- 
	From: Bryan Ewbank [mailto:ewbank at synopsys.com] 
	Sent: Wed 11/24/2004 5:50 PM 
	To: antlr-interest at yahoogroups.com 
	Cc: 
	Subject: RE: [antlr-interest] Storing ambiguity in the tree

	Alexey, others,

	I'm doing the same thing - for me, it was forced because the new language
	requires support for user-defined infix operators (with differing precedence
	and arity - ugh).  That means I have a pass that does precedence after the
	pass that does type analysis.  On the up side, it greatly simplifies the
	parser grammar:

	        EXPR : ( operator | terminal )* { ## = #( [EXPR,"EXPR"], ##); }
	                // EXPR node is to provide a hook for precedence pass.

	To take this further, I'm also recognizing if/else as separate entities,
	then rejecting (as a semantic error) any else-statement without a leading
	if-statement.

	I'm working also on a paper on this, titled something like "precedence ain't
	parsing", that will hopefully end up on antlr.org someday.

	- Bryan Ewbank
	"The best tool for requirements analysis and design is a crayon"

	Alexey Demakov said:
	> My idea is to build some "flat" tree for ambiguous portions
	> of input, i.e. all input tokens should be children of one root node.
	>
	> So at the second pass we can run parser on list of children
	> and build subtree. It's only idea, I can't get you more details
	> on implementation now.

	Yahoo! Groups Links

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 7994 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20041125/da147fbb/attachment.bin