[antlr-interest] Storing ambiguity in the tree
Maassen, H.A.M.
H.A.M.Maassen at student.tue.nl
Thu Nov 25 05:36:41 PST 2004
Thanks for the suggestions,
so if I understand correctly, you propose parsing the source file and storing the
statements-part of the code as a flat tree. Then, in a second pass (using a
TreeParser), do the actual parsing of the statements (using a symbol table from the
first pass).
I have two problems with this:
-) First off is a practical problem: I can't get the treeparser to decently build
trees :(
It won't allow the "^" operator, and commands like "{ ## = ([VIRT, "VIRTUAL"], ##); }"
cause invalid AST node type crashes. If I'm being a complete newbie here please
help me out! :)
-) Second is a more theoretical problem in that it all seems a little.. awkward. The
grammar gets spread over two parsers (and some parts copied in both), and the source
is walked twice.
This seems especially convoluted when compared to the (IMO) much more intuitive idea
of keeping statements "on hold" until the accompanying declarations are parsed:
program: stats[true] decs
{ this.setTokenBuffer = mytokenbuffer; }
stats[false]
{ this.setTokenBuffer = <the lexer>; }
;
stats[boolean buffering]
: { buffering }?
( ~("end")
{ mytokenbuffer.Add(t); }
)*
| <actual statement grammar>
;
decs: <declaration grammar>
;
While this looks more elegant to me, it doesn't seem ANTLR is very supportive of it -
I can't find an easy way to switch the tokenbuffer back and forth (like a stack) and
I have a bad feeling about the ramifications this will have on lookahead and
guessing.
Any input would be greatly appreciated :)
Regards, Harald Maassen
-----Original Message-----
From: Bryan Ewbank [mailto:ewbank at synopsys.com]
Sent: Wed 11/24/2004 5:50 PM
To: antlr-interest at yahoogroups.com
Cc:
Subject: RE: [antlr-interest] Storing ambiguity in the tree
Alexey, others,
I'm doing the same thing - for me, it was forced because the new language
requires support for user-defined infix operators (with differing precedence
and arity - ugh). That means I have a pass that does precedence after the
pass that does type analysis. On the up side, it greatly simplifies the
parser grammar:
EXPR : ( operator | terminal )* { ## = #( [EXPR,"EXPR"], ##); }
// EXPR node is to provide a hook for precedence pass.
To take this further, I'm also recognizing if/else as separate entities,
then rejecting (as a semantic error) any else-statement without a leading
if-statement.
I'm working also on a paper on this, titled something like "precedence ain't
parsing", that will hopefully end up on antlr.org someday.
- Bryan Ewbank
"The best tool for requirements analysis and design is a crayon"
Alexey Demakov said:
> My idea is to build some "flat" tree for ambiguous portions
> of input, i.e. all input tokens should be children of one root node.
>
> So at the second pass we can run parser on list of children
> and build subtree. It's only idea, I can't get you more details
> on implementation now.
Yahoo! Groups Links
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/ms-tnef
Size: 7994 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20041125/da147fbb/attachment.bin
More information about the antlr-interest
mailing list