[antlr-interest] Grammar hints?

James Ladd james_ladd at hotmail.com
Sun Dec 25 13:58:56 PST 2011


I googles antler grammar hints but I can't find the description of how to 
Hint to antler which path to take. How do I do this?

> From: antlr-interest-request at antlr.org
> Subject: antlr-interest Digest, Vol 85, Issue 20
> To: antlr-interest at antlr.org
> Date: Sun, 25 Dec 2011 12:00:01 -0800
> 
> Send antlr-interest mailing list submissions to
> 	antlr-interest at antlr.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://www.antlr.org/mailman/listinfo/antlr-interest
> or, via email, send a message with subject or body 'help' to
> 	antlr-interest-request at antlr.org
> 
> You can reach the person managing the list at
> 	antlr-interest-owner at antlr.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of antlr-interest digest..."
> 
> 
> Today's Topics:
> 
>    1. Re: De-emphasizing tree grammars? (Terence Parr)
>    2. Re: Composite Grammars (Benjamin S Wolf)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Sat, 24 Dec 2011 12:11:13 -0800
> From: Terence Parr <parrt at cs.usfca.edu>
> Subject: Re: [antlr-interest] De-emphasizing tree grammars?
> To: antlr-interest Interest <antlr-interest at antlr.org>
> Cc: "George S. Cowan" <cowang at comcast.net>,	Jason Osgood
> 	<jason at jasonosgood.com>
> Message-ID: <D977C814-4754-48F7-BEB1-88DD9798D137 at cs.usfca.edu>
> Content-Type: text/plain;	charset=windows-1252
> 
> Hi gang! Thanks to George, Gavin, Kyle, Jason, et al for bringing up this topic. First, let me point out some blog entries that I have that describe the new parser listener stuff:
> 
> http://www.antlr.org/wiki/display/~admin/2011/09/08/Sample+v4+generated+visitor
> 
> http://www.antlr.org/wiki/display/~admin/2011/09/05/Auto+tree+construction+and+visitors
> 
> I recently had the opportunity to examine some software that made extensive use of visitors over a bytecode stream to not only collect information but to translate into another form. I decided to experiment with ANTLR v4's implementation. I was able to collapse all of my tree grammars into a single tree grammar that triggered listener events like SAX. (I did not alter the fact that my parser built an AST not parse tree.) What I ended up with is a tree grammar that sent high-level events like "found rule definition", "found token reference", and so on.   It became extremely easy to, say, make another pass over the tree to grab information.  As I looked at the event listener mechanism, I realized that: *a parse tree would give me the exact same thing without a tree grammar and the parse tree can be automatically generated.* My bias towards compiler style AST expression trees may have blinded me to a simple truth. um?for 20 years.
> 
> With a single decision, I had stripped away 2 large pieces of work: AST specification and tree grammar specification. The only question is, is it useful? Well, first, why do we build trees at all? The answer is we sometimes need to process information in a non-sequential   manner and sometimes we need to make multiple passes over the tree. For example, we might want to go find all symbol definitions and then process all symbol references.  Neither requirement says we have to have any particular kind of tree.
> 
> As Gavin points out, getting error nodes into the AST to represent error recovery token consumption is not well done in v3. In v4, it doesn't bother since it puts all of that error information in the parse tree.
> 
> I will also point out that it's really hard to get the original input sequence back from an AST, particularly if you have hidden tokens. Parse trees in contrast make this very easy. Parse trees are just much more natural for use with IDEs.
> 
> Gavin asks about a type safe syntax tree. I believe v4 will provide this because there is a node type for each rule in the grammar, or optionally each alternative in the grammar. The listener interface generates enter and exit rule events for each type. For example,
> 
> public interface TListener extends ParseTreeListener {
>     void enterRule(TParser.ifstatContext ctx);
>     void exitRule(TParser.ifstatContext ctx);
> ?
> }
> 
> The context object coincidentally is also where I store all parameters, locals, return values, and labels etc. That means that listener methods have access to the complete context of the rule invocation. (In ifstatContext, you'll see the usual double dispatch methods that trigger appropriate event listener.)
> 
> If you don't want to use the listener interface, you have the entire parse tree so you can treat it like a DOM thingie if you want; e.g., you can build your own visitors.
> 
> You can turn this feature on without regenerating anything. just turn on a runtime flag and ANTLR will stitch the rule invocation contexts together to form a parse tree.
> 
> Nothing is lost. tokens consumed or missing during the parse, appear in the parse tree.  For example, here are 2 parse trees associated with extra tokens and missing tokens:
> 
> 
> 
> 
> These were generated by calling inspect on the root of the parse tree--a GUI pops up; some sample code:
> 
> ParserRuleContext tree = parser.prog();
> tree.save(parser, "/tmp/t.ps"); // Generate postscript:
> tree.inspect(parser); // or view in dialog box
> 
> @Jason: yep, I am basically following the approach you have. You no longer have to put actions in the grammar, because the listener methods have access to all labels and other attributes of each rule invocation.   If you take a look at this new mechanism, I think you'll agree that it gives you the super simplicity of the SAX listener you want.
> 
> I like my listener event mechanism because the listener methods do not have to include the boilerplate code to visit the children. all you do is respond to the event.  Listener methods don't have a return value because any values needed by processing up the tree, can simply reference the rule return values which are also stored in the context object.
> 
> As Kyle points out, a big benefit of this automatic parse tree construction and listener event mechanism is that it renders grammars 100% reusable and retargetable to any target programming language. (Sam Harwell has convinced me to include things like skip in setting channels in the lexer with special syntax rather than actions? again we get retargeting).
> 
> Concerning the neutral imperative language, which we discussed before, I love the idea but I'm not sure how much this helps us. I think that the biggest problem in creating a target is not the code generation templates, which are much improved in v4, but rather the largish library. Of course, if we strip out all of the AST stuff in the tree grammar stuff, it's actually pretty simple ;)
> 
> Oh,  let me also mention that I have implemented a twist on Jim Idle's magic sync function to really improve error correction. In a nutshell, it tries extremely hard to stay within the current rule and recover in line instead of punting and consuming until it sees a token in the follow set.
> 
> Gavin says:
> 
> > I was more thinking along the lines of I wish ANTLR would be able to
> > build the tree for me, but out of typesafe node classes, and without
> > the throwing-away-bits-of-the-tree behaviour that caused me so many
> > problems. But perhaps a SAX-style API would just be a simpler, more
> > robust solution.
> 
> Ask and ye shall receive. What I have built is exactly what you asked for. Type safe, automatically constructed, DOM or SAX model.
> 
> sorry for the stream of consciousness? just core dumping so I can get back to work ;) I apologize for my extreme absence on the mailing lists? last semester kicked my ass and I'm now trying to catch up on research.
> 
> Ter
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Sat, 24 Dec 2011 17:44:12 -0800
> From: Benjamin S Wolf <jokeserver at gmail.com>
> Subject: Re: [antlr-interest] Composite Grammars
> To: antlr-interest at antlr.org
> Message-ID:
> 	<CAN51Nt7zQTP9QCN6GptjWtO1QZp5fX8Ej2B3jExQCuN7=g9zFg at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
> 
> Actually, there are still issues here. Namely GLexer is trying to use
> both A and B directly as delegates, but never initializes the A
> delegate for G_B_A. G_B does, which leads me to believe that this can
> be solved in the constructor by adding "gA = gB.gA" in the Java case,
> "self.gA = self.gB.gA" for Python, "ctx->gA = ctx->gB->gA" for C, etc.
> But then again G_B is delegating to G_B_A; why then does GLexer want
> to delegate directly to G_B_A?
> 
> (Attached GLexer.java and the full grammar in G.zip.)
> 
> On Fri, Dec 23, 2011 at 9:49 PM, Benjamin S Wolf <jokeserver at gmail.com> wrote:
> > I've gotten some very strange errors while trying to make a composite
> > grammar, and I think I've figured out why and/or a way around it. I'm
> > posting this because the error messages were not that helpful on their
> > own, and I had to fool around for a while with a minimal test case
> > until I found a way out of the errors.
> >
> > I have a composite grammar G, which imports two disjoint lexer
> > grammars A and B, and a parser grammar C (which only requires the
> > tokens from A). Using antlr3.4 on G with varying subsequent changes
> > gives one of the following sets of errors, regardless of output option
> > or language.
> >
> > 1. G has no rules.
> >
> > 2. parser rule ... not allowed in lexer, lexer rule ... not allowed in
> > parser, etc.
> >
> > 3. java.lang.ClassCastException: org.antlr.runtime.tree.CommonTree
> > cannot be cast to org.antlr.tool.GrammarAST.
> >
> > The short answer (before I go into details below) is that a) G needs a
> > parser rule, not just lexer rules, and b) G should only import one
> > lexer grammar, and the others should be imported by that one.
> > Strangely, b) does not apply to parser grammars, as I added a second
> > parser grammar D (dependent on both A and B) to test, and G is fine*
> > either way.
> >
> > The long story: When I encountered (1), I added a dummy lexer rule
> > "COMMA : ',' ;". This cured G's lack of rules but now antlr3.4 was
> > giving me (2), where it seemed that antlr3 thought I was putting all
> > of A's lexer rules in C and all of C's parser rules in A (and B,
> > etc.). Since I had no rules dependent on B, I removed it from being
> > imported. With G importing only A and C, I was now getting (3). I
> > added the rule "comma : COMMA ;" to G and now antlr3 completed
> > successfully (and still did when I folded these two rules together
> > into "comma : ',' ;"). So I added B back to the import list from G,
> > and it gave me (2) again. But removing B from G's import list and
> > making A import it made it work fine.
> >
> > So antlr3 successfully produces a recognizer for G when G imports A,
> > C, and D, where A imports B, or when G imports B, C, and D, and B
> > imports A**.
> >
> > I am not sure of the root reason behind the inability of the top level
> > of a composite grammar to import two lexer grammars (whether a design
> > decision or bug, eg.) as none of the documentation I could find on
> > composite grammars indicates either that this is the case or should be
> > otherwise. I would have liked a better error message in place of (2),
> > at least for the case where G had a lexer rule but not a parser rule,
> > because it would have saved a little bit of stumbling around.
> >
> > *By "fine" I mean antlr3 finishes successfully. But if G doesn't
> > import B, then the generated lexer can't produce tokens defined in B
> > and so the rules in D can't be reached.
> >
> > **Unless you're like me, and have an unfortunately large lexer grammar
> > B, which causes antlr3 to run out of stack space if G imports A
> > imports B but not if G imports B imports A.
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: GLexer.java
> Type: application/octet-stream
> Size: 3721 bytes
> Desc: not available
> Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20111224/3b83b18c/attachment-0001.obj 
> -------------- next part --------------
> A non-text attachment was scrubbed...
> Name: G.zip
> Type: application/zip
> Size: 916 bytes
> Desc: not available
> Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20111224/3b83b18c/attachment-0001.zip 
> 
> ------------------------------
> 
> _______________________________________________
> antlr-interest mailing list
> antlr-interest at antlr.org
> http://www.antlr.org/mailman/listinfo/antlr-interest
> 
> End of antlr-interest Digest, Vol 85, Issue 20
> **********************************************
 		 	   		  


More information about the antlr-interest mailing list