[antlr-interest] Re: antlr-interest Digest, Vol 3, Issue 26
=?gb2312?B?us6zrA==?=
hc7750 at 163.com
Sat Feb 19 19:43:17 PST 2005
Thougn these warnings exist, the lexer can work well
> Send antlr-interest mailing list submissions to
> antlr-interest at antlr.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://www.antlr.org/mailman/listinfo/antlr-interest
> or, via email, send a message with subject or body 'help' to
> antlr-interest-request at antlr.org
>
> You can reach the person managing the list at
> antlr-interest-owner at antlr.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of antlr-interest digest..."
>
>
> Today's Topics:
>
> 1. ANTLR Debugging & XML Work (Scott Stanchfield)
> 2. lexical nondeterminism (Rice Yeh)
> 3. Re: lexical nondeterminism (Bryan Ewbank)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Fri, 18 Feb 2005 19:06:00 -0500
> From: "Scott Stanchfield" <scott at javadude.com>
> Subject: [antlr-interest] ANTLR Debugging & XML Work
> To: <antlr-interest at antlr.org>
> Message-ID: <20050218235422.A6F3D290413 at new.knowspam.net>
> Content-Type: text/plain; charset="us-ascii"
>
> Hey all!
>
> LOOKING FOR SOME GUINEA PIGS...
> SUMMARY
> =======
> If you have an ANTLR Java parser, AND use Eclipse 3.0.x or 3.1Mx, AND would
> like to help me ensure I didn't break ANTLR with some debugging and XML
> changes, please email me: scott at javadude.com
>
>
> THE DETAILS
> ===========
>
> I'm back from my long absence of ANTLR-related fun! I convinced some folks
> at work to try ANTLR for XML parsing which started a whole "let's make
> things easier" binge for me...
>
> (Note that my email address has changed several times since I wrote
> ParseView. I saw in the mailing list that some people have had trouble
> reaching me. Use scott at javadude.com...)
>
>
> I've done some major tinkering with ANTLR and the ANTLR-Eclipse plugin to do
> the following:
>
> Debugging-Related Stuff
> =======================
> I fixed ParseView and added JSR-45 debugging support...
>
> * Made ParseView work again and integrated with the ANTLR distro
> * Simplified the debugging API for ParseView (reducing some ANTLR debug
> code)
> * Added JSR-45 support to ANTLR so you can walk through the grammar in the
> eclipse debugger
> (see http://javadude.com/misc/antlr-debug/antlr-debug.html for a quick
> demo)
> * Added an JSR-45 SMAP installer to the ANTLR-eclipse plugin
> (this is based off some code from Apache Tomcat, so if this is released as
>
> part of Torsten's ANTLR-Eclipse plugin we need to figure out if there are
> licensing issues... otherwise I can make it a separate eclipse plugin w/
> an Apache license, but it's cleaner to integrate with ANTLR-Eclipse and
> avoid having the user add Smap support to the project as well as ANTLR
> support)
>
>
>
> XML-Related Stuff
> =================
> I started using XPA, but I really wanted something that did normal XML
> validation against a DTD or schema. (With XPA, I'd basically have to emulate
> the same validation in a grammar.) I also wanted some simpler syntax in the
> ANTLR grammar.
>
> So, I created an XMLTokenStream that accepts a SAX parser and runs it to
> fill a buffer with token which is then used for an ANTLR parser to pull
> from. (The buffer has high and low water marks so it doesn't suck the whole
> file into memory at once).
>
> * Created an XMLTokenStream class that takes a SAX parser (configured
> however you like) as a scanner, creates tokens for the start/end tags and
> content, and can be attached to a parser
>
> * Added an xmlTag rule option
> webApp options {xmlTag="web-app";}
> This automatically generates refs to the start and end tag for web-app, as
> well as
> setting up attribute reference support
> * Added "@name" support to reference attributes in code
> You can write code like
> { createPerson(@name, @address, @phone); }
> in a rule that has an xmlTag option. The attributes will be extracted from
> that tag
>
> * Added xmlNamespaceMapping options to the grammar to allow prefix use for
> tags ref'd in rules
>
> I've still got some more work to do to make the JSR-45 support follow rule
> references, but it's looking pretty good so far.
>
>
> REQUEST FOR TESTERS!
> ====================
> All of this required some work on the ANTLR parser and the java code
> generator. I'd like to get the warm fuzzies that I didn't blow anything up,
> so...
>
> To anyone who has a working parser AND uses Eclipse 3.0.x or 3.1Mx, if
> you're willing to run a quick test to see if things behave the same, please
> email me: scott at javadude.com. (Note that you'll need to answer a challenge
> at knowspam.net for me to get the email.)
>
> I will send a zip containing the modified eclipse plugins for ANTLR.
>
>
> MY TO DO LIST...
> ================
> I've still got some work to do, so release of all of this is probably a few
> weeks to a month away...
>
> * Add rule-ref support in SMAPs for grammar debugging. (I've started with
> it, but I'm having some silly little "off-by-one" problems in the mapping...
> Grrr... Arg...)
>
> * Add a token buffer "spy" to ParseView
>
> * Add checking for invalid breakpoint locations (right now you can set
> breakpoints anywhere in a grammar but they're ignored if they're outside
> actions)
>
> * If I have time, convert ParseView to an Eclipse plugin with some
> nicer-looking views
>
> * Work with Terence and Torsten to get all of this in the official ANTLR and
> ANTLR-Eclipse releases
>
> * Figure out who else is working on ANTLR (I hear there's a 2.8.0 in dev?)
> and merge their changes in
>
> I'd like to eventually create a new ANTLR editor for Eclipse based on the
> Structured Source Editor (SSE) that comes with the WebTools project from
> Eclipse.org. That would support all sorts of wonderful Java stuff
> development inside the action code like code completion and so forth...
>
>
> Whew! That was a lot to say...
>
> Later,
> -- Scott
>
>
>
>
>
>
> ------------------------------
>
> Message: 2
> Date: Sun, 20 Feb 2005 00:19:56 +0800
> From: Rice Yeh <riceyeh at gmail.com>
> Subject: [antlr-interest] lexical nondeterminism
> To: antlr-interest at antlr.org
> Message-ID: <47f71d9405021908197ddc07d3 at mail.gmail.com>
> Content-Type: text/plain; charset=US-ASCII
>
> Hi,
> I have the following lexical rules where the longest string
> literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> enough for this case), But why there is nondeterminism with message as
> follows:
>
> lexical nondeterminism between alts 1 and 2 of block upon
> k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> k==4:<end-of-token>,'-'
>
>
> MONTH_OF_YEAR:
> ("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> "SEP" | "OCT" | "NOV" | "DEC")
> ;
>
> DAY_OF_WEEK:
> ("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
> ;
>
> VALUE
> : DAY_OF_WEEK | MONTH_OF_YEAR
> ;
>
> Regards,
> Rice
>
>
> ------------------------------
>
> Message: 3
> Date: Sat, 19 Feb 2005 12:35:53 -0500
> From: Bryan Ewbank <ewbank at gmail.com>
> Subject: Re: [antlr-interest] lexical nondeterminism
> To: ANTLR Interest <antlr-interest at antlr.org>
> Message-ID: <dd3a065f0502190935278ff5fc at mail.gmail.com>
> Content-Type: text/plain; charset=US-ASCII
>
> The problem is linear approximate lookahead:
>
> Linear Approximate Lookahead
> An approximation to full lookahead (that can be applied to both LL and
> LR parsers) for k>1 that reduces the complexity of storing and testing
> lookahead from O(n^k) to O(nk); exponential to linear reduction. When
> linear approximate lookahead is insufficient (results in a
> nondeterministic parser), you can use the approximate lookahead to
> attenuate the cost of building the full decision.
> -- p9, antlrman.pdf
>
> This means that the lookahead sets are collaped in such a way that
> ambiguity shows up. There's a few more paragraphs in the manual that
> should help to explain it.
>
> The easiest solution I've found (there are probably others) is to
> accept an identifier ([a-z][a-z][a-z]), and than use a lookup table to
> grab the ones I want. The other solution I've used is to use flex for
> token generation.
>
> On Sun, 20 Feb 2005 00:19:56 +0800, Rice Yeh <riceyeh at gmail.com> wrote:
> > Hi,
> > I have the following lexical rules where the longest string
> > literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> > enough for this case), But why there is nondeterminism with message as
> > follows:
> >
> > lexical nondeterminism between alts 1 and 2 of block upon
> > k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> > k==4:<end-of-token>,'-'
> >
> > MONTH_OF_YEAR:
> > ("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> > "SEP" | "OCT" | "NOV" | "DEC")
> > ;
> >
> > DAY_OF_WEEK:
> > ("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
> > ;
> >
> > VALUE
> > : DAY_OF_WEEK | MONTH_OF_YEAR
> > ;
> >
> > Regards,
> > Rice
> >
>
>
> ------------------------------
>
> _______________________________________________
> antlr-interest mailing list
> antlr-interest at antlr.org
> http://www.antlr.org/mailman/listinfo/antlr-interest
>
>
> End of antlr-interest Digest, Vol 3, Issue 26
> *********************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050220/2945e22c/attachment-0001.html
More information about the antlr-interest
mailing list