[antlr-interest] Re: antlr-interest Digest, Vol 3, Issue 26

Sat Feb 19 19:43:17 PST 2005

Thougn these warnings exist, the lexer can work well


> Send antlr-interest mailing list submissions to
> 	antlr-interest at antlr.org
> 
> To subscribe or unsubscribe via the World Wide Web, visit
> 	http://www.antlr.org/mailman/listinfo/antlr-interest
> or, via email, send a message with subject or body 'help' to
> 	antlr-interest-request at antlr.org
> 
> You can reach the person managing the list at
> 	antlr-interest-owner at antlr.org
> 
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of antlr-interest digest..."
> 
> 
> Today's Topics:
> 
>    1. ANTLR Debugging & XML Work (Scott Stanchfield)
>    2. lexical nondeterminism (Rice Yeh)
>    3. Re: lexical nondeterminism (Bryan Ewbank)
> 
> 
> ----------------------------------------------------------------------
> 
> Message: 1
> Date: Fri, 18 Feb 2005 19:06:00 -0500
> From: "Scott Stanchfield" <scott at javadude.com>
> Subject: [antlr-interest] ANTLR Debugging & XML Work
> To: <antlr-interest at antlr.org>
> Message-ID: <20050218235422.A6F3D290413 at new.knowspam.net>
> Content-Type: text/plain;	charset="us-ascii"
> 
> Hey all!
> 
> LOOKING FOR SOME GUINEA PIGS...
> SUMMARY
> =======
> If you have an ANTLR Java parser, AND use Eclipse 3.0.x or 3.1Mx, AND would
> like to help me ensure I didn't break ANTLR with some debugging and XML
> changes, please email me: scott at javadude.com
> 
> 
> THE DETAILS
> ===========
> 
> I'm back from my long absence of ANTLR-related fun! I convinced some folks
> at work to try ANTLR for XML parsing which started a whole "let's make
> things easier" binge for me...
> 
> (Note that my email address has changed several times since I wrote
> ParseView. I saw in the mailing list that some people have had trouble
> reaching me. Use scott at javadude.com...)
> 
> 
> I've done some major tinkering with ANTLR and the ANTLR-Eclipse plugin to do
> the following:
> 
> Debugging-Related Stuff
> =======================
> I fixed ParseView and added JSR-45 debugging support...
> 
> * Made ParseView work again and integrated with the ANTLR distro
> * Simplified the debugging API for ParseView (reducing some ANTLR debug
> code)
> * Added JSR-45 support to ANTLR so you can walk through the grammar in the
> eclipse debugger
>   (see http://javadude.com/misc/antlr-debug/antlr-debug.html for a quick
> demo)
> * Added an JSR-45 SMAP installer to the ANTLR-eclipse plugin
>   (this is based off some code from Apache Tomcat, so if this is released as
> 
>    part of Torsten's ANTLR-Eclipse plugin we need to figure out if there are
>    licensing issues... otherwise I can make it a separate eclipse plugin w/
>    an Apache license, but it's cleaner to integrate with ANTLR-Eclipse and
>    avoid having the user add Smap support to the project as well as ANTLR
>    support)
> 
> 
> 
> XML-Related Stuff
> =================
> I started using XPA, but I really wanted something that did normal XML
> validation against a DTD or schema. (With XPA, I'd basically have to emulate
> the same validation in a grammar.) I also wanted some simpler syntax in the
> ANTLR grammar.
> 
> So, I created an XMLTokenStream that accepts a SAX parser and runs it to
> fill a buffer with token which is then used for an ANTLR parser to pull
> from. (The buffer has high and low water marks so it doesn't suck the whole
> file into memory at once).
> 
> * Created an XMLTokenStream class that takes a SAX parser (configured
> however you like) as a scanner, creates tokens for the start/end tags and
> content, and can be attached to a parser
> 
> * Added an xmlTag rule option 
>     webApp options {xmlTag="web-app";}
>   This automatically generates refs to the start and end tag for web-app, as
> well as 
>   setting up attribute reference support
> * Added "@name" support to reference attributes in code
>   You can write code like
>      {  createPerson(@name, @address, @phone); }
>   in a rule that has an xmlTag option. The attributes will be extracted from
>   that tag
> 
> * Added xmlNamespaceMapping options to the grammar to allow prefix use for
> tags ref'd in rules
> 
> I've still got some more work to do to make the JSR-45 support follow rule
> references, but it's looking pretty good so far.
> 
> 
> REQUEST FOR TESTERS!
> ====================
> All of this required some work on the ANTLR parser and the java code
> generator. I'd like to get the warm fuzzies that I didn't blow anything up,
> so...
> 
> To anyone who has a working parser AND uses Eclipse 3.0.x or 3.1Mx, if
> you're willing to run a quick test to see if things behave the same, please
> email me: scott at javadude.com. (Note that you'll need to answer a challenge
> at knowspam.net for me to get the email.)
> 
> I will send a zip containing the modified eclipse plugins for ANTLR.
> 
> 
> MY TO DO LIST...
> ================
> I've still got some work to do, so release of all of this is probably a few
> weeks to a month away...
> 
> * Add rule-ref support in SMAPs for grammar debugging. (I've started with
> it, but I'm having some silly little "off-by-one" problems in the mapping...
> Grrr... Arg...)
> 
> * Add a token buffer "spy" to ParseView
> 
> * Add checking for invalid breakpoint locations (right now you can set
> breakpoints anywhere in a grammar but they're ignored if they're outside
> actions)
> 
> * If I have time, convert ParseView to an Eclipse plugin with some
> nicer-looking views
> 
> * Work with Terence and Torsten to get all of this in the official ANTLR and
> ANTLR-Eclipse releases
> 
> * Figure out who else is working on ANTLR (I hear there's a 2.8.0 in dev?)
> and merge their changes in
> 
> I'd like to eventually create a new ANTLR editor for Eclipse based on the
> Structured Source Editor (SSE) that comes with the WebTools project from
> Eclipse.org. That would support all sorts of wonderful Java stuff
> development inside the action code like code completion and so forth...
> 
> 
> Whew! That was a lot to say...
> 
> Later,
> -- Scott
> 
> 
> 
> 
> 
> 
> ------------------------------
> 
> Message: 2
> Date: Sun, 20 Feb 2005 00:19:56 +0800
> From: Rice Yeh <riceyeh at gmail.com>
> Subject: [antlr-interest] lexical nondeterminism
> To: antlr-interest at antlr.org
> Message-ID: <47f71d9405021908197ddc07d3 at mail.gmail.com>
> Content-Type: text/plain; charset=US-ASCII
> 
> Hi,
>   I have the following lexical rules where the longest string 
> literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> enough for this case), But why there is nondeterminism with message as
>  follows:
> 
> lexical nondeterminism between alts 1 and 2 of block upon
> k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> k==4:<end-of-token>,'-'
> 
> 
> MONTH_OF_YEAR: 
> 	("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> "SEP" | "OCT" | "NOV" | "DEC")
> 	;
> 
> DAY_OF_WEEK:
> 	("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
> 	;
> 
> VALUE	
> 	:	DAY_OF_WEEK | MONTH_OF_YEAR
> 	;
> 
> Regards,
> Rice
> 
> 
> ------------------------------
> 
> Message: 3
> Date: Sat, 19 Feb 2005 12:35:53 -0500
> From: Bryan Ewbank <ewbank at gmail.com>
> Subject: Re: [antlr-interest] lexical nondeterminism
> To: ANTLR Interest <antlr-interest at antlr.org>
> Message-ID: <dd3a065f0502190935278ff5fc at mail.gmail.com>
> Content-Type: text/plain; charset=US-ASCII
> 
> The problem is linear approximate lookahead:
> 
> Linear Approximate Lookahead
> An approximation to full lookahead (that can be applied to both LL and
> LR parsers) for k>1 that reduces the complexity of storing and testing
> lookahead from O(n^k) to O(nk); exponential to linear reduction. When
> linear approximate lookahead is insufficient (results in a
> nondeterministic parser), you can use the approximate lookahead to
> attenuate the cost of building the full decision.
> -- p9, antlrman.pdf
> 
> This means that the lookahead sets are collaped in such a way that
> ambiguity shows up.  There's a few more paragraphs in the manual that
> should help to explain it.
> 
> The easiest solution I've found (there are probably others) is to
> accept an identifier ([a-z][a-z][a-z]), and than use a lookup table to
> grab the ones I want.  The other solution I've used is to use flex for
> token generation.
> 
> On Sun, 20 Feb 2005 00:19:56 +0800, Rice Yeh <riceyeh at gmail.com> wrote:
> > Hi,
> >   I have the following lexical rules where the longest string
> > literial is 3 characters. and i have lookahead set to 5 (actually 3 is
> > enough for this case), But why there is nondeterminism with message as
> >  follows:
> > 
> > lexical nondeterminism between alts 1 and 2 of block upon
> > k==1:'F','M','S' k==2:'A','E','O','U' k==3:'N','T'
> > k==4:<end-of-token>,'-'
> > 
> > MONTH_OF_YEAR:
> >         ("JAN" | "FEB" | "MAR" | "APR" | "MAY" | "JUN" | "JUL" | "AUG" |
> > "SEP" | "OCT" | "NOV" | "DEC")
> >         ;
> > 
> > DAY_OF_WEEK:
> >         ("SUN" | "MON" | "TUE" | "WEN" | "THU" | "FRI" | "SAT")
> >         ;
> > 
> > VALUE
> >         :       DAY_OF_WEEK | MONTH_OF_YEAR
> >         ;
> > 
> > Regards,
> > Rice
> >
> 
> 
> ------------------------------
> 
> _______________________________________________
> antlr-interest mailing list
> antlr-interest at antlr.org
> http://www.antlr.org/mailman/listinfo/antlr-interest
> 
> 
> End of antlr-interest Digest, Vol 3, Issue 26
> *********************************************
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20050220/2945e22c/attachment-0001.html