[antlr-interest] Rookie attempt at ANTLR 3 (Using ANTLRWORKS second correction attempt)

Sun Oct 29 02:18:12 PST 2006

Hello All:

Thanks for this hint, Jim.  I'm sorry for the delay, thanks for the rapid 
and detailed reply.

Regards:

Bill

>From: "Jim Idle" <jimi at intersystems.com>
>To: "Foolish Ewe" <foolishewe at hotmail.com>,<antlr-interest at antlr.org>
>Subject: RE: Rookie attempt at ANTLR 3 (Using ANTLRWORKS second correction 
>attempt)
>Date: Thu, 26 Oct 2006 17:35:37 -0400
>
>First ALPHANUMSTRING can end up matching nothing as it does not force any 
>character to be there, having a * rather than a +. I think that that is 
>probably your start rule issue.
>
>Next issue is that all your rules are the same thing. Your lexer only 
>recognizes ALPHANUMSTRING and so every rule is just str=ALPHANUMSTRING.
>
>Next, it is difficult to see exactly what your start rule is trying to 
>achieve but I guess you trying to get it to follow multiple lines and end 
>when you see end. I think that you can throw away the newline tokens unless 
>they end up being significant as you expand the grammar to cover the whole 
>language, which is certainly possible. But you need to formulate this such 
>that there is a rule that can match a valid construct, then use a higher 
>rule to say how this repeats. Try thinking out in words how you can 
>describe it (there you go Anthony ;-), such as a line of code is one 
>statement followed by any number of additional statements separated by a 
>semi-colon, then a NEWLINE. A statement block is any number of statements, 
>including zero, surrounded by {} etc. Once you can describe it to yourself 
>in English,
>
>However, I am afraid to say that I don't think that this approach is at all 
>correct; basically you are just telling the lexer to tokenize everything 
>that isn't a whitespace into one thing, then are trying to do all the 
>tokenizing in the parser, and not actually doing any parsing. You would be 
>better off, dare I say it, hand crafting such a beast ;-).
>
>All is not lost however, as ANTLR3 can handle your language I believe (but 
>then I believe it can be made to handle anything).
>
>I think that what you should do is lex the keywords, and provide a lex 
>rule, say IDORSTRING that matches anything that isn't a keyword. Then in 
>the parser, at the points you know you can have an non-delimited string, 
>match any possible token that can be a string (with suitable predicated 
>rules to avoid ambiguities where necessary) and interpret it as an 
>non-delimited string. Difficulties arise when an undelimited string is 
>optional and you have to lookahead and use predicates and stuff, but that's 
>what ANTLR is good at.
>
>Next, if your keywords can be: P PR PRI PRIN PRINT, then code the keyword, 
>accordingly, and distinguish it as a string back in the parser:
>
>PRINT: 'P' ( 'R' ( 'I' ( 'N' ( 'T')? )? )? )? ;
>
>Be careful about ambiguities here. Basically ANTLR will match the first 
>sequence listed (but you may end up with warnings and so on - you will need 
>to experiment).
>
>In order that you have an example of all this, I took the liberty of making 
>something close to your sample, that produces a tree (which is what you 
>want to do here, get your grammar/parser to produce an unambiguous and 
>correct tree, then write your action code to do whatever it is you want to 
>do with this in the tree parser)... that you might try to expand (tested 
>with ANTLRWorks 1.0b5):
>
>grammar TestMe;
>
>options
>{
>	output=AST;
>}
>
>tokens
>{
>	STRING;
>	CODEBLOCK;
>	CODELINE;
>	MONTH;
>}
>
>codeBlock
>	: (c+= codelines)+
>	  END
>
>	  -> ^(CODEBLOCK $c+)
>	;
>
>codelines
>	: m=month		-> ^(CODELINE ^(MONTH $m))
>	| PRINT s=string	-> ^(CODELINE ^(PRINT $s))
>	;
>
>string
>	: i=IDORSTRING			     	-> ^(STRING[$i.text] )
>	| (keyword_strings)=> k=keyword_strings -> ^(STRING[$k.text] )
>	;
>
>keyword_strings
>	: month
>	| PRINT
>	| END
>	;
>
>month	: JAN | FEB | MAR | APR | JUN | JUL | SEP | OCT | NOV | DEC ;
>
>JAN	:	'jan' ;
>FEB	:	'feb' ;
>MAR	:	'mar' ;
>APR	:	'apr' ;
>MAY	:	'may' ;
>JUN	:	'jun' ;
>JUL	:	'aug' ;
>SEP	:	'sep' ;
>OCT	:	'oct' ;
>NOV	:	'nov' ;
>DEC	:	'dec' ;
>
>END	:	'e' 'n' 'd'
>	;
>
>PRINT	:	'p' ( 'r' ( 'i' ( 'n' ( 't' )? )? )? )? ;
>
>IDORSTRING
>	: (ALPHA | DIGIT)+
>	;
>
>fragment DIGIT
>	:	('0'..'9')
>	;
>
>fragment ALPHA
>	:	('a'..'z')
>	;
>
>WS	: (' ' | '\t')+ {channel=99;}
>	;
>
>NEWLINE	: ('\r' '\n'? | '\n') { channel=99;}
>	;
>
>
>
>
>
>
>
>
>
>-----Original Message-----
>From: Foolish Ewe [mailto:foolishewe at hotmail.com]
>Sent: Thursday, October 26, 2006 11:43 AM
>To: Jim Idle; antlr-interest at antlr.org
>Subject: Rookie attempt at ANTLR 3 (Using ANTLRWORKS second correction 
>attempt)
>
>Hello All:
>
>I had a catastrophe during the edit of my previous attempt at a correction,
>so now I'm really groveling, please forgive me if you get a redundant 
>reply.
>I'm using ANTLR3 using ANTLRworks (which seems very nice so far) under
>Windows XP in case you are wondering. There should be a MIME attached
>ANTLR3 grammar to this message.
>
>When I try to compile TestGrammar.g (a MIME attached file), I get the
>following errors in the
>console tab in the bottom subwindow.  Although the prior posting omitted 
>the
>grammar
>(just as well, since I got to correct the java code in the @members
>section), there really
>was some code generating that message.
>[14:40:33] grammar TestGrammar: no start rule (no rule can obviously be
>followed by EOF)
>[14:40:33] [Long path omitted]TestGrammar.g:44:3: The following 
>alternatives
>are unreachable: 3
>
>Note that I'm trying this approach because I've got a strange language that
>I'm trying
>to scan which has "undelimited" strings (for historical reasons, this 
>wasn't
>my doing),
>so I sometimes would like to suppress key word recognition.  If I could 
>scan
>in the language
>properly, I think the parsing itself might not be too bad.
>
>If I comment out the first and second alternative, (so that startRule->end
>NEWLINE) then
>ANTLR will generate source but instead I get  (what seems to be) a Java 
>code
>generation error.
>
>13:06:08] [Long Path Snipped]\TestGrammar.java:78: illegal start of
>expression
>[13:06:08]         void endtoken = null;
>[13:06:08]         ^
>[13:06:08] 1 error
>
>Once again, sorry about cluttering up the mailing list with the prior
>malformed message,
>I hope this one is well formed.
>
>Thanks:
>
>Bill M.
>
> >From: "Jim Idle" <jimi at intersystems.com>
> >To: "Foolish Ewe" <foolishewe at hotmail.com>,<antlr-interest at antlr.org>
> >Subject: Re: [antlr-interest] Rookie attempt at ANTLR 3 (using
> >thecurrentANTLRWorks under Window XP)
> >Date: Wed, 25 Oct 2006 18:24:46 -0400
> >
> >Bill,
> >
> >Unless you have missed some of the grammar out from this post, it looks
> >to me like you don't actually have any rules in the grammar, only some
> >member functions? I would think that that you do really have some rules
> >but just have not posted them? ;-)
> >
> >If I take out the java code from your post, we are left with:
> >
> >// Test hoisting and use of predicates to allow us to use "undelimited
> >strings"
> >grammar TestGrammar;
> >
> >// I'm not using tokens in this langauge yet.
> >//tokens = { }
> >
> >
> >If this is really your grammar, then I would think it is pretty obvious
> >;-), that there is no rule for ANTLR to look for EOF in.
> >
> >Jim
> >
> >-----Original Message-----
> >From: antlr-interest-bounces at antlr.org
> >[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Foolish Ewe
> >Sent: Wednesday, October 25, 2006 1:30 PM
> >To: antlr-interest at antlr.org
> >Subject: [antlr-interest] Rookie attempt at ANTLR 3 (using the
> >currentANTLRWorks under Window XP)
> >
> >Hi Folks:
> >
> >I'm trying ANTLR 3 today, using ANTLRworks (so far it seems like Bovet
> >and
> >Parr have some
> >reallly neat stuff in there).
> >
> >I'm trying to compile the attached grammar in the tool and am getting a
> >message:
> >
> >Cannot generate the grammar because grammar TestGrammar : no start rule
> >(no
> >rule can
> >obviously be followed by EOF).
> >
> >This will probably out me to my coauthors and students, but I'm not a
> >big
> >fan of the words
> >obviously/easily or their variants :-).
> >
> >What does this message mean, how can I better convey to ANTLR that
> >startRule
> >is the start rule?
> >
> >Thanks:
> >
> >Bill M.
> >
> >_________________________________________________________________
> >Use your PC to make calls at very low rates
> >https://voiceoam.pcs.v2s.live.com/partnerredirect.aspx
> >
> >
> >--
> >No virus found in this incoming message.
> >Checked by AVG Free Edition.
> >Version: 7.1.408 / Virus Database: 268.13.11/496 - Release Date:
> >10/24/2006
> >
>
>_________________________________________________________________
>Stay in touch with old friends and meet new ones with Windows Live Spaces
>http://clk.atdmt.com/MSN/go/msnnkwsp0070000001msn/direct/01/?href=http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us
>
>--
>No virus found in this incoming message.
>Checked by AVG Free Edition.
>Version: 7.1.408 / Virus Database: 268.13.11/498 - Release Date: 10/26/2006
>
>
>
>--
>No virus found in this outgoing message.
>Checked by AVG Free Edition.
>Version: 7.1.408 / Virus Database: 268.13.11/498 - Release Date: 10/26/2006
>

_________________________________________________________________
Find a local pizza place, music store, museum and more…then map the best 
route!  http://local.live.com?FORM=MGA001