[antlr-interest] A newbie question and is this mailing listablack hole for

Foolish Ewe foolishewe at hotmail.com
Tue Oct 24 06:45:30 PDT 2006


Hi Loring:

Thanks for the encouragement, however, I don't think this particular
discovery will get me the Nobel prize (or Turing award if you prefer) :-).
At least I haven't belabored a FAQ point that had a lot of recent mailing 
list activity.
Regarding the mailing list archives, if I could lobby for a feature, I would 
like
to see a search option for the archives page (it could dummy up say a google 
search
or some such trick).

I tried this approach because of my own ignorance (I may be a slow learner,
but I'm learning :-) ), if we go back to the original problem that motivated 
the
approach, I wanted the parser to communicate a boolean value to the scanner
to control the scanning.  While it may be possible to push the scanner's job
back into the parser, I was initially inclined to have the scanner do it.  
I'd like
to revisit that option at this time.

In the Java source for Main (snipped off the end of the e-mail exchanges)
there was an instantiation the parser object of class P, and it was passed a 
lexer object of class L,
as follows (actual source code shown).

	L lexer = new L(new DataInputStream(System.in));
	P parser = new P(lexer);

In my original approach I used the ANTLR source to add a public booean 
member, say
"recognizeKeyWords" to class L, which in turn uses recognizeKeyWords as a 
predicate
to know when to prefer strings to key words.  I got that working correctly 
in the lexer,
but never was able to control it from the parser.  Now let's look at the 
constructor
for class P used here. Normally when I see such a constructor, I expect that 
class to
have a reference/handle for the object passed in.

1) Are my assumptions wrong? Is using the parser to control lexer a bad 
idea?
    Is it really the right thing to push the scanner's work back on the 
parser?
2) If I'm right, what is the name of the handle to lexer of type L in P of 
type parser?
3) If I'm right, what is the syntax for referencing lexer.recognizeKeyWords 
in parser?

Regards:

Bill M.


>From: Loring Craymer <lgcraymer at yahoo.com>
>To: Foolish Ewe <foolishewe at hotmail.com>, dave at badgers-in-foil.co.uk,  
>antlr-interest at antlr.org
>Subject: Re: [antlr-interest] A newbie question and is this mailing 
>listablack hole for
>Date: Mon, 23 Oct 2006 19:47:02 -0700 (PDT)
>
>Bill--
>
>Congratulations!  You have discovered the lack of
>semantic predicate hoisting in ANTLR 2!  Not many do
>that: apart from those of us who sorely missed this
>feature in going from PCCTS (ANTLR 1) to ANTLR 2,
>yours is the first post on the subject in the past six
>years.  One of the pluses of ANTLR 3 is that it is
>bringing back predicate hoisting.
>
>What happens in your grammar is that the predicate in
>getString (and other such rules) is not part of the
>lookahead decision in the calling rule.  startRule
>sees getString and looks for any ALPHANUMSTRING; the
>predicate is only triggered within getString.  If you
>change the
>getstring:getString to
>{kwPrefixMatch(LT(1).getText(), "getstring", 4)}?
>		getstring:ALPHANUMSTRING
>(that is, don't bury it in a subrule), the error
>reported for startRule will disappear.  Alternatively,
>you can manually hoist the predicate and do
>{kwPrefixMatch(LT(1).getText(), "getstring", 4)}?
>         getstring:getString
>
>with the same result.
>
>--Loring
>
>--- Foolish Ewe <foolishewe at hotmail.com> wrote:
>
> > Hi Dave and ANTLR list members:
> >
> > Some early experiences show that I may either be
> > executing this method
> > wrong,
> > or that there may be some limitations in the
> > approach.
> >
> > Attached is a small sample attempt at doing the sort
> > of stuff Dave seems to
> > be
> > hinting at.  I've noticed that I'm getting
> > nondeterminism messages in the
> > parser for both
> > startRule and Month, probably due to the fact all
> > keywords are scanned in as
> > ALPHANUMSTRING
> > tokens, which doesn't give much distinguishing
> > structure at the leaf nodes
> > of the parse tree.
> > Is my solution prone to this?
> >
> > The grammar also accepts language constructs which I
> > don't think it should
> > accept,
> > but I haven't tried to hard to shake out bugs from
> > it at this point.
> > What should the parser be doing if the keyword does
> > NOT match the expected
> > string
> > (e.g. do we make it throw an exception, if so what
> > exception is a good
> > choice?)?
> >
> > Thanks for the help, I'm just trying to do this the
> > smart way.
> > A revised ANTLR file and Java file are below.
> >
> > Regards:
> >
> > Bill M.
> >
> > *****************Begin ANTLR
> > Source*********************************
> > //My play area for diagnosing strange ANTLR symptoms
> > //Version History: 1.0 WAM created
> >
> >
> > // WAM - Need to add some boilerplate to let Antlr
> > generated files know that
> > they are part of the ZTestParser package
> > header{
> > 	package testing;
> > }
> >
> > class P extends Parser;
> >
> > // Parser options
> > options{
> > 	k = 2; // Token stream lookahead, remember ANTLR
> > uses LL(k) parsing
> > }
> > {
> > 	private boolean recognizeKeyWords = true;
> >
> > 	// checks to see if minlength or more leading
> > characters in pattern are the
> > prefix of str
> > 	// note, references the private recognizeKeywords
> > member
> > 	private boolean kwPrefixMatch(	String str,
> > 									String pattern,
> > 									int minlength)
> > 	{
> > 		boolean result;
> > 		if (!recognizeKeyWords){
> > 			result = false; // don't bother to do additional
> > tests at this point
> > 		} else if (str.length() > pattern.length()){
> > 			result = false; // the string is longer than the
> > pattern, so it cannot
> > match
> > 		} else if (str.length() < minlength){
> > 			result = false; // the string is too to match the
> > minimum pattern length
> > 		} else {
> > 			String strval = new String(str.toLowerCase()); //
> > For case sensitivity
> > reasons
> > 			result = str.startsWith(pattern);
> > 		}
> > 		return result;
> > 	}
> >
> > }
> >
> > // Antlr requires Terminals have names beginning
> > with uppercase letters,
> > Nonterminals should use lowercase I guess
> > startRule
> > 	:
> > 		// the actual prefix tokens are different in
> > practice
> > 		getstring:getString
> > 		// I would like to do something like the following
> > actions where lexer is
> > a type L object used in lexing
> > 		// This is not the right syntax for this, but it
> > shows the general idea
> > 		// {this.lexer.recognizeKeyWord = false;}
> > 		strval:ALPHANUMSTRING
> > 		// {this.lexer.recognizeKeyWord = true;}
> > 		nl1:NEWLINE sr1:startRule// breaks if the user
> > types in "dun\n" where \n
> > is newline
> > 	|
> > 		monthval:month nl2:NEWLINE sr2:startRule
> > 	|
> > 		// added for testing, but won't work for my
> > requirements.
> > 		toggleval:toggle nl3:NEWLINE sr3:startRule
> > 	|
> > 		endval:end nl4:NEWLINE
> > 	;
> >
> > month
> > 	:
> > 		(jan | feb)// | mar | apr | may | jun | jul | aug
> > | sep | oct | nov | dec)
> > 	;
> >
> > jan
> > 	:
> > 		{kwPrefixMatch(LT(1).getText(), "jan", 3)}?
> > 		ALPHANUMSTRING
> > 	;
> >
> > feb
> > 	:
> > 		{kwPrefixMatch(LT(1).getText(), "feb", 3)}?
> > 		ALPHANUMSTRING
> > 	;
> >
> >
> > getString
> > 	:
> > 		{kwPrefixMatch(LT(1).getText(), "getstring", 4)}?
> > 		ALPHANUMSTRING
> > 	;
> >
> > toggle
> > 	:
> > 		{kwPrefixMatch(LT(1).getText(), "toggle", 3)}?
> > 		ALPHANUMSTRING
> > 	;
> >
> > end
> > 	:
> > 		{kwPrefixMatch(LT(1).getText(), "end", 3)}?
> > 		ALPHANUMSTRING
> > 	;
> >
> > class L extends Lexer;
> >
> > // Lexer options
> > options{
> > 	k=3; // lookahead (need 2 for new line, 3 should be
> > enough for months)
> > 	charVocabulary='\u0000'..'\u007F'; // Only support
> > printable ASCII
> > characters, don't try fancy unicode stuff
> > 	// case sensitivitity turned off
> > 	caseSensitiveLiterals=false;
> > 	caseSensitive=false;
> > }
> >
> >
> > NEWLINE
> >     :   '\r' '\n'    {newline();}        // DOS
> >     |   '\r'         {newline();}        //
> > Macintosh
> >     |   '\n'         {newline();}        // UNIX
> >     ;
> >
> >
> > WHITESPACE :   ' '  {$setType(Token.SKIP);} // space
> > character
> >              | '\t' {System.out.println("Found a
> > tab"); tab();
> > $setType(Token.SKIP);};
> >
> > protected ALPHANUMERIC: ('a'..'z') | ('0'..'9');
> >
> > ALPHANUMSTRING: (ALPHANUMERIC)+;
> > ************************Begin Java
> > Source*************************************
> > package testing;
> > import java.io.*;
> >
> > public class Main {
> >
> >
>=== message truncated ===
>
>
>__________________________________________________
>Do You Yahoo!?
>Tired of spam?  Yahoo! Mail has the best spam protection around
>http://mail.yahoo.com

_________________________________________________________________
All-in-one security and maintenance for your PC.  Get a free 90-day trial! 
http://clk.atdmt.com/MSN/go/msnnkwlo0050000002msn/direct/01/?href=http://www.windowsonecare.com/?sc_cid=msn_hotmail



More information about the antlr-interest mailing list