[antlr-interest] simple parser lookahead problem

Anthony Youngman Anthony.Youngman at ECA-International.com
Thu May 13 04:08:49 PDT 2004


Thanks Monty.

I've rewritten the treeparser and it now works :-)

// treeparsers can only look one token ahead (k=1). So because the first
// two options both start with "PRINT expr", we need to left-factor. A
// list of choices cannot contain an identical first option.
printst
	:	#(PRINT
			(expr
				(
					COLON { st.write("PRINT");}
					| /* empty */ {
st.write("PRTLN");}
				)
			| /* empty */ { st.write("CRLF");}
		));

Bit complicated, eh :-)

But it works, and I've fed it " PRINT '(wol)' : ': ' : " and it handled
it no problem. All I need to do now is add a label :-)

By the way, as you can see, I've started documenting the grammar
heavily, and I'll email it to you and Ter (presumably I should email Ric
and any others too) so you can see all the "non-obvious" problems that a
"dive in head first" newbie might hit ... :-) all of which probably look
like "stating the bleeding obvious" to somebody who's been playing with
it for a while.

(And no, even when complete it's unlikely to be quite as nasty as AREV
BASIC. Prime INFOBASIC left most of the most user-vicious quirks out :-)

Cheers,
Wol

-----Original Message-----
From: Monty Zukowski [mailto:monty at codetransform.com] 
Sent: 12 May 2004 16:41
To: antlr-interest at yahoogroups.com
Cc: Monty Zukowski
Subject: Re: [antlr-interest] simple parser lookahead problem

On May 12, 2004, at 7:46 AM, Anthony Youngman wrote:

> I've got the following code in my parser ...
>
> ----------------------
> // Can I distinguish between COLON and COLONPRINT here, I need to look
> // ahead but not eat a SEMI or nl. It'll work if I can get catexpr to
> // take priority.
>
> printst : ( PRINT^ (expr (COLON)? )? );
>
> catexpr : pmexpr ( COLON^ pmexpr)* ;
> --------------------
>
> plus a bit more code that effectively says
>
> expr : catexpr ;
>
> How do I resolve the ambiguity by doing a lookahead in printst - I
> effectively want to look for an "end of statement" marker eg a newline
> or semicolon.

Syntactic predicates are for doing lookahead, but you need it in the  
rule that decides whether to call printst v. catexpr.

> Unfortunately, ":" has three different meanings, as
> exemplified in this simple line of code ...
>
> XXX: PRINT A : B :
>
> Where the first colon says "this is a label" (I haven't even touched
> this yet!),
Yeah, see my parser filter article--I was trying to solve that for your

grammar!


> the second says "concatenate the variables A and B
> together", and the third says "don't print a newline at the end".  
> Yeuch!
> At the moment I'm disambiguating in the lexer, but I don't think
that's
> a good idea ... it'll be messy :-( but I really haven't got to grips
> with predicates, which I think is what I need ...
>
> Further on, I have a hiccup with my treeparser ...
>
> printst
> 	: #(PRINT expr COLON) { st.write("PRINT");}
> 	| #(PRINT expr) { st.write("PRTLN");}
> //	| #(PRINT) { st.write("CRLF");}
> 	;
>

#() expects a root and at least one child.  What you want is simply  
PRINT.

Note that tree parsers only have k=1 lookahead.  Which means you will  
need something like this:

printst
	: #(PRINT
		(
		 expr COLON) { st.write("PRINT");}
		| expr { st.write("PRTLN");}
		| /*empty*/ { st.write("CRLF");}
		)
	)
	;

Except, of course that expr is ambiguous too.  You could syn pred here,

but better would be to alter the tree

printst : ( PRINT^ (expr (COLON {##.setType(PRINT_WITH_COLON);})? )? );

Then tree parser is
printst
	: #(PRINT
		(
		| expr { st.write("PRTLN");}
		| /*empty*/ { st.write("CRLF");}
		)
	)
	| #(PRINT_WITH_COLON expr COLON  { st.write("PRINT");})
	;

Monty

> antlr.Tool does not like the commented-out line - I'm guessing it's
> incredibly simple, but it's objecting to PRINT :-(
>
> Cheers,
> Wol
>
>
>
*********************************************************************** 
> *****
>
> This transmission is intended for the named recipient only. It may  
> contain private and confidential information. If this has come to you

> in error you must not act on anything disclosed in it, nor must you  
> copy it, modify it, disseminate it in any way, or show it to anyone.  
> Please e-mail the sender to inform us of the transmission error or  
> telephone ECA International immediately and delete the e-mail from  
> your information system.
>
> Telephone numbers for ECA International offices are: Sydney +61 (0)2  
> 9911 7799, Hong Kong + 852 2121 2388, London +44 (0)20 7351 5000 and  
> New York +1 212 582 2333.
>
>
*********************************************************************** 
> *****
>
>
>
>
> Yahoo! Groups Links
>
>
>
>
>
>
>
>
Monty Zukowski

ANTLR & Java Consultant -- http://www.codetransform.com
ANSI C/GCC transformation toolkit --  
http://www.codetransform.com/gcc.html
Embrace the Decay -- http://www.codetransform.com/EmbraceDecay.html



 
Yahoo! Groups Links



 





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list