[antlr-interest] Lexer not putting colon back

mzukowski at yci.com mzukowski at yci.com
Fri Nov 15 07:52:30 PST 2002


Do you have k="2" in your lexer?  You need k to be at least 2 so it will
look far enough ahead to see the = after the :

Monty

-----Original Message-----
From: Paul J. Lucas [mailto:dude at darkfigure.org]
Sent: Thursday, November 14, 2002 9:14 PM
To: antlr-interest at yahoogroups.com
Subject: [antlr-interest] Lexer not putting colon back


	Assume I want to parse a statement of the form:

		let $x := $y

	or:

		LET DOLLAR QNAME ASSIGN DOLLAR QNAME

	where the lexer is defined as:

		tokens { LET; QNAME; }

		protected Digit		: '0'..'9' ;
		protected Letter	: 'A'..'Z' | 'a'..'z' | '_' ;
		protected NCName	: Letter (NCNameChar)* ;
		protected NCNameChar	: Letter | Digit | '.' | '-' ;
		protected QName		: NCName (':' NCName)?  ;
		protected WhiteSpace	: ' ' | '\t' | '\r' | '\n' ;

		ASSIGN	: ":=" ;
		DOLLAR	: '$' ;
		EQUAL	: '=' ;
		S	: (WhiteSpace)+ { $setType( Token.SKIP ); } ;

		Keywords
			: "let"     { $setType( LET ); }
			| QName     { $setType( QNAME ); }
			;

	This works fine as given above.  But if I remove the whitespace
	after the $x like:

		let $x:= $y

	Then it gets it wrong.  An excerpt of the trace output is:

		 > lexer mKeywords; c==x
		  > lexer mQName; c==x
		   > lexer mNCName; c==x
		    > lexer mLetter; c==x
		    < lexer mLetter; c==:
		   < lexer mNCName; c==:
		   > lexer mNCName; c===
		    > lexer mLetter; c===
		    < lexer mLetter; c===
		   < lexer mNCName; c===
		  < lexer mQName; c===
		 < lexer mKeywords; c===
		  < varRef;  > lexer mEQUAL; c===
		 < lexer mEQUAL; c==1
		LA(1)===
		 < startRule; LA(1)===
		exception: line 1:8: unexpected char: '='

	When it encounters the ':', it tries to make it part of a
	QName, e.g, "x:z"; but since the next character is an '=', it
	can't do that.  What it SHOULD do is put the ':' back, return
	'x' as the QNAME, then pick up with ':' as part of ":=".  But
	it doesn't.  Why not?  And how can I fix this so that it
	correctly returns the right tokens regardless of whether
	whitespace is there?

	- Paul


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list