[antlr-interest] more lexical determinism

Terence Parr parrt at jguru.com
Wed Dec 5 11:57:39 PST 2001


On Wednesday, December 5, 2001, at 11:35  AM, howardckatz wrote:

> I can see why I'm getting a lexical nondeterminism error in the
> following, since the lexer has no way of knowing whether "ABCDE" for
> example is a Word or an Identifier, but I can't see how to resolve
> the amibiguity, using a predicate or otherwise. What's the easiest
> way to do this?
>
> Thanks,
> Howard
>
>
> class TestParser extends Parser;
>
> message:	(pair)+;
>
> pair:		Identifier COLON Word;
>
> class TestLexer extends Lexer;
>
> Identifier:	( Letter | '_' ) (Letter | Digit)*;
>
> Word:		(Letter)*;

First, this should be (Letter)+ because a token that matches nothing 
makes no sense.

As for distinguishing between the two kinds of words/ids, you could do 
the following in one rule (assume Word unless you see _ or digit):

Word:	( Letter | '_'  {$setType(Identifier);}) (Letter | 
Digit{$setType(Identifier);})*;

don't forget you'll need to define Identifier somehow (explicit 
reference in grammar or in tokens{} section).

Ter

>
> protected
> Letter: 	'a' .. 'z' | 'A' .. 'Z';
>
> protected
> Digit:		'0' .. '9';
>
> COLON:		':';
>
> WS :		( ' ' | '\t' | "\r\n" | '\n' | '\r' )
>    		{ $setType(Token.SKIP); };
>
>
>
>
>
> Your use of Yahoo! Groups is subject to 
> http://docs.yahoo.com/info/terms/
>
>
--
Chief Scientist & Co-founder, http://www.jguru.com
Creator, ANTLR Parser Generator: http://www.antlr.org


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list