[antlr-interest] AntLR Bug solved

Michael Labhard ince at pacifier.com
Tue Feb 5 01:53:21 PST 2002


All:

	The following lexer file compiles without warnings and does distinguish a 
single END_CHAR from an END_CHAR followed by other GRAPHIC_TOKENS.

class L extends Lexer;
options { k=2; }

NAME_TOKEN
  : GRAPHIC_TOKEN
  ;

END_TOKEN: 
    END_CHAR 
    (
	(END_CHAR | GRAPHIC_TOKEN_CHAR)+ 
	{$setType(GRAPHIC_TOKEN);}
    )
    ;

protected 
GRAPHIC_TOKEN
    :
    GRAPHIC_TOKEN_CHAR 
    (END_CHAR | GRAPHIC_TOKEN_CHAR)* 
    ;

protected 
GRAPHIC_TOKEN_CHAR
  : GRAPHIC_CHAR | BACKSLASH_CHAR
  ;

protected 
GRAPHIC_CHAR
  : ( options {greedy=true;} : 
    '#'|'$'|'&'|'*'|'+'|'/'
    |':'|'<'|'='|'>'|'?'|'@'|'^'|'~'
    );

protected 
END_CHAR: '.' ;

protected 
BACKSLASH_CHAR: '\\' ;

	From this experience I come to understand "left-factoring" to mean, in 
practice, move all cases of the left-hand part of the token into a single 
rule.  That was the trick here.  The rule END_TOKEN handles all cases in 
which END_CHAR is the first character.  Previously the GRAPHIC_TOKEN rule was 
also attempting to handle this condition, along with the END_TOKEN rule.   
Thanks to all for your encouragement.  It was very instructive.

-- Michael


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list