[antlr-interest] Lexer not pulling in fragments?

Joseph Klumpp jklumpp0 at gmail.com
Thu Apr 2 07:07:12 PDT 2009


I'm trying to create tokens for the guards of C header files (with
filter=true), e.g. '#define __hello_h_' => <GUARD, #define
__hello_h_>, and have the following rules defined:

GUARD	:	'#' LETTER+ WS+ IDPART '_';
ID	:	IDPART;

WS	: 	(' ' | '\n')+	{$channel = HIDDEN;};

fragment
IDPART	:	LETTER ( LETTER | DIGIT )*;

fragment
LETTER
	:	'$'
	|	'\u0041'..'\u005a'
	|	'\u0061'..'\u007a'
	|	'_'
	;
	
fragment
DIGIT	: 	'0'..'9';

Using these rules GUARD will never appear in the token stream. If I
change it to:
GUARD	:	'#' LETTER+ WS+ LETTER (LETTER | DIGIT)* '_';
the rule lexes correctly. I have two questions:
1. Why does it not lex correctly when I lex with IDPART?
2. Is there a way to set the value of token GUARD to be just the
IDPART portion of the lexem?

Thanks in advance,
JK


More information about the antlr-interest mailing list