[antlr-interest] Date matching instead of dot pattern

Ben Dotte ben.dotte at gmail.com
Mon Dec 28 14:50:27 PST 2009


Hi,

I'm trying to troubleshoot why an input is matching to a lexer rule
instead of a dot pattern in the parser and could use some help. The
grammar is being used to interpret user-entered searches, and the idea
is that a search surrounded by double quotes should be interpreted
as-is. The dot pattern I'm using has worked for everything I have come
across so far, until someone pointed this search out to me:

"3/4 Abstract w/Talent"

The AST tree I'm given back by this is a " node with (Abstract w /
Talent), as if the "3/4" part were never entered. If I get rid of my
DATE lexer rule and associated parser rules, it works fine.

Here is a snippet of the parser rules:

negationSearch
	:	('-'^)? (quotedSearch | dateRangeSearch | comparisonSearch |
idSearch | wildcardSearch | term)
	;
	
wildcardSearch
	:	TEXT_WITH_WILDCARD	-> ^(WILDCARD TEXT_WITH_WILDCARD)
	;
	
idSearch
	:	'#'^ TEXT
	;
	
comparisonSearch
	:	'>'^ TEXT
	|	'<'^ TEXT
	;

quotedSearch
	:	// within double quotes, output whitespace to default channel
(don't ignore whitespace, in other words)
		{ ((SwitchingCommonTokenStream)input).setTokenTypeChannel(
WHITESPACE, Token.DEFAULT_CHANNEL ); }
		'"'^
		.+ // non-greedy by default
		{ ((SwitchingCommonTokenStream)input).setTokenTypeChannel(
WHITESPACE, Token.HIDDEN_CHANNEL ); }
 		'"'!
	;
	
dateRangeSearch
	:	'[' DATE TO DATE ']'	-> ^(DATE_BETWEEN DATE+)
	|	'[' AFTER DATE ']'	-> ^(DATE_AFTER DATE)
	|	'[' BEFORE DATE ']'	-> ^(DATE_BEFORE DATE)
	;
	
subSearch
	:	'('! orSearch ')'!
	;
	
term	:	SEPARATOR* (t=anyText	-> $t)
		(SEPARATOR t2=anyText	-> ^(AND $term $t2))*
		SEPARATOR*
	;
	
anyText	:	(TO | AFTER | BEFORE | DATE | TEXT)
	;


The related lexer rules look like this:

fragment NUM
	:	('0'..'9') ;
DATE	:	('0'..'1')? NUM '/' ('0'..'3')? NUM '/' NUM NUM NUM NUM ;


I would expect the dot in quotedSearch to match to "3/4", rather than
the DATE lexer rule matching to it, since I am already inside the
double quotes. Is there something I might be able to do to fix this?
(I'm using antlr 3.1.2.)

Thanks,
Ben


More information about the antlr-interest mailing list