[antlr-interest] * (zero or more) not matching greedily

Jim Idle jimi at temporal-wave.com
Fri Apr 10 11:10:12 PDT 2009


Petteri Räty wrote:
> The relevant stuff from a grammar:
>
> category:	(alphanum|'+'|'_'|DOT) ( alphanum|'+'|'_'|DOT|'-')* {
> System.out.println($category.text); };
>
> alphanum:	LOWER|UPPER|DIGIT;
>
> DOT 	: '.';
>
> DIGIT	:	'0'..'9';
> LOWER	:	'a'..'z';
> UPPER	:	'A'..'Z';
>
> Why does it only take the first character for category? Isn't * supposed
> to be greedy? I also tried adding options {greedy=true;} to the subrule
> but it doesn't make a difference.
>
> betelgeuse at pena ~/python/depend $ ATOM="app-foo" make
> java -cp <snip long cp> Main app-foo
> a
>
> Regards,
Read the 5 minute getting started stuff. YOur lexer rules for alphanum 
are total conflict with the others, which should be fragments if really 
want this. However, you are also confusing lexer rules with parser rules 
I think. Your lexer rule should do the composite matching unless there 
is some really good reason not to.

Jim


More information about the antlr-interest mailing list