[antlr-interest] * (zero or more) not matching greedily
Petteri Räty
betelgeuse at gentoo.org
Fri Apr 10 11:28:18 PDT 2009
Jim Idle wrote:
> Petteri Räty wrote:
>> The relevant stuff from a grammar:
>>
>> category: (alphanum|'+'|'_'|DOT) ( alphanum|'+'|'_'|DOT|'-')* {
>> System.out.println($category.text); };
>>
>> alphanum: LOWER|UPPER|DIGIT;
>>
>> DOT : '.';
>>
>> DIGIT : '0'..'9';
>> LOWER : 'a'..'z';
>> UPPER : 'A'..'Z';
>>
>> Why does it only take the first character for category? Isn't * supposed
>> to be greedy? I also tried adding options {greedy=true;} to the subrule
>> but it doesn't make a difference.
>>
>> betelgeuse at pena ~/python/depend $ ATOM="app-foo" make
>> java -cp <snip long cp> Main app-foo
>> a
>>
>> Regards,
> Read the 5 minute getting started stuff. YOur lexer rules for alphanum
> are total conflict with the others, which should be fragments if really
> want this. However, you are also confusing lexer rules with parser rules
> I think. Your lexer rule should do the composite matching unless there
> is some really good reason not to.
>
> Jim
>
The reason it's done this way is not apparent from these fragments but
elsewhere I need to separate between lower and uppercase characters so
this way seemed easiest. I have place where I need to match lower case
characters so it must come first but then it matches places where I
would want alphanums so that's why it's a parser rule. How do the lexer
rules conflict? I have read the tutorial but can't understand what you mean.
Regards,
Petteri
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090410/769a344f/attachment.bin
More information about the antlr-interest
mailing list