[antlr-interest] fragment and option greedy=false

Ranco Marcus ranco.marcus at epirion.nl
Wed Nov 12 00:05:18 PST 2008


Thanks for pointing me in the right direction. I understood that
syntactic predicates could only be used to resolve ambiguity in a
grammar. I'm now trying to use this in the following rule (long string,
starts and ends with three double quotes). 

The first option gives me an error (The following alternatives can never
be matched: 2), but why? Which alternative does the 2 refer to?

LONGSTRING
	: '"""' ( options {greedy=false;} : ( '"' | '""' )? ( ~('"') )
=> LCHARACTER  )* '"""'
	;

The second option is still greedy and won't work either (Grammar passes
check, but generated code is greedy). 

LONGSTRING
	: '"""' ( ( '"' | '""' )? ( ~('"') ) => LCHARACTER  )* '"""'
	;

Any input?


-----Original Message-----
From: Sam Harwell [mailto:sharwell at pixelminegames.com] 
Sent: Tuesday, November 11, 2008 6:48 PM
To: Ranco Marcus; antlr-interest at antlr.org
Subject: RE: [antlr-interest] fragment and option greedy=false

COMMENT
    : '/*'
      (   (~'*' | '*' ~'/') => VALID_CHAR
      )*
      '*/'
    ;

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Ranco Marcus
Sent: Tuesday, November 11, 2008 11:47 AM
To: antlr-interest at antlr.org
Subject: [antlr-interest] fragment and option greedy=false

Hello all,

I would like to know if there are any limitations on the use of
fragments in non-greedy sub rules. In the example below, I would like to
create a lexer rule that matches with a comment (i.e. starts with /*,
has zero or more characters from the given fragment and ends with */). 

grammar MulticharComment;

sentence 
	: COMMENT '.'
	;

COMMENT
//	: '/*'  ( options {greedy=false;} : VALID_CHARS )* '*/'
// option 1
	: '/*'  ( options {greedy=false;} : ('a'..'z' | '*' | '/') )*
'*/'       // option 2
	;

fragment
VALID_CHARS
	: 'a'..'z' | '*' | '/'
	;

If I use the line with the fragment (option 1), I get the following
error (in ANTLRWorks 1.2.1, ANTLR v3.1.1) which I didn't expect. 

Input: /*abba*/.
Error: problem matching token at 1:9 NoViableAltException('.'@[()*
loopback of 8:10: ( options {greedy=false; } : VALID_CHARS )*])

Where does '.'@[()* come from and what is meant by 'the loopback'?

In this particular example, I could use the pattern itself (option 2)
and accept a little redundancy. However, if the pattern is more complex
(i.e. consists of Unicode character ranges, escapes, multiple subrules
etc.), this doesn't feel right. 


It would be great if any of you could clarify on this matter. Thanks in
advance.

Kind regards, Ranco Marcus

List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-addr
ess



More information about the antlr-interest mailing list