[antlr-interest] trying to understand greedy option

xdecoret xdecoret at free.fr
Tue Aug 3 15:48:04 PDT 2004


This post is following my earlier one on non-determinism. It seems I
can shut the warning up by using a option {greedy=true;} but then I
run into another problem. Here is a simple grammar 

header {
}		
options {
    language="Cpp";
    genHashLines = false;
}

//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%        PARSER                              %%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{
}
class barparser extends Parser;
options {
    k=5;
    buildAST = false;
    defaultErrorHandler=false;
}
{
}
parseFile
    : (field COMA)* field (COMA)?
    ;
field
    : id EQUAL fieldValue
    ;
fieldValue 
    : (fieldValuePart PLUS)* fieldValuePart 
    ;
fieldValuePart 
    : STRING
    | NAME
    ;
id
    : NAME
    ;
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%        LEXER                               %%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%    
//%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

{
}
class barlexer extends Lexer;
options {
    k=3;
    defaultErrorHandler=false;
    caseSensitive=false;
    charVocabulary='\3'..'\377';
}
{
}
PLUS : '+'
    ;
COMA : ','
    ;
EQUAL : '='
    ;
NAME
    : ('a'..'z'|'0'..'9'|'_'|'-'|'\''|':'|'.')+ 
    ;
protected
ESC
    : '\\' ~('\n')
    ;
protected
STRING_INTERNAL
    : ( ('\\' ~('\n'))=> ESC
        | ( '\r' { newline(); }
            | '\n' { newline(); }
            | '\\' '\n'   { newline(); }
            )
        | ~( '"' | '\r' | '\n' | '\\' )
        )*
    ;
STRING: '"' t:STRING_INTERNAL '"'
        {
            $setText(t->getText());
        }
    ;
// The \r\n below is to parse DOS file end of lines
WS
    : ( ' ' | '\t' | ('\n'| "\r\n") { newline(); })
        {
            $setType(ANTLR_USE_NAMESPACE(antlr)Token::SKIP);
        }
	;

Antlr-izing it, I get a warning about non-determinism that I can solve
 with :

fieldValue 
    : (options {greedy=true;} : fieldValuePart PLUS)* fieldValuePart 
    ;


But then, I can parse the following file:

value = toto,
value = "toto",
value = "toto" + tata + "titi" + tutu,
value = lastone

But I cannot parse the same input if I remove the last line ?!?!

Any explanations to help me understand?



 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list