[antlr-interest] Lex Matching Issues

Cid Dennis cid at kynetx.com
Mon Jul 19 08:52:58 PDT 2010


So I am new to ANTLR and have created a grammar but found a strange issue.  Because of the structure of the language I am parsing there can be tokens that match reserved works as variables but only when they are in a sub rule that does not use the reserved word.

In the example below "ruleset" is seen by the parser in two different ways.  The first is for the 'ruleset' token and the second is as a VAR token.  The problem is when the parser sees the second ruleset it is thinking the token is the "ruleset" token not the "VAR" token so it returns Mismatch token exception.  

How can I make it so that I can do this kind of parsing.   One work around I came up with was to change 'ruleset' in the grammar to be a VAR  but then it is not easy to see what the grammar looks like.  

In the end I do not care what the token is considered(VAR or 'ruleset') as long as the parser does the right thing and can parse the "assignment" if 'ruleset' is used on the left hand side of the assignment.   


Simple Example Input:

ruleset joe {
	rule myrulename is active {
		ruleset = "test";
	}	
}

Simple Grammer:

grammar test;
options {
  output=AST;
}

ID  :	('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'_')*
    ;

COMMENT
    :   '//' ~('\n'|'\r')* '\r'? '\n' {$channel=HIDDEN;}
    |   '/*' ( options {greedy=false;} : . )* '*/' {$channel=HIDDEN;}
    ;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;

STRING
    :  '"' ( ESC_SEQ | ~('\\'|'"') )* '"'
    ;

fragment
EXPONENT : ('e'|'E') ('+'|'-')? ('0'..'9')+ ;

fragment
HEX_DIGIT : ('0'..'9'|'a'..'f'|'A'..'F') ;

fragment
ESC_SEQ
    :   '\\' ('b'|'t'|'n'|'f'|'r'|'\"'|'\''|'\\')
    |   UNICODE_ESC
    |   OCTAL_ESC
    ;

fragment
OCTAL_ESC
    :   '\\' ('0'..'3') ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7') ('0'..'7')
    |   '\\' ('0'..'7')
    ;

fragment
UNICODE_ESC
    :   '\\' 'u' HEX_DIGIT HEX_DIGIT HEX_DIGIT HEX_DIGIT
    ;
    
    
ruleset :	
	'ruleset' ID '{' rule* '}'
	;
	
rule 	:
	'rule' ID 'is' ('active'|'inactive'|'test') '{' assignment* '}'
	;


assignment :  
	ID '=' STRING ';'
	;

	

Thanks for the help

------------------------------------------
Cid Dennis







More information about the antlr-interest mailing list