[antlr-interest] Re: Help with grammer for IRC TEXT

afleance afleance at yahoo.com
Sun Mar 16 18:20:17 PST 2003


I found out my main problem is I thought the protected keyword 
applied to all rules below it; I didn't realize I needed to add 
protected keyword to each rule. Once I did that, it is working
much better.

However, I still have a problem with syntax errors getting thrown 
for things like "http: "  "http:\\" (note \\ instead of //).  I
want the Lexer to just return unmatched text as an UNKNOWN_TEXT
token, but I can't figure out how to do that. 

Also, my rules for INT and FLOAT and FLOAT_OR_INT aren't working
properly when using an optional minus sign. I am trying to
detect INT e.g. 5 or -5 and FLOAT, e.g. 5.5 or -5.5

----

class IRCLexer extends Lexer;

options {
	k=4;
	filter=false;
	/* all 8 bit chars */
	charVocabulary = '\u0000'..'\u00FF';
}

URL     : HTTP ( LETTER | DIGIT | URL_SPECIAL_CHAR )+
	;
IRC_BOLD    : '\002' /* CTRL-B*/
	;
IRC_PLAIN   :  '\u000f' /*CTRL-O*/
	;
IRC_UNDERLINE : '\u0015' /*CTRL-U*/
	  ;
IRC_REVERSE : '\u0016' /*CTRL-R*/
	;

/* <CTRL-K>[FG[,BG]] where FG=00..16 and BG=00.16, e.g  <CTRL-K>04,01
*/

IRC_COLOR   : '\u0003' /*CTRL-K*/  (i:INT_2SD)? (',' (j:INT_2SD))?
	{  
	  if (i != null && j != null) {
	     setText(i.getText()+","+j.getText());
          } else if (i != null) {
	    setText(i.getText());
	  } else {
	    setText("");
	  }
	}
	;

IRC_WORD : ( LETTER | DIGIT | '_' )+
        ;

FLOAT_OR_INT : ( INT '.' ) => FLOAT 
	     { 
	     $setType(FLOAT); 
	     }
	     | ( INT )
	     { 
	     $setType(INT); 
	     }
	;

WS  :   (   ' '
        |   '\t'
        |   '\r' '\n' { newline(); }
        |   '\n'      { newline(); }
        )
        {	
/*      I want to return WS as tokens
	$setType(Token.SKIP);
*/
	} 
    ;

/* Catch all, pass through everything not matched above ?? */
UNMATCHED_TEXT : . 
	;

/********************************
 ** PROTECTED RULES 
 ********************************/
protected
FLOAT : INT '.' UNSIGNED_INT
        ;

protected
INT : ( '-' UNSIGNED_INT )
        ;

protected
HTTP    : "http://"
	;

protected
URL_SPECIAL_CHAR : ('$' | '-' | '_' | '@' | '.' | '&' | '+' |
		 '!' | '*' | '"' | '\'' | '(' | ')' | ',' |
		 '=' | ';' | '/' | '#' | '?' | '\\'':' | '%' )
	  ;

protected
LETTER : ('a'..'z'|'A'..'Z')
	;

protected
UNSIGNED_INT : (DIGIT)+
        ;

protected
DIGIT : ('0'..'9')
        ;

/* Special rule to match either 1 or 2 digit integers
   used by IRC_COLOR above */
protected
INT_2SD : (DIGIT)(DIGIT)?
        ;




 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 



More information about the antlr-interest mailing list