[antlr-interest] Re: Help with grammer for IRC TEXT
afleance
afleance at yahoo.com
Sun Mar 16 18:20:17 PST 2003
I found out my main problem is I thought the protected keyword
applied to all rules below it; I didn't realize I needed to add
protected keyword to each rule. Once I did that, it is working
much better.
However, I still have a problem with syntax errors getting thrown
for things like "http: " "http:\\" (note \\ instead of //). I
want the Lexer to just return unmatched text as an UNKNOWN_TEXT
token, but I can't figure out how to do that.
Also, my rules for INT and FLOAT and FLOAT_OR_INT aren't working
properly when using an optional minus sign. I am trying to
detect INT e.g. 5 or -5 and FLOAT, e.g. 5.5 or -5.5
----
class IRCLexer extends Lexer;
options {
k=4;
filter=false;
/* all 8 bit chars */
charVocabulary = '\u0000'..'\u00FF';
}
URL : HTTP ( LETTER | DIGIT | URL_SPECIAL_CHAR )+
;
IRC_BOLD : '\002' /* CTRL-B*/
;
IRC_PLAIN : '\u000f' /*CTRL-O*/
;
IRC_UNDERLINE : '\u0015' /*CTRL-U*/
;
IRC_REVERSE : '\u0016' /*CTRL-R*/
;
/* <CTRL-K>[FG[,BG]] where FG=00..16 and BG=00.16, e.g <CTRL-K>04,01
*/
IRC_COLOR : '\u0003' /*CTRL-K*/ (i:INT_2SD)? (',' (j:INT_2SD))?
{
if (i != null && j != null) {
setText(i.getText()+","+j.getText());
} else if (i != null) {
setText(i.getText());
} else {
setText("");
}
}
;
IRC_WORD : ( LETTER | DIGIT | '_' )+
;
FLOAT_OR_INT : ( INT '.' ) => FLOAT
{
$setType(FLOAT);
}
| ( INT )
{
$setType(INT);
}
;
WS : ( ' '
| '\t'
| '\r' '\n' { newline(); }
| '\n' { newline(); }
)
{
/* I want to return WS as tokens
$setType(Token.SKIP);
*/
}
;
/* Catch all, pass through everything not matched above ?? */
UNMATCHED_TEXT : .
;
/********************************
** PROTECTED RULES
********************************/
protected
FLOAT : INT '.' UNSIGNED_INT
;
protected
INT : ( '-' UNSIGNED_INT )
;
protected
HTTP : "http://"
;
protected
URL_SPECIAL_CHAR : ('$' | '-' | '_' | '@' | '.' | '&' | '+' |
'!' | '*' | '"' | '\'' | '(' | ')' | ',' |
'=' | ';' | '/' | '#' | '?' | '\\'':' | '%' )
;
protected
LETTER : ('a'..'z'|'A'..'Z')
;
protected
UNSIGNED_INT : (DIGIT)+
;
protected
DIGIT : ('0'..'9')
;
/* Special rule to match either 1 or 2 digit integers
used by IRC_COLOR above */
protected
INT_2SD : (DIGIT)(DIGIT)?
;
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list