[antlr-interest] C# lexer and unicode

Rodrigo B. de Oliveira rbo at acm.org
Sat Jan 31 04:41:59 PST 2004


Anyway, that's the way it works for me because I use
System.IO.File.OpenText(string) to
get the StreamReader, but I'd bet you could create
a StreamReader() with the proper enconding to make it work...

[]s,
Rodrigo

----- Original Message ----- 
From: "Rodrigo B. de Oliveira" <rbo at acm.org>
To: <antlr-interest at yahoogroups.com>
Sent: Saturday, January 31, 2004 10:36 AM
Subject: Re: [antlr-interest] C# lexer and unicode


> They work ok for me (for latin characters such as çãéõü) but the
> input files must be UTF8 encoded.
>
> Best wishes,
> Rodrigo
>
> ----- Original Message ----- 
> From: "maaxxxcal" <maaxxxcal at yahoo.com>
> To: <antlr-interest at yahoogroups.com>
> Sent: Saturday, January 31, 2004 2:17 AM
> Subject: [antlr-interest] C# lexer and unicode
>
>
> I would like to know if ANTLR's C# parser generator supports unicode.
> I have an input that contains some chinese/japanese identifiers and
> they are not being lexed properly. They are simply being skipped from
> the stream. They don't even show up in the lexer's nextToken() method.
>
> I wonder if this is because there is something wrong in my lexer or
> just because it's not yet fully supported.
>
> I have:
>
>   charVocabulary = '\u0000'..'\ufffe';
>
> Here's my whitespace rule:
>
> // Whitespace -- ignored
> WS      : ( options { generateAmbigWarnings = false; }
>   : ' ' // blank
>   | '\t' // tab
>   | "\r\n"    {newline();} // Windows
>   | ('\r'|'\n') {newline();} // Unix or Mac
>   | '\f'      // form feed
>   | ('\0'..'\10'|'\16'..'\37')  // control characters
>   ) {$setType(Token.SKIP);}
> ;
>
> Here's my rule for identifiers:
>
> IDENT
> options {testLiterals=true;
>          paraphrase="an identifier";}
> : ('\u0080'..'\ufffe'|'a'..'z'|'_')
> ('\u0080'..'\ufffe'|'a'..'z'|'_'|'$'|'0'..'9')*
> ;
>
> And here's the string I'm trying to parse:
>
> »ù½ð´úÂë VARCHAR(6) NOT NULL
>
>
>
>
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
>  http://groups.yahoo.com/group/antlr-interest/
>
> To unsubscribe from this group, send an email to:
>  antlr-interest-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
>  http://docs.yahoo.com/info/terms/
>
>
>
>
>
>
> Yahoo! Groups Links
>
> To visit your group on the web, go to:
>  http://groups.yahoo.com/group/antlr-interest/
>
> To unsubscribe from this group, send an email to:
>  antlr-interest-unsubscribe at yahoogroups.com
>
> Your use of Yahoo! Groups is subject to:
>  http://docs.yahoo.com/info/terms/
>
>
>


 

Yahoo! Groups Links

To visit your group on the web, go to:
 http://groups.yahoo.com/group/antlr-interest/

To unsubscribe from this group, send an email to:
 antlr-interest-unsubscribe at yahoogroups.com

Your use of Yahoo! Groups is subject to:
 http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list