[antlr-interest] C# and Unicode problems

tdjastrzebski tdjastrzebski at yahoo.com
Thu Jul 31 17:49:41 PDT 2003


Hi everybody,
regardless of charVocabulary option set to '\u0000'..'\uFFFE';
non-ascii characters just disappear from token text or are not being 
recognized when parsing strings like: 'po¿ó³ægêœl¹jaŸñ' (beginning 
and ending with single quotes). Am I missing something? Do I have to 
create antlr.Lexer in any particular way or pass it an input stream?

Regards,
Tom Jastrzebski

sample grammar:

options {
	language = "CSharp";
}

class TestParser extends Parser;

options {
	k = 2;
}

statement
	: StringLiteral EOF
	;

class TestLexer extends Lexer;

options {
    k = 2;
    charVocabulary='\u0000'..'\uFFFE';
}

StringLiteral
	: '\'' (~'\'')* '\''
	;


 

Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/ 




More information about the antlr-interest mailing list