[antlr-interest] accentuated chars in lexer makes gcj failed
Mathieu Clabaut
mathieu.clabaut at gmail.com
Mon Feb 7 07:44:25 PST 2005
Hello,
the following Lexer grammar :
CHAR : ('a'..'z'|'A'..'Z'|'_'| '-'
| 'é' | 'è' | 'ê' | 'ë'
| 'á' | 'à' | 'â' | 'ä'
| 'ú' | 'ù' | 'û' | 'ü' | 'î' | 'ï'
| 'ô' | 'ö' );
Get translated in the following pieces of code
case '-':
{
match('-');
break;
}
case '\u00e9':
{
match('é');
break;
}
It works well when compiling in java bytecode (javac), But when using
gcj, gcj complains about the 'é' accentuated char :
GraphesLexer.java:671: erreur: unrecognized character in input
stream.
If I replace 'é' by '\u00e9', it works like a charm.
Are their any reason why 'é' is used instead of 'u00e9' ?
(It is perhaps a bug of gcj, but the difference between the case
parameter and the match() paramter looks strange to the newbie I am)...
-mat
--
________________http://www.gnu.org/philosophy/no-word-attachments.fr.html
Mathieu CLABAUT mailto:mathieu.clabaut at free.fr
F2F5 442F F2AC E1D5 9D31 3EFC 842A BC4A 123B 9A65
More information about the antlr-interest
mailing list