[antlr-interest] UTF-8, charVocabulary in options in 3.3

Matej Mailing mailing at tam.si
Fri Jun 29 03:26:01 PDT 2012


Hi,

I am new to antlr but already have an issue. I have an input file that
contains some UTF-8 characters (like U+0161 -
http://www.fileformat.info/info/unicode/char/161/index.htm) and I am
using ANTLRFileStream(inputfile, "UTF-8") to get the input which is in
UTF-8 as it should be. However, when I do
"RES      : '\u0161' ;"

it never matches - I get input1 line 1:0 no viable alternative at
character 'š' message.

When I add the following segment to the grammar file:

"options
{
           charVocabulary='\u0000'..'\uFFFE';
}"

I get an error:
"internal error:  : java.lang.Error: Error parsing grammar.g: '\uFFFE'
not expected ';'"
...
error(100): grammar.g:5:24: syntax error: antlr: grammar.g:5:24:
expecting SEMI, found '..'
error(133): grammar.g:3:1: illegal option charVocabulary"

I have been googling around for quite some time and none of the
solutions seems to be working. What am I doing wrong?

Thanks a lot for any ideas in advance.

BR,
Matej


More information about the antlr-interest mailing list