[antlr-interest] charVocabulary having no effect
Colm McHugh
colmmagoo at yahoo.com
Mon Dec 13 11:43:40 PST 2004
Hi Andre,
My understanding (and experience) is that you are
going to get a lexer exception ("bad character" or
whatever) for any character that is not explicitly
used to define a token in your lexer (try defining the
lower-case letter range of ID as 'a'..'y', and you
should get an exception if you enter a 'z').
The charVocabulary is used if you define a token as
_not_ being a certain character or characters; then
the charVocabulary is used to determine the set of
characters the token can be.
The classic case is a STRING token, the text of which
is often defined as "anything except the quote
character". What this really means is 'any
charVocabulary character except a quote'. If you
didn't specify a charVocabulary set, then your
charVocabulary would be the set of characters
explicitly used to define the tokens in your lexer.
Hope this helps,
Colm.
>
>
> I'm struggling a bit with charVocabulary. After
> getting lot's of
> strange "unexpected character" errors I figured that
> this was a rather
> important option. I therefore added
>
> charVocabulary = '\3'..'\377';
>
> To my Lexer options.
>
> But I'm still getting unexpected char errors. I have
> a fairly simple
> grammar with a non-greedy rule to match the contents
> of a specific
> portion. When the lexer encounters the char '=' in
> this portion it
> stops saying "unexpected character". If I then add
> this:
>
> POINTLESS : '=' ;
>
> The error goes away, but then it stops on some other
> char. This
> continues until I've added all the chars not listed
> in some rule in
> the lexer. So to be sure it seems I will have to
> explicitly list *all*
> the ASCII characters.
>
> Grepping through the generated code I could not find
> a single
> reference to "charVocabulary" or "vocabulary". Is
> this option broken?
>
> I'm using Antlr 2.7.4 on Linux (Mandrake 10.0) with
> Java 1.4.2.
>
> The lexer definition from the grammar file:
>
> class QuerySchemaLexer extends Lexer;
> options {
> charVocabulary = '\3'..'\377';
> caseSensitiveLiterals = false;
> }
>
> RPAREN : ')';
> LPAREN : '(';
> COLON : ':';
> SEMI : ';';
> COMMA : ',';
>
> ID
> options {
> testLiterals = true;
> paraphrase = "an identifer";
> }
> : ('a'..'z'|'A'..'Z'|'_')
> ('a'..'z'|'A'..'Z'|'_'|'0'..'9')*
> ;
>
>
> WS : ( ' '
> | '\t'
> | '\r' '\n' { newline(); }
> | '\n' { newline(); }
> )
> {$setType(Token.SKIP);} //ignore this token
> ;
>
>
>
>
>
>
> Yahoo! Groups Links
>
>
> antlr-interest-unsubscribe at yahoogroups.com
>
>
>
>
>
>
__________________________________
Do you Yahoo!?
Yahoo! Mail - Helps protect you from nasty viruses.
http://promotions.yahoo.com/new_mail
Yahoo! Groups Links
<*> To visit your group on the web, go to:
http://groups.yahoo.com/group/antlr-interest/
<*> To unsubscribe from this group, send an email to:
antlr-interest-unsubscribe at yahoogroups.com
<*> Your use of Yahoo! Groups is subject to:
http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list