[antlr-interest] character ranges in parser rules

Andy Tripp antlr at jazillian.com
Thu Aug 2 13:33:34 PDT 2007


I had this parser rule:

labelStatement:
    (LETTER (LETTER| '_'| '0'..'9')+ ':')
    ;

and of course somewhere else:
INT:
    ('0'..'9')+

...inadvertently putting a character range in a parser rule rather than 
a lexer rule.
That caused '0' and '9' to be parsed as individual tokens, not at all 
what I wanted.
Is it invalid to put a character range in a parser rule like this? Seems 
like an error
message would have been nice when antlr-compiling.

Then I changed it to:

labelStatement:
    (LETTER (LETTER| '_'|  ('0'..'9'))+ ':')
    ;

...and then ANTLR produces parser code with a matchRange() call in it, 
but matchRange()
is in the Lexer base class, not Parser. Again, a "you can't put ranges 
in a parser rule" msg
would have been nice.

Finally, changed it to:

labelStatement:
    (LETTER (LETTER| '_'| INT)+ ':')
    ;

And all is good :)


More information about the antlr-interest mailing list