[antlr-interest] Understanding Lexer rules

Tue Feb 19 12:17:39 PST 2008

I've come to a bit of a misunderstanding of how the lexer rules work in
ANTLR.  If I were to create a simple rule to capture whole numbers:

UINT: ('0'..'9')+;

I find that anywhere I use a number in a rule is invalid because UINT
already scooped up any possible numbers as a UINT.  This causes
problems when I want to scan or parse, say, a single digit somewhere in
a parser rule.

This also becomes a problem if I wanted to make a lexer rule such as:
ALPHA: ('a'..'z'|'A'..'Z')+;

This sets a precident that any alphas are scanned off before parsers
can do anything with it.  This is bad when you want, say, just 'T' and
not ALPHA in your parser rule.

Am I using the lexer improperly?  Is there a better way to use these
rules?

Thanks.

---
Shawn Poulson
spoulson at explodingcoder.com