[antlr-interest] Understanding Lexer rules

Tue Feb 19 12:44:07 PST 2008

Ooops, meant to send to
list, since any confirmation or correction is appreciated. Re-sending:

_______
I was thinking about this earlier today... I probably have some incorrect
assumptions in here, but my current theories:

1) It helps to consider lexer rules (token definitions) to be separate from
the parser rules, even though they're in the same file.
2) Unlike parser rules, the order of appearance matters. (The auto-named
tokens generated by literals in parser rules are appended.)
3) The lexer seeks to match the first viable token.
4) Order your tokens from most specific and complex to least specific and
generic.
5) Ensure that any lexer rules which are only for convenience (and not as
fully valid first-class tokenson their own) are marked "fragmentary".

Bear in mind, these are things I haven't rigorously tested and I'm not an
ANTLR guru.

So, for example, I'd put NUMERIC (the specific case) before ALPHANUMERIC in
the lexer rules.

-- 
Darien Hager
Developer
Etelos, Inc.
darien at etelos.com

http://www.etelos.com
"Revolutionizing the way applications are developed, distributed and
consumed."

This e-mail message, including attachments, may contain confidential
information for the sole use of the intended recipient(s). If you are not
the intended recipient, then this is notice that any use, disclosure,
dissemination, distribution or copying is strictly prohibited. If you have
received this message in error please contact the sender by reply mail and
destroy all copies of the original message.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080219/fc14e82e/attachment.html