[antlr-interest] lexer: compound keywords with a twist

Mon Aug 20 01:48:28 PDT 2007

Unfortunately I do not comprehend.  I'm sort of struggling
as my knowledge of ANTLR is infantile but I *may* have an idea of what
you mean.

I really want the lexer to do the compound keyword recognition so the
parser
only has to worry about single tokens.  Basically the lexer works harder
so
the parser doesn't need to.  I decided on this because I wanted the
parser
and the lexer to be more reusable for another project, keeping keyword 
recognition in the lexer.

I would actually like to reuse the lexer for syntax aware editor.

I will look into this as you have suggested.

Thank you,

W.

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Susan Jolly
Sent: Sunday, August 19, 2007 11:43 PM
To: antlr-interest at antlr.org
Subject: [antlr-interest] lexer: compound keywords with a twist

This might be a case where you want to take advantage of the ability to
emit more than one token per lexer rule as explained in the ANTLR book
starting on page 95.

If you use a lexer rule similar to 

KEYWORD = ('a'..'z'| 'A'..'Z'| ' '| '$')+;

it will get all of your "compound" keywords plus, of course, other
sequences.

Then you include your own emit() method in your lexer that emits this
token if the token text actually is a keyword.  If not, you use a custom
"mini-lexer" to rescan the token text and emit the correct sequence of
tokens. Of course, you wouldn't want to do this unless the "mini-lexer"
is very simple.

I had a situation where the "mini-lexer" simply had to emit each
character as a separate token so this strategy worked really well.