[antlr-interest] lexer: compound keywords with a twist
Gavin Lambert
antlr at mirality.co.nz
Mon Aug 20 04:02:51 PDT 2007
At 12:52 20/08/2007, Edwards, Waverly wrote:
>I'm a first time ANTLR user and I have some questions that I need
>some assistanc with.
>I am replicating an existing procedural BASIC dialect language
>compiler. I actually have
>multiple issues to overcome but this is the first one. The
>language has *hundreds* of keywords.
>Many of the keywords are actually compound keywords
>
>Edit = numericVar
>Edit Field
>Edit Field Close
>Edit Menu
>Edit Text
>Compile Long If
For that case, my first cut attempt would be something along these
lines (not sure if it'll compile without warnings, but I think
it's close):
EDIT_FIELD
: 'Edit'
(WS
('Field'
(WS 'Close' { $type = EDIT_FIELD_CLOSE; }
| /*nothing -- EDIT_FIELD*/
)
| 'Menu' { $type = EDIT_MENU; }
| 'Text' { $type = EDIT_TEXT; }
)
| /* nothing */ { $type = IDENTIFIER; }
)
;
(Where WS is defined to exclude newlines, unless your language
supports these multi-word keywords being broken across lines too.)
This is basically the "how you'd parse it by eye"
approach. (Though iIt'll be more complicated if you want to be
case-insensitive as well...)
The last case I'm a little unsure about. It's easy enough to
handle 'Compile' as identifier vs. 'Compile Long' as keyword, but
treating 'Compile Long If' as a keyword and 'Compile Long Foo' as
three identifiers would be tricky, and would probably require
emitting multiple tokens from a single lexer rule. (It becomes
easier again if you can treat some of these cases as illegal.)
>2. Is it possible to deal with variable length keywords at the
>lexer level.
>
>stringVar = Edit$( vNumParam )
>Edit$( vNumParam ) = stringVar
Possibly, but that seems more like a job for the parser. At the
parser level you can examine the surrounding context and then emit
an EditStatement or EditFunction into the AST.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20070820/d5a85eda/attachment.html
More information about the antlr-interest
mailing list