[antlr-interest] lexer: compound keywords with a twist

Sam Ellis sam.ellis at arm.com
Mon Aug 20 00:14:46 PDT 2007


On 20/8/07 01:52, "Edwards, Waverly" <Waverly.Edwards at genesys.com> wrote:
> 2.  Is it possible to deal with variable length keywords at the lexer
> level.
> 
> stringVar = Edit$( vNumParam )
> Edit$( vNumParam ) = stringVar

In my grammar, I solve this by using a semantic predicate in my parser to
call a method that knows how to identify my keywords. For example my
isKeyword() method takes an argument containing a description of the
expected keyword (where text to the left of '*' represents the minimum stem
to be matched and text afterwards is optional):

statement
    :    {isKeyword("enum*eration", input.LT(1))}? keyword EQUALS value

To do this in the lexer instead you could either explicitly code all the
combinations:

ENUM: 'enum' ('e' ('r' ('a' ('t' ('i' ('o' ('n')? )? )? )? )? )? )?

Or as the previous poster suggested, defining a generic lexer rule that
matches on any keyword, with an action to perform disambiguation and return
the correct token type.

I chose to do my checking in the parser because my language allows keywords
as variable names, so I need further context to disambiguate between the two
usages.


-- 
Sam Ellis, RVDK Team Leader,
DevSys Product Engineering Group,          Tel: +44 (0) 1223 400516
ARM Ltd., 110 Fulbourn Road,               Fax: +44 (0) 1223 400887
Cambridge, CB1 9NJ                         mailto:Sam.Ellis at arm.com



-- 
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.




More information about the antlr-interest mailing list