[antlr-interest] A proposal for keywords

Loring Craymer craymer at warpiv.com
Wed May 24 13:00:59 PDT 2006



> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Terence Parr
> Sent: Wednesday, May 24, 2006 11:21 AM
> To: antlr-interest List
> Subject: Re: [antlr-interest] A proposal for keywords
> 
> 
> On May 23, 2006, at 11:09 AM, Loring Craymer wrote:
> 
> > I was concerned that this might not work with the LL(*) DFAs of
> > ANTLR 3, until I realized that the predicate hoisting mechanism
> > provides almost all of the support needed.  (Some sort of type
> > patching table may also be required; "if" might be matched by a
> > state that allowed either LITERAL_if or TEXT as the type for that
> > token; for a first implementation, the type patching may not be
> > necessary since tree walkers could also do dynamic lookup when
> > matching literals.  Patching, though, seems preferable over the
> > longer term.)
> 
> Hi...seems like we can just let preds handle it.  There is a cool

Mechanistically, yes.  However, the preds should be automatically generated,
not specified by the user.  I would like to set a flag and have parser
literals cause the parser to match and retype tokens instead of having the
lexer do the work of literal lookup.

> example in the examples-v3 dir about turning enum on / off, but this
> concept needs a different example.  I think this will work:
> 
> stat : if ... | ID '=' expr | ... ;
> 
> if : {input.LT(1).getText().equals("if")}? ID ;
> 
> but, that's expensive doing the compare all the time.  Another way is

Actually, it just "feels" expensive.  Since you don't hash every ID, it's
faster than doing the "testLiterals" thing--you only do a compare when you
need to.  You can also optimize to not do more than one string compare per
literal by keeping a record of the matched type (LITERAL_if in this case).

> to simply list the keywords:
> 
> id : ID | 'if' | 'then' | 'begin' | ... ;
> 
> then use like
> 
> stat : 'if' ... | id '=' expr | ... ;
> 
> What that do what you want?

Not quite--that handles option 1 (identify keywords in lexer).  I'm thinking
that the matching machinery for your "if" example above can be done "under
the hood" so that instead of

> if : {input.LT(1).getText().equals("if")}? ID ;

You do

if : "if" ;

with an option flag set and the _parser_ automagically matches a token with
text "if" as LITERAL_if.

Shmuel's desire--that a token must already be of a desired type before
coercion (TEXT, for example)--could also be satisfied, but it would add to
the implementation cost.  You would have to run an interpreted version of
the lexer to be generated over keyword strings to identify the default type.
I would think that overkill.

> 
> Ter



More information about the antlr-interest mailing list