[antlr-interest] Tokenising for context specific reserved words

Thu Jul 17 17:22:55 PDT 2008

On Thu, 2008-07-17 at 16:49 -0700, Loring Craymer wrote:
> That is one solution; however, semantic predicates--
> { input.LT(1).getText().equals("foo") }? ID --are much to be preferred
> when there are lots of potential keywords and cost less in terms of
> performance since they avoid the id method call for the general case.
> (Or should cost less:  ANTLR 3 currently does not reduce the generated
> conditionals.)

Personally I think that that construct is almost unreadable and it
involves invoking LT(), getText() - which means creating the string out
of the input stream, then a string comparison, which is another method
call in itself. I can't see how that will cost less than looking for a
token value as it invokes three method calls. Java doesn't seem to do a
great job of optimizing conditionals, but it should be able to do better
than two method calls, constructing a string via substring and a string
comparison I should think? I would also think that the DFA is faster
than that construct. 

My preference is based upon the observed performance of C I admit, where
the keywords rule is a much better performer (though I might go recheck
that to make sure ;-). Maybe the opposite is indeed true for Java.

Jim
> 
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20080717/1be1e413/attachment.html