[antlr-interest] Antlr3.4 Python bugs, workarounds

Benjamin S Wolf jokeserver at gmail.com
Fri Oct 7 14:17:39 PDT 2011


I've discovered two bugs in working with the Python target to generate lexers.

The first is in that when supplying k, the generated code for special
state transitions is invalid: the "elif" keyword is split across lines
as "el\nif", so the python interpreter crashes upon reading "el". I've
been working around this in vim by running the following command in
the generated file:

:%s/\( \+\)el\n\1/\1el/

(Replaces "el\n" preceded and followed with n spaces by n spaces
followed by "el".)

The second is in the Python antlr3 library. Calling getTokens() on a
CommonTokenStream will return all but the last token. This is because
the slice notation [start:stop] is inclusive on the left and exclusive
on the right, but stop is set to len(self.tokens) - 1.
http://www.antlr.org/api/Python/antlr3_8py-source.html#l01733

This can be fixed by finding the following lines in getTokens() (in
antlr3/streams.py):

if stop is None or stop >= len(self.tokens):
   stop = len(self.tokens) - 1

and changing them to

if stop is None or stop > len(self.tokens):
   stop = len(self.tokens)

or it can be worked around by using the tokens attribute directly.

--Ben


More information about the antlr-interest mailing list