[antlr-interest] Antlr3.4 Python bugs, workarounds

Benjamin Niemann pink at odahoda.de
Tue Dec 27 17:08:38 PST 2011


Hi Benjamin,

On Tue, Dec 27, 2011 at 2:20 AM, Benjamin S Wolf <jokeserver at gmail.com> wrote:
> On Fri, Oct 28, 2011 at 8:56 AM, Benjamin Niemann <pink at odahoda.de> wrote:
>> On Fri, Oct 7, 2011 at 11:17 PM, Benjamin S Wolf <jokeserver at gmail.com> wrote:
>>> I've discovered two bugs in working with the Python target to generate lexers.
>>>
>>> The first is in that when supplying k, the generated code for special
>>> state transitions is invalid: the "elif" keyword is split across lines
>>> as "el\nif", so the python interpreter crashes upon reading "el".
>>
>> That's been reported before, but I have problem reproducing it. Are
>> you using antlr-3.4-complete.jar (I can only see this problem with
>> that build) or did you build it yourself from antlr-3.4.tar.gz (or
>> something else completely)?
>>
>
> Hi Benjamin,
>
> I was meddling around with the stg templates for Python in trying to
> fix some other bugs I reported in another thread, and after updating
> the files in antlr-3.4-complete.jar this problem was alleviated.
>
> I narrowed down the diff and discovered that the stg templates in the
> original jar all had DOS line endings (that is, \r\n instead of just
> \n), and that removing all the carriage returns in
> org/antlr/codegen/templates/Python/Python.stg solved the issue of the
> elif being split across a newline.
>
> That certainly explains why it only showed in antlr-3.4-complete.jar,
> since the templates included with antlr-3.4.tar.gz did not have the
> carriage returns. :)

Good catch, thanks a lot for figuring that out.
That seems like a bug in stringtemplate to me - I thought it was
smarter about dealing with line endings.

Ter:
Was the jar built on a windows box? I assume perforce adds the CRLFs
when checking out the files under windows - the files are stored as
"text", i.e. line endings are converted to the native system.
Unless ST can be taught to deal with that properly, we could store the
templates as binary in the repository - but that could be messy when
someone actually wants to edit them under windows and it's hard to
notice when CRs creep back in.
Or avoid building jars on windows ;)
This probably affects other targets as well, but those are probably
less picky about some extra whitespace here and there. Could lead to
some obscure bugs though.

-Ben


More information about the antlr-interest mailing list