[antlr-interest] Changing the Lexer based on parsing the first part of a file for Python 2.6

Jim Idle jimi at temporal-wave.com
Tue Feb 14 12:56:33 PST 2012


You should only change the lexing behavior within the lexer. You will have
to set a member variable to false, then in the lexer you will need to look
for that statement and set the state in the lexer. You will not be able to
get the parser to tell the lexer about this as generally it will be too
late.

Another way might be that the lexer can do the most expansive thing (for
instance add two possibilities to a token), and the parser can choose the
one that makes sense after it hits that statement.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of fwierzbicki at gmail.com
> Sent: Tuesday, February 14, 2012 11:03 AM
> To: antlr-interest at antlr.org
> Subject: [antlr-interest] Changing the Lexer based on parsing the first
> part of a file for Python 2.6
>
> Hi all,
>
> Python 2.6 has syntax to change lexing behavior. Specifically:
>
> from __future__ import unicode_literals
>
> If this statement is present the lexing of strings changes. Without
> this directive,
>
> foo = "bar"
>
> assigns foo a String value. With the __future__ statement, foo gets a
> unicode statement. Also the __future__ statement causes
>
> foo = u"bar"
>
> to be an illegal statement. Essentially this allows you to write a 2.x
> program that will look more like a Python 3 program.
>
> So my question - what is a reasonable way to get my ANTLR3 grammar to
> signal the lexer to change? Though it seems ugly, my first thought is
> to pass a reference to the lexer to the parser and just set a boolean
> on the lexer so it has the correct behavior from then on. The reason
> that this *may* work is that Python only allows "from __future__"
> statements at the very top of the file and so no string/unicode/etc
> tokens are possible until after all "from __future__" statements have
> occurred. Will I get into trouble with cached lexing that has already
> happened? Or is there a better way to do this sort of thing?
>
> Kind regards,
>
> -Frank Wierzbicki
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe: http://www.antlr.org/mailman/options/antlr-interest/your-
> email-address


More information about the antlr-interest mailing list