[antlr-interest] parsing python

Terence Parr parrt at cs.usfca.edu
Sat Feb 28 11:28:44 PST 2004


On Feb 27, 2004, at 11:16 PM, Kaleb Pederson wrote:

> On Friday 27 February 2004 5:03 pm, Terence Parr wrote:
>> Well, I've embarked on my parser for python.  First task was to
>> autoconvert the distribution grammar to ANTLR format.  Then I jumped 
>> on
>> the lexical issues.  Eewwwwwwwwwww!  Should never watch sausage being
>> made or python being lexically analyzed.  I think Humans can get 
>> pretty
>> used to this weird indentation thing as it's nice visually.  Getting
>> the lexer to handle the weird structure and the fun exceptions to the
>> rules, is no picnic.
>
> Lol.  Weird is all a matter of definition.
>
>> Anyway, I have something that looks like a parser with a
>> PythonTokenStream that does the right INDENT/DEDENT imaginary token
>> generation, but I have to add in a few random "allow trailing commas
>> for no reason" (COMMA)? subrules.  Then I will attempt a symbol table
>> manager so I can learn the semantics of symbol table lookup.
>
> Hmm.  IIRC, the python grammar is very well defined, so shouldn't you 
> just be
> able to take what's there and make it LL(k)? (hopefully without too 
> much
> work?)

It's LL(1) 'cept for the optional commas and semis, which require k=2.

> As far as the commas and Python, I presume you are talking about tuple
> packing/unpacking? Maybe on the print statement (indicating no 
> newline)?

Stuff like that and

a=1; b=2;

and

foo(1,3,)

that's just wrong in my book, but I'm a language implementor not 
designer ;)

>> I'll post something when it's ready for public consumption. :)
>
> One of the reasons that I chose antlr is that I have heard that Python 
> support
> is underway?  How is it coming?  Is there anything that I can do to 
> help?

Well, a few people are playing around.  We have a python generator that 
works with jython, but nobody has translated the libraries to python 
yet.  I'm madly learning python as fast as I can ;)

> I'll gladly test it out.  I tried out several different Python parsers

Should be something in the file sharing area you can try out.

> (LR/LALR) for some work projects, but the debugging information they 
> provided
> made them nearly useless so I ended up writing them by hand.  I 
> recently ran
> across a few more, but was hoping I could wait for antlr to pick up 
> Python
> support.
>
>> Nice language, python....just a little kinky lexically. ;)
>
> Oh yeah.  I still have to get used to the lexer parsing structure and 
> how it
> stands apart from all the other ones that I have seen up to this 
> point.  I
> suppose then that it wasn't the grammar that was giving you 
> problems....

Nope...grammar is autotranslated from distribution.  The lexer is 
totally nasty with context-sensitive whitespace etc...  For example, 
NEWLINE must be sent to the parser as it is statement separate etc..., 
but inside (...), [...], {...} whitespace indentation and newlines are 
totally ignored and not sent to the parser.

> Thanks for the feedback and the tool.  I really like how I can look at 
> LA(k)
> and FIRST(set).  They make it really easy to handle things that were
> previously much more difficult.

:)

Ter
--
Professor Comp. Sci., University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
Cofounder, http://www.knowspam.net enjoy email again!
Cofounder, http://www.peerscope.com pure link sharing





 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list