[antlr-interest] parsing python

Kaleb Pederson kibab at icehouse.net
Fri Feb 27 23:16:26 PST 2004


On Friday 27 February 2004 5:03 pm, Terence Parr wrote:
> Well, I've embarked on my parser for python.  First task was to
> autoconvert the distribution grammar to ANTLR format.  Then I jumped on
> the lexical issues.  Eewwwwwwwwwww!  Should never watch sausage being
> made or python being lexically analyzed.  I think Humans can get pretty
> used to this weird indentation thing as it's nice visually.  Getting
> the lexer to handle the weird structure and the fun exceptions to the
> rules, is no picnic.

Lol.  Weird is all a matter of definition.

> Anyway, I have something that looks like a parser with a
> PythonTokenStream that does the right INDENT/DEDENT imaginary token
> generation, but I have to add in a few random "allow trailing commas
> for no reason" (COMMA)? subrules.  Then I will attempt a symbol table
> manager so I can learn the semantics of symbol table lookup.

Hmm.  IIRC, the python grammar is very well defined, so shouldn't you just be 
able to take what's there and make it LL(k)? (hopefully without too much 
work?)

I'm actually working on a project for school that will handle the LR(1)->LL(1) 
conversion (where possible).  I'm using antlr as the tool and currently 
accepting a small subset of the antlr grammar.  I'm hoping to make it more of 
a general purpose grammar tool for analysis etc.

As far as the commas and Python, I presume you are talking about tuple 
packing/unpacking? Maybe on the print statement (indicating no newline)?

> I'll post something when it's ready for public consumption. :)

One of the reasons that I chose antlr is that I have heard that Python support 
is underway?  How is it coming?  Is there anything that I can do to help?  
I'll gladly test it out.  I tried out several different Python parsers 
(LR/LALR) for some work projects, but the debugging information they provided 
made them nearly useless so I ended up writing them by hand.  I recently ran 
across a few more, but was hoping I could wait for antlr to pick up Python 
support.

> Nice language, python....just a little kinky lexically. ;)

Oh yeah.  I still have to get used to the lexer parsing structure and how it 
stands apart from all the other ones that I have seen up to this point.  I 
suppose then that it wasn't the grammar that was giving you problems....

Thanks for the feedback and the tool.  I really like how I can look at LA(k) 
and FIRST(set).  They make it really easy to handle things that were 
previously much more difficult.

--Kaleb


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 



More information about the antlr-interest mailing list