[antlr-interest] Python 2.3.3 grammar posted

Kaleb Pederson kibab at icehouse.net
Sat Feb 28 21:40:45 PST 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Saturday 28 February 2004 5:43 pm, Terence Parr wrote:
> Howdy,
>
> Loring and I have finished a first attempt at a python grammar. It
> seems to grok all of jython 2.1's Lib directory (40k lines of Python).
> The whitespace/comments/newlines/indentation thing is a nightmare to
> implement, but I think it's right. Very close to Python 2.3.3
> distribution grammar (minus the wackiness in the lexer).

Nice!  Thanks.

> http://www.antlr.org/grammar/1078018002577/python.tar.gz
>
> I'm happy to receive and incorporate fixes.

Ok. I ran it on about 50-100k of Python that a partner and I wrote on a 
commercial project.  The following is the only thing that cropped up:

- ------
$ cat test.py
#!/usr/bin/python
# -*- coding: utf-8 -*-
# see http://www.python.org/peps/pep-0263.html
test = "Gabrielle Carré"
print test
- ------
$ python test.py
Gabrielle Carré

$ java Python test.py
Exception in thread "main" line 5:23: expecting '"', found '?'
        at PythonLexer.nextToken(PythonLexer.java:352)
        at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
        at PythonTokenStream.nextToken(PythonTokenStream.java:123)
        at antlr.TokenBuffer.fill(TokenBuffer.java:69)
        at antlr.TokenBuffer.LA(TokenBuffer.java:80)
        at antlr.LLkParser.LA(LLkParser.java:52)
        at PythonParser.test(PythonParser.java:763)
        at PythonParser.testlist(PythonParser.java:408)
        at PythonParser.expr_stmt(PythonParser.java:970)
        at PythonParser.small_stmt(PythonParser.java:877)
        at PythonParser.simple_stmt(PythonParser.java:146)
        at PythonParser.stmt(PythonParser.java:335)
        at PythonParser.file_input(PythonParser.java:282)
        at Python.main(Python.java:44)

I also ran the following script in bash to look for some other things (after 
adding a System.out.println to print out the filename:

$ find /usr/lib/python2.3 -name "*.py" -print | xargs -n 1 -i java Python {} 
>> parsed_python.log 2>&1 >> parsed_python.log

It looks like it found lack of support for the same thing as above, complex 
numbers, and a couple other things.  I'll include the rest of it below (I'm 
not sure yahoogroups will let me attach) and I'll try to attach it as well.

Looks *really* good for a first try!

- --Kaleb

Reading /usr/lib/python2.3/test/test_binop.py
line 253:31: expecting COLON, found 'j'
line 253:65: unexpected token: :
line 254:1: expecting DEDENT, found ''
line 267:1: expecting EOF, found ''
Reading /usr/lib/python2.3/test/test_unary.py
Reading /usr/lib/python2.3/test/pickletester.py
line 344:27: unexpected token: j
Reading /usr/lib/python2.3/test/test_al.py
Exception in thread "main" line 75:20: expecting '"', found '|'
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/test/test_pep263.py
Exception in thread "main" line 2:9: unexpected char: '"'
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
[snip]
Reading /usr/lib/python2.3/test/test_compare.py
line 31:30: unexpected token: j
line 31:78: unexpected token: ]
Reading /usr/lib/python2.3/test/test_compile.py
line 108:40: unexpected token: j
Reading /usr/lib/python2.3/test/test_complex.py
line 79:50: unexpected token: j
[snip]
Reading /usr/lib/python2.3/test/test_csv.py
Exception in thread "main" line 691:58: expecting '"', found '?'
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/test/test_format.py
line 51:28: expecting RPAREN, found 'e'
line 52:28: expecting RPAREN, found 'e'
line 53:28: expecting RPAREN, found 'e'
Reading /usr/lib/python2.3/test/re_tests.py
line 85:7: unexpected token: e14
Reading /usr/lib/python2.3/test/test_sax.py
Exception in thread "main" line 72:25: expecting '"', found '?'
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/test/test_descr.py
line 198:27: expecting COLON, found 'j'
line 198:44: unexpected token: :
line 199:1: expecting DEDENT, found ''
line 214:1: expecting EOF, found ''
Reading /usr/lib/python2.3/test/test_timeout.py
line 57:61: unexpected token: j
Reading /usr/lib/python2.3/test/test_coercion.py
line 69:31: unexpected token: j
line 70:48: unexpected token: ]
Reading /usr/lib/python2.3/test/test_pprint.py
line 89:34: expecting RPAREN, found 'j'
line 94:23: expecting COLON, found ')'
line 94:25: expecting DEDENT, found '
'
line 103:1: expecting EOF, found ''
Reading /usr/lib/python2.3/shlex.py
Exception in thread "main" line 40:32: unexpected char: '''
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/tarfile.py
Exception in thread "main" line 6:31: unexpected char: 0x?D
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/encodings/punycode.py
Exception in thread "main" line 4:21: expecting '"', found ' '
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
	at antlr.TokenBuffer.fill(TokenBuffer.java:69)
	at antlr.TokenBuffer.LA(TokenBuffer.java:80)
	at antlr.LLkParser.LA(LLkParser.java:52)
	at PythonParser.file_input(PythonParser.java:241)
	at Python.main(Python.java:44)
Reading /usr/lib/python2.3/encodings/string_escape.py
Exception in thread "main" line 5:21: expecting '"', found ' '
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
	at antlr.TokenBuffer.fill(TokenBuffer.java:69)
	at antlr.TokenBuffer.LA(TokenBuffer.java:80)
	at antlr.LLkParser.LA(LLkParser.java:52)
	at PythonParser.file_input(PythonParser.java:241)
	at Python.main(Python.java:44)
Reading /usr/lib/python2.3/getopt.py
Exception in thread "main" line 23:9: unexpected char: 0x?D
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:141)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
[snip]
Reading /usr/lib/python2.3/site-packages/drv_libxml2.py
Exception in thread "main" line 36:19: expecting '"', found '?'
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
[snip]
Reading /usr/lib/python2.3/pydoc.py
Exception in thread "main" line 37:10: expecting '"', found ' '
	at PythonLexer.nextToken(PythonLexer.java:352)
[snip]
Reading /usr/lib/python2.3/plat-linux2/IN.py
line 432:10: unexpected token: =
line 439:11: unexpected token: =
line 563:1: unexpected token: def
line 569:1: unexpected token: def
line 577:1: unexpected token: def
line 597:1: unexpected token: def
line 617:1: unexpected token: null
Reading /usr/lib/python2.3/plat-linux2/TYPES.py
line 116:23: unexpected token: =
line 138:1: unexpected token: def
line 146:1: unexpected token: def
Reading /usr/lib/python2.3/heapq.py
Exception in thread "main" line 37:19: expecting '"', found 'a'
	at PythonLexer.nextToken(PythonLexer.java:352)
	at 
PythonTokenStream.insertImaginaryIndentDedentTokens(PythonTokenStream.java:131)
	at PythonTokenStream.nextToken(PythonTokenStream.java:123)
	at antlr.TokenBuffer.fill(TokenBuffer.java:69)
	at antlr.TokenBuffer.LA(TokenBuffer.java:80)
	at antlr.LLkParser.LA(LLkParser.java:52)
	at PythonParser.test(PythonParser.java:763)
	at PythonParser.testlist(PythonParser.java:408)
	at PythonParser.expr_stmt(PythonParser.java:970)
	at PythonParser.small_stmt(PythonParser.java:877)
	at PythonParser.simple_stmt(PythonParser.java:146)
	at PythonParser.stmt(PythonParser.java:335)
	at PythonParser.file_input(PythonParser.java:282)
	at Python.main(Python.java:44)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFAQXtkeAVt8Tl/2kURAhTTAJ0VFxvUdw8sE2Z+zgwXXbF2NMk5VQCfZal9
MOYkXBxJuj0vMUnkAEmIguI=
=6teE
-----END PGP SIGNATURE-----


 
Yahoo! Groups Links

<*> To visit your group on the web, go to:
     http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
     antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
     http://docs.yahoo.com/info/terms/
 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: parsed_python.log
Type: text/x-log
Size: 16461 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20040228/1e935185/parsed_python.bin


More information about the antlr-interest mailing list