[antlr-interest] Using the python output with ANTLR3

Thu Feb 19 11:15:21 PST 2009

Hi,

On Thu, Feb 19, 2009 at 4:10 PM, Dahlhoff Achim (CM-AI/PJ-VW37)
<Achim.Dahlhoff at de.bosch.com> wrote:
> Hi all.
>
> I just signed up on this list after contacting Mr. Niemann about the antlr Python binding. He suggested checking here for a little help.
>
> To transform a large bunch of java interface files to classes and stubs for a (broad) Java-to-C interface, I would like to generate all of the wrapper code and tables needed for JNI using a code-generator written in Python (I have more experience in python and think it es better suited for creating text outputs).
>
> When I launch ANTLR 3.1.1 with the Python output option on the Java 1.5 grammar file (from the antlr website), I am experiencing some problems.
>
> First of all, antlr outputs an internal error and some warnings. Then, the generated python files contain a few lines of code, which are more java than python and won't compile without some manual fixing. And when I'm done with that, the generated parser won't work... :-(
>
> Am I missing something here?
>
>
>
> The output of antlr when taking the Java 1.5 grammer and just adding the 'language=Python' option:
>
>
> C:\MIB\ANTLR>java -classpath ./antlrworks-1.2.1.jar org.antlr.Tool grammar\Java.g
>
> ANTLR Parser Generator  Version 3.1.1
> error(10):  internal error: eval tree parse error : <AST>:0:0: unexpected AST node:
>
> org.antlr.stringtemplate.language.ActionEvaluator.expr(Unknown Source)
> org.antlr.stringtemplate.language.ActionEvaluator.action(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.evaluateExpression(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.handleExprOptions(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.write(Unknown Source)
> org.antlr.stringtemplate.StringTemplate.write(Unknown Source)
> org.antlr.stringtemplate.language.ConditionalExpr.writeSubTemplate(Unknown Sourc
> e)
> org.antlr.stringtemplate.language.ConditionalExpr.write(Unknown Source)
> org.antlr.stringtemplate.StringTemplate.write(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.write(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.writeAttribute(Unknown Source)
> org.antlr.stringtemplate.language.ActionEvaluator.action(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.write(Unknown Source)
> org.antlr.stringtemplate.StringTemplate.write(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.write(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.writeAttribute(Unknown Source)
> org.antlr.stringtemplate.language.ActionEvaluator.action(Unknown Source)
> org.antlr.stringtemplate.language.ASTExpr.write(Unknown Source)
> org.antlr.stringtemplate.StringTemplate.write(Unknown Source)
> org.antlr.codegen.CodeGenerator.write(CodeGenerator.java:1275)
> org.antlr.codegen.Target.genRecognizerFile(Target.java:94)
> org.antlr.codegen.CodeGenerator.genRecognizer(CodeGenerator.java:460)
> org.antlr.Tool.generateRecognizer(Tool.java:420)
> org.antlr.Tool.process(Tool.java:287)
> org.antlr.Tool.main(Tool.java:72)

That looks like a known problem. I don't know what the exact issue was
(something with the way antlrworks was packages, I think), but using
the antlr.jar instead of antlrworks.jar (plus stringtemplate and
antlr2 which are bundled with antlrworks) should fix it.

> Then, the python interpreter stumbles over this:
>
>   if not ((t1.getLine() == t2.getLine() && .....
>
> (python requires 'and', not '&&'.)
>
> or this:
>
>   protected boolean enumIsKeyword = true;
>
> that looks like java. Python wood need something like    enumIsKeyword=True
> Finally this:
>
>   if (!enumIsKeyword) _type=Identifier;
>
> Here, the ! operator instead of 'not' is used, and the colon is missing.
> It should be:     if not enumIsKeyword: _type=Identifier

As Sam already wrote the action code has to be Python (or whatever the
target is). There is a Python version of Java.g in the example
package.

> When I fix these little bugs manually in the antlr output and try to feed it with a sample java file, the parser fails:
>
>  Traceback (most recent call last):
>   File "testprog.py", line 59, in ?
>     main(sys.argv[1:])
>   File "testprog.py", line 35, in main
>     parser.compilationUnit()
>   File "grammar\JavaParser.py", line 361, in compilationUnit
>     alt8 = self.dfa8.predict(self.input)
>   File "C:\Dahlhoff\MIB\ANTLRtest\antlr3\dfa.py", line 92, in predict
>     c = input.LA(1)
>   File "C:\Dahlhoff\MIB\ANTLRtest\antlr3\streams.py", line 859, in LA
>     return self.LT(i).type
>   File "C:\Dahlhoff\MIB\ANTLRtest\antlr3\streams.py", line 801, in LT
>     self.fillBuffer()
>   File "C:\Dahlhoff\MIB\ANTLRtest\antlr3\streams.py", line 669, in fillBuffer
>     t = self.tokenSource.nextToken()
>   File "C:\Dahlhoff\MIB\ANTLRtest\antlr3\recognizers.py", line 1133, in nextToken
>     self._state.tokenStartCharIndex = self.input.index()
>  TypeError: find/rfind/index/rindex() takes at least 1 argument (0 given)

That looks like the input of the lexer is a string. The input for the
lexer has to be one of the CharStream classes.

-Ben