[antlr-interest] File extensions for ANTLR 3

Kay Roepke kroepke at classdump.org
Thu Nov 30 15:02:43 PST 2006


On 30. Nov 2006, at 23:18 , Terence Parr wrote:

> In general I support this proposal.  The most common case is  
> plain .g for a combined parser and lexer so nothing will change.  I  
> like the idea that, from the extension, a build tool or IDE knows  
> the Java class that will come out.

For the Xcode thingy I actually wrote a filter lexer to check out the  
file content to figure out what the output files will be. I even  
check for lexer rules explicitely, just to be sure. It's a pain.

> T.gl -> TLexer.java
> T.gp -> TParser.java
> T.gtp -> TTreeParser.java
> T.g -> TLexer.java, TParser.java
>
> Kunle also has a proposal that  tree parsers are different enough  
> that we should translate literally
>
> T.gtp -> T.java
>
> His rationale is that normal grammars are named for the language  
> and, hence, adding Lexer or Parser suffix makes sense, but tree  
> parsers are usually named properly and adding the TreeParser suffix  
> does not.  I noticed this a number of times myself.  For example,
>
> DefineVariables.g -> DefineVariablesTreeParser.java  (or .gtp?)
> SemanticAnalyzer.g -> SemanticAnalyzerTreeParser.java

Yes, that has bugged me, too.

> weird, right? Because tree grammars are named for their functions,  
> we should probably leave that alone.  With the addition of  
> Prashant's proposal (a similar proposed by others) then we will  
> know precisely what Java file will come out given the extension.
>
> So, the proposal is that ANTLR specifically  
> enforce .g, .gl, .gp, .gtp as well as the name of the file being  
> the name of the grammar inside.  Further, ANTLR would deviate from  
> regular grammars for tree grammars so that no suffix is added to  
> the generated code.

Sounds pretty good to me. This also comes in time for the book ;)
I give my vote to:
T.g	- combined, (TLexer|TParser).targetexts
T.gl	- lexer, TLexer.targetexts
T.gp	- parser, TParser.targetexts
T.gtp	- tree parser, T.targetexts

But I wonder if ANTLR should really enforce this. After all, a  
compiler shouldn't care what the file is named, right? In ANTLR's  
case I see file extensions more like a recommendation to make life  
easier for GUIs, build systems, etc. The only thing we have to change  
is the class/filename of tree parsers. Everything else will already  
be working. Maybe change the extensions of the intermediate lexer  
grammar file of combined grammars, and that's it. Document the "best  
practice file extensions" and be done with it.

> How will this fly in other target languages?

I don't see how this will be any different to the current situation.  
After all, the targets aren't involved in generating the filenames.

What I sometimes longed for is to be able to generate more than one  
pair of files (or one file as in Java), esp. for rule return classes,  
but that discussion really doesn't belong here ;) Besides, that would  
be difficult to implement in a uniform manner for all targets, so  
I'll just deal with the humongous output files. My ideal for this  
would be too Objective-C specific anyway.

cheers,

-k

-- 
Kay Röpke
http://classdump.org/






More information about the antlr-interest mailing list