[antlr-interest] ANTLRv3 comments/suggestions

Terence Parr parrt at cs.usfca.edu
Wed Jul 25 14:41:11 PDT 2007


On Jul 12, 2007, at 2:09 PM, Andy Tripp wrote:

> Hi all,
> I've just started using ANTLRv3 and I have a few comments/ 
> suggestions/enhancement request/bug reports.
> Sorry to lump them all together like this - I'm lazy.
>
> And, of course, let me just say v3 is amazing!

Great! :)

> What impresses me most is that it seems like software tools in  
> general, and especially tools like ANTLR,
> never seem to get easier to use as they get older. New releases  
> invariably add power and get harder to use,
> not easier. Not so with v3: LL(*) really does a great job of  
> drastically reducing the classic "ambiguity headache".
> I'm really enjoying the new rewrite rules for creating ASTs.

Awesome...yeah, i love those things. ;)

> A really great job by Terence!

thanks!

> 1) When a rule alternative is the rule itself, I get a runtime  
> StackOverFlowError, should be caught earlier:
> A: A | B;

yup.

http://www.antlr.org:8888/browse/ANTLR-108

> 2) ANTLR always returns 0, even when an error occured. Should  
> return non-zero on error

yup. got it:

http://www.antlr.org:8888/browse/ANTLR-43

> 3) When I list a parser rule after a lexer rule,ANTLR doesn't seem  
> to find the parser rule.
> This was hard to track down because I accidentally started a parser  
> rule with uppercase (making it a lexer rule),
> and then (I think) all parser rules after that were not found. If  
> all parser rules must come first, enforce that
> and make sure no lexer rules come after any parser rules.

Hmm...i get no issue with:

grammar T;
a : B ;
B : 'b' ;
c : 'c' ;

> Also on this issue, the book only mentions once, in passing, that  
> lexer rules are uppercase, an doesn't mention
> that parser rules start with uppercase. I would emphasize this  
> issue more.
>
> 4) When I define an imaginary token called "EOF", it conflicts with  
> the ANTLR-internal one with the same name,
> and I get a NPE at runtime.

Added bug.

> 5) I have a lot of suggested improvements for CommonTree, but of  
> course I'll just extend it for myself.
> You may want to consider adding the following:
> * Add a getChildren() method - makes it easier to iterate,  
> especially with the Java1.5 foreach construct.

added to BaseTree

	/** Get the children internal List; note that if you directly mess with
	 *  the list, do so at your own risk.
	 */
	public List getChildren() {
		return children;
	}


> * why not initialize children to an empty list, rather than null  
> and having all that null checking code?

a waste of an array for every leaf node.

> * use generics - children should be a List of CommonTrees.

can't use 1.5 for runtime yet.

> * I've written a toStringPrettyTree() method that prints out trees  
> nicely indented, rather than that
> ugly LISP-ish syntax of toStringTree().

want to donate?  Send in via feedback page. :)

> * Use StringBuilder rather than StringBuffer, probably everywhere  
> in ANTLR and in the generated code.

That's 1.5.

> 6) Why do I have to both specify ASTTokenType and also do the  
> setTreeAdaptor() thing? Can't
> ANTLR call setTreeAdaptor() on its own whenever I specify an  
> ASTTokenType?

ASTLabelType is for generating casts in generated code.  it's not a  
runtime thing.
>
> 7) CommonTree.getText() shouldn't call toString() because a  
> subclass may override toString() and
> call getText() in it (causing a infinite recursion).

yep, i fixed that.

>
> 8) In the generated code, print out the TokenTypes ordered by value:
>    public static final int SR_ASSIGN=130;
>    public static final int COMMA=78;
>    public static final int STATIC_BLOCK=49;
>    public static final int MINUS=86;
>    public static final int FORMAL_PARAMETERS=39;
>    public static final int EXPANSION_CHOICES=23;
>    public static final int HexDigit=122;
>    public static final int REGEX_PRODUCTION=32;
>    public static final int MORE=12;
>    public static final int FIELD_DECLARATION=53;
>    public static final int REGEX_ID=35;
> ....
>
> 9) When I accidentally put a '$' where it doesn't belong it a rule  
> parameter:
> myrule[$param]
>  : ....
>
> ...I get error "atribute param is not a token, ...", which is fine,  
> but the line and column are both zero in the error.
>
> I hope this list is useful!

Very!  Thanks, Andy.
Ter


More information about the antlr-interest mailing list