[antlr-interest] ANTLRv3 comments/suggestions

Andy Tripp antlr at jazillian.com
Thu Jul 12 14:09:04 PDT 2007


Hi all,
I've just started using ANTLRv3 and I have a few 
comments/suggestions/enhancement request/bug reports.
Sorry to lump them all together like this - I'm lazy.

And, of course, let me just say v3 is amazing!
What impresses me most is that it seems like software tools in general, 
and especially tools like ANTLR,
never seem to get easier to use as they get older. New releases 
invariably add power and get harder to use,
not easier. Not so with v3: LL(*) really does a great job of drastically 
reducing the classic "ambiguity headache".
I'm really enjoying the new rewrite rules for creating ASTs.

A really great job by Terence!

1) When a rule alternative is the rule itself, I get a runtime 
StackOverFlowError, should be caught earlier:
A: A | B;

2) ANTLR always returns 0, even when an error occured. Should return 
non-zero on error

3) When I list a parser rule after a lexer rule,ANTLR doesn't seem to 
find the parser rule.
This was hard to track down because I accidentally started a parser rule 
with uppercase (making it a lexer rule),
and then (I think) all parser rules after that were not found. If all 
parser rules must come first, enforce that
and make sure no lexer rules come after any parser rules.

Also on this issue, the book only mentions once, in passing, that lexer 
rules are uppercase, an doesn't mention
that parser rules start with uppercase. I would emphasize this issue more.

4) When I define an imaginary token called "EOF", it conflicts with the 
ANTLR-internal one with the same name,
and I get a NPE at runtime.

5) I have a lot of suggested improvements for CommonTree, but of course 
I'll just extend it for myself.
You may want to consider adding the following:
* Add a getChildren() method - makes it easier to iterate, especially 
with the Java1.5 foreach construct.
* why not initialize children to an empty list, rather than null and 
having all that null checking code?
* use generics - children should be a List of CommonTrees.
* I've written a toStringPrettyTree() method that prints out trees 
nicely indented, rather than that
ugly LISP-ish syntax of toStringTree().
* Use StringBuilder rather than StringBuffer, probably everywhere in 
ANTLR and in the generated code.

6) Why do I have to both specify ASTTokenType and also do the 
setTreeAdaptor() thing? Can't
ANTLR call setTreeAdaptor() on its own whenever I specify an ASTTokenType?

7) CommonTree.getText() shouldn't call toString() because a subclass may 
override toString() and
call getText() in it (causing a infinite recursion).

8) In the generated code, print out the TokenTypes ordered by value:
    public static final int SR_ASSIGN=130;
    public static final int COMMA=78;
    public static final int STATIC_BLOCK=49;
    public static final int MINUS=86;
    public static final int FORMAL_PARAMETERS=39;
    public static final int EXPANSION_CHOICES=23;
    public static final int HexDigit=122;
    public static final int REGEX_PRODUCTION=32;
    public static final int MORE=12;
    public static final int FIELD_DECLARATION=53;
    public static final int REGEX_ID=35;
....

9) When I accidentally put a '$' where it doesn't belong it a rule 
parameter:
myrule[$param]
  : ....

...I get error "atribute param is not a token, ...", which is fine, but 
the line and column are both zero in the error.

I hope this list is useful!
Andy


More information about the antlr-interest mailing list