[antlr-interest] A postmortem of my use of antler
Gerald B. Rosenberg
gbr at newtechlaw.com
Mon Mar 10 18:52:06 PDT 2008
At 03:08 PM 3/10/2008, Benjamin Shropshire wrote:
>"how is antlr /expected/ to be used?" is the big one. That one
>should be on page one. I have yet to find a direct answer anywhere.
Very difficult to formulate a one page/one paragraph answer. After
all, Antlr is a DSL tailored for a large, variable problem
space. The Antlr language operators are rather basic in nature -- to
the extent there are coding patterns, they are quite general at best.
The step function learning problem is largely (I believe) one of
learning how to design a recognition specification subject to the
available mechanics of the Antlr language operators. And, design in
Antlr is largely a function of the language to be recognized. If you
look at the archived grammars, what at first appears to be wildly
varying styles is more a consequence of design choices tailored to
the intended function of the grammars.
That said, there are a number of caveats* that could be shared with
those new to Antlr**:
1) if you need better documentation, write it as you learn***, get
TDAR, or both.
2) Antlr lexers implement an LL(1) conversion of input symbols,
typically atomic characters, into tokens.
3) Antlr parsers implement an LL(*) conversion of tokens to subrules,
and then to actions or an AST.
4) The parser calls the lexer, but the lexer, on first call, will run
to EOF. After that, the lexer is just an in-memory sequential token
repository for the parser.
5) Fragments are essentially macro expansions, and are only visible
in the lexer.
6) Don't expect Antlr to resolve ambiguities automagically -- code
what you mean.
7) Even where Antlr offers some incidental recognition behavior, use
cautiously -- over reliance will result in inexplicable robustness
and maintenance problems.
8) Left factor: understand it and use it.
9) Use groups (parenthesis delimited subrules) to make clear what
your code means.
10) Use predicates to make clear what your code means.
11) Unless performance is absolutely your top 6 design requirements,
don't worry about using predicates.
12) Don't try to do too much in the lexer. The parser is more powerful anyway.
13) Don't try to do too much in the parser. Use tree-walkers to evolve an AST.
14) Don't try to dump complex blocks from a parser or AST. Use
StringTemplate to unparse.
15) Actions (brace delimited statements) can be inserted almost anywhere.
* these caveats have caveats
** top 15 that I wish I had had
*** the Antlr Wiki supports personal pages (FWIW, my notes are public)
More information about the antlr-interest
mailing list