[antlr-interest] postmortem

Wed Mar 12 12:21:07 PDT 2008

Jim Idle wrote:
> The wiki is open to anyone to add documentation at any level they like 

To address that exact point:

One (possible) problem with this: many people, by the time they have the 
experience to have confident in there understanding of any system have 
lost sight of the things that are hard for newbes. I known /I/ am guilty 
of this practically every time I open my mouth. I know this a hard 
problem, I'm currently helping develop a system that is so far off the 
beaten track that I struggle to describe it to people who haven't been 
working on it (and there are 3 people in that category). My number one 
fear is that /we/ will not be able to write documentation that is in the 
least bit tractable by new users because we have no idea what the major 
sticking points will be.

I would update the wiki with the info I would like to see in it, but I 
am still not sure my understanding is correct. I'm not suggesting a 
development effort but, has anyone seen a system for "tagging" info in a 
wiki with "confidence" ratings? ("this /seems/ to work" -> "I think this 
is correct but I'm not sure" -> "I wrote the blinking thing so this is 
RIGHT") Barring something like that, I'd be worried that I'm adding 
wrong information and undermining other users.

Not addressing Jim's point, but offering something (beside criticisms) 
to y'all:

If someone who knows, thinks this is correct, could you add it to the 
wiki in some suitably prominent place?

""

ANTLR is a tool for generating a number of different kinds of language 
recognizer/translator programs. It is typically used to process input in 
three stages. The first stage takes a characters stream as input, breaks 
it up into tokens (the smallest grouping a language cares about) and 
generates a tokens stream as output. The second stage takes this token 
stream and find’s it structure. In this stage, the sequence of tokens is 
converted into a condensed, normalized form, the Abstract Syntax Tree 
(AST), by extracting parts of the input pattern and applying rewrite 
rules. The third step (or steps in some cases) takes this AST and uses a 
tree parser to processes it to do whatever the user needs. This step 
often takes the form of multiple tree parsers that are run in sequence 
to generate the needed results. The first two steps are typically 
defined together in a combined grammar with an output type of “AST”. 
Also they should generally have no more action code attached to them 
than is needed to properly parse the input. The bulk of the logic should 
be put in the third step(s).

""

This paragraph would have been really handy to have found while I was 
still deciding what tool to use. One point about the wording of this; I 
was careful to pepper it with terms that the newbe should known before 
going on, that way, if they don’t known them they at least have a text 
string to search for (feel free to sub in better terms). In my 
experience, that hardest thing about jumping into something totally new, 
is figuring out what questions I should be asking. Once I have that, I 
can usually make good headway on my own.

If it's off base, go ahead and correct it, I have no vanity with respect 
to that. (you need not even site me as the author <grin/>)