[antlr-interest] Documenting grammars
Sam Barnett-Cormack
s.barnett-cormack at lancaster.ac.uk
Thu Mar 26 07:07:04 PDT 2009
Okay, this weekend I plan to start work on a suitable
language-and-paradigm-independent documentation system, with pluggable
inputs and outputs. I've not decided what technologies to use initially,
although you can bet that any input and output bits I write will use
ANTLR and/or ST, appropriately. I've decided to use three
languages/paradigms to keep in mind to help ensure agnosticism:
* Grammars (per ANTLR)
* Templates (per ST)
* Object-Orients languages generally (which will probably cover
procedural stuff as well, really)
I'll start with the "middle" part - working out an API for receiving and
outputing language information.
If there's interest in this, I can either keep the list updated, or just
those who are interested. If I get it off the ground and ANTLR folks
don't want it in the ANTLR stable, I'll sort out other hosting and
mailing list stuff.
Sam
Sam Barnett-Cormack wrote:
> Dennis Brothers wrote:
>> On Mar 23, 2009, at 2:23 PM, Sam Barnett-Cormack wrote:
>>
>>> Sam Harwell wrote:
>>>> Why not create our own format that properly describes grammars?
>>>>
>>>> We could group them by the Tokens file they reference to cover
>>>> lexer/parser/tree parser combinations. Documentation could include
>>>> formatted/highlighted rule text, comments, DFA statistics, and
>>>> thumbnails of the various rule diagrams linked to full-size versions.
>>>>
>>>> Sam (Harwell)
>>> The way I see it, right now, there's two options to get the minimum
>>> I'd
>>> like to see:
>>>
>>> 1) Scratch-written system to document anything, flexibly enough to
>>> allow
>>> various different terminologies (appropriate to OOP, grammars,
>>> whatever)
>>>
>>> 2) Scratch-written grammar documenting system, allowing focus on good
>>> documentation of grammars and including the stuff that Sam H talks
>>> about
>>> (configurably, of course).
>>>
>>> Option (2) is less work, while option (1) is more use to the wider
>>> community *iff* it's done well. Frankly, I'm leaning towards (2) now
>>> (despite some quickly-scratched-out design for (1), which I could
>>> always
>>> use to do that option later myself if I really want to). (2) doesn't
>>> overlap other existing systems.
>>>
>>> If there's enthusiasm for this, I'll whip up a quick outline-design,
>>> and
>>> anyone who wants to help can help me nail it down to something
>>> specific,
>>> and possibly help actually write it ;) areas that'd speed me up most
>>> would be writing output engines and bringing extra (more experienced?)
>>> minds to the parsing. Design usually produces better results from
>>> multiple minds, too.
>>>
>>> Sam (Barnett-Cormack)
>> Since ANTLR uses ANTLR to parse itself, couldn't the ANTLR grammar be
>> modified or extended to recognize and emit doc comments and the
>> elements they refer to? Seems like this would be quite a leg up on
>> Option 2.
>>
>> I don't recall whether the ANTLR grammar is public (and I know ANTLR 3
>> was originally implemented using ANTLR 2) - what's the current state
>> of this?
>
> Well, the ANTLRV3-in-ANTLRV3 is at http://www.antlr.org/grammar/ANTLR/
>
> However, it'll need some tweaking - currently, doc-comments are
> discarded except on the grammar itself. Makes life awkward, as it'd be
> annoying to have to re-tweak the grammar if/when it changes in future.
> However, it won't be hard overall. Tweak the parser grammar a bit, write
> a new (most likely filter) tree grammar to extract what's needed... a
> bit harder if we want to be able to reconstruct the rules themselves, to
> put verbatim into the documentation, but still not too bad, really.
>
> I'd prefer it if the retaining of rule doc-comments was done in the main
> codebase, of course - as well as adding doc-comments to entries in the
> tokens section. Ter et al - any chance of this happening if there *is*
> an effort to come up with a documenting tool?
>
More information about the antlr-interest
mailing list