[antlr-interest] Documenting grammars

Sam Barnett-Cormack s.barnett-cormack at lancaster.ac.uk
Thu Mar 26 07:07:04 PDT 2009


Okay, this weekend I plan to start work on a suitable
language-and-paradigm-independent documentation system, with pluggable
inputs and outputs. I've not decided what technologies to use initially,
although you can bet that any input and output bits I write will use
ANTLR and/or ST, appropriately. I've decided to use three
languages/paradigms to keep in mind to help ensure agnosticism:

 * Grammars (per ANTLR)
 * Templates (per ST)
 * Object-Orients languages generally (which will probably cover
procedural stuff as well, really)

I'll start with the "middle" part - working out an API for receiving and
outputing language information.

If there's interest in this, I can either keep the list updated, or just
those who are interested. If I get it off the ground and ANTLR folks
don't want it in the ANTLR stable, I'll sort out other hosting and
mailing list stuff.

Sam

Sam Barnett-Cormack wrote:
> Dennis Brothers wrote:
>> On Mar 23, 2009, at 2:23 PM, Sam Barnett-Cormack wrote:
>>
>>> Sam Harwell wrote:
>>>> Why not create our own format that properly describes grammars?
>>>>
>>>> We could group them by the Tokens file they reference to cover
>>>> lexer/parser/tree parser combinations. Documentation could include
>>>> formatted/highlighted rule text, comments, DFA statistics, and
>>>> thumbnails of the various rule diagrams linked to full-size versions.
>>>>
>>>> Sam (Harwell)
>>> The way I see it, right now, there's two options to get the minimum  
>>> I'd
>>> like to see:
>>>
>>> 1) Scratch-written system to document anything, flexibly enough to  
>>> allow
>>> various different terminologies (appropriate to OOP, grammars,  
>>> whatever)
>>>
>>> 2) Scratch-written grammar documenting system, allowing focus on good
>>> documentation of grammars and including the stuff that Sam H talks  
>>> about
>>> (configurably, of course).
>>>
>>> Option (2) is less work, while option (1) is more use to the wider
>>> community *iff* it's done well. Frankly, I'm leaning towards (2) now
>>> (despite some quickly-scratched-out design for (1), which I could  
>>> always
>>> use to do that option later myself if I really want to). (2) doesn't
>>> overlap other existing systems.
>>>
>>> If there's enthusiasm for this, I'll whip up a quick outline-design,  
>>> and
>>> anyone who wants to help can help me nail it down to something  
>>> specific,
>>> and possibly help actually write it ;) areas that'd speed me up most
>>> would be writing output engines and bringing extra (more experienced?)
>>> minds to the parsing. Design usually produces better results from
>>> multiple minds, too.
>>>
>>> Sam (Barnett-Cormack)
>> Since ANTLR uses ANTLR to parse itself, couldn't the ANTLR grammar be  
>> modified or extended to recognize and emit doc comments and the  
>> elements they refer to?  Seems like this would be quite a leg up on  
>> Option 2.
>>
>> I don't recall whether the ANTLR grammar is public (and I know ANTLR 3  
>> was originally implemented using ANTLR 2) - what's the current state  
>> of this?
> 
> Well, the ANTLRV3-in-ANTLRV3 is at http://www.antlr.org/grammar/ANTLR/
> 
> However, it'll need some tweaking - currently, doc-comments are 
> discarded except on the grammar itself. Makes life awkward, as it'd be 
> annoying to have to re-tweak the grammar if/when it changes in future. 
> However, it won't be hard overall. Tweak the parser grammar a bit, write 
> a new (most likely filter) tree grammar to extract what's needed... a 
> bit harder if we want to be able to reconstruct the rules themselves, to 
> put verbatim into the documentation, but still not too bad, really.
> 
> I'd prefer it if the retaining of rule doc-comments was done in the main 
> codebase, of course - as well as adding doc-comments to entries in the 
> tokens section. Ter et al - any chance of this happening if there *is* 
> an effort to come up with a documenting tool?
> 



More information about the antlr-interest mailing list