[antlr-interest] Documenting grammars

Sam Barnett-Cormack s.barnett-cormack at lancaster.ac.uk
Mon Mar 23 15:33:52 PDT 2009


Dennis Brothers wrote:
> On Mar 23, 2009, at 2:23 PM, Sam Barnett-Cormack wrote:
> 
>> Sam Harwell wrote:
>>> Why not create our own format that properly describes grammars?
>>>
>>> We could group them by the Tokens file they reference to cover
>>> lexer/parser/tree parser combinations. Documentation could include
>>> formatted/highlighted rule text, comments, DFA statistics, and
>>> thumbnails of the various rule diagrams linked to full-size versions.
>>>
>>> Sam (Harwell)
>> The way I see it, right now, there's two options to get the minimum  
>> I'd
>> like to see:
>>
>> 1) Scratch-written system to document anything, flexibly enough to  
>> allow
>> various different terminologies (appropriate to OOP, grammars,  
>> whatever)
>>
>> 2) Scratch-written grammar documenting system, allowing focus on good
>> documentation of grammars and including the stuff that Sam H talks  
>> about
>> (configurably, of course).
>>
>> Option (2) is less work, while option (1) is more use to the wider
>> community *iff* it's done well. Frankly, I'm leaning towards (2) now
>> (despite some quickly-scratched-out design for (1), which I could  
>> always
>> use to do that option later myself if I really want to). (2) doesn't
>> overlap other existing systems.
>>
>> If there's enthusiasm for this, I'll whip up a quick outline-design,  
>> and
>> anyone who wants to help can help me nail it down to something  
>> specific,
>> and possibly help actually write it ;) areas that'd speed me up most
>> would be writing output engines and bringing extra (more experienced?)
>> minds to the parsing. Design usually produces better results from
>> multiple minds, too.
>>
>> Sam (Barnett-Cormack)
> 
> Since ANTLR uses ANTLR to parse itself, couldn't the ANTLR grammar be  
> modified or extended to recognize and emit doc comments and the  
> elements they refer to?  Seems like this would be quite a leg up on  
> Option 2.
> 
> I don't recall whether the ANTLR grammar is public (and I know ANTLR 3  
> was originally implemented using ANTLR 2) - what's the current state  
> of this?

Well, the ANTLRV3-in-ANTLRV3 is at http://www.antlr.org/grammar/ANTLR/

However, it'll need some tweaking - currently, doc-comments are 
discarded except on the grammar itself. Makes life awkward, as it'd be 
annoying to have to re-tweak the grammar if/when it changes in future. 
However, it won't be hard overall. Tweak the parser grammar a bit, write 
a new (most likely filter) tree grammar to extract what's needed... a 
bit harder if we want to be able to reconstruct the rules themselves, to 
put verbatim into the documentation, but still not too bad, really.

I'd prefer it if the retaining of rule doc-comments was done in the main 
codebase, of course - as well as adding doc-comments to entries in the 
tokens section. Ter et al - any chance of this happening if there *is* 
an effort to come up with a documenting tool?

-- 
Sam Barnett-Cormack


More information about the antlr-interest mailing list