[antlr-interest] Recover grammar file from generated code?

David-Sarah Hopwood david-sarah at jacaranda.org
Fri Sep 25 11:00:40 PDT 2009


Hoang Phung wrote:
> Hi all,
> 
> In my project, I inherited the code (Parser and Lexer) generated with 
> ANTLR by someone else but I couldn't retrieve the original grammar file 
> from him. Can someone tell me if there is a way to recover the grammar 
> file from the generated code and embedded comments? Thanks in advance.

The generated code does include all of the information needed, although
it's likely to be quite tedious to recover it manually (and not worth
automating just for one grammar).

Basically, each generated method of the lexer or parser corresponds to
a rule, and each method includes comments that give the original text
of the rule. If there are no actions or predicates, then it is almost
just a matter of deleting everything but the comments, and then cleaning
up the syntax.

If there are actions, you'll need to be able to distinguish them from
generated code. Depending on the formatting of the original grammar
file, you may be able to tell which lines are actions because they have
different indentation.

If there are predicates (shown by "{...}" in the comments), you'll have
to extract the condition from the generated code -- look for 'if'
statements of the form "if ( !( predicate ) ) {
    throw new FailedPredicateException(...);
}".

There also may be code that is included from a @lexer::members or
@parser::members section. That will be just after the list of
"public static final int" token definitions at the start of the class,
possibly also with different indentation.

-- 
David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com



More information about the antlr-interest mailing list