[antlr-interest] Multiple-target parsers, and extending without overriding
Jim Idle
jimi at temporal-wave.com
Tue Jan 4 12:59:06 PST 2011
> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of Geoffrey Romer
> Sent: Tuesday, January 04, 2011 12:33 PM
> To: antlr-interest
> Cc: theov at google.com
> Subject: [antlr-interest] Multiple-target parsers, and extending
> without overriding
>
> Hi-
>
> I'm new to ANTLR, and I'm trying to evaluate its suitability for a
> project I'm working on. I'd appreciate help with a few questions:
>
> - What is the status of C++ support? The wiki indicates that C++
> support is coming "later in 2008", but this is obviously out of date.
Compile the C output as C++, keep custom code and actions entirely out of
the parser and produce AST outputs.
>
> - One goal of the project is to provide cross-platform parsing and
> unparsing support, i.e. to generate parsers and unparsers in multiple
> target languages (primarily C++ and Java) from a single representation
> of the grammar. As far as I can see, the only way to accomplish this in
> ANTLR is to provide a grammar with AST output type which uses only
> rewrite rules and AST operators (and, for unparsing, a tree grammar
> with template output type), with no target-language code at all.
Actually I use source code control for this. Start with a base definition
of all your grammars without any actions, then branch to specific targets
and add any target specific code.
> However, I'm not sure this is feasible; many ANTLR features (e.g.
> attributes and predicates, custom error handling) and techniques (e.g.
> implementing case insensitivity or keywords-as-identifiers) require use
> of a specific target language. Is this approach workable? Are there
> better options I'm overlooking?
You keep all such code outside the grammar and within your application
code. There are few differences if you do that. Also use:
id: ID | KEYWORD1 | KEYWORD2 ... etc;
And not comparison code on ID to workout keywords.
>
> - Another goal of the project is to provide a unified parsing framework
> for a family of closely related but distinct languages (specifically,
> SQL dialects).
You and everyone remotely interested in SQL ;)
> We want to be able to express the language grammars in
> terms of an inheritance hierarchy, where each language (other than the
> base) is specified in terms of its differences from the parent
> language. This seems like a natural fit for ANTLR's support for
> composite grammars, but I see two drawbacks with that approach: first,
SQL dialects are not really compatible enough to do that.
> the languages may differ in both lexical structure and syntax; since
> combined grammars cannot inherit from other combined grammars, this
> seems to imply that we'd need to maintain separate, parallel
> hierarchies of lexer and parser grammars, which are combined only in
> the leaves. Is there a cleaner solution? Second, the composition
> mechanism doesn't seem to support extending a grammar with new
> productions; only overriding existing productions.
Yes. Source code control is a better option here. Especially if you use
one that is good at branches such as perforce.
Jim
More information about the antlr-interest
mailing list