[antlr-interest] Multiple-target parsers, and extending without overriding

Geoffrey Romer gromer at google.com
Fri Jan 7 16:06:16 PST 2011


On Tue, Jan 4, 2011 at 12:59 PM, Jim Idle <jimi at temporal-wave.com> wrote:

> > -----Original Message-----
> > From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> > bounces at antlr.org] On Behalf Of Geoffrey Romer
> > Sent: Tuesday, January 04, 2011 12:33 PM
> > To: antlr-interest
> > Cc: theov at google.com
> > Subject: [antlr-interest] Multiple-target parsers, and extending
> > without overriding
> >
> > Hi-
> >
> > I'm new to ANTLR, and I'm trying to evaluate its suitability for a
> > project I'm working on. I'd appreciate help with a few questions:
> >
> > - What is the status of C++ support? The wiki indicates that C++
> > support is coming "later in 2008", but this is obviously out of date.
>
>
> Compile the C output as C++, keep custom code and actions entirely out of
> the parser and produce AST outputs.
>
> >
> > - One goal of the project is to provide cross-platform parsing and
> > unparsing support, i.e. to generate parsers and unparsers in multiple
> > target languages (primarily C++ and Java) from a single representation
> > of the grammar. As far as I can see, the only way to accomplish this in
> > ANTLR is to provide a grammar with AST output type which uses only
> > rewrite rules and AST operators (and, for unparsing, a tree grammar
> > with template output type), with no target-language code at all.
>
> Actually I use source code control for this. Start with a base definition
> of all your grammars without any actions, then branch to specific targets
> and add any target specific code.
>
> > However, I'm not sure this is feasible; many ANTLR features (e.g.
> > attributes and predicates, custom error handling) and techniques (e.g.
> > implementing case insensitivity or keywords-as-identifiers) require use
> > of a specific target language. Is this approach workable? Are there
> > better options I'm overlooking?
>
> You keep all such code outside the grammar and within your application
> code.


But that seems like it's not always possible. So far as I can tell, there's
no way to keep syntactic or semantic predicates outside the grammar (now
would it be very desirable if you could). Keeping rule arguments and return
values outside the grammar is a contradiction in terms. I can't even find a
way to skip whitespace without writing code in the target language. Am I
missing something?


> There are few differences if you do that. Also use:
>
> id: ID | KEYWORD1 | KEYWORD2 ... etc;
>
> And not comparison code on ID to workout keywords.
>
> >
> > - Another goal of the project is to provide a unified parsing framework
> > for a family of closely related but distinct languages (specifically,
> > SQL dialects).
>
> You and everyone remotely interested in SQL ;)
>
> > We want to be able to express the language grammars in
> > terms of an inheritance hierarchy, where each language (other than the
> > base) is specified in terms of its differences from the parent
> > language. This seems like a natural fit for ANTLR's support for
> > composite grammars, but I see two drawbacks with that approach: first,
>
> SQL dialects are not really compatible enough to do that.
>
> > the languages may differ in both lexical structure and syntax; since
> > combined grammars cannot inherit from other combined grammars, this
> > seems to imply that we'd need to maintain separate, parallel
> > hierarchies of lexer and parser grammars, which are combined only in
> > the leaves. Is there a cleaner solution? Second, the composition
> > mechanism doesn't seem to support extending a grammar with new
> > productions; only overriding existing productions.
>
>
> Yes. Source code control is a better option here. Especially if you use
> one that is good at branches such as perforce.
>
> Jim
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>


More information about the antlr-interest mailing list