[antlr-interest] Question about idiom.

Sat Jan 9 18:04:06 PST 2010

2010/1/9 Kay Röpke <kroepke at classdump.org>

>
> On Jan 9, 2010, at 5:32 AM, Michael Richter wrote:
>
> > I keep coming across a pattern in a grammar I'm working on.  This pattern
> > looks something like this:
> >
> >   - A production can be *A*.
> >   - A production can be *B*.
> >   - A production can be *A B.*
> >
> > In the grammar I'm transcribing this from, the notation used is *(A &
> B)*.
> > Is there some convenient way to code that in ANTLR's EBNF notation?  I
> keep
> > having to do *(A | B | A B)*.  As is that isn't all that onerous as-is, I
> > admit, but imagine if A is five tokens long and B is also five tokens
> long
> > and then imagine this kind of pattern happening about twenty times in the
> > grammar.  Is there a way to concisely do this?
>
> What is the restriction on the parts of the production?
> I.e. what differentiates a valid production from an invalid one?
>

The restriction is exactly as I put it: You can have A (where A is a
multi-token set of specified order), B (where B is a multi-token set of
specified order) or A B.  It *must* be in the order provided and A and B are
fixed token sets.

Think of it this way: you're declaring a variable.  You have a token for the
variable, then an optional type specification (A -- multiple tokens) and an
optional initializer (B -- multiple tokens).  Both parts are optional, but
you *must* have at least one and the declarations *must* be in the order of
type then initializer if both are present.  The only way I've found to do it
is (A | B | A B), but this is painful when A and B are more than one token
in length and I've got about 20 of these things in the grammar.  This is
just begging for typos.