[antlr-interest] Grammar Perplexity in v3.0 (More)

Sun Nov 12 09:05:39 PST 2006

Terence,

On Sunday 12 November 2006 08:46, Terence Parr wrote:
> On Nov 12, 2006, at 8:44 AM, Randall R Schulz wrote:
> > plainTerm
> >
> >     :    AtomicWord ( '(' arguments ')' ) ?
> >
> > AtomicWord
> >
> >     :   LowerWord
> >
> >     ;
>
> These rules are a problem.  AtomicWord is unreachable as both rules
> can match it's input.  You will never see it in the parser.
> Ter

Oh, I get it. You cannot (meaningfully) have lexical rules like

AtomicWord
    :    LowerWord
    ;

Because the replacement (or one alternative, anyway) is 
indistinguishable from the rule head. The lexer generator has to pick 
one token type to generate and in this case, LowerWord was chosen, 
essentially "stranding" any parser rule that refers to AtomicWord.

Out of curiosity, why do production such as this work for syntax rules 
but not for lexical rules?

I've noticed that when I have a lexical rule like this:

Dot: '.';

in addition to literal references to '.' in the grammar. In such cases, 
ANTLRworks displays the literal '.' instances as the named lexical 
rule "Dot."

Perhaps this identification can be used to collapse lexer rules such as 
my ill-formed ones?

Randy