[antlr-interest] MismatchedTokenException and how to find errors in ANTLRWorks

Guntis Ozols guntiso at latnet.lv
Tue Feb 12 20:37:18 PST 2008


> > Sure it is confusing. I fell into this 'xy' trap, too.
> > If COLON is defined and STAR is defined, then creating
> > another token for ':*' in a *parser* rule is confusing.
> > It should be fixed in antlr.
> > At least, antlr should emit 'anonymous token' warning[s].
> >
> > I prefer literals because of simplicity / readability.
> > The tool needs to be patched, not users.

> If I understand your "problem" I don't think this can be fixed. If you
> type ':*', how can ANTLR know you don't want ':*' as a single token.

Because:
 - there is only one grammar file for both lexer and parser
 - ':*' is mentioned in a parser rule only
 - there are lexer rules in that same file for ':' and '*'
 so lexer rules can be combined automatically to support parser rules

> It is perfectly valid to have ':', '*' and ':*' tokens, so this is
> what ANTLR produces.
> And emitting a warning just because you use a perfectly valid (if easily
> misusable) construct doesn't seem right.
> Unfortunately I don't think this is a case where ANTLR can protect
> you from yourself.
> Perhaps literals could be allowed in the parser but only to refer to tokens
> defined in the lexer, generating an error otherwise. That should alleviate
> many (if not all) of the issues while still allowing the stylistic choice
> some seem to want.
>
> Tom.

Emitting a warning doesn't seem right, generating an error is better?
I feel OK with that. Although I actually think that allowing literals
in parser can enable to create some very simple grammars easier.
But magically creating tokens by default really is a bad choice.
Perhaps an option to explicitly allow creation of anonymous tokens
is the best solution.

Guntis



More information about the antlr-interest mailing list