[antlr-interest] Failure on OpenJDK on Debian
Sam Barnett-Cormack
s.barnett-cormack at lancaster.ac.uk
Wed Apr 1 05:58:28 PDT 2009
Gavin Lambert wrote:
> At 00:02 2/04/2009, Sam Barnett-Cormack wrote:
> >However, k=*, it'll do whatever lookahead is needed, so there
> >isn't actually an ambiguity with LL(*). It would be silly to
> >left-factor, say:
> >
> >EVERY : 'every';
> >EACH : 'each';
> >EVENT : 'event';
> >
> >Because it just makes it unreadable. ANTLR knows what to do with
> >this, so why left-factor? You'll end up with equivalent decision
> >making, even.
>
> Right, which is why those aren't the problem -- they can always be
> resolved with static lookahead, so they shouldn't take long to figure out.
>
> Where you can get into trouble is when there's a common left prefix
> involving a loop -- such as the INT vs FLOAT vs RANGE case.
But by the sound of it, in Ola's case, at lest some of the collisions
are of the sort I describe:
> [java] warning(200): ioke.g:269:5: Decision can match input such as "'#'"
> using multiple alternatives: 1, 2
> [java] As a result, alternative(s) 2 were disabled for that input
Okay, that sounds like it probably ought to be factored, from what
little info we have.
> [java] warning(209): ioke.g:323:1: Multiple token rules can match input
> such as "'#'": T__38, Identifier, StringLiteral, RegexpLiteral, LineComment
> [java]
> [java] As a result, token(s)
> Identifier,StringLiteral,RegexpLiteral,LineComment were disabled for that input
At least T_38 is presumably finite-length and shouldn't be included.
Something sounds odd in the language if identifiers, string literals,
regex literals (and they are separate?) and one-line comments can all
start with a hash...
> [java] warning(209): ioke.g:202:1: Multiple token rules can match input
> such as "'['": T__34, Identifier
> [java]
> [java] As a result, token(s) Identifier were disabled for that input
Ditto above
> [java] warning(209): ioke.g:202:1: Multiple token rules can match input
> such as "'{'": T__36, Identifier
> [java]
> [java] As a result, token(s) Identifier were disabled for that input
And again...
Sounds like something may well be a bit wrong with the grammar (would
have to look at it to judge better), but sounds like something is wrong
with the ambiguity detection (or it's falling back to k=1 without saying
so) as well.
Sam
More information about the antlr-interest
mailing list