[antlr-interest] Matching tokens only at certain places
Emond Papegaaij
e.papegaaij at student.utwente.nl
Mon Jun 19 08:08:01 PDT 2006
On Monday 19 June 2006 16:13, Emond Papegaaij wrote:
> On Monday 19 June 2006 15:32, Emond Papegaaij wrote:
> This is what the resulting s0 DFA state in the generated code looks like:
>
> public DFA.State transition(IntStream input) throws RecognitionException {
> int LA3_0 = input.LA(1);
> if ( LA3_0=='{' && (sig)) {return s1;}
> if ( LA3_0=='}' && (sig)) {return s2;}
> if ( LA3_0=='i' && (sig)) {return s3;}
> if ( LA3_0==';' ) {return s4;}
> if ( (..)||(..)||(..) && (sig)) {return s5;}
> if ( (..)||(..)||LA3_0==' ' && (sig)) {return s6;}
> if ( (..)||..||(..)||(..)||(..)||(..)||..||(..) && (sig)) {return s7;}
> NoViableAltException nvae =
> new NoViableAltException("", 3, 0, input);
> throw nvae;
> }
>
> It is clear that this disables all paths except "LA3_0==';'" when 'sig' is
> false. As a result the lexer will only except ';' tokens as long as 'sig'
> is false. Am I using the {..}?=> predicates incorrectly?
Well, replying to myself again.
I've managed to get my example to parse correctly, but I had to perform some
weird tricks. First I had to re-enable the DFA paths that would accept input
that could have been lexed as METHOD_SIG_ACTION. That meant I had to
put '{!sig}?=>' predicates in all lexical rules.
However when generating the lexer, it still didn't work. Notice how ANTLR puts
the semantic predicates in the if statements:
if ( compareChar ('||' compareChar)* '&&' predicate)
In Java '&&' takes precedence over '||'. This results in the predicate only
effecting the last character comparison. I believe this is a bug in the
generated code. Grouping all character comparisons together in the generated
code made my grammar work.
To come back to the first problem. Is it really preferable to let the gated
semantic predicates disable all paths that /could/ lead to a certain
alternative? To me it seems more logical to let the predicates remove all
paths that /will/ lead to a certain alternative. But I might be missing
something obvious.
Best regards,
Emond Papegaaij
More information about the antlr-interest
mailing list