[antlr-interest] Matching tokens only at certain places
Terence Parr
parrt at cs.usfca.edu
Mon Jun 19 11:00:22 PDT 2006
On Jun 19, 2006, at 8:08 AM, Emond Papegaaij wrote:
> On Monday 19 June 2006 16:13, Emond Papegaaij wrote:
>> On Monday 19 June 2006 15:32, Emond Papegaaij wrote:
>> This is what the resulting s0 DFA state in the generated code
>> looks like:
>>
>> public DFA.State transition(IntStream input) throws
>> RecognitionException {
>> int LA3_0 = input.LA(1);
>> if ( LA3_0=='{' && (sig)) {return s1;}
>> if ( LA3_0=='}' && (sig)) {return s2;}
>> if ( LA3_0=='i' && (sig)) {return s3;}
>> if ( LA3_0==';' ) {return s4;}
>> if ( (..)||(..)||(..) && (sig)) {return s5;}
>> if ( (..)||(..)||LA3_0==' ' && (sig)) {return s6;}
>> if ( (..)||..||(..)||(..)||(..)||(..)||..||(..) && (sig))
>> {return s7;}
>> NoViableAltException nvae =
>> new NoViableAltException("", 3, 0, input);
>> throw nvae;
>> }
>>
>> It is clear that this disables all paths except "LA3_0==';'" when
>> 'sig' is
>> false. As a result the lexer will only except ';' tokens as long
>> as 'sig'
>> is false. Am I using the {..}?=> predicates incorrectly?
>
> Well, replying to myself again.
>
> I've managed to get my example to parse correctly, but I had to
> perform some
> weird tricks. First I had to re-enable the DFA paths that would
> accept input
> that could have been lexed as METHOD_SIG_ACTION. That meant I had to
> put '{!sig}?=>' predicates in all lexical rules.
That makes sense I think. ANTLR can guess !sig but only if there is
1 other unpredicated path. You must tell antlr how to gate all
ambiguous paths if you use a predicate. Note that ~';' is pretty
much anything and will therefore conflict with every other rule.
> However when generating the lexer, it still didn't work. Notice how
> ANTLR puts
> the semantic predicates in the if statements:
> if ( compareChar ('||' compareChar)* '&&' predicate)
> In Java '&&' takes precedence over '||'. This results in the
> predicate only
> effecting the last character comparison. I believe this is a bug in
> the
> generated code. Grouping all character comparisons together in the
> generated
> code made my grammar work.
Doh! Consider me a moron. Sorry about that...adding to bug fix list
(well, will take 3 seconds to fix). Go into templates/Java/Java.stg
and change
dfaEdge(labelExpr, targetState, predicates) ::= <<
if ( <labelExpr> <if(predicates)>&& (<predicates>)<endif>) {
<targetState>
}
>>
to have (...) around <labelExpr> and same for
cyclicDFAEdge(labelExpr, targetStateNumber, edgeNumber,
predicates) ::= <<
if ( <labelExpr> <if(predicates)>&& (<predicates>)<endif>) {s =
<targetStateNumber>;}<\n>
>>
Sorry about that.
> To come back to the first problem. Is it really preferable to let
> the gated
> semantic predicates disable all paths that /could/ lead to a certain
> alternative? To me it seems more logical to let the predicates
> remove all
> paths that /will/ lead to a certain alternative. But I might be
> missing
> something obvious.
The gated predicate gates all sequences associated with that token in/
out. It dynamically alters the prediction DFA to not see certain
paths. This lets you turn off various tokens when a predicate is
false. By default all tokens are visible with a {true}=> gated
predicate.
Ter
More information about the antlr-interest
mailing list