[antlr-interest] Matching tokens only at certain places

Terence Parr parrt at cs.usfca.edu
Mon Jun 19 11:00:22 PDT 2006


On Jun 19, 2006, at 8:08 AM, Emond Papegaaij wrote:

> On Monday 19 June 2006 16:13, Emond Papegaaij wrote:
>> On Monday 19 June 2006 15:32, Emond Papegaaij wrote:
>> This is what the resulting s0 DFA state in the generated code  
>> looks like:
>>
>> public DFA.State transition(IntStream input) throws  
>> RecognitionException {
>>   int LA3_0 = input.LA(1);
>>   if ( LA3_0=='{' && (sig)) {return s1;}
>>   if ( LA3_0=='}' && (sig)) {return s2;}
>>   if ( LA3_0=='i' && (sig)) {return s3;}
>>   if ( LA3_0==';' ) {return s4;}
>>   if ( (..)||(..)||(..) && (sig)) {return s5;}
>>   if ( (..)||(..)||LA3_0==' ' && (sig)) {return s6;}
>>   if ( (..)||..||(..)||(..)||(..)||(..)||..||(..) && (sig))  
>> {return s7;}
>>   NoViableAltException nvae =
>>     new NoViableAltException("", 3, 0, input);
>>   throw nvae;
>> }
>>
>> It is clear that this disables all paths except "LA3_0==';'" when  
>> 'sig' is
>> false. As a result the lexer will only except ';' tokens as long  
>> as 'sig'
>> is false. Am I using the {..}?=> predicates incorrectly?
>
> Well, replying to myself again.
>
> I've managed to get my example to parse correctly, but I had to  
> perform some
> weird tricks. First I had to re-enable the DFA paths that would  
> accept input
> that could have been lexed as METHOD_SIG_ACTION. That meant I had to
> put '{!sig}?=>' predicates in all lexical rules.

That makes sense I think.  ANTLR can guess !sig but only if there is  
1 other unpredicated path.  You must tell antlr how to gate all  
ambiguous paths if you use a predicate.  Note that ~';' is pretty  
much anything and will therefore conflict with every other rule.

> However when generating the lexer, it still didn't work. Notice how  
> ANTLR puts
> the semantic predicates in the if statements:
>  if ( compareChar ('||' compareChar)* '&&' predicate)
> In Java '&&' takes precedence over '||'. This results in the  
> predicate only
> effecting the last character comparison. I believe this is a bug in  
> the
> generated code. Grouping all character comparisons together in the  
> generated
> code made my grammar work.

Doh!  Consider me a moron.  Sorry about that...adding to bug fix list  
(well, will take 3 seconds to fix).  Go into templates/Java/Java.stg  
and change

dfaEdge(labelExpr, targetState, predicates) ::= <<
if ( <labelExpr> <if(predicates)>&& (<predicates>)<endif>) {
     <targetState>
}
 >>

to have (...) around <labelExpr> and same for

cyclicDFAEdge(labelExpr, targetStateNumber, edgeNumber,  
predicates) ::= <<
if ( <labelExpr> <if(predicates)>&& (<predicates>)<endif>) {s =  
<targetStateNumber>;}<\n>
 >>

Sorry about that.

> To come back to the first problem. Is it really preferable to let  
> the gated
> semantic predicates disable all paths that /could/ lead to a certain
> alternative? To me it seems more logical to let the predicates  
> remove all
> paths that /will/ lead to a certain alternative. But I might be  
> missing
> something obvious.

The gated predicate gates all sequences associated with that token in/ 
out.  It dynamically alters the prediction DFA to not see certain  
paths.  This lets you turn off various tokens when a predicate is  
false.   By default all tokens are visible with a {true}=> gated  
predicate.

Ter



More information about the antlr-interest mailing list