[antlr-interest] Trouble with ANTLR 3 grammar

Mon Jul 3 16:39:58 PDT 2006

On Jul 1, 2006, at 5:26 AM, Emond Papegaaij wrote:

> On Friday 30 June 2006 20:51, Terence Parr wrote:
>> On Jun 30, 2006, at 11:43 AM, Emond Papegaaij wrote:
>>> I'm printing the tokens (for debugging) before parsing with a
>>> simple while
>>> loop. Maybe this is causing the problem? I've included the
>>> Main.java that,
>>> together with the grammar included in my first mail in this thread,
>>> triggers
>>> the problem.
>>
>> That should be ok; you can try tokens.toString() but that only prints
>> the text by default.
>>
>> Hmm...so when you print that stuff, the channel shows 99 on the
>> whitespace right before going to the parser?
>
> The lexer does hit 'channel=99', but only after the token is  
> already emitted.
> Printing the channel inside the loop in Main shows '0'. For every  
> WS token
> mWS is called twice. It seems that on the first call the token is  
> emitted,
> and on the second call the channel is set. I can't explain why.  
> Here is some
> output with println statements added at the start of the method, at
> the 'channel=99' statement and at the 'emit' statement:
>
> inWS
> emit(11,1,9,0,9,9)

Ok, ANTLR will match WS in guess mode (backtracking) and then do it  
again with feeling.  It only emits at the outermost token rule  
invoked (in case INT invokes DIGIT) and only if you have not emitted  
a token yourself:

             if ( token==null && ruleNestingLevel==1 ) {
                 emit 
(type,line,charPosition,channel,start,getCharIndex()-1);
             }

So, somehow durin backtracking you are setting token (via emit()  
maybe).  Do you have an emit or token assignment inside an init action?

This all works perfectly in my fuzzy java example.

Ter