[antlr-interest] newbie lookahead question

Fri Apr 21 22:05:59 PDT 2006

John,
Thanks for the help. What you say sounds clear and I read the 
documentation on TestLiterals=true. I thought, aha, that is the key, 
just turn the TestLiterals to true and all will be fine.

However when I try it in a grammar it doesn't seem to work. Following is 
a test grammar I made up. When I give the parser the string
"activate on" it comes up with the message Parse error: line 1:1: 
unexpected char: 'a'.

When I uncomment the three rules (ACTIVATE,ON and OFF) it parses fine 
and gives me a tree with the ACTIVATE token as the main node and one 
child of the token ON. Which is exactly what I wanted.
(In this case I am surprised that the tokens section does not create an 
ambiguity with those lexer rules.)

I checked the code of the lexer and the hash table is being generated to 
look up the three literals. However the lexer stubbornly refues to 
output the token ACTIVATE when I just have them defined in the tokens 
section.

I'm probably doing something really stupid here, but I'm quite puzzled.

Thanks for your help,
Lance

class TestLexer extends Lexer;
options
{
    testLiterals = true;
    k=2;
}

tokens{ ACTIVATE="activate"; ON="on";OFF="off";}
//ACTIVATE: "activate";
//ON: "on";
//OFF: "off";
//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// Whitespace -- ignored
//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
WS    :    (    ' '
        |    '\t'
        |    '\f'
            // handle newlines
        |    (    options {generateAmbigWarnings=false;}
            :    "\r\n"  // Windows
            |    '\r'    // Macintosh
            |    '\n'    // Unix
            )
            { newline(); }
        )+
        { _ttype = Token.SKIP; }
    ;

class TestParser extends Parser;
options
{
        buildAST=true;
        k = 1;
        defaultErrorHandler=false;
}

statement: ACTIVATE^ (ON | OFF);

John B. Brodie wrote:

>Sir :-
>
>  
>
>>Well maybe not. It seems I was wrong about the tokens section. It 
>>doesn't specify lexer rules so the tokens aren't detected and put into 
>>the token stream for the parser. Ah well. It seemed like a good idea at 
>>the time.
>>
>>Lance
>>    
>>
> 
>You are not wrong about the tokens{...} lexer section.
>
>The tokens{...} section operates in concert with the testLiterals=true
>option. Please review the antlr documentation for testLiterals.
>
>You are able to set the options{ testLiterals=true; } either at the global
>level so that all rules in your lexical inspect the tokens{...} generated map
>or you can set the options{ testLiterals=true; } on only those specific lexer
>rules that are pertinent (i prefer the latter).
>
>And, oh by the way, that stuff between the "s in the tokens{...} section *IS*
>a lexer rule --- it means:
>
>         'match this explicit string literal when testLiterals is true'
>
>
>(now if we only had a way to specify synonyms in the tokens{...} section,
>e.g. tokens{ TRUE="true","YES"; FALSE="false","NO"; } then life really would
>be easy ;-)
>
>Hope this helps...
>   -jbb
>
>  
>