[antlr-interest] newbie lookahead question
Lance Gutteridge
lance at thinkingworks.com
Fri Apr 21 22:05:59 PDT 2006
John,
Thanks for the help. What you say sounds clear and I read the
documentation on TestLiterals=true. I thought, aha, that is the key,
just turn the TestLiterals to true and all will be fine.
However when I try it in a grammar it doesn't seem to work. Following is
a test grammar I made up. When I give the parser the string
"activate on" it comes up with the message Parse error: line 1:1:
unexpected char: 'a'.
When I uncomment the three rules (ACTIVATE,ON and OFF) it parses fine
and gives me a tree with the ACTIVATE token as the main node and one
child of the token ON. Which is exactly what I wanted.
(In this case I am surprised that the tokens section does not create an
ambiguity with those lexer rules.)
I checked the code of the lexer and the hash table is being generated to
look up the three literals. However the lexer stubbornly refues to
output the token ACTIVATE when I just have them defined in the tokens
section.
I'm probably doing something really stupid here, but I'm quite puzzled.
Thanks for your help,
Lance
class TestLexer extends Lexer;
options
{
testLiterals = true;
k=2;
}
tokens{ ACTIVATE="activate"; ON="on";OFF="off";}
//ACTIVATE: "activate";
//ON: "on";
//OFF: "off";
//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
// Whitespace -- ignored
//++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
WS : ( ' '
| '\t'
| '\f'
// handle newlines
| ( options {generateAmbigWarnings=false;}
: "\r\n" // Windows
| '\r' // Macintosh
| '\n' // Unix
)
{ newline(); }
)+
{ _ttype = Token.SKIP; }
;
class TestParser extends Parser;
options
{
buildAST=true;
k = 1;
defaultErrorHandler=false;
}
statement: ACTIVATE^ (ON | OFF);
John B. Brodie wrote:
>Sir :-
>
>
>
>>Well maybe not. It seems I was wrong about the tokens section. It
>>doesn't specify lexer rules so the tokens aren't detected and put into
>>the token stream for the parser. Ah well. It seemed like a good idea at
>>the time.
>>
>>Lance
>>
>>
>
>You are not wrong about the tokens{...} lexer section.
>
>The tokens{...} section operates in concert with the testLiterals=true
>option. Please review the antlr documentation for testLiterals.
>
>You are able to set the options{ testLiterals=true; } either at the global
>level so that all rules in your lexical inspect the tokens{...} generated map
>or you can set the options{ testLiterals=true; } on only those specific lexer
>rules that are pertinent (i prefer the latter).
>
>And, oh by the way, that stuff between the "s in the tokens{...} section *IS*
>a lexer rule --- it means:
>
> 'match this explicit string literal when testLiterals is true'
>
>
>(now if we only had a way to specify synonyms in the tokens{...} section,
>e.g. tokens{ TRUE="true","YES"; FALSE="false","NO"; } then life really would
>be easy ;-)
>
>Hope this helps...
> -jbb
>
>
>
More information about the antlr-interest
mailing list