[antlr-interest] tokenVocab option leads to incomplete DFA in lexer
Erik Kratochvíl
discontinuum at gmail.com
Fri Jan 25 05:29:32 PST 2008
Finally, I have found a way to fool the grammar generator, although this page
http://www.antlr.org/wiki/display/ANTLR3/Migrating+from+ANTLR+2+to+ANTLR+3
states that it is not be possible to assign token types to certain literals
("Apparently, 'testLiterals' on tokens is no longer allowed (it is now
unnecessary)." )
and comments in the org.antlr.tool.AssignTokenTypesWalker.java claim that
// if lexer, don't allow aliasing in tokens section
If you create Basic.tokens file that contains these lines
DEFINE=101
DECLARE=102
and then, in E.g (in the lexer grammar) you create a special lexer
rule for selected literals like this
DECLARE: 'declare';
DEFINE: 'define';
everything will work :)
The whole grammar:
grammar E;
options {
tokenVocab = Basic;
output = AST;
ASTLabelType = CommonTree;
}
program : ( statement )+ ;
statement
: DEFINE ID '=' INT ';'
| DECLARE ID ';'
;
DECLARE : 'declare';
DEFINE : 'define';
ID : ('a'..'z'|'A'..'Z')+ ;
INT : '0'..'9'+ ;
WS : ( '\n' | '\r' | ' ' | '\t' )+ { $channel = HIDDEN; } ;
The only drawback is that you have to use DECLARE instead of 'declare'
in the parser grammar
(but this may also be perceived as an advantage because if you
misspell DECLARE antlr.Tool will detect it).
The generated E.tokens file contains correct values
DEFINE=101
INT=104
WS=105
DECLARE=102
ID=103
'='=106
';'=107
--
Erik Kratochvíl
More information about the antlr-interest
mailing list