[antlr-interest] ENHANCEMENT - Have "lexer grammar" generate recognition for string literals in tokenVocab

Austin Hastings Austin_Hastings at Yahoo.com
Tue Oct 9 06:31:29 PDT 2007


When more than one parser is in use, they need to share a common 
(initial) vocabulary. This can come from the source lexer via the 
tokenVocab option.

But splitting parser and lexer means that *all* parser tokens must be 
emitted by the lexer, which rules out using string literals like 'while' 
or 'if' - the
parser will emit them as part of its vocabulary, but the lexer doesn't 
generate those token numbers.

I would like the "lexer grammar" mode of antlr to be extended to accept 
a tokenVocab (it currently does this) and further to generate the 
appropriate tokens for string literals it finds in the tokenVocab.

This way, if my parser.g file contains:

while: 'while' '(' expr ')' statement ;

the parser will generate a parser.tokens file like:

'while' = 1
'(' = 2
')' = 3
...

and the lexer, via tokenVocab=parser.tokens, could then generate

Temp01: 'while' ;
Temp02: '(';
Temp03: ')';


Thanks,

=Austin


More information about the antlr-interest mailing list