[antlr-interest] operator inside a string

Gavin Lambert antlr at mirality.co.nz
Tue Jul 7 06:38:48 PDT 2009


At 22:03 7/07/2009, Bob Night wrote:
>I have a following grammar. Most of the time it works fine. The 
>problem begins when I try to parse a string like this one:
>
>"test_input OPERATOR another_test_input"
>
>The operator inside quotes is still recognized as a OPERATOR 
>token, while I'd like it to be recognized as a WORD token that is 
>part of the quote.
>
>grammar test;
>start_rule    :    expr (OPERATOR expr)* EOF;
>expr    :  quote | WORD;
>quote   :    '"' WORD+ '"';
>OPERATOR : 'OPERATOR';
>WORD     :    ('a'..'z'|'A'..'Z')+;

Unless the intent of this grammar is specifically to break the 
strings into WORDs, then you should probably change quote into a 
lexer rule like so:

   QUOTE : '"' (~'"')* '"' ;

This will produce a single QUOTE token for the entire quoted text 
(including the quotes themselves), regardless of whether the text 
contains keywords, symbols, or other things that aren't valid 
WORDs (like digits).



More information about the antlr-interest mailing list