[antlr-interest] lexer rule for string
Brian Smith
brian-l-smith at uiowa.edu
Tue Oct 15 23:10:33 PDT 2002
stephane brossier wrote:
> Hi,
>
> I am trying to recognize some strings in a C program.
>
> I first had a lexer rule defined as is:
>
> STRING: '"' ~'"' '"';
>
> This worked pretty well until I had some traces like:
> printf("The string is \" the_string \" ");
>
> How can i make the lexer understand that \" is not the
> end of my string but is actually part of my string
> since there is an escape char before?
>
> Thanks,
> S.
Here is what I use to match both single-quoted and double-quoted strings
with escape sequences like \n, \", and \t, including octal escape
sequences. Note that these rules also make sure that the string doesn't
contain any newlines.
options { k=3; }
QUOTED_NAME
: ( '"' ( QUOTED_CHARACTER | '\'' )* '"' )
;
STRING_LITERAL
: ( '\'' ( QUOTED_CHARACTER | '"' )* '\'' )
;
// Note that QUOTED-CHARACTER doesn't allow single OR double quotes.
protected QUOTED_CHARACTER
: (~( '\'' | '"' | '\r' | '\n' | '\\' ))
| '\\' ( ( '\'' | '"' | 'n' | 'r' | 't' | 'b' | 'f' | '\\' )
| OCTAL_DIGIT
(options {greedy=true;} : OCTAL_DIGIT)?
(options {greedy=true;} : OCTAL_DIGIT)?
)
;
protected OCTAL_DIGIT: '0'..'7'
;
- Brian
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
More information about the antlr-interest
mailing list