[antlr-interest] Easier way to do string literals?
Gavin Lambert
antlr at mirality.co.nz
Sun Oct 14 23:57:29 PDT 2007
At 18:13 15/10/2007, Rick Mann wrote:
>StringLiteral returns [String s]
> : '"' StringGuts '"' { $s = $StringGuts.text; }
> ;
[...]
>But it's not really working quite like I'd expect. The resulting
>text includes the quotes, and the escapes don't seem to really
>turn into the actual characters (I realize I need something more
>there).
Lexer rules don't support return values (since they already have a
return value: the token), so your "returns" block won't have any
effect there. That's why you're still getting the quotes. (There
should be a warning/error message about this, but apparently
that's not possible until ANTLR3 becomes self-hosted.)
There's an example in the wiki showing how to get rid of the
quotes by using setText, which is probably what you want
instead. (FYI: setText creates a copy of the token text, whereas
emit will use the same text as the main token stream. Which means
emit is faster but a little more finicky -- and not really
suitable in your case, since you also want to munge the internal
text by parsing the escapes.)
>I need to also run through the text and handle the escapes.
This
>seems like the wrong approach, since it means I'm writing parse
>code in Java, which strikes me as underutilizing ANTLR.
Well, you're always going to have to write your own escape-parsing
code, since ANTLR can't make any guesses about what you want \n to
mean. Maybe it's a newline; maybe it's a placeholder for "the
contents of variable 'n'", maybe it's something even more
esoteric.
StringLiteral
: '"' StringGuts '"' { setText(ParseEscapes($StringGuts.text));
}
;
And yes, you have to use setText at this level. setText has no
effect in a fragment rule, so you can't handle it inside
EscapeSequence itself. Which would be nice, but it's just not
possible without a lot of dancing around.
More information about the antlr-interest
mailing list