[antlr-interest] Easier way to do string literals?
Gavin Lambert
antlr at mirality.co.nz
Mon Oct 15 00:29:47 PDT 2007
At 20:18 15/10/2007, Vaclav Barta wrote:
>quotedString returns [ String value ]
>@init {
> StringBuffer sb;
>} : {
> sb = new StringBuffer();
>}
> DQUOTE (
> EscapeSequence { sb.append($EscapeSequence.getText()); }
> | BareString { sb.append($BareString.getText()); }
> )* DQUOTE { $value = sb.toString(); }
> ;
That sort of thing is fine if all you're parsing is string
constants, but in a larger language it loses (apart from anything
else, you've probably got an auto-whitespace-stripper, whereas
whitespace needs to be preserved within strings). And you're
quite likely going to get random Identifier and Number etc tokens
in there, not just EscapeSequences and BareStrings. And unmatched
comments, too -- block and line comment markers within the scope
of a string have to be treated as part of the string, not as a
comment. So that's something else you'd have to hoist to parser
level if you did things this way. It's just messy.
Now what you *could* do is to treat it like the island grammar
example and have a separate ANTLR grammar for parsing the
internals of strings, but that seems excessive to me for what
amounts to a simple string replace operation.
More information about the antlr-interest
mailing list