[antlr-interest] lexing shell-like strings
Colin Walters
walters at verbum.org
Wed Jan 7 15:35:14 PST 2009
I have a project for which I would like to lex strings that have
somewhat "Unix shell-like" quoting semantics. Unix shell strings are
quite funky, but I'd be happy if I could express the following:
// Some straightforward stuff
somestring => [Token("somestring")]
"somestring" => [Token("somestring")]
two strings => [Token("two"), Token("strings")]
"one string" => [Token("one string")]
"error => parse error
// Here it gets a bit more subtle
one"string" => [Token("onestring")]
one"string"only => [Token("onestringonly")]
one\"string\"only => [Token("one\"string\"only")]
etc. At this point I don't need to replicate the differences between
' and ", though knowing how to would be interesting.
I'm sort of embarassed to show you my attempts, but I've attached the
closest I have. It doesn't work for the one\"string\" case though.
One thing I wanted to try but couldn't find much documentation on is
writing a essentially a totally custom lexer; I know how to parse
these strings in raw Java, but it wasn't completely clear to me which
methods to override, etc. Ideally of course I could express these
strings in the ANTLR lexer language, hopefully someone can point me
the right way there!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: shellLike.g
Type: application/octet-stream
Size: 258 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20090107/ddc15201/attachment.obj
More information about the antlr-interest
mailing list