[antlr-interest] Re: Backslash ambiguity in lexer
Xue Yong Zhi
seclib at seclib.com
Mon Jan 30 08:30:20 PST 2006
Craig Williams wrote:
> Hi!
>
> How would you implement a lexer rule allowing single backslashes as well
> as normal escaped characters including double quotes within a string?
> For instance if all the following are considered to be valid strings:
>
> "asd\" "a\"b" "\"
>
> The below grammar succeeds only for the 2nd case ("a\"b"), it does not
> resolve the ambiguity
> when the last backslash in the string be interpreted as a lonely
> backslash, not as an escaped quote.
>
> STRING_LITERAL
> options { paraphrase = "string literal"; }
> : '"' (options {greedy=false;}: (ESC)=> ESC | BACKSLASH | ~'"' )* '"'
> ;
>
The language you described has ambiguity in it. So far there is no way
to tell if \" is an escaped character or the end of the string. The
parser does not know where to go after seeing \".
You have to add other elements into the language to aid the parser. For
example, if your intention is " followed by a whitespace is the end of a
string, then you need to tell antlr explicitly.
--
Xue Yong Zhi
http://seclib.blogspot.com
More information about the antlr-interest
mailing list