[antlr-interest] Re: Backslash ambiguity in lexer

Xue Yong Zhi seclib at seclib.com
Mon Jan 30 08:30:20 PST 2006


Craig Williams wrote:
> Hi!
>  
> How would you implement a lexer rule allowing single backslashes as well 
> as normal escaped characters including double quotes within a string?
> For instance if all the following are considered to be valid strings:
>  
> "asd\" "a\"b" "\"
>  
> The below grammar succeeds only for the 2nd case ("a\"b"), it does not 
> resolve the ambiguity
> when the last backslash in the string be interpreted as a lonely 
> backslash, not as an escaped quote.
>  
> STRING_LITERAL
> options { paraphrase = "string literal"; }
>   : '"' (options {greedy=false;}: (ESC)=> ESC | BACKSLASH | ~'"' )* '"'
>   ;
>  

The language you described has ambiguity in it. So far there is no way 
to tell if \" is an escaped character or the end of the string. The 
parser does not know where to go after seeing \".

You have to add other elements into the language to aid the parser. For 
example, if your intention is " followed by a whitespace is the end of a 
string, then you need to tell antlr explicitly.

-- 
Xue Yong Zhi
http://seclib.blogspot.com



More information about the antlr-interest mailing list