[antlr-interest] Buglet in literal escaping

Johannes Luber jaluber at gmx.de
Thu Feb 12 12:15:36 PST 2009


Jim Idle schrieb:
> Johannes Luber wrote:
>> Jim Idle schrieb:
>>   
>>> Johannes Luber wrote:
>>>     
>>>>> Using '\\"' as a literal in an ANTLR grammar causes ANTLRv3.1.1 to emit:
>>>>>   match("\\"")
>>>>> which is an unterminated string constant.  Oops!
>>>>>     
>>>>>         
>>>> That's no bug, ANTLR does exactly what it is supposed to do. I think you want to use "\\\"" instead.
>>>>   
>>>>       
>>> In fact it was a bug in 3.1.1, take a look at the sequence again :-) It
>>> is a correct spec, assuming that Scott wants to match backslash followed
>>> by doublequote. ANTLR strings are in single quotes and there is no need
>>> to escape double quotes.
>>>
>>> Jim
>>>     
>>
>> But that would suppose that ANTLR would escape the " on its own? And in
>> Java \\\" equals \", so I wasn't completely off target. :)
>>
>> Johannes
>>   
> Each target is responsible for converting the string as seen by ANTLR,
> where " is not required to be escaped, into the required representation
> for the target language. You must unescape the ANTLR string, then escape
> for the target language.
> 
> Thinking about this, you should check the C# target as it probably
> copied the Java code! I think that you can pretty much copy the new Java
> target code, at least algorithmically :-)
> 
> Jim

I'll take a look then - as always, when I sync the code. :)

Johannes


More information about the antlr-interest mailing list