[antlr-interest] How to handle blackslashes correctly?

Tue Jan 18 08:10:28 PST 2011

Hello everybody.

I've got input files with different meaning for backslashes. Therefore my lexer does not really know how to generate the tokens and the parser does not what I want it to do. Maybe someone can help me checking this?
A backslash before a linefeed means the linefeed is just whitespace, whereas elsewhere it is not.
A backslash in some regions of the file is meant to be part of a file path (Windows).
A backslash in some regions of the file is part of a regular expression. I'm not interested in parsing that, so it shall be handled like a string value.
A backslash before a quote inside a quoted string means the quote does not terminate the string.

I've created a grammar that can handle all cases from my point of view. Now let's look at one fragment:

BrowserMatch \bMSIE !no-gzip !gzip-only-text/html

This should be parsed as
Keyword	BrowserMatch
value	\bMSIE
not	!
value	no-gzip!gzip-only-text/html
not	!
value	gzip-only-text/html

but it is parsed as
Keyword	BrowserMatch
unknown	\b
value	MSIE
...

My expression for value allows backslash and the necessary letters, still the parser thinks it should not recognize this value.
What can be the reason for that?

Hiran
___________________________________________________________
Empfehlen Sie WEB.DE DSL Ihren Freunden und Bekannten und wir
belohnen Sie mit bis zu 50,- Euro! https://freundschaftswerbung.web.de