[antlr-interest] Re: Error: How to deal with Special characters?

David Wigg wiggjd at lsbu.ac.uk
Mon Jul 25 03:24:49 PDT 2005


Hello,

What interested me about the message from Prekumar of 24 July 
was how in some source code a hyphen ("-") could become 
displayed as a "u" circumflex ("û")in DOS mode when the ISO 
8859-1 value of the first is 45 and the second 251 (with a 
difference of 206.

What happens when you use the hyphen for the subtraction 
operator in your source code?

What is the significance of it being in a comment?

What coding system is being used in the source code?

Is this a problem with a particular IDE?

Are we talking MS or UNIX?

Yours puzzled,

David.

Original message:

Message: 4
Date: Sun, 24 Jul 2005 06:57:31 -0700 (PDT)
From: Premkumar Rathanavelu <rprememail at yahoo.com>
Subject: [antlr-interest] Error: How to deal with Special 
characters?
To: antlr-interest at antlr.org
Message-ID: <20050724135731.27121.qmail at web80908.mail.scd.yahoo.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Everyone.,
   In source codes often comments comes with some kind of
   special characters like û etc.,
Consider a Comment Line:
/*  Computer Software - Restricted Rights */

In the above comment line, hyphen ('-') between "Software" and 
"Restricted"
looks normal but when we view that in DOS editor it shows
'-' as û.
My comment line token
Comment
  : "/*"
   ( {LA(2) != '/'}? '*'
   | EndOfLine //{newline();}
   | ~('*'| '\r' | '\n')
   )*
   "*/"  {$setType(Token.SKIP);}// newline();}
  ;

So, I placed a token with that special character in the parser.
But still I'm getting error. The file could not be parsed anymore.

I'm a newbie..please help me to overcome the error.

Thanks in advance.,
Prem



More information about the antlr-interest mailing list