[antlr-interest] How to create a case-insensitive parser using theC target?

Jim Idle jimi at temporal-wave.com
Tue Jul 10 12:28:38 PDT 2007


I have added case insensitivity to the C input streams, but this is not
part of the currently shipping library and I am still testing it so have
not yet submitted it (probably today though, especially is someone wants
it).

All you do is the following:

// Create the input stream using the supplied file name
// (Use antlr3AsciiFileStreamNew for UCS2/16bit input).
//
input	= antlr3AsciiFileStreamNew(fName);

// This lexer has its tokens specified in upper case only and then we
tell it
// to do upper case converted comparisons with the input stream. The
tokens preserve
// the case of the text that actually matched but matched in a case
insensitive way.
//
input->setUcaseLA(input, ANTLR3_TRUE);


The method call installs a version of LA() that always returns toupper()
on the input char (which means that it does not alter the actual input
stream, but will MATCH in upper case. You then specify all your keywords
in upper case only.

This has been done for the 8 bit input stream only at the moment though
I will add 16 bit UCS2 before too long. 

Also note that it only uses toupper() which means the system you are
using needs to have a locale sensitive version. IN general, for more
complicated streams you would use my supplied example to write a version
of the function that uses say IBM's ICU package (which I do not want the
standard runtime to be dependent on of course).


Watch the Fisheye site and when you see me check this change in you can
get the dist tarball and give it a try. For now, specify your tokens in
upper case only and test with upper case until you have the update.

Jim

> -----Original Message-----
> From: antlr-interest-bounces at antlr.org [mailto:antlr-interest-
> bounces at antlr.org] On Behalf Of troy runkel
> Sent: Tuesday, July 10, 2007 11:22 AM
> To: ANTLR mail-list
> Subject: [antlr-interest] How to create a case-insensitive parser
using
> theC target?
> 
> I'm using ANTLR v3 to build a parser that works with the C target.
> Currently the parser is case-sensitive and I need it to be
> case-insensitive.  It looks like there have been a number of
> discussions regarding case-insensitive parsers over the last year or
> so, but I couldn't find anything describing how to setup a
> case-insensitive parser for the C target.  Anybody out there know how
> to do this?  Thanks.
> 
> Troy Runkel


More information about the antlr-interest mailing list