[antlr-interest] Spaces in names

Dave Raskin dave.raskin at rimage.com
Wed Aug 8 14:54:18 PDT 2007


Gavin, thanks for suggestions! Clearly I was sleeping during my compiler
class at school ;-)

I will take a look at the examples,

Dave Raskin 

-----Original Message-----
From: Gavin Lambert [mailto:antlr at mirality.co.nz] 
Sent: Wednesday, August 08, 2007 3:49 PM
To: Dave Raskin
Cc: antlr-interest at antlr.org
Subject: RE: [antlr-interest] Spaces in names

At 01:56 9/08/2007, Dave Raskin wrote:
>Gavin, thanks for the reply!

First of all, make sure you hit the "Reply to All" button so that your
reply goes to the list as well.  You're more likely to get a response
that way.

>BTW, I am using C# generated code - could this be an issue?

No, this is a grammar problem, not a target problem.

>name_blank_name
>: NAME ( BLANK NAME )*
>;

That won't work, because BLANK tokens will still be getting skipped (or
if they're not, then it'll mess up other bits of your grammar).  As I
said, if you want to skip whitespace most of the time but treat it as
significant sometimes, then you have to do it in the lexer.


>I also tried changing the lexer rule, but got a compile error:

What was the error?

>NAME_BLANK_NAME
>: NAME ( BLANK NAME )*
>;

It's not quite as easy as that; since NAME can contain an arbitrary
number of characters, there's ambiguity between NAME and
NAME_BLANK_NAME.  You'd have to write a hybrid type-changing rule if you
wanted to do it this way.

But as I said in my original message, if you're trying to represent a
string value then this is completely the wrong approach, since it will
collapse whitespace.  ie. it would have no way of representing "hello
world" (two spaces) -- it would always translate it into "hello world"
(one space).  Plus, the above doesn't take leading or trailing space
into account.

You need to make the lexer match an entire string (including the
quotes) as a whole.  That way you can preserve the exact contents of the
string.

>Is there an example grammer you can point to, I haven't found one 
>yet...

Have a look at examples/csharp/tinyc/lexer.g, specifically at the
CHAR_LITERAL and STRING_LITERAL rules.  You basically want a hybrid of
the two, though you can probably simplify the escape handling (only
supporting escaping backslashes and single quotes; though that depends
on your particular dialect).




More information about the antlr-interest mailing list