[antlr-interest] Antlr 3 and the newline token problem

Anthony Youngman Anthony.Youngman at eca-international.com
Mon Nov 28 03:32:17 PST 2005


And don't forget, in some languages newlines have semantic meaning.

For example, in DATABASIC, they can be a *block* terminator.

I have a load of code to handle newlines, for precisely this reason. Oh
- and don't forget you're jumping to assumptions that a newline is \r
and/or \n. I've worked on a system where it could be (iirc) \n or \n\0.
Lines were stored in an exact multiple of 16-bit words, null-padded ...

My preferred solution would be a special code eg '\lf' which is platform
dependent, but even that screws up - I do a load of work on Windows then
process it on linux - it's nice to work with tools that don't care.

Cheers,
Wol

-----Original Message-----
From: antlr-interest-bounces at antlr.org
[mailto:antlr-interest-bounces at antlr.org] On Behalf Of Martin Probst
Sent: 25 November 2005 23:29
To: antlr-interest at antlr.org
Subject: RE: [antlr-interest] Antlr 3 and the newline token problem

Am Freitag, den 25.11.2005, 20:58 +0000 schrieb Micheal J:
> Please don't post HTML mail to the list. 

Yeah, please don't do that. You too ;-)

> WS : '\r' '\n' {newline();}
>        | '\r'    {newline();}
>        | '\n'   {newline();}
>      ;

What about

NL: (('\r' ( '\n' /* Windows */ | /* empty, MAC OS <=9 */ )
    ) | '\n' /* the proper way ... */ ) { newline(); };

(untested, but works, doesn't it?).

It's a documentation issue, just put it into a prominently linked FAQ.
Also tell people which method to override in case they want to count
from 0 (e.g. "return super.getLine() - 1;").

I would opt against doing the newline magic behind the scenes
automagically. Doing newlines properly is a good exercise to get started
with ANTLR, and those who can't get it to work with some documentation
have much bigger problems. On the other hand, there are people parsing
binary files (line what?), people using Eclipse (that's the offset
people ;-)), people who want to provide proper line/column numbers to
users. It's easy to do all of these, just some minor Java work. IMHO
ANTLR should keep out of that business.

Martin


* ************************************************************************ *

This transmission is intended for the named recipient only. It may contain private and confidential information. If this has come to you in error you must not act on anything disclosed in it, nor must you copy it, modify it, disseminate it in any way, or show it to anyone. Please e-mail the sender to inform us of the transmission error or telephone ECA International immediately and delete the e-mail from your information system.

Telephone numbers for ECA International offices are: Sydney +61 (0)2 8272 5300, Hong Kong + 852 2121 2388, London +44 (0)20 7351 5000 and New York +1 212 582 2333.

* ************************************************************************ *



More information about the antlr-interest mailing list