[antlr-interest] Recovering white space in V3.0

Thu Jun 9 12:22:42 PDT 2005

On Jun 9, 2005, at 11:44 AM, Andy Tripp wrote:

> On this issue of "common" vs. "extreme" and whether to buffer all  
> tokens or not...
>
> I think the main thing is that you want ANTLR to provide all the  
> functionality that
> most people might want, and let people override (or turn off) stuff  
> they don't want.
> Some people may want tokens buffered  and others won't. Better to  
> provide the buffering
> code and let people turn it off, rather than not buffering  
> everything and having people who need
> it have to write code to do it.
>
> In other words, have CommonTokenStream do buffering and then maybe  
> provide a
> alternative LeanTokenStream that doesn't. But don't just provide  
> LeanTokenStream,
> because then people will have to write their own buffering code.

Exactly my plan, Andy! :)

As I noted privately this morning to Bryan Ewbank, I parsed 90,000  
line C++ header files with my 90Mhz 64M RAM NeXT box 10 years ago  
with no ill effect (PCCTS buffered it all up to do syntactic  
predicates).  I estimate for Bryan's 100,000 line files, you might  
consume 30M in Java to buffer all text and all tokens.

Ter
--
CS Professor & Grad Director, University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com