[antlr-interest] Recovering white space in V3.0
Terence Parr
parrt at cs.usfca.edu
Mon Jun 13 11:28:31 PDT 2005
On Jun 11, 2005, at 7:58 AM, Andy Tripp wrote:
> Terence,
> I'm currently testing my Jazillian translator
> on gcc's libc. It's about 800,000 lines, and I keep all that as
> token streams in memory.
Hi :)
For all files all at once or one file at a time?
> It's
> not a pretty sight, and I'm off to buy more memory because my 1GB
> is no longer enough :( I'll be doing lots
> of memory profiling - I'm sure it's my fault, not yours :)
Hmm...i wonder if it's my fault!
> ...speaking of things being your fault...
> I spent the past week doing CPU profiling. One bottleneck
> for me was that makeToken() uses reflection (calling newInstance()) -
> I now have a setTokenFactory() method so that I provide my own
> makeToken() method.
I wondered how slow the reflection was...good to know. I'm avoiding
it 3.0
> And now after fixing my own bottlenecks, nextToken() and LA() are
> right near the top in my list of CPU hogs :)
Well, the lexer is slow, whence, the nextToken speed. LA is called a
HUGE amount in 2.x, repeatedly even in the same decision. 3.0
decisions are optimal in that they call input.LT(i) for token i at
most once during a single decision.
> That's when I know
> I must be done, when I can say "I've done about all I can do, and
> the rest must be Terence's fault" ;)
Sounds like a South Park episode..."Blame Terence!" Though, they
misspell "terrence and phillip".
> Obviously, I'm just kidding, and I love ANTLR, even if I don't
hooray!
> believe in treewalkers (or even believe in AST-generating
> parsers much, now that I think about it - guess I'm a lexer man).
:) "To thine own self be true!" :)
Ter
--
CS Professor & Grad Director, University of San Francisco
Creator, ANTLR Parser Generator, http://www.antlr.org
Cofounder, http://www.jguru.com
More information about the antlr-interest
mailing list