[antlr-interest] Tokens that span across char streams

Stanislav Sokorac sokorac at gmail.com
Wed Aug 26 20:03:05 PDT 2009


That's very true. It's probably because performance is something that you
can have a "feel" for as you look at some code, while memory usage takes a
lot more (system-wide, sometimes) analysis to determine.

Do you have any tips on how to keep the memory usage down for large files?
And, how hungry is "quite hungry"? :) Will parsing 20,000 lines of generic
C-style code take tens, hundreds, or thousands of MBs?

Stan

On Wed, Aug 26, 2009 at 10:30 PM, David-Sarah Hopwood <
david-sarah at jacaranda.org> wrote:

> Stanislav Sokorac wrote:
> > I guess the tricky thing will be to insert this functionality without
> > significantly adding to the run time.. If the stream has to check for
> > macros, and also mux between the regular stream and the macro definition,
> > I'm adding two 'if' checks on every single character. Maybe more if I'm
> also
> > selectively updating character positions.
>
> I wouldn't worry about that. There are already several dynamically
> dispatched method calls per character.
>
> > I could have the lexer signal to the stream when the switch is needed to
> > remove one of those, at least.
> >
> > Or am I over-optimizing here, is lexer already doing way more on every
> > character than I'm talking about here? I am going to be running into some
> > significantly large files, so I'd like to avoid overhead wherever I
> can...
>
> For parsing large files, I would worry more about memory. ANTLR lexer/
> parsers are quite memory-hungry, and typically almost nothing can be gc'd
> until the input is completely parsed and the CharStream and TokenStream
> objects have been discarded.
>
> (FWIW, I think most programmers systematically overestimate the performance
> effect of running additional code, and underestimate the effect of memory
> usage. The latter can absolutely kill performance if it leads to swapping.)
>
> --
> David-Sarah Hopwood  ⚥  http://davidsarah.livejournal.com
>
>
> List: http://www.antlr.org/mailman/listinfo/antlr-interest
> Unsubscribe:
> http://www.antlr.org/mailman/options/antlr-interest/your-email-address
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.antlr.org/pipermail/antlr-interest/attachments/20090826/27b1e4ac/attachment.html 


More information about the antlr-interest mailing list