[antlr-interest] Re: newbie question

Tue Oct 26 09:46:18 PDT 2004

On Tue, 26 Oct 2004 16:32:23 -0000, tsipaggiedad <garyf at austinaggies.com> wrote:
> 
> Cool.. thanks.. the first thing (making it greedy) worked (I tried the
> second one a bit, but couldn't get my actions to work out correctly).
> 
> At any rate.. now.. I need to figure out how to improve it's speed.  I
> assume there is a penalty (that I will have to live with) for
> "greedy=true".  However, are there some things I could do to make this
> speedier?  On my test platform, I'm only parsing about 15k lines/second.

To be really frank with you, if I were writing a CSV tokenizer
(really, only a lexer is needed), I'd write it by hand.

Writing a basic tokenizer which only uses one character of lookahead
and only needs to provide one token to its consumer (i.e. it doesn't
need to stick tokens into a queue) is pretty easy, and it can be made
very fast if you're willing to use a character pointer and use the
null character as a string terminator. A post I made a few years back
on the comp.compilers newsgroup describes the essence of the process,
written in Delphi/Object Pascal (but it's fairly readable and
translatable into any C-based language, since it uses PChar - the
equivalent of char*):

http://compilers.iecc.com/comparch/article/01-04-039

HTH,

-- Barry Kelly

Yahoo! Groups Links

<*> To visit your group on the web, go to:
    http://groups.yahoo.com/group/antlr-interest/

<*> To unsubscribe from this group, send an email to:
    antlr-interest-unsubscribe at yahoogroups.com

<*> Your use of Yahoo! Groups is subject to:
    http://docs.yahoo.com/info/terms/