[antlr-interest] Incremental Parsing?

David Piepgrass qwertie256 at gmail.com
Thu Jul 19 07:23:53 PDT 2007


Sohail's suggestion makes the most sense to me. But if you just have a
lexer, not a parser, all you have to do is call NextToken() on your
lexer (don't even bother using CommonTokenStream).

IIRC, all solutions suggested will suffer from the problem that the
entire file will be loaded into memory (although parsing can still be
incremental). I wonder if that file-mapping stream is availaboe
yet--potentially it could eliminate that problem, although depending
on what you do with the records after you get them, having the entire
file loaded by the end might actually save memory because String
objects don't have to be created for each and every token...

On 7/16/07, Benji Smith <benji at benjismith.net> wrote:
> Is it possible (and straightforward) to implement an incremental
> parser with antlr?
>
> The reason I ask is that I recently wrote a JavaCC parser for CSV
> files (which correctly handles quoted fields, quote escaping, and
> fields containing newline characters (a huge pain in the ass if you're
> trying to parse CSV without writing a grammar)).
>
> The JavaCC parser works well, except that I can't parse the file
> record-by-record. I have to parse the entire file, returning a
> collection of Record objects at the end.
>
> With small files, that's fine. But I sometimes need to parse CSV files
> with millions of records, and it'd be very nice to simply pull the
> next record (as long as it complies with the grammar), much like you'd
> pull the next token from a TokenStream.
>
> Any tips on performing incremental parsing with antlr?
>
> Thanks!
>
> --benji smith
>


-- 
- David
http://qism.blogspot.com


More information about the antlr-interest mailing list