[antlr-interest] Out of memory - how to avoid retaining all tokens??
John Pool
j.pool at ision.nl
Thu Mar 4 23:32:35 PST 2010
I have an ANTLR application with which I process a large (100Mb) file by
means of a filter grammar (@options{filter=true;}), in search of a simple
pattern. No backtracking is necessary. Each time the pattern is found, a
call is made to a (C#) method that does some bookkeeping.
The file is scanned with the following loop:
while (lexer.NextToken () != Token.EOF_TOKEN) { }
When the pattern is encountered, the C# method is called. From 'the book' at
section 5.8 (filter option) I understood that 'the lexer yields an
incomplete stream of tokens'. This, however, does not prevent an out of
memory exception to occur after a while.
How can I prevent this from happening? No backtracking whatsoever needs to
occur, so I do not need a token history to be retained during the execution
of the above loop. I have tried inserting {Skip();} statements, but this
does not seem to help.
I noticed that the exception does not occur (and scanning the file goes
considerably faster) when in lexer.NextToken() I comment out
int m = input.Mark();
and
input.Rewind(m);
but I am not sure what undesired effect this may have.
Regards, John Pool
More information about the antlr-interest
mailing list