[antlr-interest] Out of memory - how to avoid retaining all tokens??

John Pool j.pool at ision.nl
Thu Mar 4 23:32:35 PST 2010

I have an ANTLR application with which I process a large (100Mb) file by
means of a filter grammar (@options{filter=true;}), in search of a simple
pattern. No backtracking is necessary. Each time the pattern is found, a
call is made to a (C#) method that does some bookkeeping. 

The file is scanned with the following loop: 

while (lexer.NextToken () != Token.EOF_TOKEN) { } 

When the pattern is encountered, the C# method is called. From 'the book' at
section 5.8 (filter option) I understood that 'the lexer yields an
incomplete stream of tokens'. This, however, does not prevent an out of
memory exception to occur after a while. 

How can I prevent this from happening? No backtracking whatsoever needs to
occur, so I do not need a token history to be retained during the execution
of the above loop. I have tried inserting {Skip();} statements, but this
does not seem to help.

I noticed that the exception does not occur (and scanning the file goes
considerably faster) when in lexer.NextToken() I comment out 

int m = input.Mark(); 




but I am not sure what undesired effect this may have.


Regards, John Pool


More information about the antlr-interest mailing list