[antlr-interest] Small input and grammar causes out of memory error (Java)

jason.terhune at thomsonreuters.com jason.terhune at thomsonreuters.com
Tue Dec 16 13:50:26 PST 2008


Hi all,

I was playing with the TDD example on the wiki, and I was surprised to
see an out of memory error with my trivial grammar and input using
v3.1.1.  After I fixed my grammar by adding a space to the NONBREAKING
definition, the problem went away:
NONBREAKING : ('a'..'z' | 'A'..'Z' | ' ');

I won't pretend to understand parser generators, but it seems like this
problem should fail fast instead of consuming a bunch of memory.  Is
there an option I can set to avoid this?  Should I submit this as a bug?
I've pasted the grammar, test case and stack trace below.

Thanks,
Jason


--- grammar ---

grammar CSV;

options {
	language = Java;
}

@header {
  package com.trgr.parser;
}

@lexer::header {package com.trgr.parser;}

line returns [List<String> result]
@init {
    result = new ArrayList<String>();
}
:  term NEWLINE;

term returns [String parsedItem]
  : f=TERM { $parsedItem = $f.text;}
  |   // nothing
  ;

NEWLINE : '\r'? '\n';

NONBREAKING : ('a'..'z' | 'A'..'Z');

TERM : NONBREAKING*;


--- junit test ---

package com.trgr.parser;

public class CSVParserTest {
	@Test
	public void testMultipleWords() throws IOException,
RecognitionException {
	    CSVParser parser = createParser("Red Blue\n");
	    List<String> result = parser.line();
	    assertEquals(2, result.size());
	    assertEquals("Red", result);
	    assertEquals("Blue", result);
	}
	
	private CSVParser createParser(String testString) throws
IOException {
	    CharStream stream = new ANTLRStringStream(testString);
	    CSVLexer lexer = new CSVLexer(stream);
	    CommonTokenStream tokens = new CommonTokenStream(lexer);
	    CSVParser parser = new CSVParser(tokens);
	    return parser;
	}
}

--- exception ---

java.lang.OutOfMemoryError: Java heap space
	at java.util.Arrays.copyOf(Unknown Source)
	at java.util.Arrays.copyOf(Unknown Source)
	at java.util.ArrayList.ensureCapacity(Unknown Source)
	at java.util.ArrayList.add(Unknown Source)
	at
org.antlr.runtime.CommonTokenStream.fillBuffer(CommonTokenStream.java:11
6)
	at
org.antlr.runtime.CommonTokenStream.LT(CommonTokenStream.java:238)
	at
org.antlr.runtime.CommonTokenStream.LA(CommonTokenStream.java:300)



More information about the antlr-interest mailing list