[antlr-interest] gUnit freezing when choking on invalid input

Fri Jun 27 02:55:56 PDT 2008

Hi all,

I am experiencing an annoying issue with gUnit freezing when parsing
of test input fails. Here is my test case, with ANTLR v3.0 and gUnit
1.0.1.

----------[Expr.g]----------------------
grammar Expr;

options {
	output=AST;
}

tokens {
	PROG;
}

@header {
package com.expr;
}

@lexer::header {
package com.expr;
}

program:
	(expr ';')*							-> ^(PROG expr*)
;

// The following rules are deliberately incorrect
// Should be expr1 ('+' expr1)* and expr2 ('*' expr2)*
// for a real expression parser :-)
expr:
	expr1 '+' expr1						-> ^('+' expr1 expr1)
;

expr1:
	expr2 '*' expr2						-> ^('*' expr2 expr2)
;

expr2:
	INTEGER								-> INTEGER
|	'(' expr ')'						-> expr
;

INTEGER:
	('0'..'9')+
;
-------------------------------------------

----------[Expr.testsuite]-----------
gunit Expr;

@header {
package com.expr;
}

expr:
	"1+2"				-> (+ 1 2)
-------------------------------------------

With this input, gUnit freezes indefinitely. When feeding the parser
directly, I get "line 0:-1 mismatched input '<EOF>' expecting '*'",
which is correct wrt the badly written grammar (see comments above
rules expr and expr1). However gUnit does not see the error and gets
stuck. I have noticed, by looking at the JUnit code that gUnit
generates with the -o option (the JUnit version of the test freezes,
too), that it internally uses threads to pass input to the parser and
retrieve its output. The issue might have something to do some thread
starving on a reader.

This is annoying when run from the command line: the whole test suite
freezes, so you have to execute the parser against every test by hand
to track the actual error. This also makes gUnit unusable in
continuous integration scenarii.

Another oddity is that gUnit behaves as expected, i.e. reporting the
error, when the AST construction is done inline:
-------------------------------------------
expr:
	expr1 '+'^ expr1
;

expr1:
	expr2 '*'^ expr2
;
-------------------------------------------

Shouldn't both construction methods be equivalent?

Any insights on these issues?

Thanks!

Thomas