[antlr-interest] Testing lexer grammars with gunit

Wed Nov 25 00:16:16 PST 2009

On Nov 24, 2009, at 12:17 AM, Gavin Lambert wrote:

> At 20:54 24/11/2009, Leon Su wrote:
>> gUnit P;
>> lexical-rule-name:
>> "input" OK
>> ...
>>
>> By the way, the next release of gUnit will allow you to test a  
>> lexer grammar individually with the syntax: gUnit lexer L;
>
> Does gUnit only support that kind of limited testing?  (I ask out of  
> ignorance; I've never really looked at it.)
>
> For lexer rules in particular, "OK" is a fairly meaningless test.   
> What'd be better is something like:
>
> gUnit P;
> LEXER:
>  "abc" ID
>  "abc123" ID
>  "123" INT
>  "a+b" ID["a"] PLUS ID["b"]
>  "a--b" ID DECREMENT ID
>  "a- -b" ID MINUS MINUS ID
>
> etc.  Then you could do lexer-only testing for lexer grammars and  
> lexer-and-parser testing for combined grammars.

gUnit treats every rule as a unit which is the smallest testable part  
of a grammar, and it tests whether individual units of the grammar are  
fit for use.
Therefore, the test you recommended above could be rewritten in the  
gUnit format as below:

gUnit P;
ID:
"abc" OK
"abc123" OK
"a" OK
"b" OK
INT:
"123" OK
PLUS:
"+" OK
"-" FAIL
...

But I also like your idea of the token-stream style testing for lexer  
grammars.

-L