[antlr-interest] Lexer rule problem - ANTLR 2.7.7

Fri Feb 1 11:32:19 PST 2008

Hey, Gavin, thanks so much for the personalized reply!  I am continually
impressed by the level of interest and support form members of this group.

I have been using the gUnit environment for catching cases like this as you
suggest, and I'm reasonably happy with that approach.  
However there are grammar constructs (such as the one in the example) which
should (may) be detectable as ambiguous without having to create a test.
The problem with a test is that it is not going to be portable to the next
grammar I develop.

To some extent antlr already does this - for example, it can spot when a
rule has two ambiguous paths and will warn you.  I guess I was prompting for
this facility to be improved.  A prime example might be a check to ensure
that the cardinality of each token in a parse rule matches that in the tree
construction.  For example, to catch the fault in the following:
 my_rule: 
	a b+ c*  -> ^(a b+ c+)

...which gives a RewriteEmptyStreamException if the right (wrong) input is
given.  I could write a test case, but only if I spot it.  The first person
to trip over this might be my customer...

I love ANTLR, but boy is it frustrating at times!

Many thanks,

Mark Edgeworth 

Incoras Solutions Ltd.
Tel:  +44 (0)700 580 8048;  Fax: +44 (0)700 597 8009
Skype: 'markedgeworth'

-----Original Message-----
From: Gavin Lambert [mailto:antlr at mirality.co.nz] 
At 07:14 1/02/2008, Mark Edgeworth wrote:
 >As a relatively new user of ANTLR I have spent literally ages on 

 >just this sort of issue recently.  ANTLR is a great tool and 
just
 >so powerful, but will often accept 'unusual' grammar constructs
 >from newbies like me that can never work properly.  Any extra
 >build time checks are worth their weight in gold to me as I
 >learn... (ie 'lint' for .g files?)

Unit testing.  Construct a test harness for your lexer/parser in 
the language of your choice (eg. via JUnit, NUnit, CppUnit, or one 
of the many other frameworks) and run it frequently to ensure that 
changes you've made to your grammar haven't broken anything.

It's not quite build time (although it could be -- some build 
utilities can automatically run tests after building) but it's 
very specific and helps you make sure that your parser is doing 
what you think it's doing.

It's especially useful for testing the lexer in isolation, since 
you can't really do that properly in ANTLRworks.