[antlr-interest] sentences of a grammar

Gerrie Myburg [ MTN - Innovation Centre ] Myburg_G at mtn.co.za
Fri Jun 9 02:24:27 PDT 2006


I write some code that will take an antlr grammar definition, create a
graph from the grammar and will try to generate all possible sentences
from the grammar. The following rules are used to avoid infinite length
sentences.

(a)* is rewritten as ( |a|aa| ... ) to a max limit 
(a)+ is rewritten as ( a|aa| ... ) to a max limit
(a)? is rewritten as ( |a) 

Recursive loops are avoided by entering the loop once and once only.
Infinite recursion is detected and reported.

The sentences are created by finding all possible paths in the grammar
graph. Backtracking is used in the graph at every point in the path
where a '|' is encountered.

Token references are resolved at the moment by referring to a map of
token ref definitions. If no definition is supplied then the token ref
string is used instead.

The code is meant to be used on a grammar that has already been passed
by the antlr compiler.

I wrote the code to test the tree parser of a sql compiler. While
testing the code I found that the number of sentences generated for some
of the sql rules is enormous. The number of sentences can run into the
millions for most rules. The antlr grammar in turn yields about 8000
sentences.

The only sensible way of using the code is to generate a sentence then
feed it to the destination compiler directly instead of writing out a
file.

The sentences are not necessarily semantically correct at the moment. In
fact I am sure that it is not the case.

Does anyone want me to upload the code to the antlr website?


NOTE: This e-mail message is subject to the MTN Group disclaimer see http://www.mtn.co.za/disclaimer



More information about the antlr-interest mailing list