[antlr-interest] Progressive Slowdown in Parsing

A. Saake asaake at hotmail.de
Mon Dec 22 02:44:10 PST 2008


Thanks for the suggestions.

Can you please give me a hint, what GetKindOfOps is responsible for? My 
impression is, that (not scientific spoken) "all ealier rewrites are visited 
again". Is that possible?

My first attempt was to generate an AST and then process it with a tree 
grammar.

But the more my grammar growed, the more difficult it was to maintain the 
tree grammar. And so I concentrated on a single grammar with embedded 
templates, supported by some external tables and functions.

But if an AST output is the solution, I will try it again.

The 4GL itself is not very far away from C#, as the sample below may show. 
Here is a sample of the 4GL code:

        LOCAL VOID FUNCTION set_envment()

        LOCAL $durchl,	$konf_wert,	$konf_kz1, $baudb_cmd, $kasto,
	$aktionsstr, $aktionskunden, $pos,
	$prov_regel, $prov_grp, $prov_bereich, $prov_regzusatz
        BEGIN

	 /*-------Basisw{hrung und Euro-W"hrungs-Krzel und Euro-W"hrungs-Kurs*/
	SET $str_basis, $basis_kz_z, $basis_kz2_z TO
		SELECT konf_wert, konf_kz1, konf_kz2
		FROM kon
		WHERE konf_feld = 'Basiswaehrung'
		  AND konf_zen = $zent_id_z;

	IF $str_basis IS UNDEFINED AND $zent_id_z > 1 THEN
		SET $str_basis, $basis_kz_z, $basis_kz2_z TO
			SELECT konf_wert, konf_kz1, konf_kz2
			FROM kon
			WHERE konf_feld = 'Basiswaehrung'
			  AND konf_zen = 1;

	IF $str_basis IS UNDEFINED OR strlen$($str_basis) = 0 THEN
		SET $str_basis TO 'DM;EUR;DM'

                ... and so on

 and that is my C# result so far:

	  	private void set_envment() {
	  	  /* LOCAL $durchl,	$konf_wert,	$konf_kz1, $baudb_cmd, $kasto,
	  	  	$aktionsstr, $aktionskunden, $pos,
	  	  	$prov_regel, $prov_grp, $prov_bereich, $prov_regzusatz */

 	  	      using (UnitradeContext ctx = new UnitradeContext()) {
 	  	              ctx.ObjectTrackingEnabled = false;

 	  	              var q = (from c in ctx.KON
 	  	                            where (c.KONF_FELD == "Basiswaehrung" && 
c.KONF_ZEN == MainForm.zent_id_z)
  		                             select c).First();

 	  	              if (q != null) {
 	  	                str_basis = q.KONF_WERT;
 	  	                MainForm.basis_kz_z = q.KONF_KZ1;
 	  	                MainForm.basis_kz2_z = q.KONF_KZ2;
 	  	              }
 	  	          }

 	  	          if (str_basis == null && MainForm.zent_id_z > 1) {
  		            using (UnitradeContext ctx = new UnitradeContext()) {
 	  	  	ctx.ObjectTrackingEnabled = false;

 	  	  	var q = (from c in ctx.KON
 	  	  	              where (c.KONF_FELD == "Basiswaehrung" && c.KONF_ZEN == 
1)
  		                              select c).First();

 	  	                if (q != null) {
 	  	  	   str_basis = q.KONF_WERT;
 	  	  	   MainForm.basis_kz_z = q.KONF_KZ1;
 	  	  	   MainForm.basis_kz2_z = q.KONF_KZ2;
  		               }
 	  	           }
 	  	      }

		     if (str_basis == null || Sysfunc.Strlen(str_basis) == 0) {
 	  	        str_basis = "DM;EUR;DM";
 	  	    }

Thanks for your help!

Andreas

--------------------------------------------------
From: "Johannes Luber" <JALuber at gmx.de>
Sent: Monday, December 22, 2008 11:11 AM
To: "A. Saake" <asaake at hotmail.de>; <antlr-interest at antlr.org>
Subject: Re: [antlr-interest] Progressive Slowdown in Parsing

>> Hi,
>>
>> I'm new to ANTLR, but with support of the "The Definitive Reference ...",
>> the great tool ANTRLWorks and much optimism I got all problems solved, so
>> far.
>> ANTLR is a very impressive tool and probably will help me to migrate
>> legacy 4GL code to C#, as I hope.
>>
>> But now I'm facing a problem, where help would be very appreciated !
>>
>> I'm converting a 4GL "local function" with 3500 lines of code, mainly
>> constisting of assignments (SET ... TO ...), simple IF's and embedded SQL
>> Selects. This means, there are many flat, not deep statements.
>>
>> The first 2000 lines can be processed in under 1 min. If I parse the 
>> whole
>> script, time increases to 15 minutes.
>
> This sounds, if the the runtime of your program is O(n^2).
>
>> For a correct migration of this
>> script (it's an include file), I would have to embed it into another 3500 
>> lines
>> code script, and I'm afraid that it will need a very long time. Because 
>> of
>> the variable system of the 4GL (declaration of variables is not 
>> necessary,
>> so there's no scope, I have to estimate it from context), I will have to
>> run it many times.
>>
>> To find out, if the slowdown is from my grammar, I tried lot's of
>> syntactic predicates and so on, until I used a profiler, which names 
>> GetKindOfOps
>> as responsible for nearly 80% of the runtime.
>>
>> My grammar is a combined lexer and parser, output is template and I use
>> the token rewrite mechanism.
>>
>> Can you give me a hint, what I can do for more speed?
>> Would building an output tree and then generating templates be faster?
>
> The question is if you can cache the results of GetKindOfOps(). If yes 
> then using an output tree with a symbol table seems to be a good choice. 
> But as 4GL is probably very different from C#, the use of 
> TokenRewriteStream is a bad choice. After all, TokenRewriteStream is meant 
> only for small surgical changes on the input and not for complete 
> translations. Changing the grammar to TokenStream could mean a speedup 
> alone.
>
> Johannes
>>
>> Many thanks in advance !
>>
>> Andreas
>
> -- 
> Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: 
> http://www.gmx.net/de/go/multimessenger
> 


More information about the antlr-interest mailing list