[antlr-interest] Progressive Slowdown in Parsing

Johannes Luber JALuber at gmx.de
Mon Dec 22 02:11:31 PST 2008


> Hi,
> 
> I'm new to ANTLR, but with support of the "The Definitive Reference ...",
> the great tool ANTRLWorks and much optimism I got all problems solved, so
> far. 
> ANTLR is a very impressive tool and probably will help me to migrate
> legacy 4GL code to C#, as I hope.
> 
> But now I'm facing a problem, where help would be very appreciated !
> 
> I'm converting a 4GL "local function" with 3500 lines of code, mainly
> constisting of assignments (SET ... TO ...), simple IF's and embedded SQL
> Selects. This means, there are many flat, not deep statements.
> 
> The first 2000 lines can be processed in under 1 min. If I parse the whole
> script, time increases to 15 minutes.

This sounds, if the the runtime of your program is O(n^2).

> For a correct migration of this
> script (it's an include file), I would have to embed it into another 3500 lines
> code script, and I'm afraid that it will need a very long time. Because of
> the variable system of the 4GL (declaration of variables is not necessary,
> so there's no scope, I have to estimate it from context), I will have to
> run it many times.
> 
> To find out, if the slowdown is from my grammar, I tried lot's of
> syntactic predicates and so on, until I used a profiler, which names GetKindOfOps
> as responsible for nearly 80% of the runtime. 
> 
> My grammar is a combined lexer and parser, output is template and I use
> the token rewrite mechanism.
> 
> Can you give me a hint, what I can do for more speed? 
> Would building an output tree and then generating templates be faster?

The question is if you can cache the results of GetKindOfOps(). If yes then using an output tree with a symbol table seems to be a good choice. But as 4GL is probably very different from C#, the use of TokenRewriteStream is a bad choice. After all, TokenRewriteStream is meant only for small surgical changes on the input and not for complete translations. Changing the grammar to TokenStream could mean a speedup alone.

Johannes
> 
> Many thanks in advance !
> 
> Andreas 

-- 
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger


More information about the antlr-interest mailing list