[antlr-interest] Parser performance dropping as a function of line count

Rukmal Fernando rukmal_fernando at yahoo.com
Mon Oct 2 03:27:12 PDT 2006


Hi,

Thank you for the pointer, and while I was checking what you said about syntactic predicates I realised the gravity of my mistake - identifier was doing a syntactic predicate that I had somehow forgotten about.

I have removed it, and this seems to have rectified the problem. Thank you very much once again.

Out of curiosity, can anyone point out why a simple misplaced syntactic predicate could have such an impact, especially one that grew with the file size even when the number of lines parsed was fixed? (i.e.: in my case, the time the parser took to parse up to the first parser error @line 460 doubled for every 1000 lines in the source file).

Best regards,

Rukmal.

----- Original Message ----
From: Ric Klaren <ric.klaren at gmail.com>
To: Rukmal Fernando <rukmal_fernando at yahoo.com>
Cc: antlr-interest at antlr.org
Sent: Monday, October 2, 2006 1:09:35 PM
Subject: Re: [antlr-interest] Parser performance dropping as a function of line count

Hi,

On 10/2/06, Rukmal Fernando <rukmal_fernando at yahoo.com> wrote:
> After trying a few PL\SQL grammar files which did not meed our particular needs, we decided to write a new PL\SQL parser from scratch, comprising of a subset of PLSQL features specific to our work. The (Lexer + Parser) grammar is only 670 lines with only one syntactic predicate.
>
> The problem we have is that we have PL\SQL files of around 18K lines of code, consisting of a PL\SQL package with various procedures and functions. We now have some serious performance problems with this.
>
> As an example, we have a parser error generated at line 460. When the last bit of the file is truncated to bring the file size to 1K lines, the parser takes about 15-16 seconds to reach the error. When the file is truncated to about 2K lines, it takes 29 seconds to reach the error. Likewise, when the file is truncated to 3K and 5K lines, it take rougly 90 and 150 seconds respecitvely to reach the 460th line.

Did you try generating the parser with -traceParser and see what is
exactly happening? The times you list more or less hint at that the
syntactic predicate is in a pretty bad place. Looking at the output of
-traceParser will tell you wether that is happening. I'm not 100% sure
wether the default trace behaviour shows wether the parser is
backtracking, think it did, else you have to override the traceXX
methods.

Aside note: Did you check wether the lexer is the slowing factor?
ANTLR2's lexers are not really performance animals. You can easily
check this by making a loop that calls the lexer's nextToken() method
repeatedly.

Cheers,

Ric







More information about the antlr-interest mailing list