[antlr-interest] ANTLR performance
Chrobot, Stefan
Stefan.Chrobot at sabre.com
Tue May 11 07:46:30 PDT 2010
Thanks for your response, Lorenzo!
This is exactly what's happening with my code.
I dropped the rewriting and created my own mechanism. The running time
dropped from ~10.00sec to ~00.10sec. Below I present my solution.
Stefan
1) I created a custom token class:
internal class CustomToken : CommonToken
{
private string myText;
public CustomToken(ICharStream input, int type, int channel, int
start, int stop)
: base(input, type, channel, start, stop)
{
}
public void ParseAs(string text)
{
myText = text;
}
public override string Text
{
get
{
return myText ?? base.Text;
}
set
{
base.Text = value;
}
}
}
2) Made lexer emit CustomTokens:
public override IToken Emit()
{
var token = new CustomToken(this.input, base.state.type,
base.state.channel, base.state.tokenStartCharIndex, this.CharIndex - 1);
token.Line = base.state.tokenStartLine;
token.Text = base.state.text;
token.CharPositionInLine = base.state.tokenStartCharPositionInLine;
this.Emit(token);
return token;
}
3) Added "rewrite" method to the parser:
private void ParseAs(CustomToken start, string text)
{
start.ParseAs(text);
var stop = (CustomToken)input.LT(-1);
for (int i = start.TokenIndex + 1; i <= stop.TokenIndex; ++i)
{
var token = (CustomToken)input.Get(i);
token.ParseAs("");
}
}
4) Set grammar option:
TokenLabelType = CustomToken;
Usage:
assignment
: ID '=' INT { ParseAs($assignment.start,
"<assignment>"); }
;
-----Original Message-----
From: Lorenzo de Lara [mailto:ldelara at affsys.com]
Sent: Tuesday, May 11, 2010 4:35 PM
To: Chrobot, Stefan
Cc: antlr-interest at antlr.org
Subject: Re: [antlr-interest] ANTLR performance
I have noticed the same thing with rewrite=true and came upon this bug
report from 2008, which is currently still open:
http://www.antlr.org/jira/browse/ANTLR-371
The problem is parsers with rewrite rules run in non-linear time on any
inputs above a few hundred rewrites. I've verified this in both Java and
C#. You can verify this for yourself by commenting out your rewrite
rules and running the parser and observing much closer to linear
runtime. (5 minutes with rewrite rules on vs. 5 seconds rewrite rules
off on a typical 1500 line input for us) The offending method is
GetKindOfOps in TokenRewriteStream taking up to 100% of the runtime
according to a Java profiling tool.
I've implemented the proposed fix (in Java) which does away with calling
GetKindOfOps completely and can confirm it does result in much more
reasonable, linear-like performance, without introducing any new
problems, as far as I can tell.
-Lorenzo
On 2010-05-11, at 5:17 , Chrobot, Stefan wrote:
Hi,
I'm using ANTLR with the C# target. The generated parser performs too
slow for my needs. My grammar uses k = 6.
Does it have a performance impact? What value should I target to get
optimum performance - 1 or *? Would changing the grammar to 1/* give
significant performance boost?
Stefan
List: http://www.antlr.org/mailman/listinfo/antlr-interest
Unsubscribe:
http://www.antlr.org/mailman/options/antlr-interest/your-email-address
More information about the antlr-interest
mailing list