[antlr-interest] ANTLR build process performance improvement

Thu Aug 11 12:59:28 PDT 2011

Hey Sam, thanks! I'm using it now and it's faster. Love the VS tool-chain!
I'd be dead in the water without it. And haven't bumped into any bugs with
the new version. So cool!

I've bummed into a slightly different problem and no matter how I arrange my
grammar (included) I can't seem to work around it. I think it might be a bug
in the SpecialStateTransition logic...

Below is what I'm trying to parse along with the trace using your enter-exit
partial methods (wonderful addition BTW) for my C# preprocessor interleaved
with when the tokens get pulled from the stream. Also interleaved are my
calls to toggle SkipSection which is trying to keep track of when code is
#ifdefed out. So what we see is that [#if] is pulled and then [false] and
[\r\n]. At that point in the parse I'm at the end of the pre-processor line
and so I look up the pp_conditional stack to see if I'm in a #ifdefed out
section of code. In this case I am so I set SkipSection to True. That
enables (via semantic predicate) my lexer rule
PP_SKIPPED_CHARACTERS=45 which should suck up any code that is not a pragma
statement (doesn't start with #). That's all well and good and so the next
thing that get tokenized is [#pragma warning disable] which is good.

Now at this point I expect that no tokens should get pulled until I reach
the pp_pragma production. I expect this because I figure ANLTER should be
able to predict where it needs to go without pulling any more tokens --
after all the only thing that can follow a [#pragma warning disable] token
is a list of integers. The actual behavior is that in my
pp_condition_section production ANTLR pulls [10] as a
PP_SKIPPED_CHARACTERS instead of an INTEGER because SkipSection is set to
True. If things had gone as expected and [10] had been pulled in the
pp_pragma production. If that had happened then SkipSection would have been
set to False and [10] would be pulled as an INTEGER.

The code that's actually pulling the [10] is DFA.Predict in the
SpecialStateTransition loop when trying to predict where to go for the
pp_conditional_section production:

pp_conditional_section
: { !SkipSection }? => input_section
| { SkipSection }? => pp_skipped_section
;

I'm guessing that this production is "special" because it's got those gated
semantic predicates and that's why DFA.Predict enters into the
SpecialStateTransition logic. What I don't understand is why it would need
to pull any more tokens to know where to go next. Do you think that's a bug
that it's pulling tokens in this case?

After writing this e-mail it occurred to me that I might manually try to do
the prediction. I did this by putting a break point in the
pp_conditional_section rule at the dfa.Predict line. But instead of asking
the DFA to do the prediction I just set-ip to the case I wanted (e.g. {...}?
=> pp_skipped_section). Then hit F5. And it works! I included the trace of
that run below. Given that I really do think that the SpecialStateTransition
logic (or there abouts) is being to aggressive about pulling tokens... what
do you think?

Thanks,
Chris

CSharpAst.Parse("#if false\r\n#pragma warning disable
10\r\n/*foo*/\r\n#endif");

Enter start 1
[@-1,0:2='#if',<38>,1:0]
 Enter input_section 2
  Enter input_section_part 3
   Enter pp_directive 6
    Enter pp_conditional 8
     Enter pp_if_section 9
[@-1,4:8='false',<4>,1:4]
      Enter pp_expression 17
       Enter pp_or_expression 18
        Enter pp_and_expression 19
         Enter pp_equality_expression 20
          Enter pp_unary_expression 21
           Enter pp_primary_expression 22
[@-1,9:10='\\r\\n',<29>,1:9]
           Leave pp_primary_expression 22
          Leave pp_unary_expression 21
         Leave pp_equality_expression 20
        Leave pp_and_expression 19
       Leave pp_or_expression 18
      Leave pp_expression 17
      Enter pp_conditional_block 12
       Enter pp_new_line 31
SkipSection = True
[@-1,11:33='#pragma warning disable',<42>,2:0]
       Leave pp_new_line 31
       Enter pp_conditional_section 13
[@-1,34:36=' 10',<45>,2:23]
[@-1,37:38='\\r\\n',<29>,2:26]
        Enter pp_skipped_section 14
         Enter pp_skipped_section_part 15
          Enter pp_directive 6
           Enter pp_leaf_directive 7
            Enter pp_pragma 29
SkipSection = False
             Enter pp_warning_list 30

 Here is the trace when I make the prediction by hand:

Enter start 1
[@-1,0:2='#if',<38>,1:0]
 Enter input_section 2
  Enter input_section_part 3
   Enter pp_directive 6
    Enter pp_conditional 8
     Enter pp_if_section 9
[@-1,4:8='false',<4>,1:4]
      Enter pp_expression 17
       Enter pp_or_expression 18
        Enter pp_and_expression 19
         Enter pp_equality_expression 20
          Enter pp_unary_expression 21
           Enter pp_primary_expression 22
[@-1,9:10='\\r\\n',<29>,1:9]
           Leave pp_primary_expression 22
          Leave pp_unary_expression 21
         Leave pp_equality_expression 20
        Leave pp_and_expression 19
       Leave pp_or_expression 18
      Leave pp_expression 17
      Enter pp_conditional_block 12
       Enter pp_new_line 31
SkipSection = True
[@-1,11:33='#pragma warning disable',<42>,2:0]
       Leave pp_new_line 31
       Enter pp_conditional_section 13
        Enter pp_skipped_section 14
         Enter pp_skipped_section_part 15
          Enter pp_directive 6
           Enter pp_leaf_directive 7
            Enter pp_pragma 29
SkipSection = False
[@-1,35:36='10',<28>,2:24]
             Enter pp_warning_list 30
[@-1,37:38='\\r\\n',<29>,2:26]
             Leave pp_warning_list 30
             Enter pp_new_line 31
SkipSection = True
[@-1,39:45='/*foo*/',<45>,3:0]
             Leave pp_new_line 31
            Leave pp_pragma 29
           Leave pp_leaf_directive 7
          Leave pp_directive 6
         Leave pp_skipped_section_part 15
         Enter pp_skipped_section_part 15
[@-1,46:47='\\r\\n',<29>,3:7]
[@-1,48:53='#endif',<35>,4:0]
         Leave pp_skipped_section_part 15
        Leave pp_skipped_section 14
       Leave pp_conditional_section 13
      Leave pp_conditional_block 12
     Leave pp_if_section 9
     Enter pp_endif 16
      Enter pp_new_line 31
SkipSection = True
      Leave pp_new_line 31
     Leave pp_endif 16
    Leave pp_conditional 8
   Leave pp_directive 6
  Leave input_section_part 3
 Leave input_section 2
Leave start 1

On Thu, Aug 11, 2011 at 9:31 AM, Sam Harwell <sharwell at pixelminegames.com>wrote:

> Hi “brave testers” :)****
>
> ** **
>
> I updated the MSBuild integration for the CSharp3 target to significantly
> improve its performance in several areas. I haven’t tested the update to see
> if it fixes the issues with ReSharper’s IntelliSense engine, but it sure
> would be sweet if it did!****
>
> ** **
>
> **1.       **Time to compile grammars should be reduced by 1-2 seconds per
> project containing grammars.****
>
> **2.       **The “lag” in the IDE when you change windows away from a
> modified grammar file and when you save a grammar file should be reduced by
> 1-2 seconds each time.****
>
> **3.       **When you open a project IntelliSense will be ready
> immediately as opposed to waiting until you save a grammar or build the
> project.****
>
> **4.       **When you add or remove a file from the project, IntelliSense
> won’t break.****
>
> ** **
>
> If you’d like to test out the new tool, it’s available in the following 7z
> file. Simply close Visual Studio and replace your existing Antlr3.targets
> and AntlrBuildTask.dll with the ones from this archive and you’re ready to
> go.****
>
> ** **
>
>
> http://www.tunnelvisionlabs.com/downloads/antlr/AntlrBuildTask-experimental-9029.7z
> ****
>
> ** **
>
> Thanks,****
>
> Sam****
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: CSharp.g
Type: application/octet-stream
Size: 8500 bytes
Desc: not available
Url : http://www.antlr.org/pipermail/antlr-interest/attachments/20110811/a6956cdb/attachment.obj