score:0

You realize that StopWatch doesn't take actual thread activity into account, right? It's the equivalent of timing your commute to work in the morning; there are a lot of things that could impede your progress by varying amounts from day to day (a light you catch one day that stops you the next, traffic jams, etc).

The analogy holds pretty well in the computer; the OS could have interrupted your thread to do something else, your thread could have had to wait for page file operations (expansion, swapping), etc. Try running each algorithm 2 or 3 times and average the times. Also, make sure your application is running in FullTrust, which bypasses all security (but not runtime integrity) permission checks. Lastly, if you can somehow multithread this profiler, you can obtain metrics about the actual number of cycles the algorithm needed from the CPU, which will be independent of thread scheduling delays.

score:0

It's the call to enumerable.Count. When I increase the size of the array by 1000 and decrease the iterations of the performance test by 1000, they perform almost the same and I get the following results:

int?: did 1000 iterations in 488 ms
strings: did 1000 iterations in 437 ms

With this test, the string version is actually faster.

The reason for this, it seems, is that for the struct version, the compiler can inline the call to enumerable.Count. However, for the class version, it generates the IL below. Explanations are in the code.

.method public hidebysig static int32  CountNonNull<class T>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!T> enumerable) cil managed
{
    .custom instance void [System.Core]System.Runtime.CompilerServices.ExtensionAttribute::.ctor() = ( 01 00 00 00 ) 
    .maxstack  4
    .locals init ([0] int32 CS$1$0000)
    IL_0000:  nop
    IL_0001:  ldarg.0
    IL_0002:  ldnull
    // Push the lambda for ReferenceEquals onto the stack.
    IL_0003:  ldftn      bool ConsoleApplication7.Program::'<CountNonNull>b__8'<!!0>(!!0)
    // Create a new delegate for the lambda.
    IL_0009:  newobj     instance void class [mscorlib]System.Func`2<!!T,bool>::.ctor(object,
                                                                                    native int)
    // Call the Count Linq method.
    IL_000e:  call       int32 [System.Core]System.Linq.Enumerable::Count<!!0>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>,
                                                                                class [mscorlib]System.Func`2<!!0,bool>)
    IL_0013:  stloc.0
    IL_0014:  br.s       IL_0016
    IL_0016:  ldloc.0
    IL_0017:  ret
}

For the struct version, it doesn't have to do any of this and it just inlines something like this:

var enumerator = enumerable.GetEnumerator();
int result = 0;

try
{
    while (true)
    {
        var current = enumerator.Current;

        if (current.HasValue)
            result++;

        if (!enumerator.MoveNext())
            break;
    }
}
finally
{
    enumerator.Dispose();
}

This is of course much faster than the IL for the class version.


Related Articles