score:2

Union and Except are deferredly executed. Read remarks section in documentation for more info.

http://msdn.microsoft.com/en-us/library/bb341731

http://msdn.microsoft.com/en-us/library/bb300779

Also, in such situation when you are note sure what's happening under the hood, you can use Reflector or any other .NET decompiler. It's really helpful.

score:0

A Linq query is executed when it has a underlying collection but it is not a collection. You can materialze a query for example by using ToList() or ToArray.

Every deferred executed Enumerable extension tries first to cast the IEnumerable<T>to an IList<T>(f.e. if it needs an indexer) or to a ICollection<T>(f.e. if it needs a Count). If it can, it will use the "native" method which does not need to execute the query, otherwise it will enumerate the sequence.

Search for the tem deferred on MSDN to see whether a method is executexd deferred or immediately. If you inspect the source code(f.e. via ILSpy), you can detect deferred executed methods by looking for the yield keyword.

Union and Except are implemented using deferred execution. So you need a ToList/ToArray if you want to persist that result.

Here's an example, the implementation of Enumerable.Count:

ICollection<TSource> collection = source as ICollection<TSource>;
if (collection != null)
{
    return collection.Count;
}
ICollection collection2 = source as ICollection;
if (collection2 != null)
{
    return collection2.Count;
}
int num = 0;
using (IEnumerator<TSource> enumerator = source.GetEnumerator())
{
    while (enumerator.MoveNext())
    {
        num++;
    }
}
return num;

score:1

Answer on your question depends on two options:

  • Type of operation: lazy (or deferred) and greedy. Lazy operations wouldn't be executed immediatly, and defered until code will start materializing data from your linq source. Greedy operations always execute immediatly.
    • Example of lazy operations: .Union, .Except, .Where, .Select and most of other linq operations
    • Greedy are: .ToList, .Count .ToArray and all operations that materialize data
  • Source of data for your linq operations. While you're working with Linq to Memory all operations (both lazy and greedy) will be executed immediately. Usualy Linq to external sources of data will execute lazy operations only during materialization.

Using this two rules you could anticipate how linq will behave:

  1. .Count and .ToList will execute immediatly and materialize data
  2. After .ToList, you'll get collection at memory, and all following operations will be executed immediatly (.Count will execute one again)
  3. How will .Union and .Except behave as lazy or greedy depends on type of your data source. For memory they will be greedy, for SQL lazy.

Example for LinqPad. I have one Enumerable and lazy or deferred .Where and .Select operations on it before and after materializing using greedy .Count or .ToList:

void Main() 
{ 
    "get enumerable".Dump(); 
    var samplesEnumerable = GetSamples(); 

    "get count on enumerable #1".Dump(); 
    samplesEnumerable.Count().Dump(); 

    "get enumerable to list #1".Dump();  
    var list = samplesEnumerable.ToList();   

    "get count on list #1".Dump();   
    list.Count().Dump(); 

    "get count on list again #2".Dump(); 
    list.Count().Dump(); 

    "get where/select enumerable #1".Dump(); 
    samplesEnumerable 
        .Where(sample => { sample.Dump(); return sample.Contains("5"); }) 
        .Select(sample => { sample.Dump(); return sample; })
        .Dump(); 

    "get where/select list #1".Dump(); 
    list 
        .Where(sample => { sample.Dump(); return sample.Contains("5"); }) 
        .Select(sample => { sample.Dump(); return sample; })
        .Dump(); 
} 

string[] samples = new [] { "data1", "data2", "data3", "data4", "data5" }; 

IEnumerable<string> GetSamples() 
{ 
    foreach(var sample in samples)  
    { 
        sample.Dump(); 
        yield return sample; 
    } 
}

Sample output. Key points

  1. On not materialized data, every .Count and .List are retrieving data again and again

    • get count on enumerable #1
    • get where/select enumerable #1
  2. After materializing data, enumerable will be not retreived any more

    • get enumerable to list #1
    • get count on list #1
    • get count on list again #2
    • get where/select list #1

Output:

get enumerable
get count on enumerable #1
data1
data2
data3
data4
data5

5
get enumerable to list #1
data1
data2
data3
data4
data5
get count on list #1

5
get count on list again #2

5
get where/select enumerable #1
data1
data1
data2
data2
data3
data3
data4
data4
data5
data5
data5

data5 

get where/select list #1
data1
data2
data3
data4
data5
data5

data5

Related Articles