score:0

Accepted answer

It turns out in the example I was looking at the Where method wasn't being run on the database itself, but a list already pulled from the database.

ie. this doesn't work:

var airplanes = _dataContext.Airplanes.Where(p => Test(p.Status.ToString(), "a");

but this does:

var airplanes = _dataContext.Airplanes.ToList().Where(p => Test(p.Status.ToString(), "a");

Although in my case that sort of defeats the purpose of doing LINQ instead of iteration.

score:-1

LINQ to Entities has to know how to translate your code into SQL query it can run against database.

When you use (I removed ToString call from the code you posted, because ToString doesn't work with LINQ to Entities):

var airplanes = _dataContext.Airplanes.Where(p => p.Status.Contains("a"));

compiler translates your lambda into Expression Tree that LINQ provider can traverse and generate proper SQL:

You can have the C# or Visual Basic compiler create an expression tree for you based on an anonymous lambda expression, or you can create expression trees manually by using the System.Linq.Expressions namespace.

quote from Expression Trees (C# and Visual Basic)

It knows how to translate call to IEnumerable.Contains because it's one of the methods that was added to it as 'known' stuff - it knows it has to generate IN statement the same way it knows that != should be translated into <> in SQL.

When you use

var airplanes = _dataContext.Airplanes.Where(p => Test(p.Active, "a");

all the Expression Tree has is Test methods call, which LINQ provider knows nothing about. It also can't inspect it's content to find that it's actually just wrapping around Contains call, because the methods get's compiled to IL, not to Expression Tree.

Appendix

As of why Where(p => Test("abc", "a")) word and Where(p => Test(s, "a")) doesn't I'm not 100% sure, but my guess is LINQ provider is smart enough to see that you're calling it with two constant values, so it just tries to execute it and see if it can get a value back, that could be treated as constant in the SQL query.

score:1

In the first two cases, the call to Test doesn't relate to the parameter in the lambda, so they both reduce to p => true.

In the third this similarly happens, though it'll reduce to p => true sometimes, and p => false sometimes, but either way when it comes to creating the expression, the result of calling Test is found and then feed into the expression as a constant.

In the fourth the expression includes sub-expressions accessing a property of an entity and calling Contains, both of which EF understands and can turn into SQL.

In the fifth the expression contains sub-expressions accessing a property and calling Test. EF doesn't understand how to translate the call to Test so you need to either associate it with a SQL function, or rewrite Test so it creates an expression rather than calculating the result directly.

More on expressions as promised:

Let's start with two things that you may already know all about, but if you don't then the rest will be harder to understand.

The first is what p => p.Status.Contains("a") actually means.

Which is, on its own, absolutely nothing. Unlike most expressions in C# lambdas can't have a type without a context.

1 + 3 has a type, which is int and hence in var x = 1 + 3 the compiler gives x the type int. Even long x = 1 + 3 starts with the int expression 1 + 3 and then casts it to long after that.

p => p.Status.Contains("a") doesn't have a type. Even (Airplane p) => p.Status.Contains("a") doesn't have a type, and so var λ = (Airplane p) => p.Status.Contains("a"); isn't allowed.

Instead the type of a lambda expression will be either a delegate type or an Expression strongly typed to a delegate. So all of these are allowed (and mean something):

Func<Airplane, bool> funcλ = p => p.Status.Contains("a");
Expression<Func<Airplane, bool>> expFuncλ = p => p.Status.Contains("a");
delegate bool AirplanePredicate(Airplane plane);
AirplanePredicate delλ = p => p.Status.Contains("a");
Expression<AirplanePredicate> expDelλ = p => p.Status.Contains("a");

Okay. Maybe you knew that, if not you do now.

The second thing is what Where actually does in Linq.

The Queryable form of Where (we'll ignore the Enumerable form for now, and come back to it) is defined thus:

public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource> source, Expression<Func<TSource, bool>> predicate)

IQueryable<T> represents something that can obtain 0 or more items. It can do four things, through four methods:

  1. Give you an enumerator to enumerate through those items (inherited from IEnumerable<T>).
  2. Tell you what type of item it has (which will be typeof(T) but it's inherited from IQueryable where its not as obvious).
  3. Tell you what its query provider is.
  4. Tell you what its expression is.

Now, those last two are the important bits here.

If you start with new List<Airplane>().AsQueryable then the query provider will be a EnumerableQuery<Airplane> which is a class that handles dealing with querys about in-memory enumerations of Airplane, and its expression will represent returning that list.

If you start with _dataContext.Airplanes the provider will be a System.Data.Entity.Internal.Linq.DbQueryProvider which is a class that handles dealing with EF queries about databases, and its expression will represent running SELECT * FROM Airplanes on a database and then creating an object for each row returned.

Now, the job of Where is to get the provider to create a new IQueryable that represents filtering the results of the expression we start with according to the Expression<Func<Airplane, bool>> passed to it.

The fun bit is that this is wonderfully self-referencial: The expression returned by Where when you call it with arguments of IQueryable<Airplane> and Expression<Func<Airplane, bool>> in fact represents calling Where with arguments of IQueryable<Airplane> and Expression<Func<Airplane, bool>>! It's like calling Where results in Where saying "hey, you should call Where here".

So, what happens next?

Well, sooner or later we do some operation that results in an IQueryable not being used to return another IQueryable but some other object representing the results of a query. For the sake of simplicity let's say we just start enumerating through the results of our single Where.

If it were Linq-to-objects then what we'd have is a queryable with an expression that means:

Take the Expression<Func<Airplane, bool>> and compile it so you have a Func<Airplane, bool> delegate. Cycle through every element in the list, calling that delegate with it. If the delegate returns true then yield that element, otherwise don't.

(This incidentally is what the Enumerable version of Where does directly with a Func<Airplane, bool> instead of an Expression<Func<Airplane, bool>>. Remember when I said that the result of Where was an expression saying "hey, you should call Where here"? That's pretty much what it does, but because the provider now picks the Enumerable form of Where and uses the Func<Airplane, bool> rather than the Expression<Func<Airplane, bool>>, we get the results we want. This also means that as long as the operations provided on IQueryable<T> have an equivalent provided on IEnumerable<T> linq-to-objects can cater for everything linq generally caters for).

But this is not linq-to-objects, it's EF so what we have is an expression that means:

Take the Expression<Func<Airplane, bool>> and turn it into a SQL boolean expression such as can be used in a SQL WHERE clause. Then add that as a WHERE clause to the earlier expression (which translates to SELECT * FROM Airplanes).

The tricky bit here is "and turn it into a SQL boolean expression".

When your lambda was p => p.Status.Contains("a") then the SQL could be produced of (depending on SQL version) CONTAINS (Status, 'a') or Status LIKE '%a%' or something else for a different type of database. The end result is hence SELECT * FROM Airplanes WHERE Status LIKE '%a%' or so on. EF knows how to break down that expression into component expressions, and how to turn .Status into a column access and how to turn string's Contains(string value) into SQL where clauses.

When your lambda was p => Test(p.Status.ToString(), "a") the result was a NotSupportedException because EF does not know how to turn your Test method into SQL.

Okay. That's the meat, lets get to the pudding.

Can you elaborate on what you mean by "rewrite Test so it creates an expression rather than calculating the result directly".

A problem here is that I don't know quite what your end goal was, as in just where you wanted to be flexible. So I'm going to do something that does the equivalent of .Where(p => Test(p.Status.ToString(), someStringParameter)) in three ways; an easy way, a pretty easy way, and a hard way, where they can be made more flexible in a variety of ways.

First the easiest way:

public static class AirplaneQueryExtensions
{
  public static IQueryable<Airplane> FilterByStatus(this IQueryable<Airplane> source, string statusMatch)
  {
    return source.Where(p => p.Status.Contains(statusMatch));
  }
}

Here you can use _dataContext.Airplanes.FilterByStatus("a") and it is as if you'd used your working Where(). Because that's precisely what it is doing. We haven't really done a lot here, though there certainly is scope for DRY on more complicated Where() calls.

Roughly equally easy:

public static Expression<Func<Airplane, bool>> StatusFilter(string sought)
{
  return p => p.Status.Contains(sought);
}

Here you can use _dataContext.Airplanes.Where(StatusFilter("a")) and againt it's pretty much the same as if you'd used your working Where(). Again, we haven't done a lot here, but there's scope for DRY if the filter were more complicated.

Now for the fun version:

public static Expression<Func<Airplane, bool>> StatusFilter(string sought)
{
  var param = Expression.Parameter(typeof(Airplane), "p");                      // p
  var property = typeof(Airplane).GetProperty("Status");                        // .Status
  var propExp = Expression.Property(param, property);                           // p.Status
  var soughtExp = Expression.Constant(sought);                                  // sought
  var contains = typeof(string).GetMethod("Contains", new[]{ typeof(string) }); // .Contains(string)
  var callExp = Expression.Call(propExp, contains, soughtExp);                  // p.Status.Contains(sought)
  var lambda = Expression.Lambda<Func<Airplane, bool>>(callExp, param);         // p => p.Status.Contains(sought);
  return lambda;
}

This does pretty much exactly the same as the previous version of StatusFilter does behind the scenes, except that identifies types, methods and properties using .NET metadata tokens and we use typeof() and names.

As the comments in each line show, the first line obtains an expression representing a property. We don't really need to give it a name since we won't be using it directly in source-code, but we call it "p" anyway.

The next line gets the PropertyInfo for Status and the subsequent creates an expression representing getting that for p, hence p.Status.

The next line creates an expression representing the constant value of sought. While sought isn't a constant generally, it is in terms of the overall expression we are creating (this is why EF was able to treat Test("abc", "a") as a constant true rather than having to translate it).

The next line gets the MethodInfo for Contains and in the next line we create an expression representing calling that on the result of p.Status with sought as an argument.

And finally we create an expression that ties them all together into the equivalent of p => p.Status.Contains(sought), and return it.

This is clearly a lot more work than just doing p => p.Status.Contains(sought). And well, that's sort of the point of having lambdas for expressions in C#, so that we normally don't have to do that work.

Indeed, to have a true expression-based equivalent of your Test we find ourselves doing:

public static MethodCallExpression Test(Expression a, string b)
{
  return Expression.Call(a, typeof(string).GetMethod("Contains", new[]{ typeof(string) }), Expression.Constant(b));
}

But then to use it we need to do more expression based work, because we can't just p => Test(p.Status, "a") because p.Status isn't an expression in that context. We have to do:

public static Expression<Func<Airplane, bool>> UseTest(string b)
{
  var param = Expression.Parameter(typeof(Airplane));
  return Expression.Lambda<Func<Airplane, bool>>(Test(Expression.Property(param, typeof(Airplane).GetProperty("Status")), b), param);
}

And now finally we can use _dataContext.Airplanes.UseTest("a"). Phew!

The expression-based approach has two advantages though.

  1. We can use it if we want to do some manipulation with the expressions beyond want lambdas allow direction, such as in https://stackoverflow.com/a/30795217/400547 where they're use alongside reflection to be able to specify the property to access as a string.
  2. You hopefully know enough about how linq works behind the scenes to know all you need to know to understand fully why some of the queries in your question were working and some where not.

Related Query

More Query from same tag