score:14

Accepted answer

I am trying to understand why this linq does not compile

The key to understanding is to read the section of the specification on how queries are lowered into normal code.

Let's start with your query:

from fundInvoices in paidfundInvoices
from p in fundInvoices.Value
group p by p.VendorId into ps
select new Payment {
  FundId = fundInvoices.Key.FundId, // ERROR here
  Value = ps.Sum(p => p.Amount)
}

OK, step one. The rule in the spec is:

A query expression with a continuation from … into x … is translated into from x in ( from … ) …

Your query is now

from ps in (
  from fundInvoices in paidfundInvoices
  from p in fundInvoices.Value
  group p by p.VendorId)
select new Payment {
  FundId = fundInvoices.Key.FundId, // ERROR here
  Value = ps.Sum(p => p.Amount)
}

And now it should be clear why fundInvoices is not in scope in the select clause. fundInvoices is a range variable of a completely different query.

But in case that is not clear, let's keep going. The next rule is:

A query expression of the form from x in e select v is translated into ( e ) . Select ( x => v )

Your query is now

((from fundInvoices in paidfundInvoices
  from p in fundInvoices.Value
  group p by p.VendorId))
  .Select(ps => 
    new Payment {
      FundId = fundInvoices.Key.FundId,
      Value = ps.Sum(p => p.Amount)
    })

Now we can translate the inner query:

A query expression with a second from clause followed by something other than a select clause from x1 in e1 from x2 in e2 … is translated into from * in ( e1 ) . SelectMany( x1 => e2 , ( x1 , x2 ) => new { x1 , x2 } ) …

The * is a "transparent identifier" and we'll see what it means in a minute.

Your query is now

((from * in (paidfundInvoices).SelectMany(
   fundInvoices => fundInvoices.Value, 
   (fundInvoices, p) => new {fundInvoices, p}) 
  group p by p.VendorId))
  .Select(ps => 
    new Payment {
      FundId = fundInvoices.Key.FundId,
      Value = ps.Sum(p => p.Amount)
    })

Final rule:

A query expression of the form from x in e group v by k is translated into ( e ) . GroupBy ( x => k , x => v )

So that's

((((paidfundInvoices).SelectMany(
   fundInvoices => fundInvoices.Value, 
   (fundInvoices, p) => new {fundInvoices, p}))
  .GroupBy(* => p.VendorId, * => p)))
  .Select(ps => 
    new Payment {
      FundId = fundInvoices.Key.FundId,
      Value = ps.Sum(p => p.Amount)
    })

The * means "bring the pair of members of the anonymous type selected in the select-many into scope. Desugar that and remove the unnecessary parens and we have the final form of the query:

paidfundInvoices
  .SelectMany(
    fundInvoices => fundInvoices.Value, 
    (fundInvoices, p) => new {fundInvoices, p})
  .GroupBy(pair => pair.p.VendorId, pair => pair.p)))
  .Select(ps => 
    new Payment {
      FundId = fundInvoices.Key.FundId,
      Value = ps.Sum(p => p.Amount)
    })

And now it should be very clear why fundInvoices is not in scope in the continuation. It would be in scope in the GroupBy thanks to transparent identifier desugaring, but it is not at all in scope in the Select.

More generally: in LINQ scopes generally flow from declarations on the left to usages on the right, but there are some exceptions: an into removes range variables from scope, not all range variables are in scope at all places in a join clause, and so on. Read the spec for more details.

score:1

Once you do the group into you can no longer access the original from variables. If you need access, put it in the group:

from fundInvoices in paidfundInvoices                                    
from p in fundInvoices.Value
group new { fundInvoices, p } by p.VendorId into ps
         select new Payment
                {
                  FundId = ps.fundInvoices.FundId,
                  Value = ps.Sum(p => p.Amount)
                }

Related Articles