score:1

Accepted answer
rows
    .groupby(
        r => new { r.date.year, r.date.month, r.date.day, r.type },
        (r, rr) => new { r.year, r.month, r.day, r.type, amount = rr.sum(rrr => rrr.amount) })
    .groupby(
        r => new { r.year, r.month, r.day },
        (r, rr) => new { r.year, r.month, r.day, amount = rr.orderbydescending(rrr => rrr.type).select(rrr => rrr.amount).first() })
    .groupby(
        r => new { r.year, r.month },
        (r, rr) => new { r.year, r.month, amount = rr.sum(rrr => rrr.amount) })

ratio behind this is very simple: the requirement "if there's at least one type 2 record, sum only type 2 records" can be implemented by simply grouping records by type (within days, of course). why does it work? because we split all records into two groups, type 2 (which should be used if there is at least one type 2 record), and type 1 (which in fact means "all records when no type 2 present"). second part (choosing sum) is even simpler: we just order groups (within day) by descending type (i.e. sum for type 2, sum for type 1) and take first one, which gives us type 2 if present and type 1 otherwise.

frankly, it's a kind of "smart code" everybody hates, because nobody could understand from a glance how it works.

score:0

how about this?

from r in rows
group r by new { r.date.year, r.date.month }
into g
let type2days = g.where( a => a.type == 2 ).select( a => a.date.day ).distinct()
let filtered = g.where( a => a.type == 2 || type2days.contains(a.date.day) == false )
select
    new
        {
            date = new datetime(g.key.year, g.key.month, 1),
            hours = filtered.sum(a => a.amount)
        };

score:1

how would this look to you?

var query =
    from r in rows.toarray()
    group r by new { r.date.year, r.date.month } into g
    let lookup = g.tolookup(x => x.type, x => x.amount)
    let hours = lookup[2].any() ? lookup[2].sum() : lookup[1].sum()
    select new
    {
        date = new datetime(g.key.year, g.key.month, 1),
        hours,
    };

note the .toarray() as you need to bring your data into memory to make this work.

i assumed that type was an integer with values of either 1 or 2.


on further thought i don't think it is possible to do this kind of grouping in a single query without bringing it into memory.

so the best alternative is to minimize the memory. if this doesn't work then you'll need to break it down into several queries.

var query1 =
    from r in rows
    group r.amount by new
    {
        r.date.year,
        r.date.month,
        r.type,
    } into g
    select new
    {
        g.key.year,
        g.key.month,
        g.key.type,
        amount = g.sum(),
    };

var query2 =
    from r in query1.toarray()
    group r by new
    {
        r.year,
        r.month,
    } into g
    let lookup = g.tolookup(x => x.type, x => x.amount)
    let hours = lookup[2].any() ? lookup[2].sum() : lookup[1].sum()
    select new
    {
        date = new datetime(g.key.year, g.key.month, 1),
        hours,
    };

is this a option a possibility?

score:1

okay, i think i've got this figured out. i've tested it with some data, and that works.

testdata and sql testquery:

declare @table table (
    datum datetime,
    amount int,
    [type] int
)

insert into @table (datum, amount, [type]) values
('2012-01-01',200,1),
('2012-01-01',100,2),
('2012-01-02',500,1),
('2012-03-01',200,1),
('2012-03-01',100,1),
('2012-03-02',500,2)

select month(datum), year(datum), count(*), sum(amount)
from @table t
inner join (
    select day (datum) as _day, month(datum) as _month, year(datum) _year,
    max([type]) as _type
    from @table
    group by day (datum), month(datum), year(datum)
) x
on _month = month (t.datum)
and _year = year(t.datum)
and _day = day(t.datum)
and _type = t.[type]
group by month(datum), year(datum)

result:

(no column name)    (no column name)    (no column name)    (no column name)
1   2012    2   600
3   2012    3   800

the translated linq query, tested with l2s and a 'real' test-table with test-data.

using (dataclasses1datacontext ctx = new dataclasses1datacontext()) {
    var rows = ctx.tests;

    var query = rows
        .join(
            rows.groupby(rr =>
                new { rr.datum.day, rr.datum.month, rr.datum.year },
                (key, data) => new { year = key.year, month = key.month, day = key.day, maxtype = data.select(xxx => xxx.type).max() }
            ),
            rr => new { day = rr.datum.day, month = rr.datum.month, year = rr.datum.year, type = rr.type },
            rr => new { day = rr.day, month = rr.month, year = rr.year, type = rr.maxtype },
            (r, r1) => r

        )
        .groupby(r =>
            new { year = r.datum.year, month = r.datum.month },
            (key, data) => new { year = key.year, month = key.month, amount = data.select(xx => xx.amount).sum() }
        )
        .tolist();
}

this returns the same results.

and, for the fun of it, the sql-query that l2s generated from the linq query.

select [t7].[value] as [year], [t7].[value2] as [month], (
    select sum([t8].[amount])
    from [dbo].[test] as [t8]
    inner join (
        select [t11].[value3], [t11].[value2], [t11].[value], (
            select max([t12].[type])
            from [dbo].[test] as [t12]
            where ((([t11].[value] is null) and (datepart(day, [t12].[datum]) is null)) or (([t11].[value] is not null) and (datepart(day, [t12].[datum]) is not null) and ((([t11].[value] is null) and (datepart(day, [t12].[datum]) is null)) or (([t11].[value] is not null) and (datepart(day, [t12].[datum]) is not null) and ([t11].[value] = datepart(day, [t12].[datum])))))) and ((([t11].[value2] is null) and (datepart(month, [t12].[datum]) is null)) or (([t11].[value2] is not null) and (datepart(month, [t12].[datum]) is not null) and ((([t11].[value2] is null) and (datepart(month, [t12].[datum]) is null)) or (([t11].[value2] is not null) and (datepart(month, [t12].[datum]) is not null) and ([t11].[value2] = datepart(month, [t12].[datum])))))) and ((([t11].[value3] is null) and (datepart(year, [t12].[datum]) is null)) or (([t11].[value3] is not null) and (datepart(year, [t12].[datum]) is not null) and ((([t11].[value3] is null) and (datepart(year, [t12].[datum]) is null)) or (([t11].[value3] is not null) and (datepart(year, [t12].[datum]) is not null) and ([t11].[value3] = datepart(year, [t12].[datum]))))))
            ) as [value4]
        from (
            select [t10].[value], [t10].[value2], [t10].[value3]
            from (
                select datepart(day, [t9].[datum]) as [value], datepart(month, [t9].[datum]) as [value2], datepart(year, [t9].[datum]) as [value3]
                from [dbo].[test] as [t9]
                ) as [t10]
            group by [t10].[value], [t10].[value2], [t10].[value3]
            ) as [t11]
        ) as [t13] on (datepart(day, [t8].[datum]) = [t13].[value]) and (datepart(month, [t8].[datum]) = [t13].[value2]) and (datepart(year, [t8].[datum]) = [t13].[value3]) and ([t8].[type] = [t13].[value4])
    where ((([t7].[value] is null) and (datepart(year, [t8].[datum]) is null)) or (([t7].[value] is not null) and (datepart(year, [t8].[datum]) is not null) and ((([t7].[value] is null) and (datepart(year, [t8].[datum]) is null)) or (([t7].[value] is not null) and (datepart(year, [t8].[datum]) is not null) and ([t7].[value] = datepart(year, [t8].[datum])))))) and ((([t7].[value2] is null) and (datepart(month, [t8].[datum]) is null)) or (([t7].[value2] is not null) and (datepart(month, [t8].[datum]) is not null) and ((([t7].[value2] is null) and (datepart(month, [t8].[datum]) is null)) or (([t7].[value2] is not null) and (datepart(month, [t8].[datum]) is not null) and ([t7].[value2] = datepart(month, [t8].[datum]))))))
    ) as [amount]
from (
    select [t6].[value], [t6].[value2]
    from (
        select datepart(year, [t0].[datum]) as [value], datepart(month, [t0].[datum]) as [value2]
        from [dbo].[test] as [t0]
        inner join (
            select [t3].[value3], [t3].[value2], [t3].[value], (
                select max([t4].[type])
                from [dbo].[test] as [t4]
                where ((([t3].[value] is null) and (datepart(day, [t4].[datum]) is null)) or (([t3].[value] is not null) and (datepart(day, [t4].[datum]) is not null) and ((([t3].[value] is null) and (datepart(day, [t4].[datum]) is null)) or (([t3].[value] is not null) and (datepart(day, [t4].[datum]) is not null) and ([t3].[value] = datepart(day, [t4].[datum])))))) and ((([t3].[value2] is null) and (datepart(month, [t4].[datum]) is null)) or (([t3].[value2] is not null) and (datepart(month, [t4].[datum]) is not null) and ((([t3].[value2] is null) and (datepart(month, [t4].[datum]) is null)) or (([t3].[value2] is not null) and (datepart(month, [t4].[datum]) is not null) and ([t3].[value2] = datepart(month, [t4].[datum])))))) and ((([t3].[value3] is null) and (datepart(year, [t4].[datum]) is null)) or (([t3].[value3] is not null) and (datepart(year, [t4].[datum]) is not null) and ((([t3].[value3] is null) and (datepart(year, [t4].[datum]) is null)) or (([t3].[value3] is not null) and (datepart(year, [t4].[datum]) is not null) and ([t3].[value3] = datepart(year, [t4].[datum]))))))
                ) as [value4]
            from (
                select [t2].[value], [t2].[value2], [t2].[value3]
                from (
                    select datepart(day, [t1].[datum]) as [value], datepart(month, [t1].[datum]) as [value2], datepart(year, [t1].[datum]) as [value3]
                    from [dbo].[test] as [t1]
                    ) as [t2]
                group by [t2].[value], [t2].[value2], [t2].[value3]
                ) as [t3]
            ) as [t5] on (datepart(day, [t0].[datum]) = [t5].[value]) and (datepart(month, [t0].[datum]) = [t5].[value2]) and (datepart(year, [t0].[datum]) = [t5].[value3]) and ([t0].[type] = [t5].[value4])
        ) as [t6]
    group by [t6].[value], [t6].[value2]
    ) as [t7]

i have no idea how efficient this sql is, you're going to have to try it.


Related Query

More Query from same tag