Accepted answer

When this type of question comes up the answer is always the same: The PLINQ overhead is higher than the gains.

This happens because the work items are extremely small (grouping by a char, or creating a new object from trivial inputs). It works much better when they are bigger.


It's really hard to tell what's going on there strictly based on the code you provided.

TPL uses thread pool threads. The thread pool starts up with about 10 running threads. If you need more threads then the thread pool will create new ones about once every second as long as a new thread is needed. If your loop resulted in more than 10 parallel operations, it would need to spend time spinning up a new thread.Correction: the number of threads a parallel loop needs takes away from available threads in the thread pool. The thread pool tries to keep a minimum number of available threads in that pool, if it notices that threads are taking too long, it will spin up new ones to compensate--which takes resources. Lots of parts of the framework use the thread pool, so there's all sorts of opportunities that could be stressing the thread pool. Starting up a thread is fairly expensive.

The other possibly is that if your number of iterations was more than the number of available CPUs, a lot of context switching resulted. Context switching is expensive and impacts the load on the CPUs as well as how fast the OS can switch between threads.

If you provide more detail, like the input data, I can provide more detail in the answer.

Related Articles