C# Parallel and Concurrent Programming

📖 8 min read

Async vs Parallel

Understanding when to use each approach is crucial.

Scenario Use Reason
HTTP calls, DB queries async/await I/O-bound, threads wait
Image processing Parallel CPU-bound, needs compute
File compression Parallel CPU-bound
Reading many files async/await I/O-bound
Complex calculations Parallel CPU-bound
// I/O-bound - use async
public async Task<IEnumerable<string>> DownloadAllAsync(IEnumerable<string> urls)
{
    var tasks = urls.Select(url => httpClient.GetStringAsync(url));
    return await Task.WhenAll(tasks);
}

// CPU-bound - use parallel
public IEnumerable<ProcessedImage> ProcessImages(IEnumerable<Image> images)
{
    return images.AsParallel()
        .Select(img => ProcessImage(img))
        .ToList();
}

Parallel Class

Execute loops in parallel using multiple threads.

Parallel.For

// Process indices in parallel
Parallel.For(0, 100, i =>
{
    ProcessItem(i);
});

// With parallelism control
Parallel.For(0, 100,
    new ParallelOptions { MaxDegreeOfParallelism = 4 },
    i => ProcessItem(i));

// With local state for thread-safe accumulation
long total = 0;
Parallel.For(0, 1000,
    () => 0L, // Initialize local state per thread
    (i, state, localSum) => localSum + ComputeValue(i), // Process
    localSum => Interlocked.Add(ref total, localSum)); // Combine

Parallel.ForEach

var items = GetItems();

Parallel.ForEach(items, item =>
{
    ProcessItem(item);
});

// With options
var options = new ParallelOptions
{
    MaxDegreeOfParallelism = Environment.ProcessorCount,
    CancellationToken = cancellationToken
};

Parallel.ForEach(items, options, item =>
{
    options.CancellationToken.ThrowIfCancellationRequested();
    ProcessItem(item);
});

// Partitioned for better performance with large collections
var partitioner = Partitioner.Create(items, EnumerablePartitionerOptions.NoBuffering);
Parallel.ForEach(partitioner, item => ProcessItem(item));

Breaking and Stopping

Parallel.For(0, 1000, (i, state) =>
{
    if (FoundTarget(i))
    {
        // Stop - don't start new iterations, but finish running ones
        state.Stop();
        return;
    }

    // Break - stop at this index, complete lower indices
    if (ShouldBreak(i))
    {
        state.Break();
    }

    ProcessItem(i);
});

PLINQ (Parallel LINQ)

Parallelize LINQ queries for CPU-bound operations.

Basic PLINQ

var numbers = Enumerable.Range(1, 1_000_000);

// Sequential
var sumSeq = numbers
    .Where(n => n % 2 == 0)
    .Select(n => n * n)
    .Sum();

// Parallel - just add AsParallel()
var sumPar = numbers
    .AsParallel()
    .Where(n => n % 2 == 0)
    .Select(n => n * n)
    .Sum();

// When order matters
var ordered = numbers
    .AsParallel()
    .AsOrdered() // Maintain source order
    .Where(n => IsPrime(n))
    .Take(100)
    .ToList();

PLINQ Options

var results = source
    .AsParallel()
    .WithDegreeOfParallelism(4)     // Limit threads
    .WithExecutionMode(ParallelExecutionMode.ForceParallelism)  // Always parallel
    .WithMergeOptions(ParallelMergeOptions.NotBuffered)  // Stream results
    .WithCancellation(cancellationToken)
    .Select(item => Process(item))
    .ToList();

When PLINQ Helps

// GOOD for PLINQ - expensive per-item operation
var processed = images
    .AsParallel()
    .Select(img => ResizeAndCompress(img))  // CPU-intensive
    .ToList();

// BAD for PLINQ - overhead exceeds benefit
var doubled = numbers
    .AsParallel()
    .Select(n => n * 2)  // Too simple
    .ToList();

// BAD - I/O bound (use async instead)
var contents = files
    .AsParallel()
    .Select(f => File.ReadAllText(f))  // I/O, not CPU
    .ToList();

Concurrent Collections

Thread-safe collections for concurrent access.

ConcurrentDictionary

var cache = new ConcurrentDictionary<string, Data>();

// Add or get
var value = cache.GetOrAdd("key", key => LoadData(key));

// Add or update
var updated = cache.AddOrUpdate(
    "key",
    key => CreateNew(key),           // Add factory
    (key, old) => UpdateExisting(old) // Update factory
);

// Thread-safe read
if (cache.TryGetValue("key", out var data))
{
    Process(data);
}

// Thread-safe remove
if (cache.TryRemove("key", out var removed))
{
    Cleanup(removed);
}

// Atomic update pattern
cache.AddOrUpdate("counter",
    _ => 1,
    (_, current) => current + 1);

ConcurrentQueue and ConcurrentStack

var queue = new ConcurrentQueue<WorkItem>();

// Producer
queue.Enqueue(new WorkItem());

// Consumer
if (queue.TryDequeue(out var item))
{
    Process(item);
}

// ConcurrentStack - LIFO
var stack = new ConcurrentStack<int>();
stack.Push(1);
stack.PushRange(new[] { 2, 3, 4 });

if (stack.TryPop(out var value)) { }
if (stack.TryPeek(out var top)) { }

ConcurrentBag

Unordered collection optimized for scenarios where the same thread produces and consumes.

var bag = new ConcurrentBag<Result>();

Parallel.ForEach(items, item =>
{
    var result = Process(item);
    bag.Add(result);  // Thread-local storage optimization
});

var allResults = bag.ToArray();

BlockingCollection

Producer-consumer pattern with blocking operations.

using var collection = new BlockingCollection<WorkItem>(boundedCapacity: 100);

// Producer
Task.Run(() =>
{
    foreach (var item in GetItems())
    {
        collection.Add(item);  // Blocks if full
    }
    collection.CompleteAdding();
});

// Consumer
Task.Run(() =>
{
    foreach (var item in collection.GetConsumingEnumerable())
    {
        Process(item);  // Blocks if empty
    }
});

// With timeout
if (collection.TryAdd(item, TimeSpan.FromSeconds(5)))
{
    // Added successfully
}

Thread Synchronization

Choosing a Synchronization Primitive

Different primitives solve different problems. Choosing the wrong one leads to either poor performance or subtle bugs.

Primitive Use When Trade-offs
lock Simple mutual exclusion Easy to use but blocks threads
SemaphoreSlim Limiting concurrent access (e.g., max 5 connections) More flexible than lock
ReaderWriterLockSlim Many reads, few writes Complexity for read-heavy scenarios
Interlocked Simple numeric operations Fastest, but limited to specific ops
Concurrent collections Shared data structures Thread-safe by design

Use lock when:

  • You need simple mutual exclusion
  • The protected code is fast (no I/O, no async)
  • Only one thread should execute the critical section at a time

Use SemaphoreSlim when:

  • You need to limit concurrency (e.g., max 10 parallel HTTP calls)
  • You need async-compatible synchronization
  • You need to coordinate across async methods

Use ReaderWriterLockSlim when:

  • Reads vastly outnumber writes
  • Read operations don’t modify shared state
  • You can tolerate the added complexity

Use Interlocked when:

  • You only need simple atomic operations (increment, compare-exchange)
  • Maximum performance is critical
  • You’re implementing lock-free algorithms

Prefer concurrent collections over manually synchronizing regular collections.

lock Statement

private readonly object lockObj = new();
private int counter;

public void Increment()
{
    lock (lockObj)
    {
        counter++;
    }
}

// Don't lock on 'this' or Type objects
// BAD: lock (this)
// BAD: lock (typeof(MyClass))

SemaphoreSlim

Limit concurrent access to a resource.

private readonly SemaphoreSlim semaphore = new(maxCount: 3);

public async Task ProcessAsync(Item item)
{
    await semaphore.WaitAsync();
    try
    {
        await DoWorkAsync(item);
    }
    finally
    {
        semaphore.Release();
    }
}

// Process with limited concurrency
public async Task ProcessAllAsync(IEnumerable<Item> items)
{
    var tasks = items.Select(async item =>
    {
        await semaphore.WaitAsync();
        try
        {
            return await ProcessAsync(item);
        }
        finally
        {
            semaphore.Release();
        }
    });

    await Task.WhenAll(tasks);
}

ReaderWriterLockSlim

Multiple readers or single writer.

private readonly ReaderWriterLockSlim rwLock = new();
private Dictionary<string, string> data = new();

public string Read(string key)
{
    rwLock.EnterReadLock();
    try
    {
        return data.TryGetValue(key, out var value) ? value : null;
    }
    finally
    {
        rwLock.ExitReadLock();
    }
}

public void Write(string key, string value)
{
    rwLock.EnterWriteLock();
    try
    {
        data[key] = value;
    }
    finally
    {
        rwLock.ExitWriteLock();
    }
}

Interlocked Operations

Atomic operations without locks.

private long counter;

public void Increment() => Interlocked.Increment(ref counter);
public void Decrement() => Interlocked.Decrement(ref counter);
public void Add(long value) => Interlocked.Add(ref counter, value);
public long Read() => Interlocked.Read(ref counter);

// Compare and swap
private int state;

public bool TryTransition(int from, int to)
{
    return Interlocked.CompareExchange(ref state, to, from) == from;
}

// Exchange
public int GetAndReset()
{
    return Interlocked.Exchange(ref counter, 0);
}

Task Parallel Library Patterns

Task.Run for CPU-Bound Work

// Offload CPU-bound work from UI thread
private async void Calculate_Click(object sender, EventArgs e)
{
    var result = await Task.Run(() =>
    {
        return ExpensiveCalculation();
    });

    DisplayResult(result);
}

// Don't wrap async methods in Task.Run
// BAD:
await Task.Run(async () => await httpClient.GetStringAsync(url));

// GOOD:
await httpClient.GetStringAsync(url);

Continuation Tasks

var task = GetDataAsync();

// Continue when task completes
var continuation = task.ContinueWith(t =>
{
    if (t.IsCompletedSuccessfully)
        Process(t.Result);
    else if (t.IsFaulted)
        HandleError(t.Exception);
});

// Prefer async/await over ContinueWith
var data = await GetDataAsync();
Process(data);

TaskCompletionSource

Bridge between callback-based APIs and Task-based APIs.

public Task<string> DownloadAsync(string url)
{
    var tcs = new TaskCompletionSource<string>();

    var client = new WebClient();
    client.DownloadStringCompleted += (s, e) =>
    {
        if (e.Cancelled)
            tcs.SetCanceled();
        else if (e.Error != null)
            tcs.SetException(e.Error);
        else
            tcs.SetResult(e.Result);
    };

    client.DownloadStringAsync(new Uri(url));

    return tcs.Task;
}

Channels (Modern Producer-Consumer)

using System.Threading.Channels;

// Bounded channel - blocks when full
var bounded = Channel.CreateBounded<Message>(new BoundedChannelOptions(100)
{
    FullMode = BoundedChannelFullMode.Wait,
    SingleReader = false,
    SingleWriter = false
});

// Unbounded channel - never blocks writes
var unbounded = Channel.CreateUnbounded<Message>();

// Producer
public async Task ProduceAsync(ChannelWriter<Message> writer, CancellationToken ct)
{
    try
    {
        while (!ct.IsCancellationRequested)
        {
            var message = await GetNextMessageAsync(ct);
            await writer.WriteAsync(message, ct);
        }
    }
    finally
    {
        writer.Complete();
    }
}

// Consumer
public async Task ConsumeAsync(ChannelReader<Message> reader, CancellationToken ct)
{
    await foreach (var message in reader.ReadAllAsync(ct))
    {
        await ProcessMessageAsync(message);
    }
}

// Usage
var channel = Channel.CreateBounded<Message>(100);
var producerTask = ProduceAsync(channel.Writer, cts.Token);
var consumerTask = ConsumeAsync(channel.Reader, cts.Token);
await Task.WhenAll(producerTask, consumerTask);

Thread-Safe Patterns

Lazy Thread-Safe Initialization

// Lazy<T> with thread safety
private readonly Lazy<ExpensiveObject> lazy =
    new Lazy<ExpensiveObject>(() => new ExpensiveObject());

public ExpensiveObject Instance => lazy.Value;

// Double-check locking (manual pattern)
private volatile ExpensiveObject? instance;
private readonly object lockObj = new();

public ExpensiveObject Instance
{
    get
    {
        if (instance == null)
        {
            lock (lockObj)
            {
                instance ??= new ExpensiveObject();
            }
        }
        return instance;
    }
}

Immutable State Updates

// Thread-safe state updates using immutable types
private ImmutableList<Item> items = ImmutableList<Item>.Empty;
private readonly object lockObj = new();

public void AddItem(Item item)
{
    lock (lockObj)
    {
        items = items.Add(item);
    }
}

// Or using Interlocked with compare-exchange
private ImmutableList<Item> items = ImmutableList<Item>.Empty;

public void AddItemLockFree(Item item)
{
    ImmutableList<Item> initial, updated;
    do
    {
        initial = items;
        updated = initial.Add(item);
    } while (Interlocked.CompareExchange(ref items, updated, initial) != initial);
}

Common Pitfalls

Race Conditions

// BAD - race condition
private int counter;
public void Increment()
{
    counter++;  // Not atomic: read-modify-write
}

// GOOD - atomic increment
public void IncrementSafe()
{
    Interlocked.Increment(ref counter);
}

Deadlocks

// DEADLOCK potential - acquiring locks in different order
public void Method1()
{
    lock (lockA)
    {
        lock (lockB) { }  // Waits for B
    }
}

public void Method2()
{
    lock (lockB)
    {
        lock (lockA) { }  // Waits for A
    }
}

// SOLUTION - always acquire locks in same order

Closure Capture in Loops

// BAD - all tasks capture same variable
for (int i = 0; i < 10; i++)
{
    Task.Run(() => Console.WriteLine(i));  // Might print 10 ten times
}

// GOOD - capture copy
for (int i = 0; i < 10; i++)
{
    int captured = i;
    Task.Run(() => Console.WriteLine(captured));
}

// Or use foreach (captures correctly since C# 5)
foreach (var item in items)
{
    Task.Run(() => Process(item));  // OK
}

Version History

Feature Version Significance
Parallel class .NET 4.0 Parallel loops
PLINQ .NET 4.0 Parallel LINQ
Concurrent collections .NET 4.0 Thread-safe collections
Task Parallel Library .NET 4.0 Task-based parallelism
SemaphoreSlim .NET 4.0 Lightweight semaphore
Channels .NET Core 2.1 Modern producer-consumer
IAsyncEnumerable .NET Core 3.0 Async streams

Key Takeaways

Use async for I/O, parallel for CPU: Async frees threads during I/O waits. Parallel uses multiple threads for CPU work.

Limit parallelism: Don’t spawn unlimited parallel tasks. Use MaxDegreeOfParallelism or semaphores.

Prefer concurrent collections: ConcurrentDictionary and friends are optimized for concurrent access.

Lock minimally: Hold locks for the shortest time possible. Consider lock-free alternatives.

Use channels for producer-consumer: Channels provide efficient, modern producer-consumer patterns.

Test concurrent code carefully: Race conditions are timing-dependent. Use stress testing and tools like thread sanitizers.

Found this guide helpful? Share it with your team:

Share on LinkedIn