Exploring concurrency, parallelism and asynchronous programming in .NET

In the world of modern software development, the ability to perform multiple tasks simultaneously is important for building responsive and efficient applications. Have you ever wondered how your favorite apps handle multiple operations at once without slowing down? Or how complex computations are performed in the blink of an eye? Concurrency, parallelism and asynchronous programming are the key techniques that make this possible.

Concurrency is the ability of a system to manage multiple tasks at the same time, although not necessarily simultaneously. It executes tasks in a way that makes progress on several tasks without waiting for one to complete before starting another. Imagine you are juggling multiple balls, giving each one a little attention — this is concurrency.

Parallelism takes concurrency a step further by executing multiple tasks simultaneously, typically on multiple CPU cores. This allows multitasking, where different parts of a program run at the same time. It’s like having multiple jugglers, each handling their own set of balls at the same time.

Asynchronous programming allows tasks to run independently of the main program flow, keeping applications responsive even while waiting for long-running operations to complete. This is important for tasks like fetching data from a server or reading files, which can take time. It’s like setting an alarm clock to ring when a task is done, while you move on to other things. When the alarm rings, you come back to it — this is asynchronous programming.

Threads and Tasks

Concurrency and parallelism are achieved using techniques such as threads, tasks and parallel loops. At the core of concurrency in .NET are threads and tasks. A thread is the smallest unit of execution within a process, while a task represents an asynchronous operation that can be scheduled to run on a thread.

When you create a new thread in .NET, the runtime allocates resources and schedules the thread to run on one of the CPU cores. The operating system’s scheduler manages the execution of threads, switching between them to ensure fair resource allocation. Tasks provide a higher-level abstraction over threads, making it easier to work with asynchronous operations. The Task Parallel Library (TPL) in .NET manages tasks, handling their creation, scheduling and execution on the available threads.

Task.Run(() =>

{

  Thread.Sleep(1000);

  Console.WriteLine("Task completed!");

});

When a new thread is created, the .NET runtime initializes a new thread object, allocating necessary resources like stack memory and a thread control block (TCB) to store thread-specific information. The runtime makes a system call to the operating system to create the actual thread at the OS level. This allocates additional resources and setting up the thread’s execution context. The thread is initially placed in a “new” state. It transitions to the “ready” state when it is ready to run.

The OS scheduler manages the execution of threads. It decides which threads run on which CPU cores and for how long, using scheduling algorithms to manage this process efficiently. Threads in the “ready” state are placed in a ready queue, with different queues for different priority levels. The scheduler periodically performs context switching, saving the state of the currently running thread and loading the state of the next thread to be executed. This allows multiple threads to share CPU time. The scheduler assigns threads to available CPU cores based on factors such as priority, load balancing and affinity (preferred execution on certain cores).

When a new task is created using Task.Run or other methods, the Task Parallel Library (TPL) queues the task for execution. The TPL uses the .NET thread pool to manage a pool of worker threads. The thread pool dynamically adjusts the number of worker threads based on workload, adding or removing threads to optimize performance. The TPL uses a work-stealing algorithm to balance the load among threads. Idle threads can “steal” tasks from the queues of busy threads. Worker threads execute tasks from the queue, updating the task’s state upon completion. Completed tasks may trigger continuation tasks or callbacks.

Asynchronous programming

Asynchronous programming in .NET is simplified using the async and await keywords. These allow you to write non-blocking code that keeps your application responsive while waiting for long-running operations to complete. When you use the await keyword, the method pauses its execution until the awaited task completes. The .NET runtime manages this by releasing the thread to perform other tasks while waiting and continuing the method on the original thread once the awaited task is done.

public async Task<string> FetchDataAsync()

{

    using (HttpClient client = new HttpClient())

    {

        string data = await client.GetStringAsync("https://example.com");

        return data;

    }

}

When an async method is called, it returns a Task object that represents the ongoing operation. This allows the calling code to continue executing or also to await the task. The compiler transforms the async method into a state machine. This state machine keeps track of the method’s progress and knows how to continue execution after each await. When the method hits an await, the current context (e.g., the synchronization context for UI threads) is captured. This allows that after the awaited task completes, the method can continue on the same context, maintaining thread affinity where necessary. The thread executing the method is released back to the thread pool, allowing it to perform other tasks. This is what makes async methods non-blocking. Once the awaited task completes, the runtime schedules the continuation of the async method. This could be on the original context (e.g., UI thread) or on another thread in thread pool, depending on the captured context.

By releasing threads during long-running operations, asynchronous programming allows the CPU to perform other tasks, improving overall utilization. Since async methods don’t block threads, there’s less contention for thread pool resources, leading to better performance and scalability. Applications, especially UI-based, stay responsive because the main thread isn’t blocked by long-running operations.

Impact on OS and hardware

The OS scheduler manage the execution of threads. It decides which threads run on which CPU cores and for how long, balancing different factors to optimize system performance. The scheduler uses algorithms like round-robin, priority-based scheduling or multilevel queue scheduling to allocate CPU time to threads. This allows fair resource allocation and reducing context switching overhead. When the scheduler switches from one thread to another, it performs a context switch, saving the state of the current thread and loading the state of the next one. The scheduler distributes threads across CPU cores to balance the load, preventing some cores from being overloaded while others are underutilized.

At the hardware level, the CPU executes instructions from multiple threads, using its architecture to run tasks efficiently in parallel. Hyper-Threading allows a single physical CPU core to execute multiple threads simultaneously. Each core has two separate sets of architectural state, appearing as two logical cores to the OS. This improves throughput by utilizing idle threads within the core. Multiple CPU cores allow full parallelism, where each core can execute a different thread concurrently. CPUs use Instruction-Level Parallelism (ILP) by executing multiple instructions from the same thread in parallel. Techniques such as pipelining, superscalar execution and out-of-order execution enhance parallelism at the instruction level.

Asynchronous methods in .NET are designed to enhance system efficiency and responsiveness. When an async method awaits a task, the runtime releases the thread back to the thread pool, allowing other tasks to run. This dynamic management of threads prevents the creation of too many threads reducing the load on the CPU and avoiding excessive context switching. Asynchronous methods ensure that threads are not blocked while waiting for I/O operations. This non-blocking behavior leads to better CPU and memory utilization, because threads are not idle while waiting for I/O operations.

The CPU periodically interrupts running threads to allow the OS scheduler to make scheduling decisions. During an interrupt, the CPU saves the state of the current thread (registers, program counter, etc.) and invokes the scheduler. The scheduler loads the state of the next thread to be executed and restores its context. The scheduler assigns threads to available CPU cores.

Parallelism and the power of modern CPUs

Parallelism breaks down a task into smaller sub-tasks that can be processed simultaneously, using multiple CPU cores to improve performance. Modern CPUs come with multiple cores and support simultaneous multithreading (SMT) technologies, such as Intel’s Hyper-Threading.

A large task is divided into smaller and independent sub-tasks. Each sub-task is self-contained and can be executed concurrently. Each sub-task is assigned to a separate thread. Modern CPUs with multiple cores can execute multiple threads simultaneously. Hyper-Threading further enhances this capability by allowing each core to handle multiple hardware threads. For example, if a CPU has four cores, it can handle four sub-tasks at the same time, speeding up the overall execution compared to sequential processing.

The OS scheduler allocates CPU time to threads and distributes them across the available cores. It ensures that threads are balanced to maximize CPU utilization and minimize idle time. The scheduler dynamically adjusts the distribution of threads to balance the load across cores, preventing any single core from becoming a bottleneck. When a core switches from one thread to another, the OS performs a context switch. Although context switching introduces some overhead, the performance enhanced from parallelism typically outweigh this cost.

In .NET, the TPL provides a simple way to parallelize tasks. The Parallel class and Task class are commonly used to create and manage parallel tasks.

Parallel.For(0, 100, i =>

{

    ProcessItem(i);

});

PLINQ (Parallel LINQ) enables parallel processing of LINQ queries, allowing operations on collections to be executed in parallel.

var results = data.AsParallel().Where(item => item.IsValid).ToList();

Combining asynchronous programming with parallelism allows for efficient and non-blocking operations that utilize multiple cores.

public async Task ProcessDataAsync(List<int> data)

{

    var tasks = data.Select(item => Task.Run(() => ProcessItem(item))).ToArray();

    await Task.WhenAll(tasks);

}

When a parallel task is created in .NET, the runtime interacts with the OS to request resources and allocate the task to a thread. When a parallel task is created, it is queued in the thread pool. An available worker thread picks it up for execution. To prevent race conditions and ensure thread-safe operations, .NET provides synchronization mechanisms such as lock, mutex and semaphore. These mechanisms coordinate access to shared resources and ensure data integrity.

Race conditions

Concurrency introduces the challenge of managing access to shared resources. A race condition occurs when the behavior of a software system depends on the relative timing of events, such as the order in which threads execute, access shared resources or modify data. This can lead to unpredictable and wrong outcomes.

Race conditions occur due to the concurrent execution of instructions by the CPU. Modern CPUs have multiple cores and each core can execute threads independently. This concurrency allows multiple threads to run simultaneously, potentially accessing the same memory locations. When multiple threads access shared resources (variables, data structures or hardware registers) without proper synchronization, they may read or write data inconsistently.

Let’s imagine two threads, A and B, incrementing a shared variable counter.

Thread A reads counter (value = 0)
Thread B reads counter (value = 0)
Thread A increments counter (value = 1)
Thread B increments counter (value = 1)
Both threads write back their values, resulting in counter being 1 instead of the expected 2.

int counter = 0;

Parallel.For(0, 1000, i =>

{

    counter++; // Potential race condition

});

Multitasking allows the OS to interrupt running threads, leading to potential race conditions if threads access shared resources without synchronization. During context switches, the OS saves the state of the currently running thread and restores the state of the next thread to be executed. This switching can occur at any time, causing race conditions if not managed properly. Deadlocks occur when two or more threads are waiting for each other to release resources, causing them to be stuck indefinitely. Techniques to prevent deadlocks include resource ordering and timeout-based locking mechanisms.

To prevent race conditions, developers must use synchronization mechanisms to ensure that only one thread accesses a shared resource at a time or that operations on shared resources are atomic. .NET provides synchronization mechanisms, such as locks, semaphores and concurrent collections.

Mutexes (Mutual Exclusions) are locking mechanisms that ensure only one thread can access a critical section of code at a time. When a thread gets a mutex, other threads trying to get it are blocked until the mutex is released.

private static readonly Mutex mutex = new Mutex();



public void Increment()

{

    mutex.WaitOne();

    try

    {

        counter++;

    }

    finally

    {

        mutex.ReleaseMutex();

    }

}

A semaphore controls access to a resource by allowing a fixed number of threads to access it simultaneously. It’s useful for managing a pool of resources.

// Allows up to 3 threads

private static readonly Semaphore semaphore = new Semaphore(3, 3);



public void AccessResource()

{

    semaphore.WaitOne();

    try

    {

        // Access shared resource

    }

    finally

    {

        semaphore.Release();

    }

}

Monitors (or locks) are higher-level synchronization mechanisms that provide mutual exclusion and the ability to wait for a condition to be met.

private static readonly object lockObject = new object();



public void Increment()

{

    lock (lockObject)

    {

        counter++;

    }

}

Modern CPUs and OSs provide atomic operations, which are indivisible and cannot be interrupted. These operations ensure that race conditions do not occur during the execution of critical instructions.

public void Increment()

{

    Interlocked.Increment(ref counter);

}

In .NET, the volatile keyword ensures that a variable is read from and written to main memory, preventing the compiler and CPU from reordering operations that might lead to race conditions.

private volatile int counter;



public void SetCounter(int value)

{

    counter = value;

}

Use thread-local storage for variables that are accessed by multiple threads, but should not be shared among them. This isolates the data for each thread.

private static ThreadLocal<int> threadLocalVariable = new ThreadLocal<int>(() => 0);



public void UseThreadLocal()

{

    // Each thread has its own copy of the variable

    threadLocalVariable.Value = 42;

}

Event loop

The event loop maintains a task queue, which holds callbacks and tasks that need to be executed. It continuously iterates through the task queue, picking up tasks and executing them one at a time. Asynchronous I/O operations, such as reading a file, network requests and timers, register callbacks in the task queue without blocking the main execution thread.

In .NET, the SynchronizationContext class and the Task-based Asynchronous Pattern (TAP) facilitate the execution of asynchronous tasks. The event loop enables concurrency by allowing multiple tasks to be managed and executed. This is important for I/O tasks, where the main thread can continue executing other tasks while waiting for I/O operations to complete. The SynchronizationContext class is responsible for capturing the current execution context and scheduling tasks on the appropriate thread. This is widely used for UI applications, where tasks need to be executed on the main UI thread.

public async Task PerformAsyncTask()

{

    var context = SynchronizationContext.Current;

    await Task.Run(() =>

      {

          // Perform background work

      }).ContinueWith(t =>

      {

          context.Post(_ =>

          {

              // Update UI on the main thread

          }, null);

      });

}

Task-based Asynchronous Pattern (TAP) uses the event loop to manage asynchronous tasks. The async and await keywords simplify the development of asynchronous code, allowing tasks to be executed without blocking the main thread. When an async method awaits a task, it registers the continuation of the method as a callback in the event loop. The method returns control to the caller, allowing the main thread to perform other tasks. Once the awaited task completes, the event loop picks up the continuation and continues the method.

To sum up

Understanding concurrency, parallelism and asynchronous programming opens up a world of possibilities for creating efficient, responsive and scalable applications. From the high-level abstractions provided by tasks and async methods to the low-level interactions with the OS scheduler and CPU, these concepts empower developers to use the full potential of modern hardware.