The dangers of ThreadLocal

Languages and frameworks evolve. We as developers have to learn new things constantly and unlearn already-learned knowledge. Speaking for myself, unlearning is the most difficult part of continuous learning. When I first came into contact with multi-threaded applications in .NET, I stumbled over the ThreadStatic attribute. I made a mental note that this attribute is particularly helpful when you have static fields that should not be shared between threads. At the time that the .NET Framework 4.0 was released, I discovered the ThreadLocal class and how it does a better job assigning default values to thread-specific data. So I unlearned the ThreadStaticAttribute, favoring instead ThreadLocal<T>.

Fast forward to some time later, when I started digging into async/await. I fell victim to a belief that thread-specific data still worked. So I was wrong, again, and had to unlearn, again! If only I had known about AsyncLocal earlier.

Let's learn and unlearn together!

TL;DR

  • A Task is not a Thread. Task is a future or promise that gets eventually executed by a worker Thread.
  • If you need ambient data local to the asynchronous control flow, for example to cache WCF communication channels, use AsyncLocal instead of ThreadStaticAttribute or ThreadLocal provided by .NET 4.6 or the Core CLR.

There is no spoon

Some things just stick in our minds. When the world around us evolves, those things might no longer be true. All of us who started programming with .NET in the pre-.NET 4.0 era have synthesized the knowledge of threads and how useful they are in executing intensive operations in parallel. We knew the difference between foreground/background threads and how important it is not to block the UI thread.

Then the Task Parallel Library was released and turned our world upside down. When looking at System.Threading.Tasks.Task, we suddenly felt like Neo from The Matrix and realized there is no spoon — or in our context, no Thread!

I'm not the first one using the matrix analogy to describe the difference between a Thread and a Task. Stephen Cleary has an excellent post There is no thread from the year 2013 which uses the same analogy and dives deeper into the differences. I highly suggest reading it.

A Task under the Task Parallel Library is a future or promise. A Task is something you want to be done. In contrast, a Thread is one of the many possible workers who might perform that task. By the contract of a Task, we don't know whether it will be scheduled, whether it will be immediately executed, or whether it is already done the moment we declared it. The Task Parallel Library runtime has the built-in smarts to decide whether a task is executed on the thread that created it or if it needs to be scheduled on the worker thread pool or the IO thread pool. Furthermore, just because a thread was working on a given task doesn't mean that thread will execute all the continuations of that task.

This gets even more complex when we start introducing async/await into the equation. Any time you write an await statement, along with your friend ConfigureAwait(false), the thread currently executing the task can yield back and start executing multiple other tasks. When the I/O operation completes, the remainder of the task (the continuation) is again scheduled as a Task on the currently responsible TaskScheduler. The previously responsible thread might pick up that task and continue working on it. Alternately, any other available thread could do it. The following code illustrates this:

static dynamic Local;
static ThreadLocal<string> ThreadLocal = new ThreadLocal<string>(() => "Initial Value");

public async Task ThereIsNoSpoon()
{
    // Assign the ThreadLocal to the dynamic field
    Local = ThreadLocal;

    Console.WriteLine($"Before TopOne: '{Local.Value}'");
    await TopOne().ConfigureAwait(false);
    Console.WriteLine($"After TopOne: '{Local.Value}'");
    await TopTen().ConfigureAwait(false);
    Console.WriteLine($"After TopTen: '{Local.Value}'");
}

static async Task TopOne()
{
   await Task.Delay(10).ConfigureAwait(false);
   Local.Value = "ValueSetBy TopOne";
   await Somewhere().ConfigureAwait(false);
}

static async Task TopTen()
{
   await Task.Delay(10).ConfigureAwait(false);
   Local.Value = "ValueSetBy TopTen";
   await Somewhere().ConfigureAwait(false);
}

static async Task Somewhere()
{
   await Task.Delay(10).ConfigureAwait(false);
   Console.WriteLine($"Inside Somewhere: '{Local.Value}'");
   await Task.Delay(10).ConfigureAwait(false);
   await DeepDown();
}

static async Task DeepDown()
{
   await Task.Delay(10).ConfigureAwait(false);
   Console.WriteLine($"Inside DeepDown: '{Local.Value}'");
   Fire().Ignore();
}

static async Task Fire()
{
   await Task.Yield();
   Console.WriteLine($"Inside Fire: '{Local.Value}'");
}

The above code should be relatively straightforward to understand. The entry point is the method ThereIsNoSpoon, which calls into two methods TopOne and TopTen. These methods set the ThreadLocal value to ValueSetBy TopOne and ValueSetBy TopTen, respectively. Both methods call the method Somewhere, which prints out the value of the ThreadLocal and calls into DeepDown. DeepDown prints the value of the ThreadLocal again, and then kicks off an asynchronous method called Fire without awaiting it (hence the method Ignore, which suppresses the compiler warning CS4014). This code uses a tiny trick that lets us later reuse the code to demonstrate AsyncLocal. The methods used in the execution path access the dynamic static field Local. Both classes provide a Value property and, therefore, we can just assign either ThreadLocal or AsyncLocal without duplicating unnecessary code.

The output of the above code looks similar to the following (your result might vary):

Before TopOne: 'Initial Value'
Inside Somewhere: 'ValueSetBy TopOne'
Inside DeepDown: 'ValueSetBy TopOne'
After TopOne: 'ValueSetBy TopOne'
Inside Fire: 'Initial Value'
Inside Somewhere: 'Initial Value'
Inside DeepDown: 'ValueSetBy TopTen'
Inside Fire: 'Initial Value'
After TopTen: 'ValueSetBy TopTen'

Before the execution of the TopOne method, the ThreadLocal has its default value. The method itself assigns the value Initial Value to the ThreadLocal. This value remains until the Fire method is scheduled without being awaited. In our case, the Fire method gets scheduled on the default TaskScheduler, the ThreadPool. Therefore, the previously-assigned value is no longer bound to the ThreadLocal, leading to the Fire method only being able to read the ThreadLocal's default value. In our example reading the initial value has no consequences. But what if you used the ThreadLocal to cache expensive communication objects such as WCF channels? You'd assume that those expensive objects are cached and reused when executing the Fire method. But they wouldn't. This would create a dangerous hot path in your codebase, furiously creating new expensive objects with each invocation. You'd have a difficult time discovering this, until your shopping cart system crashed spectacularly on the busiest day of the holiday shopping season.

Long story short: a Task is not a Thread. Along with async/await, we must say goodbye to thread-local data! Say hello to AsyncLocal.

Bend it with your mind

AsyncLocal<T> is a class introduced in the .NET Framework 4.6 (also available in the new CoreCLR). According to MSDN, AsyncLocal<T> "represents ambient data that is local to a given asynchronous control flow."

Let's decipher that statement. An asynchronous control flow can be seen as the call stack of an asynchronous method call chain. In the example above, the asynchronous control flow from the view of the AsyncLocal starts when we set the Value property. So the two control flows would be

Flow 1: TopOne > Somewhere > DeepDown > Fire

Flow 2: TopTen > Somewhere > DeepDown > Fire

So the promise of the AsyncLocal is that we can assign a value to it that's present as long as we are inside the same asynchronous control flow. We'll prove that with the following code.

static AsyncLocal<string> AsyncLocal = new AsyncLocal<string> { Value = "Initial Value" };

public async Task BendItWithYourMind()
{
    // Assign the AsyncLocal to the dynamic field
    Local = AsyncLocal;

    // This code is the same as before but shown again for clarity
    Console.WriteLine($"Before TopOne: '{Local.Value}'");
    await TopOne().ConfigureAwait(false);
    Console.WriteLine($"After TopOne: '{Local.Value}'");
    await TopTen().ConfigureAwait(false);
    Console.WriteLine($"After TopTen: '{Local.Value}'");
}

The output of the above code looks similar to this:

Before TopOne: 'Initial Value'
Inside Somewhere: 'ValueSetBy TopOne'
Inside DeepDown: 'ValueSetBy TopOne'
After TopOne: 'Initial Value'
Inside Fire: 'ValueSetBy TopOne'
Inside Somewhere: 'ValueSetBy TopTen'
Inside DeepDown: 'ValueSetBy TopTen'
Inside Fire: 'ValueSetBy TopTen'
After TopTen: 'Initial Value'

As we can see, the code finally behaves as we anticipated from the beginning. Outside the asynchronous control flow, the AsyncLocal has its default value "Initial Value." As soon as we assign the Value property, it remains set even if we call the Fire method without awaiting it.

Sometimes Local just isn't local enough

We've just seen how [ThreadStaticAttribute] or ThreadLocal no longer works as expected when we combine it with asynchronous code using the async/await keywords. So if you want to build robust code that needs to access ambient data local to the current asynchronous control flow, you must use AsyncLocal and upgrade to .NET 4.6 or the Core CLR. If you can't upgrade your project to one these platform version you can try to mimic AsyncLocal by using the ExecutionContext instead.

Now that you've seen AsyncLocal, let me tell you the unlearning isn't over yet! In the next installment, I'll show you how you can restructure your existing code so you won't even need ambient data anymore. Stay tuned, and don't bend too many spoons in the meantime!