Async pitfalls (Part 4 of 4)

We’re busy with a journey of discovering the basics of writing good asynchronous code. Here’s a quick road map:

We’ve covered the basics of TPL and discussed the async and await keywords. In this last part we take a look at some of the pitfalls that may arise as we write asynchronous code.

#1 async void is ONLY for high level handlers

This is probably the most important pitfall to understand. It’s one that is commonly used and difficult to understand why we must not use async void unless we fully understand how async and await work.

Let’s recall how async and await work.

  • A method marked with async enables the await keyword.
  • When await is hit, the method is run asynchronously and control returned to caller.
  • The message loop will continue to check the status of the Task
  • When the Task is compleeted, the message loop will give control back to the await statement and processing will continue after the await..

Let’s think about this carefully. If our message loop checks on the Task to see the status, but our async method returns void, how can it know that it’s finished? The answer is it cannot. So what happens is the caller will assume that the method is complete and continue it’s processing. Now we have a race condition and this opens a can of worms.

asyncvoid_pitfall

See a little demonstration of async void gone bad. Follow the step numbers on the image and explanation below:

  1. Here we are in the Loaded event. (It’s fine to use async void here as it is a high level handler). In this step we call the method GetLoggedInUserAsync() which is an async void method (alarm bells should go off now).
  2. You can see that we do an async call to the database to get the user name. We await the result and then set it to a variable _loggedInUser.
  3. Since we hit an await in step 2, remember that control returns back to the caller. This means that MyApp_Loaded gets control again. But since GetLoggedInUserAsync() did not return a Task, There’s no way for the message loop to track the Task completion, so an assumption is made that it’s done and you’ll see that the code will continue at number 3 printing “Logged in user is: “. But why is _loggedInUser blank? Because it gets set asynchronously at step 2 and since we returned void, the Loaded already printed without the variable having a value yet.
  4. Now finally we get a result from the database and the Task returned at GetUserNameFromDbAsync() is complete. The Message loop picks up that this Task is completed, returns control to the method and finally _loggedInUser is set. But unfortunately far too late.

Here’s our result (the username wasn’t set in time):

asyncvoid_pitfallresult

How would we correct this code? Very easy, we simply

  1. Change the async void to async Task
  2. Await the task in the Loaded event.

asyncvoid_pitfallsolution

Here’s the result (As it was intended):

asyncvoid_pitfallresultcorrect

Do NOT use async void, rather use async Task. The only exception is for high level event handlers (such  as Loaded events, Click events etc).

#2 Mixing await and TPL may cause deadlocks

If we’re going to start making our methods async, it can be quite important to make all our methods async, right up the calling tree. A common mistake is to use await in and .Result interchangeably. Let’s have a look why this is a problem.


private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var loggedInUser = GetLoggedInUserAsync().Result;
}

public async Task<string> GetLoggedInUserAsync()
{
   return await GetUserNameFromDbAsync();
}

When running the above code in WPF we kick off our async method from the UI Thread context we get a deadlock. At first glance it doesn’t make sense as to why.

The answer is that by default await schedules the continuation of a function on the same SynchronizationContext.

Let’s walk through the code snippet to understand this better.

  • In line 8 we await a Task.
  • This was called from our UI Thread context. This means that continuation with awaited results will run on our UI Thread again.
  • But in line 3 we used .Result which blocks the Thread until a result is received.
  • So line 3 blocks waiting for a result, whilst line 8 tries to return the awaited result onto the UI thread
  • This causes a deadlock.

Solution

There are 2 options that we can ensue:

  1. Make sure async and await are used all the way
      • Change line 1 to async void
      • Change line 3 to await the results instead of asking for .Result.
    
    private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
    {
       var loggedInUser = await GetLoggedInUserAsync();
    }
    
    public async Task<string> GetLoggedInUserAsync()
    {
       return await GetUserNameFromDbAsync();
    }
    
    
  2. The second option would be to use .ConfigureAwait(false)
    • Add ConfigureAwait(false) to all the places where you return a task
    • This will override the default behaviour on continuing on the “saved context” and therefore resolve the deadlock
private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var loggedInUser = GetLoggedInUserAsync().Result;
}

public async Task<string> GetLoggedInUserAsync()
{
   return await GetUserNameFromDbAsync().ConfigureAwait(false);
}

I would say a rule of thumb is to use async all the way up, but when creating class libraries that handle asynchronous work, then make sure that you litter it with ConfigureAwait(false) as you cannot control how the consuming code will call your async methods.

#3 Parallel class, ask if sequence matters

When using Parallel class to do iterations in parallel, always ask the question does sequence matter? If the answer is yes, then consider a different approach. Parallel class iterations don’t guarantee sequential processing. Also bear in mind that the iterations may not necessarily run in parallel. If the class deems the work to completed quicker synchronously, it will do so.

#4 Thread affinity restrictions with certain technologies

This is an age-old pitfall, but I believe it deserves a mention. Some technologies impose thread affinity restrictions that requires code to run on a specific thread. Make sure you consider these restrictions depending on the technology you are using.

For example in WPF we cannot update the UI except from the UI Thread. In this scenario we must bear in mind that if we’re on a different thread that we must invoke the UI’s dispatcher to pop our piece of work onto the UI Thread.

#5 Unobserved Exceptions

An unobserved Task throwing an exception will not be hanlded by user code. We need to observe the tasks (await, .Result or .Wait) or tap into TaskScheduler.UnobservedTaskException event to handle these exceptions

#6 Parallel isn’t always faster

When opting to make some operation asynchronous, it’s important to ask where it is necessary to do so. When a task is quick and simple when it runs synchronously, we may make our application slower by forcing tasks to run on seperate threads. Remember that whenever we run operations in parallel, there’s a cost of synchronization involved.

A common area of “over-parallelization” would be to make nested loops asynchronous. In such a case, try to only keep the outer loop asynchronous whilst the inner loops can run synchronously.

#7 Thread safety

This is an important concept to grasp. Once we start performing tasks in parallel, we will run into scenario’s where different threads share common resources. If multiple threads can access a method simultaneously without compromising the data, we say that the method is “Thread-safe”.

So let’s play out a little scenario. We have an application where we sell bikes. To sell a bike, we call SellBike() method and we increment a totalSold variable to keep track of bikes sold. We have one rule that we may only sell 5 bikes in total. After that we must stop.

Here’s the synchronous version of the above scenario:


int totalSold = 0;
public void MyApp_Loaded(object sender, EventArgs e)
{
   for (int i = 0; i < 10; i++)
   {
      SellBike();
   }
   Console.WriteLine("Sold " + totalSold + " bikes");
}

private void SellBike()
{
   if (totalSold < 5)
   {
      // Sell Bike
      Thread.Sleep(1);

      totalSold += 1;
   }
}

No matter how many times we run the above application, we always get this output:

bikeresult_sync

Now somewhere along the way, we decide that we need to make our application asynchronous. We change the for loop to Parallel.For and enjoy our performance gain:


int totalSold = 0;
public void MyApp_Loaded(object sender, EventArgs e)
{
   Parallel.For(0, 10, (i) =>
   {
      SellBike();
   });
   Console.WriteLine("Sold " + totalSold + " bikes");
}

private void SellBike()
{
   if (totalSold < 5)
   {
      // Sell Bike
      Thread.Sleep(1);

      totalSold += 1;
   }
}

Only to realise that suddenly we’re selling too many bikes. Here’s the output:

bikeresult_async

What went wrong?

How was it possible to sell 9 bikes when our code limited us to only selling 5? Thread safety.

We’re now iterating in parallel and the SellBike method gets hit from mulitple threads. Different threads hit the if statement at line 13 and ask have I sold 5 bikes yet? If the answer is true, the tread enters the code block and sells a bike and then increments the totalSold variable. The problem is that if all the threads manage to get past line 13 before 5 of the thread have updated the totalSold variable, it’s too late now to chase them out of the if statement, since it’s already processed as true.

Solution

We need to make sure that if one thread is “busy selling” a bike, we wait until it’s done. Once it’s done we allow the next thread to sell a bike. This way the integrity of our business rule to only sell 5 bikes stays intact.

We can use the lock statement. We lock on an object and then when a thread hits the lock statement, it checks whether any other thread has locked on the object. If so, the thread will wait until the other thread releases the object. Here’s the code:


private object sync = new object();
private void SellBike()
{
   lock (sync)
   {
      if (totalSold < 5)
      {
         // Sell Bike
         Thread.Sleep(1);

         totalSold += 1;
      }
   }
}

Now our method is “Thread-safe”, since multiple threads can access it at the same time without compromising the data.

bikeresult_sync

Something interesting to note is that when we use the lock keyword, it actually makes use of the Montior class (System.Threading namespace) in the IL code. It uses the Enter method to allow a thread to enter the code block and the Exit method to release the object, allowing another thread to lock on it and enter the code block. Here’s the decompiled c# from the IL code.

IL_monitorclass

The Montior class is know a synchronization primitive. It is a class which aids in the synchronizing of threads to avoid race conditions (as we saw in the bike example). There are many synchronization primitives to choose from, each with their own benefits, I’ll briefly touch on the Lightweight ones introduced in .NET 4.0

ManualResetEventSlim

  • A popular signalling synchronization primitive
  • Use this if a Thread needs to “Wait” for another thread to signal it.
  • Once the other thread gives the signal, the thread can continue its work.
  • This is a way for threads to “talk” to each other. “Hey, let me know when you’re done”, “Ok, I’m done”…

ManualResetEventSlim

SemaphoreSlim

  • Very similar to the Monitor class except that it allows n amount of threads to access a resource / code block
  • It’s much like a bouncer at the door of a code block. Only allowing n amount to enter at a time while the threads wait outside.
  • When a thread exits the resource, the “bouncer” allows another thread to enter.

SemaphoreSlim

CountdownEvent

  • This is a signalling synchronization primitive
  • A thread will wait until n amount of other threads have signaled.
  • E.g. a thread will say, “I’m waiting until 3 other threads have given me the signal!”

CountdownEvent

Barrier

  • This is used when you want parallel threads to “wait” at a point until they’re all done and then together continue again.
  • A great way to synchronize without requiring the control of a parent / master thread.
  • E.g. 3 Threads do work and when a thread is done, waits asking the other 2 “Are you done?” Once all the threads are done, they all continue together “Ok, ready? let’s Go!”

Barrier

Concurrent Collections

A last thing to consider it comes to Thread safety are concurrent collections. The standard collections are not Thread safe, as one thread could be adding an item, whilst another is removing item(s). We could implement specific locking procedures before accessing or manipulating collections, but fortunately for us, Microsoft already did a lot of this work for us.

Here’s a list of collections which can be safely used for multi-threading:

ConcurrentBag Thread safe collection of items (unsorted)
ConcurrentDictionary<TKey, TValue> Thread safe Dictionary
ConcurrentQueue Thread safe Queue
ConcurrentStack Thread safe Stack
IProducerConsumerCollection An Interface to implement our custom thread safe collection when manipulating and reading.

ConcurrentBag, ConcurrentQueue, ConcurrentStack & BlockingCollection implement this interface

Conclusion

That concludes this series of exploring asynchronous coding. To summarise, we’ve touched on the basics of:

Threading and asynchronous programming is a massive topic and there’s much to learn especially when it comes to understanding the lower-level workings of Threading, thread pools, thread affinity etc. I trust that this series has given you a good basis to start using asynchronous code. Comments and suggestions are welcome.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s