Intro to async coding (Part 1 of 4)

In this series, we’ll be laying the foundation for writing good and maintainable asynchronous code in .NET. This series is aimed more toward developers with little or no experience in asynchronous programming.

Here’s a quick road map on our journey:

What and why Multi-tasking?

Traditionally applications were mostly written to run synchronously. Only a single task could take place at one time and any other tasks had to follow systematically one after the next. Since then, multi-core machines have become the “norm”. They have the capability of truly executing multiple tasks in parallel (Task A can run while Task B is busy). The result is a more responsive application, performance gain and well, just a better overall user experience.

Before we start playing around with what the .NET framework offers us, let’s lay a basic foundation of “Multi-tasking”

Some important jargon

Processor / CPU The workhorse. We code some instructions and this is the guy who ends up doing the actual work. Workhorse
Thread Holds a sequence of instructions. These instructions will be executed by the CPU. If we write a synchronous application, all our instructions are handled by a single thread. thread
Dispatcher Each Thread has its own dispatcher. The dispatcher has access to queue instruction to the thread, but may not touch anyone else’s thread.  dispatcher
Time Slicing A CPU can only handle one thread at a time. At any given time our system typically has 100’s of threads that need to be processed by the CPU (needed for OS, services, other applications etc.) But we don’t have 100’s of cores (not yet at least). So the CPU does what’s called Time Slicing (different to multi-tasking). The CPU will do some work on a thread and then switch to do work on another thread and then switch again etc. This gives us an illusion of parallel processing.

timeSlice

“Old school” Multi-threading  (before TPL)

The Thread class has been around since .NET 1.1. Consider the following example:

We create a new thread and do our heavy lifting in that thread freeing up the Main Thread to do something else.


private static void DoHeavyWork()
{
   //...
}
static void Main(string[] args)
{
   // DoHeavyWork() will run on a seperate thread.
   new Thread(DoHeavyWork).Start();

   // Main Thread freed up...
   Console.WriteLine("Done");
}

Easy enough, so why do we need TPL?

Even with this simple piece of code, we already have a fundamental flaw. Do you spot it?

Well done if you spotted it quickly. Our Thread with the heavy work kicks off and then our Main Thread continues and prints “Done”. This isn’t right, the heavy work on the other thread is still busy. What now? We could allow the thread to join up with our Main thread as follows:


private static void DoHeavyWork()
{
   //...
}

static void Main(string[] args)
{
   // DoHeavyWork() will run on a seperate thread.
   var thread1 = new Thread(DoHeavyWork);
   thread1.Start();
   // Main Thread freed up...
   thread1.Join();   
   Console.WriteLine("Done");
}

Nice, that seemed to work! The other thread we created will do the work and then join up with the Main thread at the Join() method (line 12). The Main thread will block until thread1 is done with all it’s work.

Success? Well, we’ve only got one thread spawning from the main tasks and we’re not dealing with results (our method has no return type). Problem is that Threads don’t have return types because they are, well… Threads not methods.

When start returning values from a thread we would probably create global variables, set it in our thread and then when we’re done read the value of the variable.

But what if we added another thread which does work that depends on the first thread’s result? Now we start introducing thread safety concerns and synchronization issues where we need to signal thread2 can continue when thread1 has set its result and also make sure they access the result variable at the correct time and one at a time (race conditions) etc.

Let’s stir this pot even more… What would it mean for performance and overhead if we end up with lots of parallel processing? We’d have to constantly start up and kill threads, each time costing us in performance as there’s overhead that comes along such as setting up Thread Local Storage for Time slicing, create Queues, managing it’s Lifecycle etc. As a result we introduce the ThreadPool for efficiency and control over our threads.

What’s the point I’m trying to make?

Even though there is a solution for every spanner that is thrown into our works, the problem is we exponentially added complexity as well as reduced readability and maintainability. Unfortunately more often than not this leads to a decrease in quality with bugs that are difficult to find and reproduce. Very quickly asynchronous code has escaped the comfort of the majority of us developers and gone into entered into a realm only for experts and the brave.

The stage is set for TPL

Fortunately our tale doesn’t end here, let’s look at TPL in Part 2 and see if and how it makes our lives easier.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s