Posted in .NET Development

Machine Learning – Tic Tac Toe in C#

A while back I created a Tic Tac Toe game in Python and trained an opponent (or bot) using an Artificial Neural Networks (ANN). At that stage I really wanted to try out some reinforcement learning, but since it was a pet project on the side and I’m not as comfortable with Python, I just didn’t end up getting to it.

So…. recently I thought, let me accept the challenge but do it in C#. It certainly was a lot easier to implement in .NET for me. I also got to finally get my hands dirty with Blazor WebAssembly which was great. Being able to run client side code in C# (no JavaScript) is really a revelation. There were some limitations with running MiniMax algorithm in WebAssembly on the Client-side if you want to check that out in the Part 2 video (below)

I’m happy to have finally wrapped this up, so I hope you enjoy it as much as I’ve enjoyed making it.

Play the game here: https://tictactoe.filteredcode.com/
Code on GitHub: https://github.com/NielsFilter/TicTacToe.Net
YouTube videos: YouTube: filteredCode – TicTacTic AI in C# Playlist

Part 1: Creating the game in c#

Creating the game and laying the foundations to create AI bots

Part 2: Creating a MiniMax algorithm bot

Creating a MiniMax algorithm bot

Part 3: Creating the Q-Learning bot

Created a Reinforcement Learning (Q-Learning) bot

More bots coming?

My goal was Reinforcement Learning and it worked out really well. Now that I’ve ticked that box I’m happy to leave it as is (for now). All this is done in my own private capacity and spare time is a luxury I don’t have too much of at the moment.

But…

If I were to pick this up again in the future, 2 ideas that I wanted to play around with are:

  1. Supervised learning algorithm – Linear Regression (or something similar)
    • The thought is to capture all states from other games and “featurize” the various states. For instance features like “Occupies Middle”, “2 of yours with an empty space” (naming is hard…)
    • With decent features defined, the plan would be to make use of ML.Net to train
  2. Use a Neural Network to solve
    • Not as important to me since I’ve done this in Python already
    • But the thought of testing out a package like TensorFlow.NET package and seeing if the feature set and level of support is good enough to do some production Neural Networks purely in .NET.

Posted in .NET Development

Azure zero downtime deployments

It’s been a really long time since my last blog post. I’ve decided I’m going to try use video instead of writing. It’s just easier to follow than long articles and screenshots.

This aim with this video:

  • Take an existing web application
  • Push it to Azure DevOps git repo
  • Deploy site to Azure App Service
  • Continuous Deployment should have no downtime
Posted in .NET Development, Tutorials

IEnumerable vs IQueryable – Part 1

What’s the difference between IQueryable and IEnumerable? This is probably the second most frequent question I’ve been asked (number 1 has to be understanding delegates).

The purpose of this article isn’t to formally define the interfaces, but rather to paint an easy to understand picture of how how do they differ. Then in Part 2 we get practical with 8 code snippet questions where we can test our understanding of the topic.

The Difference (Short answer)

  • IEnumerable – quering a collection in Memory
  • IQueryable – quering an External Data Source (most commonly a database)

What is IEnumerable?

The IEnumerable interface exposes an GetEnumerator method, which returns an IEnumerator interface which allows us to iterate through the collection. In plain English: an IEnumerable is a collection of items you can loop through.

Did you know, even arrays are inherently IEnumerable. See this snippet below:

int[] nums = { 1, 2, 3, 4 };
bool isEnumerable = nums is IEnumerable<int>; // True

IsEnumerable in the above code is true. This kind of makes sense, since an array (just like IEnumerable) is a collection of items we can loop through.

What makes IEnumerable special?

The most special thing about IEnumerable is that we can query items using LINQ.

There are a bunch of LINQ methods (Where, Select, Take, OrderBy, First etc.) which are simply extension methods for the IEnumerable interface. Same as all extension methods, just include the namespace (System.Linq) and the whole range of LINQ extensions are available to filter our collection.

Something else that’s very important to understand when using IEnumerable is Deferred Execution. In short, a LINQ Query only captures the intent. It holds off as long as it can and only does the actual filtering when any of the following happens:

  • Iterate the results (e.g. foreach)
  • Call ToList
  • Get a single result from the query (e.g. .Count() or .First())

Understanding Deferred Execution is key to using IEnumerable correctly. Not understanding this can lead to unnecessary performance issues. Check out Part 2 to see if you understand it correctly.

What is IQueryable?

Firstly IQueryable inherits from IEnumerable. This means inherently, it is also a collection of items that you can loop through. We can also write LINQ queries against an IQueryable.

IQueryable is used when querying a data source (let’s say a database).  So if we are using Entity Framework (EF), we can write a LINQ Query as follows and it will actually produce a SQL query:

EFQuery

In the above, our LINQ Query was translated to a SQL Query. When it is executed, the query will be run against our SQL database and return results to memory. Remember, the filtering does NOT happen in memory, but in the database (read this sentence again to make sure you’ve got it).

How does LINQ suddenly become SQL?

There are 2 important properties on the IQueryable interface: Expression and Provider

IQueryable

  • Expression – This is the Expression Tree built up from the LINQ Query
  • Provider – Tells us how to translate the Expression Tree into something else

In our case (using EF with a SQL database) what happened:

  • We created a simple LINQ query
  • This built up an Expression Tree
  • The Expression Tree gets passed to the Provider
  • Our provider translates the Expression Tree in SQL Query
  • As soon as we use our results (deferred execution), the SQL Query will execute against a database.
  • The results are returned and stored into memory.

Great uses for IQueryable

Think about the way IQueryable works for a moment. Let’s say we have a custom Data Source like a file which appends data with some separators we defined. If we find ourselves, constantly reading these files and trying to sift through the text to get hold of data;  we could instead use IQueryables and create our own Query Provider. This will allow us to write LINQ queries to get data from our files.

Another popular place IQueryable is used is for ASP.NET WebAPI OData. We can expose our REST endpoints and allow the person using our Web Service to filter only the data they need without pulling all data down to the client first. OData is basically just a standard that allows us to use URLs to filter specific data.

Example: Let’s say our REST service returns a list of 100 000 People: (http://mysite.com/People). But in our app we only want the people whose surnames contain the search text “Filter”.

Without the power of IQueryable and OData, we would either have to:

  • Pull all 100 000 people down to our client and then locally in memory filter for those 10 people with surname “Filter” that we actually need.
  • Or create an endpoint specfically for searching for people by surname, passing a query string parameter “Filter”.

Neither of these are great. But using Web API with OData, we could create a controller that returns IQueryable<Person> and then allow our app to:

  • Send a custom URL: http://mysite.com/People$filter=contains(Surname,’Filter’)
  • On the server, the IQueryable Expression Tree is built up from the OData URL
  • The Provider translates the Expression Tree to a SQL Query
  • The SQL executes against the database only getting 10 items from it
  • These 10 items are returned to app as the client’s requested format (e.g. JSON)

So with the power of IQueryable and OData, we indirectly queried the database via a URL, without having to write server code and didn’t have to pull data we did not need. (less bandwidth, less server processing and minimal client memory footprint)

Side note: LINQ Query Syntax vs Extension Methods

Not directly related to the topic, but a question I’ve been asked several times as well. Is it better to use Query Syntax or Extension methods.

Query Syntax:

var result = from n in nums
             where n > 2
             select n;

Extension Methods:

var result = nums.Where(n => n > 2);

They both compile down to the Extension Methods in the end. The Query syntax is simply a language feature added to simplify complex queries. Use which ever is more readable and maintainable for you.

I prefer to use Extension Methods for short queries and Query Syntax for complex queries.

Conclusion

If you missed everything, just remember this:

  • IEnumerable – queries a collection in Memory
  • IQueryable – queries an External Data Source (most commonly a database)

If you are comfortable with these interfaces, go to Part 2 and test yourself with 8 quick code snippet questions on the topic.

Posted in .NET Development, Architecture

Should my code be “Technical” or “Domain” focused

  > How do we structure our solution?
  > What do we name our files?
  > How do we organize the folders in the project?
  > How do we structure our code regions?

It’s probably safe to say we’ve all sat with these questions and still do every time we expand our projects or create new ones.

The big question that this article addresses is whether we should organize our code based on the “Domain” or rather on “Technical” implementations.

Let’s quickly define both and see which is better.

Technical Focus

This approach organizes code with a technical or functional focus. This is a more traditional way of organizing an application, but still very much in use today. Let’s see how this would look practically

Code Regions

Regions are defined according to the functional implementation. If it’s a method and it’s public it goes to Public Methods, regardless of what the method does.

tech_regions

Project Layout

For example creating and MVC application, File -> New Project lays out the folders with a technical focus. If you create a View, regardless of what it does, it goes into the Views folder.

tech_folders

Solution Architecture

The traditional layered architecture is a very common practice. This approach organizes the projects according to the function. If I have a Business Logic or Service class, it will go into the Domain project, regardless of what it does.

tech_architecture

In short it’s a “What it IS” approach

You’ll see that in each of the above cases, we’ve organised according to what something IS and not what it DOES. So if we’re developing a Hospital application, a Restaurant Management system, or even a Live Sport Scoring dashboard, the structure for these vastly different domains will look almost identical.

Domain Focus

This approach organizes code with a domain or business focus. The focus on the domain has definitely been popularized by designs such as DDD, TDD, SOA, Microservices etc. Let’s see how this would look practically:

Code Regions

Regions are defined according to the domain implementation. Anything related to “Admitting a patient to the Hospital” will go in the Admit Patient region, regardless of what it is.

dom_regions

Project Layout

Taking the MVC example mentioned earlier for Project Layout, we would now see folders according to the specific domain. If we create something that is related to “customer feedback”, it would go in the CustomerFeedback folder, regardless of what it is (view, controller, script etc.)

dom_folders

Solution Architecture

Architecture would be based around a type of SOA or Microservices approach, where each domain would exist independently in it’s own project. If we have new domain in a live sport scoring app such as “Cricket”, we would create a new project for Cricket and everything related to it will go in there regardless of what it is.

dom_architecture

In short it’s a “What it DOES” approach

You’ll see that in each of the above cases, we’ve organised according to what something DOES and not what it IS. So once again, if we’re developing a Hospital application, a Restaurant Management system and a Live Sport Scoring dashboard, the structure for these vastly different domains will look completely different.

So which is best?

Firstly let’s just put it out there that there’s a “3rd approach” as well, a hybrid between the 2. For example, we could have a Properties region (which is technical), and then a Admit Patient region (which is domain) for all domain related methods.

So which is best? Well let’s see…

Why Technical is better than Domain

1. Every project’s layout and all page regions are identical.

We as developers are often very technically oriented, so this would feel right at home as we can feel in control even if we’re clueless about the domain.

2. Less pieces

Since there are only so many technicalities within a project, once we’ve grouped by them, the number of regions, folders or projects will never grow.

3. Layer specific skills or roles

If the development team’s roles in a project are technical-specific, this approach is great. Each developer has their specific folder or project which they work on and maintain. For example you have one developer only creating views, another only doing domain specific validations, another only focusing on data access etc.

Why Domain is better than Technical

1. We’re solving business problems

As technical as we developers can be, at the end of the day, if we’re not solving domain specific problems, we’re failing as software developers. Since business is our core and the technical only the tool to get us there, organizing code, folder and projects by domain makes much more sense.

2. Scales better

When the application expands or the scope widens, it often means that the new implementations don’t affect or bloat existing code as each domain is “isolated” from the next  (closer adherence to the Single Responsibility and Open/closed principles).

3. Everything is together

Often developers are responsible for all or at least most layers of technical implementations. If we for instance had to now expand our Live Sport Scoring web dashboard to include tennis, we very easily end up working with data access code, business rules and validations, view models, views, scripts, styles, controllers etc. and these are for a typical web application implementation. We could easily have a few more.

The point is, we often work with all of these while solving a single domain problem. So if we for example have a tennis folder and our tennis specific scripts, styles, views, controllers etc. were together, that would already be much more productive.

4. Reusable

This only really affects architecture, but if a project is built and isolated by domain, it become reusable by different applications on it’s own. In an enterprise environment, this is really useful.

For example if a large corporate business has internal procurement rules or procedures, but the business has many different systems for it’s departments, be it the cafeteria, HR, finances etc. then an SOA-type approach would enable you to have one project which handles all the procurement procedures and all the different flavours of applications can go through this procurement service, ensure that both the correct procedures and the same procedures are used for every procurement for every department.

Conclusion

So I haven’t yet said which one is best. For me personally, my bias definitely lies more with organizing projects around the domain.

Once again, this is no silver bullet answer or solution, but remember that there are most definitely the wrong approach for a specific project or problem. Here are some questions that we should ask, testing our approach to existing systems:

  • Are there any areas where we suffer under lack of productivity?
  • If so, would the different approach be better?
  • If so, would changing the approach be too great an adjustment for the benefits it would provide?

But the ultimate questions really are:

  • Are the business needs currently being met?
  • And are the developers happy and in consensus with the approach?

As the good old saying goes: “Don’t fix something that’s not broken”.

I’d love to hear thoughts from your experience with either approach and any opinions, short falls or benefits you’ve experienced

Posted in .NET Development, News

Exciting times for .NET developers

It’s definitely a good time to be a .NET developer. Microsoft has been around for a very long time and have often been labelled (rightfully, I suppose) as “slow” and “closed” in their approach, isolating their products and services solely to users on their platform. But this has changed drastically in more recent years. There are many reasons to be excited.

They’ve gone Agile

Don’t believe me? See this interesting article from Steve Denning on forbes.com . A company of 128 000 employees not only adopting the agile approach but doing so very successfully is no small feat.

Much of their recent development is completely open-source on a GitHub. Now anyone can see their progress, use or test pre-releases, provide feedback or even modify code on their behalf and commit it for review and approval. The earlier you get feedback on a product, the more solid the foundation and sooner you end up with a stable release.

.NET Core is Cross-platform

Yip, you can now host your ASP.NET Core 1.0 web site on anything from a Mac, Linux or even Raspberry Pi. How is this possible? .NET Core has been built completely modular and the .NET assemblies can be deployed as NuGet packages without having to “install” the framework first. As for the runtime, .NET core has what’s called the DNX which hosts the application on any of the mentioned platforms, which has the CoreCLR, so we don’t lose the managed goodies like garbage collection.

happy_binny_2

Here are some other ways which doors have opened for developers from vastly different technology backgrounds:

  • Visual Studio Code is a free version of Visual Studio running on Windows, OS X or Linux
  • There is built in tooling for building cross-platform hybrid Cordova mobile apps (TACO) VS, no more command line compiling as in the past.
  • Native Windows Mobile or Store apps (UWP) can also be written with HTML and JavaScript back-end (this enables pretty much every web developer to be able to create Native Windows apps without a steep learning curve of XAML and C#)
  • Visual Studio has first class support for GitHub source control directly from VS
  • Azure has support for pretty much any popular platform, development technology, source control etc.
  • VS also has built in support for popular task runners such as Gulp or Grunt and package managers such as bower and npm
  • If you prefer create sites with NodeJS, VS even has tooling for that
  • Even though this has been around for quite some time, if you have different language backgrounds such as Python or Ruby, you can create Desktop of Web projects from VS with these. For example it blew me away that you can create a WPF application having a XAML front-end with Python code-behind. (This makes use of the .NET’s DLR which bridges the gap allowing dynamic typed languages such as Python to run on the .NET framework).

The point to take from this is that the focus of Microsoft is no longer a attempt at a form of monopoly, but creating platforms and tools that would invite different developers to freely use their products, tools and frameworks (and I assume the goal is to ultimately get them to use Azure)

They went big with Azure

Microsoft is really putting a lot more emphasis on their cloud platform Azure

It’s also a great platform to all a local network to move to the cloud using their “Infrastructure as a Service” (IaaS) or even “Platform as a Service” (PaaS). This obviously saves cost and time spent on hardware and software maintenance, updates, hotfixes etc.

The consumption payment model “pay for what you use” is really attractive especially for start-ups and allows easy and flexible scaling. I’ve got a couple of tiny prototype applications running on Azure at the moment and so far, everything’s still free because of the low traffic.

Starting fresh

Haven’t we all had those projects where our great designs or approaches seem to get in the way years down the line as things change?

This is interesting, because if there’s any company that has years of “backward” compatibility caked into their software which they’d rather wish they’d have done differently or as times changed, the way their API’s get used changed, it’s Microsoft. Backward compatibility means stability but also often means lack in performance and scalability over time (especially if you’re still supporting legacy API’s from a decade ago).

Someone in Microsoft was bold enough to make the call for some rewrites. Off the top of my head, these are things they’ve recently completely rewritten from ground up:

  • The C# Compiler (Rosyln)
  • NET Core
  • ASP.NET Core 1.0
  • Entity Framework Core 1.0

These are only the ones I know about and they’re not small either. Besides Roslyn, nothing is directly “backward compatible”, but rather “conceptually” compatible, transferring existing concepts to the new frameworks rather than simply porting code as is.

In case you were wondering, ASP.NET Core 1.0 was initially called ASP.NET vNext and then became ASP.NET 5 with MVC 6, which ran on .NET Core 5 using EF 7. Now that’s a mouthful, so last week they’ve announced it’s been renamed to Core 1.0 (makes sense for a rewrite to start again at 1.0). So at least for now, it’s referred to as:

  • ASP.NET Core 1.0
  • NET Core 1.0
  • Entity Framework Core 1.0

Performance matters

It’s no longer fair to label Microsoft products as slow. There are a lot of smart people that have put much effort into reducing memory footprints and optimizing performance. To name a few performance benefits as a developer I’ve picked up on recently:

  • If you’re running .NET Native (such as UWP apps) you get the performance of C++ and the productivity of managed C#
  • The RyuJIT compiler [link to other article] means your app will just be a bit faster without doing anything, especially the start-up times.
  • And here’s my favourite, ASP.NET Core 1.0 benchmarks when compared to Google’s NodeJS web stack.
    • On a Linux server ASP.NET Core is 2.3x faster
    • On a Windows server, it’s more than 8x faster with 1.18 million requests per second!

aspnet5benchmark

Want to see some code

I’ve been exploring and keeping an eye on ASP.NET Core 1.0 as it goes through the pre-release phases. I’ve personally found it to be quite a big change from ASP.NET 4.6 and hope to be sharing a few nuggets soon on some great features I’ve enjoyed when I get the time.

Posted in .NET Development, Tutorials

Property Explosion with Roslyn API

Conclusion

Yip, we’ll start at the end. This article aims to show the powers of Roslyn and hopefully inspire some great ideas to help grow the already Massive eco-system of free tooling for Visual Studio (VS).

The final product is this life changing VS Refactoring tool, called Property Explosion

ExtensionsAndUpdates

Go give it a try, in Visual Studio go to Tools -> Extensions and Updates -> Search for Property Explosion in the online section -> Download and Install -> (Restart VS when done)

Now open up any C# project, click on a property (either Auto-property or Full-property) and Hit Ctrl + . and you’ll get a suggestion to either Explode the property (make it full) or Crunch it (convert to Auto property).

CrunchProperty

Life Changing Stuff!  If you’re interested in how we got here, read on.

This article will discuss, installation, touch on high level overview of code and VSIX deployment. My source code is available on GitHub (see links at the bottom).

Now the Beginning

Now we’ve seen the future and taken away from the dramatic climax (like watching The Village for the second time).

So, What is Roslyn? As quickly mentioned in a previous article, Roslyn is an open-source implementation of the C# / VB.Net compilers. But one of the key features of is the API it exposes allowing us to create analytical and refactoring tools. We’ll be using the Roslyn API to build a VS Refactoring tool.

Installation

You’ll need Visual Studio 2015 and the Roslyn SDK (If you don’t have the SDK installed, you can install it directly through VS, which is pretty neat).

  • Open up Visual Studio 2015
  • File -> New Project
  • Under Visual C# go to Extensibility.

If you have installed Roslyn SDK:

  • Click on Code Refactoring (VSIX) Project and OK 

If not:

  • Choose Install Visual Studio Extensibility Tools directly out of Visual Studio and hit OK

VS_install_roslyn

  • Then Download the .NET Compiler Platform SDK hit OK and download it

VS_downloadCompiler

  • Now you can choose Code Refactoring (VSIX) Project and OK 

Getting started

Now that you’ve created a Code Refactoring project, you’ll see there’s some default plumbing set up.

There are 2 projects created.

  1. Portable class library, which holds the refactoring entry point and code logic.
  2. VSIX project, which only holds a vsixmanifest file and references the class library.

Make sure the .VSIX project is set as a startup project. A .VSIX file is basically a Visual Studio extensions installation file. This is what get’s deployed to the Visual Studio Gallery enabling users to download the tool.

You should now be able to hit F5 and debug the project as it. This will open up another instance of VS 2015, with the VSIX installed and attaches the debugger to the new VS instance.

  • Open up a project (or create a new Console App).
  • Now click on the class and hit Ctrl + .
  • You’ll see a suggestion pop up to reverse the class name e.g. Program becomes margorP.

To debug and step through the code and see what’s happening is really easy:

  • Put a breakpoint in the ComputeRefactoringsAsync method in the CodeRefactoringProvider class
  • Now in the new VS instance, click on a class name again and hit Ctrl + . and the breakpoint will be hit.

How does code refactoring work?

  1. User hits Ctrl + . and VS will look for installed VSIX tools
  2. The Roslyn API will call the Entry Point in custom code (CodeRefactoringProvider)
  3. Custom code makes decisions on which Code Actions to register.
  4. User sees a little Context Menu pop up with Code Action(s) available
  5. User clicks on the Code Action, and code is refactored.

Property Explosion Code

It’s a bit out of scope to go through each piece of the Property Explosion code. We will however take a high level look at what was done and why.

Here’s the PropertyExplosion Repository on GitHub. Click Download to Zip, unzip and open up the solution

So now in the context of the Property Explosion code, a user hits Ctrl + . and the Entry Point in our custom code gets hit.

Registering Code Action(s)

First thing is we do at Entry Point is to check if we’re dealing with a property and if so, do we have an Auto Property or a Full Property? If we have an Auto Property, we need to register the Explode Code Action, which can expand the property. Similarly, if we have a Full property we register the Crunch Code Action, to collapse property to Auto Property.

Now we need to rewrite our code

We could simply do some code refactoring directly in the code action method, but it seems the preferred scalable approach is to use a Syntax Rewriter. In our project, we’ve got the PropertyCollapser and PropertyExploder rewriters. Syntax rewriters work on the Visitor Design Pattern. It’s a tedious one to get used to if you’ve never worked with it before, but in the Syntax Rewriter, you can see how and why it’s very useful.

What that means in our case is that we call visit on a Node and then the each element in the Syntax Tree will be traversed and “visited”. If the current element being visited is one that we care about (for example the property in question), then we can override the Rewriter’s Visit method for the specific node type and manipulate the Syntax Node.


public override SyntaxNode VisitPropertyDeclaration(PropertyDeclarationSyntax propertyDeclaration)
{
  if (propertyDeclaration == this._fullProperty)
  {
    //... Create a new property ...
    return newProperty;
  }

  return base.VisitPropertyDeclaration(propertyDeclaration);
}

See the code above, where we override the VisitPropertyDeclaration method. This method will be hit for every property that is traversed in the Syntax Tree. Since we only care about one specific property in question, we do a check. Is this the property we’re busy refactoring? If Yes, then we go on to build a new property and return it.

That’s easy enough, replacing an existing Node. But how do we add a new node with the Visitor Pattern? Let’s take adding a new field. We check our visit method if we’re busy with the Property’s Parent, if so, we create a new field, insert it and return the “updated” parent. Easy as that.

Use the existing code as reference, put some breakpoints down and see how the Visit methods get called and used to refactor Nodes.

Now we simply return the new root (which is the one modified by the Rewriters) to the Document and our code is successfully refactored.

Deployment

Once you’re happy with the refactoring tool, it’s time to deploy. Deployment is really easy.

  1. Check that you’re happy with the config in the vsixmanifest file
  2. Change VS build type to Release, and build the solution.
  3. Go to the bin\release folder and you’ll find a .VSIX file.
  4. Login or Register at Visual Studio Gallery
  5. Click Upload and Upload the .VSIX file
  6. Once Uploaded, make sure to click Publish and Your tool is available for download immediately from VS Gallery or directly from VS (Tools -> Extensions and Updates).

Things to consider

1) “Rewriting” code is more complicated than a simple statement

We need to think like a compiler rather than developer. It’s crucial to understand that we work with these 3 things to build up code: Syntax Nodes, Syntax Tokens and Trivia.

Let’s take this statement for example:



string myString = "Hello World";


That’s as easy as it gets. We see a simple one liner. The compiler sees Nodes, Tokens and Trivia:

Visualizer_HelloWorld

It’s very demoralizing realising the complexity of a simple statement. But it needn’t be. If you’ve installed the .NET Compiler Platform SDK, go to View –> Other Windows –> Roslyn Syntax Visualizer. This Window is a life saver as it demonstrates your current code’s Syntax Tree. DON’T BE A HERO, USE THE VISUALIZER!!!

If you’re wondering how something should be expressed, code it, click on it and the Roslyn Syntax Visualizer will show you a granular break down of its Nodes, Tokens and Trivia.

Nodes

  • These are the main building block of the syntax tree (the blue ones in the above image).
  • Statements, Expressions, Clauses, Declarations etc.

Tokens 

  • These are almost like little extras. They cannot be a parent to other Nodes or Tokens (the green ones)
  • Keywords, Literals, Semi-colons etc.

Trivia

  • These are there for formatting purposes and don’t affect code directly (the white/grey ones)
  • Comments, Whitespaces, Regions etc.

2) How to find a Member

The 2 ways of finding something is done via the Syntax Tree or Semantic Model.

Syntax Tree

  • Not much extra info on a Node (e.g. can’t determine if and where a member is referenced)
  • Optimized for performance.

Semantic Model

  • Rich with extra compile time info (e.g. references to a member)
  • Much slower than the Syntax Tree because it often triggers a compilation of code

Try always use Syntax Tree unless you can’t, then use Semantic Model. Semantic Model is very powerful and is an important feature of the Roslyn API.

3) Treat the vsixmanifest file with care

I managed to mess up my vsixmanifest file. I was able to build the project, deploy it, download & install it but nothing happened. I thought the fault was in the code, so I tried debugging, which stopped working as well. No logs, no error messages, nothing works any more.

You’ll probably want to configure the compatibly of VS and .NET framework versions that your tool can run on. So much like any config file, take care when making changes.

If you’ve messed it up, create a new Refactoring Code Project and use the fresh vsixmanifest file as a reference to fix the existing one.

That’s it…

That’s it from a high level overview for building a code refactoring tool using the Roslyn API. I hope it provided you with an idea to get started. Please feel free to download my code, step through it, use it, improve it, abuse it or sell it for millions.

Some links…

Posted in .NET Development, News

A world of change…

The world of technology and development looks vastly different now than it did 3 or 4 years ago. With all this change it’s bound to happen that new lingo is tossed around, moments before another new best thing hits the development world. If we don’t embrace the change and open our minds up to learn, we quickly feel like a fish out of water and are left behind.

About 2 years ago I was considering to gain some skills outside of .NET, especially as the market for open-source and cross-platform was becoming more demanding and making some noise. However, I’m delighted to see what Microsoft has been busy with lately and it seems they’re embracing the market change as well and are steering themselves in that direction. Let me touch on some of the changes to frameworks, compilers, application models and IDE and see if we can make sense of them.

A disclaimer: this is written from a .NET perspective and not objectively to development as a whole.

.Net frameworks

After .Net 4.5 there are a couple of new frameworks that have made an appearance. Why so many and what are they?

  • .NET 4.5.1
  • .NET 4.5.2
  • .NET 4.6
  • .NET Core
  • .NET Native
  • .NET 2015

.NET 4.5.1

The biggest reason for this release is Windows 8.1. Both Windows 8.1 Store Apps and Windows 8.1 Phone Store Apps need .NET 4.5.1.

Some other smaller enhancements & features:

  • JIT improvements (Specifically for Multi-core machines)
  • 64-bit edit and continue (without stopping app)

.NET 4.5.2

There are 2 noticeable changes in the .NET 4.5.2, one for ASP.NET and other (yes believe it) for Windows Forms.

ASP.NET: The bigger change in ASP.Net is probably the HostingEnvironment.QueueBackgroundWorkItem. In the past if you suggested to “fire a task on a seperate thread and forget about it”, serious red flags were raised. This was because IIS needs to recycle your application regularly and if it happened to do so while your task was busy, the work would never complete. HostingEnvironment.QueueBackgroundWorkItem allows you to “fire and forget”, return a response to the user and the task can safely continue (only up to a max of 90 seconds).

Windows Forms: As some devices support higher and higher resolutions, the scaling in WinForms became a problem. Things such as the little drop down list arrow became absolutely tiny. NET 4.5.2 has introduced a feature to allow resizing for High resolutions, solving this problem.

.NET 4.6

This is the next full version of the .Net framework. There are a whole bunch of new features and improvements. Some of the many new features include :

  • Better event tracing
  • Base Class Library (BCL) changes.
  • New Cryptography API’s
  • Plenty ASP.NET enhancement

But probably the most notable feature for me is the new JIT compiler, RyuJIT. This is a 64-bit JIT compiler optimized for 64-bit computing. Great thing is you’ll get better performance without actually doing anything (on 64-bit machines).

.NET Core

This guy has made some headlines. Imagine a .NET Framework that could be deployed via NuGet. No need for specific framework prerequisites to be installed, but a framework that ships with the application. Imagine no more, this is what .NET Core has brought to the table. .NET Core is also modular which means that you don’t need all part of the framework, only those that you care about.

The biggest features though in my opinion, is that .NET is no longer limited to Windows. It is a cross-platform implementation of the .NET Framework. Yip, we can now deploy .NET applications on Linux or Mac. It’s important to note that .NET Core does not have everything that the full .NET Framework has yet.

.NET Native

I’m sure that you’ve heard hard core C++ junkies say “If you wrote this in C++ it would be much faster”. Why don’t we all switch to C++ for a little performance gain? That’s easy. Productivity almost always trumps Performance. Does it really matter  to the client that it took 350ms to execute instead of 150ms? What matters is that it took only 10 minutes to develop instead of 30! Not just that but also that we can rest assured that our memory is safely managed by the CLR’s Garbage Collector.

Well .NET Native is an interesting twist to this age old tale. It allows you to compile code directly into native (machine code) instead of IL code (which only gets converted to native code at runtime by the JIT compiler). This way it avoids needing to run on the full CLR as the usual .NET applications do, but still includes a refactored runtime for Garbage Collection.

Can we still step through our code and edit and continue? Fortunately yes we can. When “Debugging” the code actually runs off the CoreCLR (Part of .NET Core) and is not natively compiled. This also prevents extended compilation times each time we debug.

Some benefits using .NET Native:

  • Faster startup times (JIT doesn’t need to convert to native code at runtime)
  • Smaller memory footprints (optimizations made to chuck out what we won’t need at runtime)
  • C++ compilation with C# productivity.

Of course this coin also has 2 sides. Limitations:

  • Must compile into specific architecture (since JIT used to handle this, we must now make both x86 and x64 builds)
  • Limited (currently) to Windows Store development

.NET 2015

.NET 2015 is an umbrella term for these new .NET “components”:

dotnet2015

Compilers

What’s new on the compiler forefront and why should we care? Whether you’re indifferent to understanding the different compilers and their benefits or care deeply about the matter, knowing what’s new and what that means for development is important. So what is new?

  • Roslyn
  • RyuJIT
  • .NET Native Compilation

Before I jump into these, We very briefly need to highlight how a .NET application compiles and executes.

  1. We write some C#
  2. Compile. This runs our code through a compiler (c# compiler in our case – csc.exe) which outputs IL (Intermediate Language) code.
  3. Running our application, the JIT compiler (Part of CLR) converts IL code to native code as needed
  4. Native code is executed and cached (in memory)

Compilation

This image was taken from this blog, which does a fantastic job at explaining the basics of the JIT compiler.

Roslyn

Roslyn is a rewrite (from ground up) of the C# and VB.NET compilers. It’s an open source solution, allowing us as developers to not only view how items get compiled, but also get our hands dirty in customizing compilation (if needed). The compilers have been written in their own languages (C# compiler code is C# and VB.NET compiler code is VB.NET).

Probably the most important features that Roslyn brings to the table is a set of API’s that allow us to create some interesting things such as implement customized intellisense or refactoring. The API allows us to do static analysis which means we can analyse code without actually having to execute it. Also Roslyn can compile code “on-the-fly”. This means that you don’t need to recompile code before running it again, as the Roslyn compiler will do this for you in memory.

A silly refactoring example might be, in Visual Studio, clicking on a global variable, hitting Ctrl + . and then then choose our custom “Convert to Property”.  Our code written with Roslyn API will then grab the variable, analyse it, perhaps adapt it to some naming standard and convert it to a property. We can now easily build and deploy our refactoring tool to NuGet allowing others to easily download and use our tool.

Although Roslyn is probably not something most developers will get their hands dirty with, it certainly will open a flood-gate for Visual Studio productivity tool and extensions.

RyuJIT

Although JIT compilers are nothing new, the next-generation 64-bit JIT compiler for .NET has been released and dubbed RyuJIT. It’s performance is a lot better compared to the previous 64-bit JIT compiler. The heaviest workload of a JIT compiler is at startup as it starts converting IL code to native code and caches it in memory. RyuJIT now starts up to 30% faster.

.NET Native Compilation

We’ve already touched on .NET Native and what makes it so valuable. The .NET Native compiler compiles all our code (including .NET Framework and 3rd party) code directly into machine (native) code. We’ve discussed the advantages of this earlier. This is just to mention the new compilation chain .NET Native uses to magically convert code to machine.

Application Models

So we’ve looked at the new Frameworks and touched on new compilers. There’s one more area that also boasts change. The application model. The one making the most noise IMO is ASP.NET 5, but UWP for Windows 10 is still worthy of some attention:

  • Universal Windows Applications (UWP)
  • ASP.NET 5.

Universal Windows Applications (UWP)

UWP for Windows 10 has been recently released. What is UWP? Basically it allows us to develop a single application which will be able to deploy to a whole range of all Windows devices. Desktop, Mobile, XBox, Surface Hub etc…

The great thing is UWP supports quite a variety of languages: C++, C#, VB, JavaScript, HTML, XAML. So, whether you’re from a Web Background (and are familiar with HTML and JavaScript) or a WPF background (XAML and C#) you’ll be able to comfortable develop the apps to your strengths.

The biggest feature for me is that UWP is now optimized by the .NET Native runtime (I won’t go through the benefits of this again. See the .NET Framework section of why this is cool).

ASP.NET 5

ASP.NET 5 (previously referred to as ASP.NET vNext) has been released and boasts some great and interesting features and changes. First off, a lot has changed and to touch on all changes is out of the scope of this article. I’ll mention some things that stood out for me personally.

  • First and top of my list, ASP.NET 5 can run both on the full .NET Framework and on .NET Core. Running on .NET Core means it’s now possible to host our sites on OSX or Linux.
  • ASP.NET does no longer support WebForms, but only ASP.NET MVC
  • No more VB yet? Only C# is supported at the moment.
  • Some great TagHelpers which are closer to pure HTML to be used instead of the usual Razor HtmlHelpers.
  • Support for some popular client-side libraries such as GruntJS, NPM and Bower.
  • Built-in support for Dependency Injection

Plenty more info and goodies to be found on the ASP.NET 5 site

IDE

Tying together all the new features, we have a new IDE (Visual Studio 2015).

Visual Studio 2015

Visual Studio 2015 RTM has been out since mid-late July (I think) and using it for a couple of weeks I’ve noticed some cool features:

  • We can now debug lambda expressions. This is VERY cool. (Quick watch a collection, run a Linq Query and get results immediately)
  • Visual Studio has built in support for Cordova. Previously we’ve needed to compile from the command line.
  • When running from one breakpoint to another, the elapsed time shows. (No more manual timer code to check how long a method took)
  • Compiler support for C# 6 (and VB 14 of course).
  • VS Premium and Ultimate merged into Enterprise. So if you previously had premium account, you’ll now get “upgraded” to Enterprise. This finally allows the use of CodeLens feature (been around in VS 2013 already)
  • Includes a built in Android Emulator which can be used to debug Xamarin / Cordova apps.

Conclusion

Seeing the effort and improvements that Microsoft has put into some new products recently and how their shift towards a more open-source ecosystem and cross-platform intentions, I believe there are exciting times ahead and at least for the next while, in my opinion it’s looking both promising and safe to be a .NET developer!

Posted in .NET Development, Asynchronous Programming, Tutorials

Async pitfalls (Part 4 of 4)

We’re busy with a journey of discovering the basics of writing good asynchronous code. Here’s a quick road map:

We’ve covered the basics of TPL and discussed the async and await keywords. In this last part we take a look at some of the pitfalls that may arise as we write asynchronous code.

#1 async void is ONLY for high level handlers

This is probably the most important pitfall to understand. It’s one that is commonly used and difficult to understand why we must not use async void unless we fully understand how async and await work.

Let’s recall how async and await work.

  • A method marked with async enables the await keyword.
  • When await is hit, the method is run asynchronously and control returned to caller.
  • The message loop will continue to check the status of the Task
  • When the Task is compleeted, the message loop will give control back to the await statement and processing will continue after the await..

Let’s think about this carefully. If our message loop checks on the Task to see the status, but our async method returns void, how can it know that it’s finished? The answer is it cannot. So what happens is the caller will assume that the method is complete and continue it’s processing. Now we have a race condition and this opens a can of worms.

asyncvoid_pitfall

See a little demonstration of async void gone bad. Follow the step numbers on the image and explanation below:

  1. Here we are in the Loaded event. (It’s fine to use async void here as it is a high level handler). In this step we call the method GetLoggedInUserAsync() which is an async void method (alarm bells should go off now).
  2. You can see that we do an async call to the database to get the user name. We await the result and then set it to a variable _loggedInUser.
  3. Since we hit an await in step 2, remember that control returns back to the caller. This means that MyApp_Loaded gets control again. But since GetLoggedInUserAsync() did not return a Task, There’s no way for the message loop to track the Task completion, so an assumption is made that it’s done and you’ll see that the code will continue at number 3 printing “Logged in user is: “. But why is _loggedInUser blank? Because it gets set asynchronously at step 2 and since we returned void, the Loaded already printed without the variable having a value yet.
  4. Now finally we get a result from the database and the Task returned at GetUserNameFromDbAsync() is complete. The Message loop picks up that this Task is completed, returns control to the method and finally _loggedInUser is set. But unfortunately far too late.

Here’s our result (the username wasn’t set in time):

asyncvoid_pitfallresult

How would we correct this code? Very easy, we simply

  1. Change the async void to async Task
  2. Await the task in the Loaded event.

asyncvoid_pitfallsolution

Here’s the result (As it was intended):

asyncvoid_pitfallresultcorrect

Do NOT use async void, rather use async Task. The only exception is for high level event handlers (such  as Loaded events, Click events etc).

#2 Mixing await and TPL may cause deadlocks

If we’re going to start making our methods async, it can be quite important to make all our methods async, right up the calling tree. A common mistake is to use await in and .Result interchangeably. Let’s have a look why this is a problem.


private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var loggedInUser = GetLoggedInUserAsync().Result;
}

public async Task<string> GetLoggedInUserAsync()
{
   return await GetUserNameFromDbAsync();
}

When running the above code in WPF we kick off our async method from the UI Thread context we get a deadlock. At first glance it doesn’t make sense as to why.

The answer is that by default await schedules the continuation of a function on the same SynchronizationContext.

Let’s walk through the code snippet to understand this better.

  • In line 8 we await a Task.
  • This was called from our UI Thread context. This means that continuation with awaited results will run on our UI Thread again.
  • But in line 3 we used .Result which blocks the Thread until a result is received.
  • So line 3 blocks waiting for a result, whilst line 8 tries to return the awaited result onto the UI thread
  • This causes a deadlock.

Solution

There are 2 options that we can ensue:

  1. Make sure async and await are used all the way
      • Change line 1 to async void
      • Change line 3 to await the results instead of asking for .Result.
    
    private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
    {
       var loggedInUser = await GetLoggedInUserAsync();
    }
    
    public async Task<string> GetLoggedInUserAsync()
    {
       return await GetUserNameFromDbAsync();
    }
    
    
  2. The second option would be to use .ConfigureAwait(false)
    • Add ConfigureAwait(false) to all the places where you return a task
    • This will override the default behaviour on continuing on the “saved context” and therefore resolve the deadlock
private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var loggedInUser = GetLoggedInUserAsync().Result;
}

public async Task<string> GetLoggedInUserAsync()
{
   return await GetUserNameFromDbAsync().ConfigureAwait(false);
}

I would say a rule of thumb is to use async all the way up, but when creating class libraries that handle asynchronous work, then make sure that you litter it with ConfigureAwait(false) as you cannot control how the consuming code will call your async methods.

#3 Parallel class, ask if sequence matters

When using Parallel class to do iterations in parallel, always ask the question does sequence matter? If the answer is yes, then consider a different approach. Parallel class iterations don’t guarantee sequential processing. Also bear in mind that the iterations may not necessarily run in parallel. If the class deems the work to completed quicker synchronously, it will do so.

#4 Thread affinity restrictions with certain technologies

This is an age-old pitfall, but I believe it deserves a mention. Some technologies impose thread affinity restrictions that requires code to run on a specific thread. Make sure you consider these restrictions depending on the technology you are using.

For example in WPF we cannot update the UI except from the UI Thread. In this scenario we must bear in mind that if we’re on a different thread that we must invoke the UI’s dispatcher to pop our piece of work onto the UI Thread.

#5 Unobserved Exceptions

An unobserved Task throwing an exception will not be hanlded by user code. We need to observe the tasks (await, .Result or .Wait) or tap into TaskScheduler.UnobservedTaskException event to handle these exceptions

#6 Parallel isn’t always faster

When opting to make some operation asynchronous, it’s important to ask where it is necessary to do so. When a task is quick and simple when it runs synchronously, we may make our application slower by forcing tasks to run on seperate threads. Remember that whenever we run operations in parallel, there’s a cost of synchronization involved.

A common area of “over-parallelization” would be to make nested loops asynchronous. In such a case, try to only keep the outer loop asynchronous whilst the inner loops can run synchronously.

#7 Thread safety

This is an important concept to grasp. Once we start performing tasks in parallel, we will run into scenario’s where different threads share common resources. If multiple threads can access a method simultaneously without compromising the data, we say that the method is “Thread-safe”.

So let’s play out a little scenario. We have an application where we sell bikes. To sell a bike, we call SellBike() method and we increment a totalSold variable to keep track of bikes sold. We have one rule that we may only sell 5 bikes in total. After that we must stop.

Here’s the synchronous version of the above scenario:


int totalSold = 0;
public void MyApp_Loaded(object sender, EventArgs e)
{
   for (int i = 0; i < 10; i++)
   {
      SellBike();
   }
   Console.WriteLine("Sold " + totalSold + " bikes");
}

private void SellBike()
{
   if (totalSold < 5)
   {
      // Sell Bike
      Thread.Sleep(1);

      totalSold += 1;
   }
}

No matter how many times we run the above application, we always get this output:

bikeresult_sync

Now somewhere along the way, we decide that we need to make our application asynchronous. We change the for loop to Parallel.For and enjoy our performance gain:


int totalSold = 0;
public void MyApp_Loaded(object sender, EventArgs e)
{
   Parallel.For(0, 10, (i) =>
   {
      SellBike();
   });
   Console.WriteLine("Sold " + totalSold + " bikes");
}

private void SellBike()
{
   if (totalSold < 5)
   {
      // Sell Bike
      Thread.Sleep(1);

      totalSold += 1;
   }
}

Only to realise that suddenly we’re selling too many bikes. Here’s the output:

bikeresult_async

What went wrong?

How was it possible to sell 9 bikes when our code limited us to only selling 5? Thread safety.

We’re now iterating in parallel and the SellBike method gets hit from mulitple threads. Different threads hit the if statement at line 13 and ask have I sold 5 bikes yet? If the answer is true, the tread enters the code block and sells a bike and then increments the totalSold variable. The problem is that if all the threads manage to get past line 13 before 5 of the thread have updated the totalSold variable, it’s too late now to chase them out of the if statement, since it’s already processed as true.

Solution

We need to make sure that if one thread is “busy selling” a bike, we wait until it’s done. Once it’s done we allow the next thread to sell a bike. This way the integrity of our business rule to only sell 5 bikes stays intact.

We can use the lock statement. We lock on an object and then when a thread hits the lock statement, it checks whether any other thread has locked on the object. If so, the thread will wait until the other thread releases the object. Here’s the code:


private object sync = new object();
private void SellBike()
{
   lock (sync)
   {
      if (totalSold < 5)
      {
         // Sell Bike
         Thread.Sleep(1);

         totalSold += 1;
      }
   }
}

Now our method is “Thread-safe”, since multiple threads can access it at the same time without compromising the data.

bikeresult_sync

Something interesting to note is that when we use the lock keyword, it actually makes use of the Montior class (System.Threading namespace) in the IL code. It uses the Enter method to allow a thread to enter the code block and the Exit method to release the object, allowing another thread to lock on it and enter the code block. Here’s the decompiled c# from the IL code.

IL_monitorclass

The Montior class is know a synchronization primitive. It is a class which aids in the synchronizing of threads to avoid race conditions (as we saw in the bike example). There are many synchronization primitives to choose from, each with their own benefits, I’ll briefly touch on the Lightweight ones introduced in .NET 4.0

ManualResetEventSlim

  • A popular signalling synchronization primitive
  • Use this if a Thread needs to “Wait” for another thread to signal it.
  • Once the other thread gives the signal, the thread can continue its work.
  • This is a way for threads to “talk” to each other. “Hey, let me know when you’re done”, “Ok, I’m done”…

ManualResetEventSlim

SemaphoreSlim

  • Very similar to the Monitor class except that it allows n amount of threads to access a resource / code block
  • It’s much like a bouncer at the door of a code block. Only allowing n amount to enter at a time while the threads wait outside.
  • When a thread exits the resource, the “bouncer” allows another thread to enter.

SemaphoreSlim

CountdownEvent

  • This is a signalling synchronization primitive
  • A thread will wait until n amount of other threads have signaled.
  • E.g. a thread will say, “I’m waiting until 3 other threads have given me the signal!”

CountdownEvent

Barrier

  • This is used when you want parallel threads to “wait” at a point until they’re all done and then together continue again.
  • A great way to synchronize without requiring the control of a parent / master thread.
  • E.g. 3 Threads do work and when a thread is done, waits asking the other 2 “Are you done?” Once all the threads are done, they all continue together “Ok, ready? let’s Go!”

Barrier

Concurrent Collections

A last thing to consider it comes to Thread safety are concurrent collections. The standard collections are not Thread safe, as one thread could be adding an item, whilst another is removing item(s). We could implement specific locking procedures before accessing or manipulating collections, but fortunately for us, Microsoft already did a lot of this work for us.

Here’s a list of collections which can be safely used for multi-threading:

ConcurrentBag Thread safe collection of items (unsorted)
ConcurrentDictionary<TKey, TValue> Thread safe Dictionary
ConcurrentQueue Thread safe Queue
ConcurrentStack Thread safe Stack
IProducerConsumerCollection An Interface to implement our custom thread safe collection when manipulating and reading.

ConcurrentBag, ConcurrentQueue, ConcurrentStack & BlockingCollection implement this interface

Conclusion

That concludes this series of exploring asynchronous coding. To summarise, we’ve touched on the basics of:

Threading and asynchronous programming is a massive topic and there’s much to learn especially when it comes to understanding the lower-level workings of Threading, thread pools, thread affinity etc. I trust that this series has given you a good basis to start using asynchronous code. Comments and suggestions are welcome.

Posted in .NET Development, Asynchronous Programming, Tutorials

async and await (Part 3 of 4)

We’re busy with a journey of discovering the basics of writing good asynchronous code. Here’s a quick road map:

In Part 2, we explored the essentials of TPL. In .Net 4.5 we’ve been presented with a new kids around the block async & await. The huge advantage of these 2 is that they even further simplify the code and enable you to run logic asynchronously on a single thread. Yip, re-read that sentence again. It blew my mind the first time I did. This is a huge thing when it comes to writing non-blocking UI code.

Let’s test these magic keywords:

We’ll write a little WPF application with 3 TextBoxes. Then on the form load, we do 3 different “heavy calculations” and place each result into a TextBox. We’ll use TPL basics we’ve learned up until now.

Xaml:


<StackPanel>
   <TextBox x:Name="txt1" />
   <TextBox x:Name="txt2" />
   <TextBox x:Name="txt3" />
</StackPanel>

Here’s our TPL code behind:


private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var t1 = HeavyCalculation(2000000000);
   var t2 = HeavyCalculation(1800000000);
   var t3 = HeavyCalculation(int.MaxValue);

   txt1.Text = t1.Result.ToString();
   txt2.Text = t2.Result.ToString();
   txt3.Text = t3.Result.ToString();
}

public Task<string> HeavyCalculation(int number)
{
   return Task.Run<int>(() =>
   {
      int total = 0;
      for (int i = 0; i < number; i++)
      {
         unchecked { total += i; }
      }
      return total.ToString();
   });
}

The above works, The 3 tasks run asynchronously and the main thread carries on. If we look at lines 7, 8, 9 we set our TextBoxes to our Task results. This blocks the Main Thread until the results are obtained, resulting in our UI Thread hanging / freezing for a little bit. See image below of UI “freezing”:

wpf_freeze

Don’t get me wrong, this is already loads better than calling all 3 of our methods synchronously. We can make this a much better user experience using async & await? Let’s see how we would do that:


private async void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   var t1 = HeavyCalculation(2000000000);
   var t2 = HeavyCalculation(1800000000);
   var t3 = HeavyCalculation(int.MaxValue);

   txt1.Text = await t1;
   txt2.Text = await t2;
   txt3.Text = await t3;
}

public async Task<string> HeavyCalculation(int number)
{
   return await Task.Run(() =>
   {
      int total = 0;
      for (int i = 0; i < number; i++)
      {
         unchecked { total += i; }
      }
      return total.ToString();
   });
}

  • In the above code we added async keyword to the methods (this simply enables the await keyword in the method).
  • Line 14 we added the await keyword.
  • Now in lines 7, 8, 9 instead of explicitly asking for the results, (t1.Result), we use the await keyword.
  • The UI is completely responsive and as the results “come in”, the TextBox values will be set.

See the screenshots below:

wpf_async1 wpf_async2 wpf_done

Immediately the UI is responsive and the user can interact with our system even though our results aren’t all ready yet. As the results come in, the TextBoxes get populated.

How does it do this?

(Using the above example as a reference)

  • The Main Thread kicks off the 3 tasks asynchronously as before.
  • Now the UI Thread continues until it reaches line 7.
  • In the first example (t1.Result) the UI Thread blocked until the result is available.
  • Using await, instead of blocking the UI thread, the method returns control to the caller and the UI becomes responsive.
  • Now the “message pump” / “message loop” will continue to check the status of the Task. (how the message loop works is out of the scope of this article).
  • Once the result comes in, the Task object is marked as completed.
  • The message loop will give control back to the await at line 7 and continues until line 8.
  • If the result is ready, the result will be returned, otherwise, control is sent back to the caller (UI Thread) until a result comes in. This is how the UI becomes responsive.

Something important to note is that by default the await “saves” the SynchronizationContext of the function calling it. This can cause deadlocks .Result or .Wait and await are used together. (See async pitfalls, Part 4)

Error handling

Exception handling using async and await is much simpler as well. We can simply use a try … catch statement as if we were writing synchronous code:



try
{
   int result = await GetResult();
}
catch(Exception)
{
   // Handle exception...
}

What’s Next?

That’s it for covering the basics of async and await. In Part 4, we will look at some of the common pitfalls of asynchronous programming.

Posted in .NET Development, Asynchronous Programming, Tutorials

Essentials of TPL (Part 2 of 4)

We’re busy with a journey of discovering the basics of writing good asynchronous code. Here’s a quick road map:

In Part 1 we’ve set the stage for asynchronous programming by discussing some basic jargon and touched on the difficulties of the traditional approach to Multi-tasking. We quickly saw how it can become very complex and tedious to debug and maintain.

Fortunately we have the Task Parallel Library (TPL). TPL was introduced by Microsoft to provide an easier approach to implement parallel processing in applications. Much of the plumbing is now abstracted away and as a result, a more understandable API is available.

Task

The Task class (System.Threading.Tasks namespace) is our main player in TPL. We basically wrap up an operation into a Task which can then be executed asynchronously.

Is a Task the same as a Thread?

Very important to understand that a Task is NOT a Thread. Behind the scenes, a Task by default makes use of the ThreadPool. Also it’s important to note that a new Task does NOT guarantee a new Thread. It could be run on the same thread as it is called on.

If you’re a Web Developer, you should be familiar with Javascript Promises. A Task is much like a JS Promise.

Please note there are big differences between Tasks and JS Promises, the statement is merely to help relate a Task conceptually to something we’re already familiar with.

A Task as a piece of work that can run asynchronously and is more closely related to data or program flow. It’s a higher level, data oriented abstraction of a piece of work. A Thread on the other hand is a more explicit low level component. It also doesn’t abstract very nicely with our Application / Domain Logic.

There are 2 Task classes:

  • Task – No return type. An example would be to write to a log file. The operation gets executed and no return value is expected.
  • Task<TResult> – TResult is the return type. An example would be to get a customer by their Id. A Customer instance result would be expected from the Task. (Task<Customer>).

Starting a task

There are different ways to start a task:

Code Why’s & Why not’s
var t = new Task(…);
t.Start();
  • Slower than Task.Factory.StartNew(…)
  • Handy when you need to create a task but only start it later.
var t = Task.Factory.StartNew(…);
  • Use this if you’re using .Net 4.0 to start Tasks.
var t = Task.Run(…);
  • Introduced in .Net 4.5
  • Same as Task.Factory.StartNew except that it includes specific defaults that were often repetitively passed, making it easier to call.

Why is Task.Factory.StartNew faster than task.Start()? A Task can only ever be started once. The Start method on the instance has validation checks to ensure that a Task is not started again if it has already been run. When starting the Task from the Factory, this validation is not needed as it obviously has not run before since the Factory itself creates it.

Finally time to see some code. Let’s set the scene with some “heavy” synchronous code:


public static int HeavyCalculation(int number)
{
   int total = 0;
   for (int i = 0; i < number; i++)
   {
      unchecked { total += i; }
   }
   return total;
}

Here we have a method which receives a number, loops from 0 until the number passed, whilst keeping a running total. The total is then returned to the calling code.

The unchecked operator simply means that if we go past the maximum value that an integer can hold, instead of throwing an overflow exception, it will flip over to the minimum value and carries on.

static void Main(string[] args)
{
   Stopwatch sw = Stopwatch.StartNew();

   int answer1 = HeavyCalculation(2000000000);
   int answer2 = HeavyCalculation(1800000000);
   int answer3 = HeavyCalculation(int.MaxValue);

   int sumTotal = answer1 + answer2 + answer3;

   sw.Stop();

   Console.WriteLine("Answer: " + sumTotal);
   Console.WriteLine("Time: " + sw.ElapsedMilliseconds + " ms");

   Console.ReadLine();
}

Our calling code simply does 3 calculations on some big numbers; the answers are then summed up and written to the console window. It took about 12 seconds to execute.
ResultSync1

Let’s convert the 3 calculations into Tasks and allow them to run asynchronously.

static void Main(string[] args)
{
   Stopwatch sw = Stopwatch.StartNew();

   var t1 = Task.Run(() => HeavyCalculation(2000000000));
   var t2 = Task.Run(() => HeavyCalculation(1800000000));
   var t3 = Task.Run(() => HeavyCalculation(int.MaxValue));

   int sumTotal = t1.Result + t2.Result + t3.Result;

   sw.Stop();

   Console.WriteLine("Answer: " + sumTotal);
   Console.WriteLine("Time: " + sw.ElapsedMilliseconds + " ms");

   Console.ReadLine();
}

ResultAsync1

This time it took about 4.5 seconds to execute. Same results and almost 3 times faster, that’s not bad at all!

Let’s quickly discuss what happened in the above code.

  • The application will quickly run through to line 9.
  • At this point, all 3 Task have been started asynchronously.
  • The reason the application pauses at line 9 is because now we are requesting the Task’s results.
  • It cannot give a Result until it has one, so it blocks.
  • Fortunately, since all 3 Tasks ran in paralllel, they will more or less get their results at the same time.

Don’t worry if you don’t quite understand how the “waiting” of Tasks work, we’ll get to that soon.

Observing Tasks

So we can create Tasks and send them off into the “oblivion” to do what they need to do and in the mean time we do some other neat stuff. Sometimes we need to get back some results or at least wait for tasks to finish before doing our next set of work.

Waiting for a Task to return a Result or simply waiting until a Task is done, is know as “observing a Task“.

There are 2 ways to observe a Task:

  • Asking for the Result
  • Pause, by calling Wait

Asking for the Result.


var t1 = Task.Run(...);
var t2 = Task.Run(() => GetName());

int number = t1.Result;
string name = t2.Result;

In the above code, we created 2 Tasks (Task and Task). I’ve intentionally indicated Task for the first task and just Task on the second to show that there are different ways calling Task.Run().

The first Task we explicitly say that our Task is a task returning integer and therefore any method called in Run, MUST return an integer.

In the second Task, we simply say Task.Run and depending on the return type of the method passed into the Run, the Task determines its result type.

Both Tasks (t1 and t2) are started and run asynchronously. Then at Line 4 the main thread will wait until t1 is done and bring back the result. After that in Line 5, the main thread waits until t2 is done and brings back a result.

Pause by calling Wait

We can observe a task by using the Wait method. This is often done when we don’t have a return type, but we still want to make sure the Task completes before continuing. Here are some different ways to Wait for a tasks:

  • t1.Wait();
    • We can call the Wait method on the Task instance.
    • This would mean that calling thread would stop it’s execution and wait until the Task instance is done before continuing.
  • Task.WaitAll(t1, t2, t3)
    • Calling the static WaitAll method allows us to pass in an array of Tasks.
    • The calling thread will now wait at this point until all the tasks passed have completed before continuing
  • Task.WaitAny(t1, t2, t3)
    • Calling the static WaitAny method allows us to pass in an array of Tasks.
    • The calling thread will now wait at this point until any of the tasks passed have completed.
    • Once ANY of the Tasks have completed, the calling thread will continue execution, regardless of whether the other Tasks are done or not.

Task Exceptions

We’re are now TPL Masters!? Not yet, we still need to touch on Exception Handling. What happens when exceptions are thrown?

There are 2 exception “quirks” which caught me out and so I consider them very important.

  1. Tasks only throw AggregateException's.
    • It’s possible that multiple exceptions are thrown together (as we’re running our Tasks asynchronously) and therefore Exceptions thrown are bundled into an AggregateException.
    • To get the thrown Exception(s) details, we use the InnerExceptions property (Note: This is InnerExceptions, not the the standard InnerException).
  2. If a Task is NOT Observed, the Exception will be unhandled.
    • Very important! If you do not observe a Task (Wait for it or ask for it’s Result), it WILL remain unhandled, even if you wrap the neatest try/catch around it.
    • See the snapshot below. The exception in MyNewMethod is unhandled, even though there’s a try/catch around the Task calling it.

AsyncExceptions

Let’s practice what we preach and implement better Exception handling.


var t1 = Task.Run(() => MyNewMethod());
try
{
   t1.Wait();
}
catch (AggregateException aggEx)
{
   foreach(var ex in aggEx.InnerExceptions)
   {
      if (ex is NotImplementedException)
      {
         Console.WriteLine("NotImplementedException: " + ex.Message);
      }
      else
      {
         Console.WriteLine("Generic Exception: " + ex.Message);
      }
   }
}

In the above code we:

  1. Observe the exception (Wait() at line 4)
  2. Only catch an AggregateException
  3. Iterate the InnerExceptions and handle them individually.

Since the AggregateException is simply a “bundling” of exceptions, what if we wanted certain exceptions to be handled, whilst others we want to bubble upwards for the caller to handle?

We could check the Exception type and then rethrow Exceptions we don’t want handled. The AggregateException class has a nice little Ace up it’s sleeve for this purpose. The Handle(Func<Exception, bool>) method:


static void Main(string[] args)
{
   var t1 = Task.Run(() => MyNewMethod());
   try
   {
      t1.Wait();
   }
   catch (AggregateException aggEx)
   {
      aggEx.Handle(HandleException);
   }
}

private static bool HandleException(Exception ex)
{
   if (ex is NotImplementedException)
   {
      Console.WriteLine("Not Implemented!");
   }
   else
   {
      Console.WriteLine("Generic Exception: " + ex.Message);
   }
   return true; // True means we've handled the exception.
}

Here we simply create a HandleException method which takes an Exception and returns a boolean value. All the Exceptions within AggregateException will be passed to this method one at a time. If false is returned, the Exception is unhandled and will be bundled up into a new AggregateException and then rethrown to the caller. If true is returned, the exception is handled.

ILHandle

For interest sake, I checked out the decompiled code of the Handle method to see exactly what it does: In the highlighted parts we see that the “handled = false” exceptions are simply added to a list of Exceptions and re-thrown as a new AggregateException.

If we do want to catch our unobserved exceptions we can subscribe to the TaskScheduler’s UnobservedTaskException event.


TaskScheduler.UnobservedTaskException += TaskScheduler_UnobservedTaskException;
...
private static void TaskScheduler_UnobservedTaskException(object sender, UnobservedTaskExceptionEventArgs e)
{
   //...
}

The above method will only get hit once the GarbageCollector makes his round and collects a Task which threw an unobserved exception.

Cancelling Tasks

Now we’re capable of sending off Tasks asynchronously and waiting for them when we need to. But what happens if we need to cancel a long running Task? If we waited for the Task, we may have to wait for a long time. Once again TPL offers us a simple way to handle this. The image below describes the Cancelling process.

CancellationProcess

Let’s put some code to the idea.


CancellationTokenSource cts = new CancellationTokenSource();
public void DoWork()
{
   try
   {
      var t1 = Task.Run(() => MyLongMethod(cts.Token));
      t1.Wait();
   }
   catch (AggregateException aggEx)
   {
       //... Handle OperationCancelledException
   }
}

public void CancelTask()
{
   cts.Cancel();
}

public void MyLongMethod(CancellationToken token)
{
   for (int i = 0; i < 10; i++)
   {
      if (token.IsCancellationRequested)
      {
         // Clean up task here.
         throw new OperationCanceledException();
      }
      Thread.Sleep(500); // Simulating some work.
   }
}

Let's steps through the above code:

  1. Calling code creates a CancellationTokenSource. (line 1)
  2. CancellationTokenSource instance has a property CancellationToken. This token is passed to the Task(s) (line 6)
  3. The Task must do work and continuously check if the CancelToken has been cancelled.(line 24)
  4. When tasks must be cancelled, Call Cancel() on the source which changes the Token’s status to Cancelled (line 17). CancellationToken is a reference type, so all the tokens passed into tasks will point to the same token on the heap, and so changing the status to Cancelled will “signal” all Tasks.
  5. Next time Tasks check their Token’s status they will see it’s been cancelled. Now they stop their work, clean up and throw OperationCancelledException. (line 26-27)
  6. Calling code will catch this exception and handle it as required. (line 11)

Using the method ThrowIfCancellationRequested on the token is easier. This wraps up line 24 -28 into a single method call.

Chaining Tasks

Seldom we find code to be quite as simple as all our examples in this article so far. Let’s paint a new scenario:

We have an application that needs to call MyFirstTask and then when that is done, it must call MySecondTask and finally when this is done it must write “Finished” to a log file. From what we learnt, we could do the following:

// Not the best way!
var t1 = Task.Run(() => MyFirstTask());
t1.Wait();
var t2 = Task.Run(() => MySecondeTask());
t2.Wait();
var t3 = Task.Run(() => WriteToLog("Finished"));
t3.Wait();

Above you see we need to execute a task, wait for it to finish before kicking off the next one. Here’s a better way to represent the code above:

var t1 = Task.Run(() => MyFirstTask())
           .ContinueWith((prevTask) => MySecondeTask())
           .ContinueWith((prevTask) => WriteToLog("Finished"));

t1.Wait();

Here we see chaining taking place. Instead of creating new tasks for every step, we simply chained them. Once the first Task completed, the ContinueWith will execute and so on.One other thing to note is that the continue with a single parameter prevTask which basically is the antecedant / Previous Task which kicked off the ContinueWith. This way we can access the Result of the first Task from withing the ContinueWith.

The Task Factory has some static methods which take multiple tasks:

  • Task.Factory.ContinueWhenAll(tasks, (allTasks) => ….) – will wait until ALL Tasks are done before running the continuation Task.
  • Task.Factory.ContinueWhenAny(tasks, (firstTask) => …) – will wait until ANY Task is done before running the continuation Task.

None of the Continue methods will block the calling thread!. If you call Task.Factory.ContinueWhenAll(…) the calling code will continue on, since we have chained our continue with Task. If the calling code needs to stop, the Wait methods should be used.

Something else worth mentioning is that the ContinueWith method has and overload accepting TaskContinuationOptions. This allows us to chain a Task only when the previous one fails or is cancelled or is successful etc.

Scheduler / Dispatcher

So what if we were using WPF and tried to update the UI from within a Task? We will still get the good old cross thread exception. Remember in Part 1 we said that each Thread has a dispatcher and dispatches instructions to the thread. No one else may dispatch instructions to that specific thread. So in our little WPF application, we would need to make sure that UI changes are passed to the Dispatcher governing the UI Thread.

Here’s a little WPF example code that would cause a cross thread exception


private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   Task.Factory.StartNew(() => SetText());
}

private void SetText()
{
   string text = "A";
   for (int i = 0; i < 100; i++)
   {
      text += i;
   }
   txt.Text = text;
}

And the result (as we expected):

crossThreadException

There are different ways of solving this:

  • We could invoke our UI code with txt dispatcher (as we would have done before TPL)
    txt.Dispatcher.BeginInvoke(new Action(() => txt.Text = text));
  • Another way would be to leave SetText as is (without polluting it with Dispatcher logic) and changing the “context” on which the TaskScheduler runs. Here’s how that would look:

private void MainWindow_Loaded(object sender, RoutedEventArgs e)
{
   TaskScheduler uiScheduler = TaskScheduler.FromCurrentSynchronizationContext();
   Task.Factory.StartNew(() => SetText(), CancellationToken.None, TaskCreationOptions.None, uiScheduler);
}

As you can see in the code above, we get the current Context (which is the UI context) and then we pass that in as a parameter when we kick off our Task and then the Task execute on the UI thread.

Perhaps a more realistic example of the above would be to call GetText which does the heavy loading and then chain it passing the UI context into the ContinueWith method, where we would set txt.Text = prevTask.Result. This way our heavy processing happens on a different thread and only our UI logic is scheduled on the UI Thread.

Parallel class

A great Helper / Utility class which forms part of TPL is the static Parallel class. There are 3 main methods in the parallel class which we can use:

  • Parallel.Invoke
  • Parallel.For
  • Parallel.ForEach

Parallel.Invoke

This method simply executes multiple methods in parallel. Kind of what Task.Run() does.


Parallel.Invoke(
  () => FirstMethod(),
  () => SecondMethod(),
  () => ThirdMethod());

So why would there be just another way to execute some tasks? The answer is that this way is faster.

Parallel.Invoke only takes methods without any return types (Action‘s). The reason for the performance increase is that Task.Run or Task.Factory.StartNew has to create a Task instance for every action performed in parallel, where as Parallel.Invoke does not return anything.

Simply put, use this method if you have methods that must be run Asynchronously and you don’t care about observing tasks (Send off tasks and forget about them).

Parallel.For and Parallel.ForEach

When we’re iterating items, each item is processed one at a time. What if we could split the work across cores, significantly cutting the time it would take to process the entire collection? With Parallel.For and Parallel.ForEach we can now process our iterations in parallel.

Only ever use these methods if the sequence does not matter.

Some examples:


Parallel.For(1, 10, i =>
{
   Console.WriteLine(i);
});

or


var nums = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
Parallel.ForEach(nums, (num) =>
{
   Console.WriteLine(num);
});

Both examples with produce something like this:

ParallelLoopResult

PLINQ

PLINQ is parallel support for LINQ. There is an IEnumerable extension method .AsParallel() which returns ParallelQuery class. When we do our LINQ queries on a ParallelQuery, the actual execution of the query may happen asynchronously. I say may as there is no guarantee that the query will be processed asynchronously. TPL will sometimes opt to execute the query synchronously if it deems it to be faster.

The reason why processing the LINQ in parallel may be slower that synchronously, is that the cost of creating and synchronizing threads may be more expensive than simply executing the query.

Here’s some code:


List lstCustomers = ListAllCustomers();
var vips = from cust in lstCustomers.AsParallel()
           where customer.Status == "VIP"
           select customer;

Same LINQ, just call .AsParallel() on you IEnumerable and that’s it. Doesn’t get much easier than that 🙂

What’s Next?

That’s it for covering the basics of TPL. In Part 3, we will look at the async and await keywords introduced in .NET 4.5 and what makes them unique.