Category.NET

Serial vs Parallel task execution

This time let’s talk a bit about the difference between serial and parallel task execution.

The idea is simple: if we have two or more operations depending one from another (eg. the result of one goes as input into another), then we need to run them in serial, one after the other.

Total execution time will be the sum of the time taken by the single steps. Plain and easy.

What if instead the operations don’t interact? Can they be executed each in its own path so we can collect the results later on? Of course! That is called parallel execution .

parallel car racing track

It’s like those electric racing tracks: each car gets its own lane, they can’t hit/interfere each other and the race is over when every car completes the circuit.

So how can we do that? Luckily for us, in .NET we can use Task.WhenAll() or Task.WaitAll() to run a bunch of tasks in parallel.

Both the methods do more or less the same, the main difference is that Task.WaitAll waits for all of the provided Task objects to complete execution, blocking the current thread until everything has completed.

Task.WhenAll instead returns a Task that can be awaited on its own. The calling method will continue when the execution is complete but you won’t have a thread hanging around waiting.

So in the end, the total time will be more or less (milliseconds heh) the same as the most expensive operation in the set.

I’ve prepared a small repository on Github to demonstrate the concepts, feel free to take a look. It’s a very simple .NET Core console application showing how to execute two operations in serial and then in parallel.

Here’s a screenshot I got on my Macbook Pro:

Know your data structures – List vs Dictionary vs HashSet

Are there any cases when it doesn’t really matter how your data is structured, as long as you’re fulfilling the task at hand? Or is it always important to use the perfect data structure for the job? Let’s find out!

Those collections have quite different purposes and use cases. Specifically, Lists should be used when all you have to do is stuff like enumerating the items or accessing them randomly via index.

Lists are very similar to plain arrays. Essentially they are an array of items that grow once its current capacity is exceeded. It’s the standard and probably the most used collection. Items can be accessed randomly via the [] operator at constant time. Adding or removing at the end costs O(1) as well, except when capacity is exceeded. Doing it in the beginning or the middle requires all the items to be shifted.

Dictionaries and HashSets instead are specialised collections intended for fast-lookup scenarios. They basically map the item with a key built using an hash function. That key can be later on used to quickly retrieve the associated item.

They both share more or less the same asymptotic complexity for all the operations. The real difference is the fact that with a Dictionary we can create key-value pairs (with the keys being unique), while with an HashSet we’re storing an unordered set of unique items.

It’s also extremely important to note that when using HashSets, items have to properly implement GetHashCode() and Equals() .


On Dictionaries instead that is obviously needed for the Type used as key.

I wrote a very small profiling application to check lookup times of List, Dictionary and Hashset. Let’s do a quick recap of what these collections are. It first generates an array of Guids and uses it as source dataset while running the tests.

The code is written in C# using .NET Core 2.2 and was executed on a Macbook Pro mid 2012. Here’s is what I’ve got:

Collection creation
Collection creation

Lists here perform definitely better, likely because Dictionaries and HashSets have to pay the cost of creating the hash used as key for every item added.

Collection creation and lookup
Collection creation and lookup

Here things start to get interesting: the first case shows the performance of creation and a single lookup. More or less the same stats as simple creation. In the second case instead lookup is performed 1000 times, leading to a net win of Dictionary and HashSets. This is obviously due to the fact that a lookup on a List takes linear time ( O(n) ), being constant instead ( O(1) ) for the other two data structures.

Lookup of a single item
Lookup of a single item

In this case Dictionaries and HashSet win in both executions, due to the fact that the collections have been populated previously.

Lookup in a Where()
Lookup in a Where()

For the last example the system is looping over an existing dataset and performing a lookup for the current item. As expected, Dictionaries and HashSet perform definitely better than List.

It’s easy to see that in almost all the cases makes no difference which data structure is used if the dataset is relatively small, less than 10000 items. The only case where the choice matters is when we have the need to cross two collections and do a search.

Using Decorators to handle cross-cutting concerns — Part 2 : a practical example

In my previous article I discussed a bit about how to use the Decorator pattern to implement cross-cutting concerns and reduce clutter in your codebase.

Today it’s going to be a bit more practical: we’ll be looking at a small demo I published on Github that makes use of Decorators as well as some other interesting things like .NET Attributes, CQRS and Dependency Injection.

I’m not going to deep dive into the details of CQRS as it would obviously take too much time and it’s outside the scope of this article. I’m using it here because query/command handlers usually expose just one method so there is no need to implement a big interface. Also, I like the pattern a lot 🙂

So let’s go straight to the code! The repository is available here: https://github.com/mizrael/cross-cutting-concern-attributes

It’s a very small .NET Core WebAPI application, nothing particularly fancy. No infrastructure of course, there’s no need for this article.

There’s just one API controller, exposing a single GET endpoint to retrieve a list of “values”. I might have called it “stuff” instead of “values”, it’s just an excuse to retrieve some data from the backend.

As you may have noticed, there’s no direct reference to the query handler in the API controller: I prefer to use MediatR to avoid injecting too many things in the constructor. It has become an habit so I’m doing it even when there’s just one dependency.

For those who don’t know it, MediatR acts as a simple in-process message bus, allowing quick dispatch of commands, queries and events. So, basically, it’s a very handy tool when implementing CQRS.

The ValuesArchiveHandler class handles the actual execution of the query. Actually it’s not doing much, apart from returning a fixed list of strings.

What we’re interested into actually is that small attribute, [Instrumentation] . It is just a marker, the real grunt-work will be elsewhere. I could have used an interface as well of course, but there are several reasons why I didn’t.

First of all, I prefer to avoid empty interfaces: an interface is a contract, and an interface without method doesn’t define any contract.

Moreover, attributes can always be configured to not propagate to descendant types automatically, something you cannot do with interfaces.

Now, take a look at the InstrumentationQueryHandlerDecorator class. It’s a query handler Decorator, so it gets an instance of a query handler injected in the constructor, and uses it in the Handle() method.

This decorator is not doing anything particular fancy, it’s just using Stopwatch to track how much time the inner handler is taking to complete.

What we’re interested into is the constructor: there the system is checking if the inner instance has been marked with the [Instrumentation] attribute, flipping a boolean value based on the result. That bool will then be used in the Handle() method to turn the instrumentation on or off. That’s it!

I’m using StructureMap as my IoC container and I’m taking care of the handler registration here . In the same file I also decorate all the query handlers with the InstrumentationQueryHandlerDecorator .

Keep in mind that I could have added some smarts here and check at registration time if a particular handler had been decorated with the [Instrumentation] attribute.

That would probably be a better solution as it would avoid runtime type checks, handling everything during the application bootstrap.

I’ll probably add this to the repository, I left it out to keep things simple 🙂

This article is also available on Medium as part of a series:

Using Decorators to handle cross-cutting concerns

I was actually planning of posting this article here but I was migrating to another server the last week and it took one week for the domain to point to the new DNS. Turns out this gave me the chance to try Medium instead, so published my first article there.

This time I’ll be writing about a very simple but powerful technique to reduce boiler-plate caused by cross-cutting concerns. In this post we’ll explore a simple way to encapsulate them in reusable components using the Decorator pattern.

Let’s first talk a bit about “cross cutting concerns”. On Wikipedia we can find this definition:

Cross-cutting concerns are parts of a program that rely on or must affect many other parts of the system.

In a nutshell, they represent almost everything not completely tied to the domain of the application but that can affect in some way the behaviour of its components.

Examples can be:
– caching
– error handling
– logging
– instrumentation

Instrumentation for instance can lead to a lot of boilerplate code which eventually will create clutter and pollute your codebase. You’ll basically end up with a lot of code like this:

Of course, being IT professionals, you can quickly come up with a decent solution, find the common denominator, extract the functionality, refactor and so on.

So…how would you do it? One option would be to use the Decorator pattern! It’s a very common pattern and quite easy to understand:

Basically you have a Foo class that you need somewhere that implements a well known interface, and you need to wrap it into some cross-cutting concern. All you have to do is:

  1. create a new container class implementing the same interface
  2. inject the “real” instance
  3. write your new logic where you need
  4. call the method on the inner instance
  5. sit back and enjoy!

Very handy. Of course it can be quite awkward in case your interface has a lot of methods, but in that case you might have to reconsider your architecture as it is probably breaking SRP.

One option would be moving to CQSCQRS. In the next post of the series we will see a practical example and discuss why those patterns can be an even more interesting option when combined with Decorators.

Stay tuned!

Testing the boundaries of your Web APIs

How do you make sure an entire software you wrote works? And how would you do that if your system doesn’t have a UI? Well, simply by testing the boundaries of course!

From time to time I like to extract pieces of code from what I’m working on and create small repos just to showcase a single functionality or idea. 

This time I’m putting some efforts on TDD on APIs and after few refactorings I came up with a nice structure that you can use as a starting skeleton for a simple system. You can find all the sources here on GitHub.

The demo is very simple, just a single controller that stores and provides user details. Nothing fancy. The user model class exposes only three properties: id, full name and email.

Few points worth noting though:

  • the class is immutable. I wrote a bit about the concept here.
  • I’m adopting the Special Case (or Null Object) pattern a lot these days. Hence the NullUser static property.

Persistence is done in-memory as it’s obviously outside the scope. Moreover, as you can see the Tests project contains only the end-to-end tests, no unit/integration test to cover the persistence layer.

The testing infrastructure is where things gets interesting, even though it’s actually fairly straightforward. An XUnit Fixture is firing up a TestServer and bootstrapping the application using (possibly) the same settings as the real system.

A shared WebHostBuilderFactory class is indeed responsible of building the required IWebHostBuilder instance.

That’s it!

Ok, just to be honest, I got the idea from Mark Seemann : he has a very interesting course on Pluralsight named “Outside-in TDD“. If you have the chance, I strongly suggest you to watch it.

So, now that we have our nice infrastructure ready, all we have to do is write our tests! Being this a Web API, these might be considered either “functional” or “end to end”.

Honestly I think it’s simply a naming thing and doesn’t change the fact that probably these should be the first tests you would write.

Why? Because (and Mark explains it really well in his course) you’re ensuring from the consumer’s perspective that your APIs do what they’re expected to do.

You’ll be “testing the boundaries”.

But most importantly, you’re validating your acceptance criteria and making sure your system works. 

Everything else is just an implementation detail.

So what are we testing here? The routes of course! Our API is managing users, and being it RESTful, we’re asserting that all the http verbs are doing what we expect to do. 

Most of these tests should derive directly from the acceptance criteria written by your Product Owner. In case you don’t have one but instead rely on some (even vague) specifications, a good starting point is simply testing inputs and outputs. 

Happy testing!

© 2019 Davide Guida

Theme by Anders NorenUp ↑