Good Practices for Enterprise Integrations over the Azure Service Bus

Integrating systems over a good messaging infrastructure has many benefits if done right. Some of these are system decoupling and platform agnosticism, independent scalability, high reliability, spike protection, a central place to keep an eye on your integrations and more depending on your situation.

Many of these benefits are vast topics on their own. But the point of this article is to go over some of the pitfalls when implementing large-scale integrations over a messaging infrastructure and guidelines for avoiding them.

These are based on my learning from a recent project which involved improving a client’s existing integration between 5+ large systems. While the messaging infrastructure in use was the Azure Service Bus, I believe many of these points could apply to all other message-based integrations and service buses.

1. Don’t roll your own

Don't re-invent the wheel Don’t re-invent the wheel.

Not rolling your own should really be common sense in all projects, yet you still see it unnecessarily done more often than you should.

Unless your aim is to actually provide a competing solution that works better than what’s available and you have the resources to do so, you will get it wrong, not think of the edge cases, and it will eventually cost you more as your efforts are diverted from delivering business value to maintaining your custom solution.

So stand on the shoulder of giants as much as you can and use tried and tested infrastructure that’s already available.

2. Don’t bypass your messaging infrastructure

This is closely related to the previous point and includes plugging in your own service bus-esque solutions alongside a proper infrastructure like the Azure Service Bus, for example, storing messages in a database table for later processing.

One of the major initial issues I identified with the client’s integration, was that to avoid queues getting full, they were retrieving messages from the queues, storing them in an MSSQL database and processing from there. This had introduced a big flaw in the design, losing out on many benefits that came with ASB:

  • Proper locking of the message when it’s being processed, to prevent duplicate processing. Implementing your own locking mechanism on a database table is complex, error-prone and can easily cause dead-lock situations.

  • Scaling and performance. Even if you get the locking right, it won’t be nowhere near as performant as a proper message queue for high traffic scenarios. You’ll have a high-write, high-read situation (a lot of messages coming in and a lot of polling) which is very hard to optimize a SQL database table for.

  • Availability and reliability. A good messaging platform like ASB, is highly available and redundantly stores your data.

  • Filtering, dead lettering, automatic expiry and a myriad of other features that come with ASB.

This will lead you to the pitfall #1 from above: in order to get it right, they had to essentially roll their own messaging infrastructure, so they ended up with two problems instead of solving the first one.

Treat the root problem, not the symptom.

‘What was the original problem you were trying to fix?’ ‘Well, I noticed one of the tools I was using had an inefficiency that was wasting my time.’

3. Use shared topics and a pub-sub pattern instead of queues

Imagine the scenario of updating customer details in a CRM system which should be reflected in the sales system. There’s only two systems, so it might be tempting to do something like this:

Point-to-Point An exclusive topic/queue for sending a single system’s customer update messages, effectively creating a point-to-point integration.

This might look fine if you’re only starting with a couple systems, however not going down this route is critical for future maintainability.

Let’s say after some time, the business needs a third legacy system – one that drives many other parts of the business – integrated as well:

Point-to-Point Keep it simple?

There’s several problems with the above:

  • Single points of failure. Each of the systems being slow or unavailable breaks the link and the data no longer flows to the next system. This actually caused one of the client’s main systems to have data that was weeks old until the faulty system was properly fixed and processed its backlog of messages.

  • If, some time later, you add a third integration between the legacy system and CRM like the diagram, all of a sudden you have inadvertently created an integration between all of the systems by forming a circle which may not have been your intention. It also becomes much more difficult to reliably stop infinite loops from happening between all the systems involved where one message keeps going round and round. This can result in invalid data and even more resources being wasted. More on this below about having a mechanism to stop infinite message loops.

If you want to have redundancy, you’ll end up having to integrate each system with all the other ones:

Point-to-Point Fancy some spaghetti?

With only 3 systems and one simple message, this is already looking like a mess. As you add more systems and integrations, it becomes exponentially costlier and harder to develop and maintain.

There are many flavours to these point-to-point integrations and each come with their own myriad of problems.

A much better approach that addresses most of the above issues and is easily maintainable, is a shared topic that all participating systems publish and subscribe to:

Point-to-Point The publishers don’t have to know about the other systems. Any interested system can subscribe to the topic, effectively getting its own queue.

By using this approach, you further decouple the systems, keep it extensible for plugging new systems and keep complexity and maintenance costs linear. If subscribing systems need to know where a message originated from, that can be included in a standard message body format, or better yet, message properties as they can drive filters and actions.

4. Be mindful of loops – subscribers shouldn’t republish

It’s very easy to cause infinite loops that can wreak havoc on all the systems involved. From the above example, it can easily happen if each system received a message from the topic, updates a customer and as a result republishes a CustomerUpdated message back to the topic.

A simple solution to this problem that works with the above pub-sub model, is that a system’s action based on a message received from the topic, shouldn’t cause the same message to be republished back to the topic.

5. Include event timestamps in messages

Each message should have a timestamp. Not the one that is automatically stamped on messages by the service bus, but one that your system includes in the message. The timestamp should describe when the event the message describes happened within that system.

This provides a means for other systems to not act based on outdated data. e.g. it’s possible that by the time a subscribing system receives a CustomerUpdated message, it’s already applied a more recent update. A simple timestamp check can prevent the more recent data from being overwritten.

While you’re at it, make that timestamp a UTC one so you don’t run into time zone conversion and datetime arithmetic issues. You only care about the instant when something happened, not the local time, and a UTC datetime simply represents that.

6. Messages should be the source of truth

‘And the whole setup is just a trap to capture escaping logicians. None of the doors actually lead out.’

All necessary information from the publishing system should be self-contained in each published message. That means all the information the subscribing systems need to be able to act on the message.

A simple example of this is the CRM system publishing a CustomerUpdated message that only says a specific customer’s email address has been updated, without including the updated email. Then the Sales system maybe makes a call to a CRM API to retrieve the updated email address. However, by the time that message is processed by Sales and the call is made, the customer may have been updated again in CRM, resulting in data inconsistency.

This major anti-pattern not only introduces flaws and invalid data, but takes away many of the benefits of using a service bus in the first place, such as system decoupling.

So keep and treat messages as the single source of truth. On the other hand, don’t include unnecessary information and keep them lightweight to make the most of your topic space.

7. Have naming conventions for topics and subscriptions

Each topic and subscription following a naming convention brings many monitoring and troubleshooting upsides:

  • As the integrations within the service bus namespace grow it remains easy to find the target topic and subscription.

  • You can see what’s going on in your service bus namespace when using an external tool like the Service Bus Explorer.

  • It can potentially drive an automatic monitoring tool that can visualize all the integrations and the flow of data.

An example of such a convention could be:

Topic Name Subscription Names
YourCompanyName.CustomerUpdated YourCompanyName.CRM

Like most conventions, the value mainly comes from there being one, rather than the convention per se.

8. Have proper logging and monitoring to quickly identify and deal with issues

Integrations, once in place, shouldn’t just become part of a black box that’s your service bus.

It is crucial to have proper monitoring, instrumentation and logging in place to be notified of irregularities as soon as possible. You should have automatic alarm bells for at least the following:

  • A subscription’s incoming rate becomes consistently higher that its processing rate. This should be fine for short periods of time e.g. traffic spikes. However if it continues your topic would eventually get full.

  • A topic’s used space is consistently increasing.

  • A subscription’s dead-letter queue message count is consistently increasing.

The above are closely related and are usually a sign that a subscribing system is unavailable or having stability or performance issues. These issues need to be dealt with ASAP. Depending on how the publishing systems can cope with the topic they’re publishing to being full, it could also lead to problems in those systems. For the client, this actually accumulated a backlog of failed tasks in a publishing system that was limited and hosted externally. It was hopelessly retrying over and over, unnecessarily using significant resources which affected other parts of the system such as delays in sending important marketing material.

In the meanwhile, to stop the topic from getting full, you could set up forwarding to redirect messages for the problematic system to a different backup topic subscription or queue until the problem is resolved. Just don’t move them to a database.

Azure Monitor can help here. Having structured logging of important information to a centralized logging server such as Seq can also be really beneficial, however, be careful not to create a lot of noise by logging the world.

9. Don’t forget the dead-letter queue

All queues and topic subscriptions automatically come with a supplementary sub-queue, called the dead-letter queue (DLQ). Messages can end up here for a number of reasons:

  • By the engine itself as a result of your messaging entity configurations, e.g. automatically dead-lettering expired messages or filtering evaluation errors.

  • By the engine itself as messages exceeds maximum delivery count – which is 10 by default. For example, your subscriber receives the message, but fails to process it, maybe due to an exception.

  • Due to the subscriber’s request in message processing code.

See here for a complete list and their details.

These dead-letter queues contribute to the overall used space of the topic, so an operator should keep an eye on them, using your monitoring tools, and empty them regularly via either resubmitting them if they were due to transient errors, or discard them if they were poison messages.

10. Performance is not just a nice-to-have

Having performant message processors ensures your integrations run smoothly and can withstand traffic spikes without your topics getting full. Here are some tips that can increase performance dramatically:

  • Use AMQP and avoid HTTP polling. This should be the default if you’re using the new .NET Standard Service Bus Client library, you can read about the benefits here. Also be careful to not use the old library – most of the documentation around still point to that.

  • Use the asynchronous callback-based message pump API. Sending/receiving messages to/from the message broker is an inherently IO based asynchronous operation – you shouldn’t hold up threads for them.

  • Process messages concurrently. Many programmers shy away from writing concurrent code, as it is more complex and error prone. That’s usually a good approach, however the free lunch has long been over, and this is one of the scenarios where having concurrent code really shines. Concurrent code doesn’t mean you have to do unnecessary multithreading. If you use the asynchronous APIs and leverage truly async code where possible, even a single thread can accomplish orders of magnitude more. This needs to be accompanied by proper asynchronous synchronization so you don’t, for example, process two messages for the same customer simultaneously.

  • Keep connections to the service bus alive (i.e. the clients) and don’t recreate them as that’s expensive. They are designed to be kept alive and reused.

  • Leverage prefetching with a message lock duration that’s tweaked based on message processing times. When using the default lock expiration of 60 seconds, Microsoft recommends 20 times the maximum processing rate of your subscriber. e.g. if the subscriber processes 10 messages per second the prefetch count could be 10 x 20 = 200. The idea is to prefetch comfortably below the number of messages your subscriber can process, so they aren’t expired by the time it gets around to processing them. You can read more about that here.

  • Use partitioned topics. One of their biggest benefits is they can go up to 80GBs in size compared to just 5GBs for unpartitioned ones. That can give you a lot more time to deal with issues explained above and you almost never need to worry about them getting full. But they also have better throughput, performance and reliability. There’s really no good reason for not using them.

By combining the above, I was able to improve the processing time per message on a critical subscription for the client from ~15 seconds to ~3 seconds and total messages processed per hour from ~240 to ~12000.

11. Have message versioning and remain backward compatible

It’s only a matter of when, not if, that your messages need to be changed. To make moving forward easier and seamless, have a message versioning strategy to start with and make those changes in a backward compatible way.

Prepare yourself for the situation that different subscriptions of a single topic contain different message versions. This allows subscribers to be upgraded at their own pace, while not blocking those that can process the new version.

Old message versions can ultimately be retired when all subscribers are upgraded.

12. Have idempotent subscribers

Even with other measures in place such as duplicate detection. It’s very likely that your subscriber receives the same message twice. For example, this can happen when during the time a subscriber is processing a message, its lock expires and the message is released back to the subscription queue. So you have to make sure your subscribers process messages idempotently. This can be achieved via various mechanisms depending on your circumstance, but checking against message timestamps or unique message IDs can be a simple effective measure.


In conclusion, service buses, like any other tool in our toolbox, can be misused or abused, perhaps more easily than some of the others and that’s led to many hating them. But in the right situation they are very powerful. Following the above guidelines should hopefully help you build a solid foundation for large-scale integrations over service buses and not end up with huge improvement costs. Because changing upstream design always costs exponentially more downstream.

The Dangerous EF Core Feature: Automatic Client Evaluation

Recently when going through our shiny new ASP.NET Core application’s Seq logs, I accidentally spotted a few entries like this:

The LINQ expression ‘foo’ could not be translated and will be evaluated locally.


I dug around in the code and found the responsible queries. Some of them were quite complex with many joins and groupings, while some of the other ones were very simple stuff, like someStringField.Contains("bar", StringComparison.OrdinalIgnoreCase).

You may have spotted the problem right away. StringComparison.OrdinalIgnoreCase is a .NET concept. It doesn’t translate to SQL and you can’t blame EF Core for that. As a matter of fact if you run the same query in full-blown Entity Framework, you’ll get a NotSupportedException telling you it can’t convert your perdicate to a SQL expression. And that’s a good thing! because it prompts you to review your query, and if you really want to have a predicate in your query that only makes sense in the CLR world, you can decide if doing a ToList() at some point in your IQueryable to pull down the results of your query from the database into memory makes sense. And when you have your results in memory you can continue with your query shenanigans without having to worry if the rest of your query is translatable to SQL. Or you may decide that you don’t need that StringComparison.OrdinalIgnoreCase, because your database collation is case-insensitive anyway.

The point is, by default, you are in control and can make explicit decisions based on your circumstance.

Not anymore in Entity Framework Core. Apparently there’s this concept of mixed client/server evaluation in EF Core. What it effectively does is if you put stuff in an IQueryable LINQ query that can’t be translated to SQL or a query in your underlying database, it tries to magically make it work for you, by taking the untranslatable bits out and running them locally! and it’s enabled by default!

That’s a huge and extremely dangerous behavior change compared to the full Entity Framework. Consider this familiar entity:

public class Person
	public string FirstName { get; private set; }
	public string LastName { get; private set; }
    public List<Address> Addresses { get; private set; }
    public List<Order> Orders { get; private set; }
	more properties and child collections

And imagine someone writing an EF query like this:

	var results = dbContext.Persons
		.Include(p => p.Addresses)
		.Include(p => p.Orders)
		.Where(p => p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase))

EF Core can’t translate p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase) into a query that can be run on the database, so what it does is, it pulls down the whole Persons table, as well as the whole Orders and Addresses tables from the database into the memory and then runs the .Where(p => p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase)) filter on the results. 🤦

It’s not hard to imagine the performance repercussions of that on any real-sized application with a significant number of users. It can easily bring down applications to their knees. Compare that to the full Entity Framework where you get an exception. The fact that this hugely different behavior is the default is mind blowing!

You could argue that it’s the developer’s fault for including something like StringComparison.OrdinalIgnoreCase in the IQueryable prediate. While that’s true, you can’t blame people newer to EF for expecting something like that to magically work. Especially since it does! for more senior devs it can easily sneak past unnoticed when context switching between LINQ-to-entities and LINQ-to-objects.

Also, having untranslatable things like StringComparison.OrdinalIgnoreCase in your query isn’t the only culprit that results in client evaluation. If you have too many joins or groupings, the query can become too complex for EF Core and make it fall back to local evaluation.

So you probably want to keep an eye on your logs if your EF Core queries are magically working without a hitch as you may be getting client evaluation. Or better yet, if you don’t want that additional cognitive overhead, disable it altogether and make it throw like the good old Entity Framework:

/* Startup.cs */

public void ConfigureServices(IServiceCollection services)
	services.AddDbContext<YourContext>(optionsBuilder =>
			.ConfigureWarnings(warnings => warnings.Throw(RelationalEventId.QueryClientEvaluationWarning));

/* Or in your context's OnConfiguring method */

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
	optionsBuilder.ConfigureWarnings(warnings => warnings.Throw(RelationalEventId.QueryClientEvaluationWarning));

I believe languages and frameworks should always make it harder to make mistakes, especially ones like this with potentially devastating consequences, so I strongly disagree with this new default behavior and will be keeping it disabled.

Create your own visual studio code snippets

Visual Studio Code Snippets are awesome productivity enhancers; I can only imagine how many millions of keystrokes I’ve saved over the years by making a habit out of using them.

Although a lot of common code you use daily might not be available out of the box, adding them yourself is very simple.

Here are some samples for creating Console.ReadLine & Console.ReadKey snippets:


<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="">
  <CodeSnippet Format="1.0.0">
      <Description>Code snippet for Console.ReadLine</Description>
        <Literal Editable="false">
      <Code Language="csharp">


<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="">
  <CodeSnippet Format="1.0.0">
      <Description>Code snippet for Console.ReadKey</Description>
        <Literal Editable="false">
      <Code Language="csharp">

You can save the above as .snippet files and then import them via Tools > Code Snippet Manager... > Import... and use them by typing cr or ck and hitting TAB twice.

So go ahead and create handy ones for things you find yourself typing all the time. You can refer to this MSDN article for more details.

Allowing Only One Instance of a C# Application to Run

Making a singleton application, i.e. preventing users from opening multiple instances of your app, is a common requirement which can be easily implemented using a Mutex.

A Mutex is similar to a C# lock, except it can work across multiple processes, i.e. it is a computer-wide lock. Its name comes from the fact that it is useful in coordinating mutually exclusive access to a shared resource.

Let’s take a simple Console application as an example:

    class Program
        static void Main()
            // main application entry point
            Console.WriteLine("Hello World!");

Using a Mutex, we can change the above code to allow only a single instance to print Hello World! and the subsequent instances to exit immediately:

static void Main()
    // Named Mutexes are available computer-wide. Use a unique name.
    using (var mutex = new Mutex(false, " SingletonApp"))
        // TimeSpan.Zero to test the mutex's signal state and
        // return immediately without blocking
        bool isAnotherInstanceOpen = !mutex.WaitOne(TimeSpan.Zero);
        if (isAnotherInstanceOpen)
            Console.WriteLine("Only one instance of this app is allowed.");

        // main application entry point
        Console.WriteLine("Hello World!");

Note that we’ve passed false for the initiallyOwned parameter, because we want to create the mutex in a signaled/ownerless state. The WaitOne call later will try to put the mutex in a non-signaled/owned state.

Once an instance of the application is running, the SingletonApp Mutex will be owned by that instance, causing further WaitOne calls to evaluate to false until the running instance relinquishes ownership of the mutex by calling ReleaseMutex.

Keep in mind that only one thread can own a Mutex object at a time, and just as with the lock statement, it can be released only from the same thread that has obtained it.

Modeling PowerToys for Visual Studio 2013

I rarely use tools that generate code, however, one that has become a fixed asset of my programming toolbox is Visual Studio’s class designer. It’s a great productivity tool that helps you quickly visualize and understand the class structure of projects, classes and class members. It’s also great for presentation of code-base that does not come with a UI, e.g. a Class Library.

It also lets you quickly wire-frame your classes when doing top-down design, but it is limited in that aspect, for example it does not support Auto-Implemented Properties, which I tend to almost always use in my Types, instead it blurts out a verbose Property declaration along with a backing field. Fortunately, almost all of these issues are fixed with the great Modeling PowerToys Visual Studio add-in by Lie which turns Class Designer into an amazing tool.

When I finally upgraded from my beloved Visual Studio 2010 to 2013, in the midst of all the horrors of VS 2013, I also found out that this add-in has not been updated to support later versions and the original author seems to be inactive, so I upgraded it myself and decided to put it here for other fellow developers who also happen to like the tool:

Download Link

To install the add-in, extract the ZIP file contents to %USERPROFILE%\Documents\Visual Studio 2013\Addins and restart Visual Studio.

Please note that I’m not the author of this add-in, I merely upgraded it for VS 2013.

Git: Commit with a Utc Timestamp and Ignore Local Timezone

When you git commit, Git automatically uses your system’s local timezone by default, so for example if you’re collaborating on a project from Sydney (UTC +10) and do a commit, your commit will look like this in git log for everyone:

commit c00c61e48a5s69db5ee4976h825b521ha5bx9f5d
Author: Your Name <>
Date:   Sun Sep 28 11:00:00 2014 +1000 # <-- your local time and timezone offset

Commit message here

If you find it rather unnecessary to include your local timezone in your commits, and would like to commit in UTC time for example, you have two options:

  1. Changing your computer’s timezone before doing a commit.
  2. Using the --date commit option to override the author date used in the commit, like this:

     git commit --date=2014-09-28T01:00:00+0000

The first option is very inconvenient, changing system’s timezone back and forth between UTC and local for commits is just silly, so let’s forget about that. The second option however, seems to have potential, but manually inputting the current UTC time for each commit is cumbersome. We’re programmers, there’s gotta be a better way…

Bash commands and aliases to the rescue! we can use the date command to output the UTC time to an ISO 8601 format which is accepted by git commit’s date option:

git commit --date="$(date --utc +%Y-%m-%dT%H:%M:%S%z)"

We then alias it to a convenient git command like utccommit:

git config --global alias.utccommit '!git commit --date="$(date --utc +%Y-%m-%dT%H:%M:%S%z)"'

Now whenever we want to commit with a UTC timestamp, we can just:

git utccommit -m "Hey! I'm committing with a UTC timestamp!"