Code Reviews: An Accurate Window into a Team's Health

Pull requests and code reviews are one of the most powerful software quality tools we have available in our tool-belt.

Even when flying (han) solo on a project, I tend to create feature branches and PRs and review my own code after a short break; I usually find a few things I can improve when doing that, with the added benefit of being able to switch easily to other work if needed.

In teams, the benefits are many and more significant: extra pair of eyes are always better in spotting improvement opportunities and catching mistakes, they consistently raise the quality of the project in all areas to the best of the team’s knowledge rather than individuals, they reduce the project risk by sharing knowledge and breaking down information silos while upskilling everyone. To a much much lesser degree of importance, they are about guarding the project against YOLO pushes of crap code – that’s more relevant for public, open-source projects; if that’s your main reason for using pull requests and code reviews, that’s a sign of bigger root problems in the team.

However, this post isn’t about those more commonly known benefits; it’s about how code reviews are one of the best places to observe and gauge a team’s health. I recently thought about this as I was reading the excellent book called “The Five Dysfunctions of a Team”, recommended to me by my dear colleague Mehdi Khalili.

In the book, Patrick Lencioni talks about the following five core team dysfunctions which build on top of each other. I’ll go through them and note how they can be observed in PRs and code reviews.

The Five Dysfunctions of a Team

Absence of trust

The most fundamental dysfunction in a team is a lack of trust, which is essentially the team members not willing to be vulnerable and open to each other. This results in a huge waste of time and energy as team members invest them in defensive behaviours, and are reluctant to ask for help from each other.

Putting your code up in front of colleagues to review and give feedback on is essentially an act of vulnerability and people opposing the idea of code reviews is a strong sign of this dysfunction.

Tackling this issue has to start with the leaders. Team and tech leads need to keep their minds open and look for opportunities to learn from and ask for help from junior people. You also have to cultivate a safe culture where mistakes aren’t frowned upon, but are seen as an opportunity to learn and improve. Otherwise, they keep happening and growing silently until they lead to big failures.

Fear of conflict

When there is trust, conflict becomes nothing but the pursuit of truth, an attempt to find the best possible answer. – Patrick Lencioni

Healthy, constructive conflict and discussion within a team is good thing (tension isn’t; don’t confuse the two). Diversity of thoughts leads to improvements and better outcomes.

If there is trust, but code reviews are going through with little or no comments or discussions, that’s a strong sign that the team has a fear of conflict and wants to maintain an artificial harmony. At this point, pull requests and code reviews become merely red tape.

To tackle this, encourage everyone to give feedback or even ask questions. Encourage opinions, but have an open mind and hold them loosely. However, be careful to not fall into the trap of bike-shedding on trivial issues.

Having code review etiquette also definitely helps. Be nice and show how something could be improved rather than just criticising. Call out the positive things you like. Don’t ever make feedback personal; it should always be about the code and don’t forget that people are not their code!

Lack of commitment

This happens when people haven’t had a chance to provide their input or be heard. When people don’t weigh in on something, they don’t buy in to it. This is not about seeking consensus (although it’s good if you have that); it’s more about making sure people are heard.

If there is no fear of conflict, but people are reluctant to perform code reviews or give their vote to them, that is a sign of either this dysfunction or the next one.

To solve this, team members (including seniors) should regularly bounce ideas off each other and ask for other team members’ opinions about how to approach a particular problem, even if they already have in mind what they think is a solid approach.

Avoidance of accountability

In a well-functioning team, it’s the responsibility of each team member to hold one another accountable, and accept it when others hold them accountable. Otherwise, they’ll end up with low standards.

This happens when people have a lack of clarity on the work being done. They don’t understand it properly or they don’t know why it’s being done.

The signs of this dysfunction in code reviews is very similar to the previous one: people are either reluctant to perform code reviews, or they make trivial comments but then leave them in a pending state without approving or voting otherwise.

To solve this, have clear communication around expectations and ensure the team understands why something is being done. Abiding by the user story template of “As a [persona], I want [feature], so that [outcome]” and having clear and detailed acceptance criteria helps here; if you have trouble with these, that’s usually because of poor connection and communication with the business or end-users.

Inattention to results

This is often because individual contributions and goals are prioritized over the team’s goal and collective business outcomes, which is usually a result of big egos, politics and individual KPIs.

If the previous dysfunctions don’t exist, but you find that people have to be constantly pushed and nudged to review pending pull requests, and it usually takes a long time before they happen, that’s a strong sign of this dysfunction, as people are prioritizing their WIP individual contributions over shipping work that’s nearly done.

To tackle this, you need to move towards a team-oriented culture and thinking. The whole of a well-functioning team is greater than the sum of its parts.

Goals, recognitions and rewards should primarily be for teams rather than individuals. However, it’s crucial that teams are small enough that each individual feels the effects of their contributions towards the collective outcome, otherwise, the motivating effects are lost.

 

Note that code reviews don’t fix these dysfunctions by themselves; but they are a great place where the dysfunctions are exhibited, letting you observe and identify them – a critical first step in fixing them. This, in addition to the benefits mentioned in the beginning, makes code reviews the single highest impact practice you can have to help you improve your product and your team, both from a technical and people perspective.

Integration Testing in Azure Functions with Dependency Injection

I’m a big proponent of integration and functional tests (or “subcutaneous tests” if there’s a UI). Done efficiently and sparingly, they give you the biggest bang for your buck when it comes to confidence in the overall well-being of the majority of your system.

Dependency injection is ubiquitous these days, and ASP.NET Core MVC’s seamless support for integration testing via the Microsoft.AspNetCore.Mvc.Testing NuGet package has made this kind of testing simple when using dependency injection. However, I found that when dealing with Azure Function Apps, a similar setup is not as effortless as installing a package and using a WebApplicationFactory, especially since Azure Function projects aren’t set up for dependency injection out of the box. Fortunately, after a bit of digging under the hood of Microsoft.AspNetCore.Mvc.Testing and Microsoft.AspNetCore.TestHost, I’ve been able to create a similar testing experience which I’ll go through below.

The setup isn’t identical. While Microsoft.AspNetCore.Mvc.Testing bootstraps an in-memory test server that you can communicate with using an HTTP client for functional testing, here we’ll be operating at a level below that by just bootstrapping a test IHost. The goal here is to set up integration tests while leveraging existing dependency injection setup and settings without duplication.

Sample Function App With Dependency Injection

Let’s assume the below Azure Function app, with one HTTP endpoint that responds with the answer to life, the universe and everything:

public class SuperFunction
{
	readonly IHitchhikerGuideToTheGalaxy _hitchhikerGuideToTheGalaxy = new HitchhikerGuideToTheGalaxy();

	[FunctionName(nameof(AnswerToLifeTheUniverseAndEverything))]
	public IActionResult AnswerToLifeTheUniverseAndEverything(
		[HttpTrigger(AuthorizationLevel.Anonymous)] HttpRequest req)
	{
		return new OkObjectResult(_hitchhikerGuideToTheGalaxy.GetTheAnswerToLifeTheUniverseAndEverything());
	}
}

public interface IHitchhikerGuideToTheGalaxy
{
	int GetTheAnswerToLifeTheUniverseAndEverything();
}

public class HitchhikerGuideToTheGalaxy : IHitchhikerGuideToTheGalaxy
{
	readonly ISuperComputer _superComputer = new SuperComputer();

	public int GetTheAnswerToLifeTheUniverseAndEverything() => _superComputer.CalculateTheAnswerToLifeTheUniverseAndEverything();
}

public interface ISuperComputer
{
	int CalculateTheAnswerToLifeTheUniverseAndEverything();
}

public class SuperComputer : ISuperComputer
{
	public int CalculateTheAnswerToLifeTheUniverseAndEverything() => 42;
}

But we all know that new is glue for dependencies, so let’s inject them. Support for dependency injection in Azure Functions has been added since Azure Functions 2.x. To register IHitchhikerGuideToTheGalaxy and ISuperComputer as services and inject them, we’ll create a Startup class, similar to what we have with ASP.NET Core applications:

using Microsoft.Azure.Functions.Extensions.DependencyInjection;
using Microsoft.Extensions.DependencyInjection;
using SuperFunctionApp;

[assembly: FunctionsStartup(typeof(Startup))]

namespace SuperFunctionApp
{
    public class Startup : FunctionsStartup
    {
        public override void Configure(IFunctionsHostBuilder builder)
        {
            builder.Services.AddTransient<IHitchhikerGuideToTheGalaxy, HitchhikerGuideToTheGalaxy>();
            builder.Services.AddTransient<ISuperComputer, SuperComputer>();
        }
    }
}

It has to inherit from FunctionsStartup and you also need to add a FunctionsStartupAttribute assembly attribute that points to the Startup class. Both of these types exist in the Microsoft.Azure.Functions.Extensions NuGet package which you need to install.

Now we can have some dependency injection goodness with the Function Host automatically injecting the dependencies for us:

public class SuperFunction
{
	readonly IHitchhikerGuideToTheGalaxy _hitchhikerGuideToTheGalaxy;

	public SuperFunction(IHitchhikerGuideToTheGalaxy hitchhikerGuideToTheGalaxy)
	{
		_hitchhikerGuideToTheGalaxy = hitchhikerGuideToTheGalaxy;
	}

	[FunctionName(nameof(AnswerToLifeTheUniverseAndEverything))]
	public IActionResult AnswerToLifeTheUniverseAndEverything(
		[HttpTrigger(AuthorizationLevel.Anonymous)] HttpRequest req)
	{
		return new OkObjectResult(_hitchhikerGuideToTheGalaxy.GetTheAnswerToLifeTheUniverseAndEverything());
	}
}

public interface IHitchhikerGuideToTheGalaxy
{
	int GetTheAnswerToLifeTheUniverseAndEverything();
}

public class HitchhikerGuideToTheGalaxy : IHitchhikerGuideToTheGalaxy
{
	readonly ISuperComputer _superComputer;

	public HitchhikerGuideToTheGalaxy(ISuperComputer superComputer)
	{
		_superComputer = superComputer;
	}

	public int GetTheAnswerToLifeTheUniverseAndEverything() => _superComputer.CalculateTheAnswerToLifeTheUniverseAndEverything();
}

public interface ISuperComputer
{
	int CalculateTheAnswerToLifeTheUniverseAndEverything();
}

public class SuperComputer : ISuperComputer
{
	public int CalculateTheAnswerToLifeTheUniverseAndEverything() => 42;
}

But how do we now properly test that our function correctly returns what we know is the obvious answer to life, the universe and everything? testing the various bits separately would cross into the realm of unit-testing, while duplicating our dependency injection setup is far from ideal; we want to keep and use all our existing DI configuration from our Startup in our tests.

Setting Up An Azure Function Test Host

The secret to using our existing Startup configuration and DI for integration testing lies in bootstrapping a .NET Core Generic Host via the ConfigureWebJobs extension method:

var startup = new Startup();
var host = new HostBuilder()
	.ConfigureWebJobs(startup.Configure)
	.Build();

The use of ConfigureWebJobs might seem a bit strange, but since Azure Functions are built on top of the Web Jobs SDK, there is significant shared and compatible API surface. Here we are using the overload that takes an Action<IWebJobsBuilder> configure argument. Our Startup class inherits from FunctionsStartup which inherits from IWebJobsStartup, providing a compatible Configure method that calls our Configure(IFunctionsHostBuilder builder) method!

Putting the above together, we can write an integration test, that initializes our HTTP Function reusing all the dependency injection and settings from our “real” code inside a test Host, and tests whether it correctly returns the true answer to life, the universe and everything:

public class SuperFunctionTests
{
	readonly SuperFunction _sut;

	public SuperFunctionTests()
	{
		var startup = new Startup();
		var host = new HostBuilder()
			.ConfigureWebJobs(startup.Configure)
			.Build();

		_sut = new SuperFunction(host.Services.GetRequiredService<IHitchhikerGuideToTheGalaxy>());
	}

	[Fact]
	public void Test()
	{
		// arrange
		var req = new DefaultHttpRequest(new DefaultHttpContext());

		// act
		var result = (OkObjectResult)_sut.AnswerToLifeTheUniverseAndEverything(req);

		// assert
		Assert.Equal(42, result.Value);
	}
}

If you’d like to quickly try it for yourself, I have shared my preferred way to set up Azure Function projects as a .NET Core Template on GitHub. The template includes a test set up with the above approach, plus using the familiar appsettings.json file for settings instead of local.settings.json and Environment Variables, to help with ease and simplicity of deployment, as well as a build defined via Azure YAML Pipelines.

How to Version Your .NET NuGet Packages, Libraries and Assemblies + Azure YAML Pipelines Example using .NET Core CLI

I often see arbitrary patterns used for versioning packages and assemblies, especially for internal projects, with the only common goal being that the versions are increased with each release. There is rarely a distinction between NuGet package Version, AssemblyVersion, FileVersion and InformationalVersion and they’re commonly all set to the same value, if set at all. That’s not to blame anyone, having all these different ways to set a version is darn confusing.

However, if you plan to release a public NuGet package, or an internal package that could be heavily used, you’ll be making your life, and your users’ lives a lot easier, and avoid creating a “dependency hell”, if you distinguish between the above versions and follow a meaningful versioning strategy like Semantic Versioning.

If you haven’t heard about Semantic Versioning, here’s the TL;DR:

  • It solves the problems of version lock, the inability to upgrade a package with minor changes and bug fixes without having to release new versions of every dependent package, and version promiscuity, the inability to safely upgrade a package with breaking changes without automatically breaking dependent packages.

  • The versioning scheme is {major}.{minor}.{patch}-{tag}+{buildmetadata}. {major}, {minor} and {patch} are numeric, while {tag} and {buildmetadata} can be alphanumeric.

  • Bug fixes increment the {patch} number.

  • Backward compatible additions/changes increment the {minor} number.

  • Only breaking changes increment the {major} number. You can refer to CoreFX’s rules for classifying breaking changes as a guideline.

  • The -{tag} and +{buildmetadata} parts are optional. -{tag} is used to denote pre-release unstable versions and so is used less frequently, although I recommend always including the +{buildmetadata} to provide valuable information – more on this below.

So how do you follow and apply that in the .NET ecosystem?

I’ve recently released a NuGet package, called EF.Auditor – a simple and lightweight auditing library for Entity Framework Core that supports Domain Driven Design. As part of releasing this package, I refreshed my knowledge around versioning, the multitude of version attributes and researched versioning strategies used by Microsoft and popular packages to arrive at the sweet spot of versioning packages and assemblies. I’ll go through the highlights below. If you’d like to skip the details and just see the versioning strategy, jump to the “Putting it all together” part.

What the heck are NuGet package Version, AssemblyVersion, FileVersion, InformationalVersion

If you’re confused by all the different versions, you’re not alone. However, while they might all look the same, they have quite different characteristics and effects.

NuGet package version

This is the version in your package’s nuspec file in the <version> element.

The NuGet package version is displayed on NuGet.org, the Visual Studio NuGet package manager and is the version number users will commonly see, and they’ll refer to it when they talk about the version of a library they’re using. The NuGet package version is used by NuGet and has no effect on runtime behavior.

Since this version is the most visible version to developers, it’s a good idea to update it using Semantic Versioning. SemVer indicates the significance of changes between release and helps developers make an informed decision when choosing what version to use. For example, going from 1.0 to 2.0 indicates that there are potentially breaking changes.

However, since Visual Studio < 2017 and NuGet client < 4.3.0 do not support SemVer 2.0.0, I suggest sticking with SemVer 1.0.0 here for maximum compatibility.

AssemblyVersion

Commonly defined in the csproj file, or AssemblyInfo.cs in old project formats, the assembly version is what the CLR uses at runtime to select which version of an assembly to load.

The .NET Framework CLR demands an exact match to load a strong named assembly. For example, Libary1, Version=1.0.0.0 was compiled with a reference to Newtonsoft.Json, Version=11.0.0.0. The .NET Framework will only load that exact version 11.0.0.0. To load a different version at runtime, a binding redirect must be added to the .NET application’s config file.

Strong naming combined with assembly version enables strict assembly version loading. While strong naming a library has a number of benefits, it often results in runtime exceptions that an assembly can’t be found and requires binding redirects to be fixed. .NET Core assembly loading has been relaxed, and the .NET Core CLR will automatically load assemblies at runtime with a higher version, but the .NET Framework CLR needs the entire AssemblyVersion has to be an exact match in order for an assembly to be loaded.

For maximum binary compatibility with the .NET Framework CLR and strong-naming scenarios, only sync the major version of this attribute to match the major part of your semantic version and leave the other parts as 0. This is the strategy followed by .NET Framework BCL, CoreFX and many other popular packages like Entity Framework Core and Newtonsoft.Json.

Providing a * in place of the numbers is likely one of the worst things to do as it makes compiler set numbers based on some crazy rules and you’d need a calculator to get any valuable information out of those numbers. It also would cause an error in .NET Core projects as they are deterministic by default.

Note that although Assembly Version follows a similar {Major}.{Minor}.{BuildNumber}.{Revision} schema, it has significant limitations so you won’t be able to apply SemVer here anyway:

  • It only supports numbers for all parts.
  • Each number can only go up to 65534, so you wouldn’t be able to, for example, have a useful full date such as 190920 included in your revision. Using an auto-incremented build number from your CI here would also get you only so far.

FileVersion

Commonly defined in the csproj file, or AssemblyInfo.cs in old project formats, the assembly file version is used to display a file version in Windows and has no effect on runtime behavior. Setting this version is optional. It’s visible in the File Properties dialog in Windows Explorer:

File Version

It’s also the version displayed in the Windows Explorer tooltip when you hover over an assembly:

File Version

File Version follows the same versioning schema as Assembly Version and has the same limitations, so you can’t use this one for proper SemVer versioning either. But for consistency and preventing user confusion, I recommend keeping the {Major}.{Minor}.{BuildNumber} parts of this attribute in sync with the {major}.{minor}.{patch} from your semantic version. Changing this attribute frequently won’t create the same issues as changing Assembly Version explained above, as this attribute is never used by .NET.

InformationalVersion

Commonly defined in the csproj file, or AssemblyInfo.cs in old project formats, the assembly informational version is similar to File Version in that setting it is optional, it has no effect on runtime behavior and it shows up in File Properties dialog in Windows Explorer as Product version:

Product Version

The major difference is this version can be any string and doesn’t have the same limitations as Assembly Version and File Version, which makes it perfect for being set to your full SemVer 2.0.0 version.

The InformationalVersion assembly attribute could be read at runtime and used to identify the exact version of your package or application, for example in logging:

string semanticVersion = assembly.GetCustomAttribute<AssemblyInformationalVersionAttribute>().InformationalVersion;

Putting it all together

Combining the above information and after trying many different approaches through the years, I’ve found the below strategy to be at the sweet spot of versioning NuGet packages and .NET assemblies:

  • Set your NuGet package version to the {major}.{minor}.{patch}-{tag} part of your Semantic Version
  • Follow the SemVer specs for bumping the {major}, {minor} and {patch} parts
  • Set your AssemblyVersion to {major}.0.0.0
  • Set your FileVersion to {major}.{minor}.{patch}.0
  • Set your InformationalVersion to the full Semantic Version of {major}.{minor}.{patch}-{tag}+{buildmetadata}
    • For {buildmetadata}, use the pattern of {BuildDate:yyyyMMdd}{BuildRevision}.{commitId}. This provides you with super-useful information at a glance:
      • When the build was done
      • The exact commit in your source control history the build was from, which helps you quickly check out that version for troubleshooting
  • Only include the -{tag} part if you’re releasing an unstable pre-release version, e.g. -alpha

How you implement the strategy would depend on your platform, tooling and build process. While you could go with tools like GitVersion, my preferred method these days is a simple source-controlled way of managing the version through Azure YAML Pipelines, close to where the rest of my build definition is.

Azure YAML Pipelines example

Here’s EF.Auditor’s versioning done through Azure YAML Pipelines:

name: $(Date:yyyyMMdd)$(Rev:r)

pool:
  vmImage: 'Ubuntu 16.04'

variables:
  buildConfiguration: 'Release'
  versionMajor: 1
  versionMinor: 1
  versionPatch: 5

steps:
- powershell: |
    $mainVersion = "$(versionMajor).$(versionMinor).$(versionPatch)"
    $commitId = "$(Build.SourceVersion)".Substring(0,7)
    Write-Host "##vso[task.setvariable variable=mainVersion]$mainVersion"
    Write-Host "##vso[task.setvariable variable=semanticVersion]$mainVersion+$(Build.BuildNumber).$commitId"
    Write-Host "##vso[task.setvariable variable=assemblyVersion]$(versionMajor).0.0"
  name: SetVersionVariables

- powershell: |
    dotnet pack --configuration Release /p:AssemblyVersion='$(assemblyVersion)' /p:FileVersion='$(mainVersion)' /p:InformationalVersion='$(semanticVersion)' /p:Version='$(mainVersion)' --output '$(Build.ArtifactStagingDirectory)'
  name: Pack

- task: PublishBuildArtifacts@1

I’ve simplified the pipeline to the bits relevant to versioning; you can see the full pipeline definition on GitHub.

Common C# async and await misconceptions

.NET Programmers have traditionally shied away from writing asynchronous code, and mostly for good reason. Writing asynchronous code used to be arduous work and the result was difficult to reason about, debug and maintain. That became exacerbated when you threw concurrency into the mix – parallel or asynchronous – as that’s harder to consciously follow for our brains which are optimised/trained for non-concurrent and sequential logic.

The compiler magic of async/await since C# 5.0 and the Task-based Asynchronous Pattern has immensely improved that experience, mainly by abstracting asynchronous code to read like synchronous code. In fact it’s been such a big hit that it’s spread and been adopted by many other popular languages such as JavaScript, Dart, Python, Scala and C++.

This has resulted in a trend of more and more asynchronous code popping up in our code-bases. That’s mostly a good thing, because leveraging asynchrony in the right place can lead to significant performance and scalability improvements. However, with all great magic, comes great misconception. I’ll go over some of the most common ones below which I encounter often.

async/await means multi-threading

“The method isn’t async, it’s single-threaded.” – too many devs

I continue to hear and see many varieties of the above statement and it’s the most common misconception in my experience. Overloading of terms like asynchronous, concurrent and parallel and using them interchangeably in the industry can take some of the blame, and maybe some of it falls on Tasks being included in the System.Threading namespace.

async doesn’t magically make your code asynchronous. It don’t spin up worker threads behind your back either. In fact, it doesn’t really do anything other than enable the use of the await keyword in a method. It was only introduced to not break existing codebases that had used await as an identifier and to make await usage heuristics simpler. That’s it!

await is a bit more complicated and is quite similar to how the yield keyword works, in that it yields flow of control back to the caller and creates a state-machine by causing the compiler to register the rest of the async method as a continuation. That continuation is run whenever the awaited task completes. The compiler transforms the below:

var result = await task;
statements

Into essentially:

var awaiter = task.GetAwaiter();
awaiter.OnCompleted (() =>
{
	var result = awaiter.GetResult();
	statements
});

That’s the gist. The compiler emits more complex code, however, none of that involves spinning worker threads. It also doesn’t magically make your task’s implementation asynchronous either. In fact, if the implementation is synchronous it all runs synchronously, but slooower.

You can’t have concurrency with purely asynchronous code

So I just said async doesn’t require multiple threads. That means unless you mix in multi-threading you can’t have concurrency right?

Wrong.

Truly asynchronous operations can be concurrently performed by one thread. Heresy! how can one thread perform multiple operations at the same time?

That’s possible because IO operations leverage IOCP ports at the device driver level. The point of IO completion ports is you can bind thousands of IO handles to a single one, effectively using a single thread to drive thousands of asynchronous operations concurrently.

Take this sample code that makes 5 HTTP calls to a slow API one by one:

async Task DoSequentialAsync()
{
	const string SlowApiUrl = new Uri("http://slow-api-that-responds-after-2-seconds");
	for (int i = 0; i < 5; i++)
	{
		var client = new WebClient();
		await client.DownloadStringTaskAsync(SlowApiUrl);
	}
}

For me it takes about a little more than 10 seconds to complete, while using a single thread. No surprise here.

Now what do you think happens if we change the code to instead of awaiting each call one by one, fire off all of them and await the combination of all those tasks?

async Task DoConcurrentAsync()
{
	const string SlowApiUrl = new Uri("http://slow-api-that-responds-after-2-seconds");
	var tasks = new List<Task>();
	for (int i = 0; i < 5; i++)
	{
		var client = new WebClient();
		var no = i;
		tasks.Add(client.DownloadStringTaskAsync(SlowApiUrl));
	}

	await Task.WhenAll(tasks);
}

The moment of truth! the whole thing takes a little more than 2 seconds this time, and all with just a single thread. Here’s a more complete example that keeps track of used threads and the exact time. Go ahead and run it for yourself, may you become a believer.

How that’s possible is that as soon as each call is fired and registered on a completion port, while it is “in flight”, we get the thread back and use it to fire the next one. That’s also the same reason why doing something like the above with tasks that use an EF context to query the database asynchronously will cause EF to blow up with a DbConcurrencyException saying concurrent queries on the same context aren’t supported.

All code should be async by default for better performance

We saw that async/await generates a state-machine and significant extra code which involves a new type with extra heap allocations, exceptions and try-catches and context marshalling between different execution contexts. As you can imagine, all of that comes with some overhead which inevitably makes asynchronous code run slower than a synchronous counterpart.

If it’s slower, why do we use asynchronous code?

In rich client applications which have a UI thread, or single-threaded applications, it’s for responsiveness. Because you won’t block the main thread and freeze the app while waiting for an asynchronous operation.

It’s a different story in ASP.NET web applications and APIs. ASP.NET request processing is multi-threaded with a thread per request model. Releasing the thread when awaiting an asynchronous request in an API call doesn’t create a more responsive experience for the user because it still has to complete before the user gets the response. The benefit here is scalability and throughput – getting more bang (request handling) for your buck (hardware resources).

Threads are maintained in a thread pool and map to limited system resources. Depending on many different factors, the underlying scheduler might decide to spin more threads and add to this pool which involves some microseconds of overhead per thread. That may seem insignificant, but for an ASP.NET API serving a lot of users, it can quickly add up to significant numbers. By releasing the thread when it’s idling for an IO-bound work to complete, such as calling another API, it can be used to serve another request. It also protects against usage bursts since the scheduler doesn’t suddenly find itself starved of threads to serve new requests.

This means your API can make better use of available system resources and deal with a lot more requests before it falls over. However, you only get these benefits when you leverage truly asynchronous code on inherently asynchronous IO bound operations that have significant latency like reading a file from a slow drive or making a network call.

If async code is used for performing synchronous operations, or even asynchronous operations that are guaranteed to complete in milliseconds, the overhead outweighs the gains and results in worse performance. In one benchmark this detriment was measured to be about 300% in total execution time, 600% in memory and 25% in CPU usage. So give it a second think before making that next method async by default. Horses for courses.

You need to use async/await in the whole call stack

Have you seen code like this before?

// FooController.cs

public class FooController
{
	public async Task<Foo> GetFoo()
	{
		return await _fooService.GetFooAsync()
	}
}


// FooService.cs

public class FooService
{
	public async Task<Foo> GetFooAsync()
	{
		return await _fooContext.GetAsync()
	}
}

You should have asynchronous code in the whole call stack to not hold up threads, but you don’t have to actually use the async/await keywords unless you need to.

FooController.GetFoo and FooService.GetFooAsync just return another task without doing anything else. These “proxy” methods don’t need to use async/await at all and can be simplified to:

// FooController.cs

public class FooController
{
	public Task<Foo> GetFoo()
	{
		return _fooService.GetFooAsync()
	}
}


// FooService.cs

public class FooService
{
	public Task<Foo> GetFooAsync()
	{
		return _fooContext.GetAsync()
	}
}

It is still asynchronous code, but it’s more efficient as it avoids creating an unnecessary state-machine. It is also less code and reads better, so it’s simply cleaner code.

Note that there are some gotchas involved with omitting the async/await keywords in more complex methods. For example, returning a Task inside a using block without awaiting it, exits the block and results in immediate disposal which might not be expected. Also, exception behaviour is slightly different when a synchronous part of the method throws an exception, and the calling code happens to await the method after invoking it, e.g. firing a bunch of tasks and then awaiting them all with Task.WhenAll. That exception will be thrown synchronously rather than when the task is awaited. To avoid these issues and their cognitive overhead, I recommend only omitting the async/await keywords in simple proxy methods as my examples above.

 

Hopefully this helps debunk some of the most common misconceptions about the magic of async/await in C# and .NET.

Good Practices for Enterprise Integrations over the Azure Service Bus

Integrating systems over a good messaging infrastructure has many benefits if done right. Some of these are system/network decoupling and platform agnosticism, independent scalability, high reliability, spike protection, a central place to keep an eye on your integrations and more depending on your situation.

Many of these benefits are vast topics on their own. But the point of this article is to go over some of the pitfalls when implementing large-scale integrations over a messaging infrastructure and guidelines for avoiding them.

These are based on my learning from a recent project which involved improving a client’s existing integration between 5+ large systems. While the messaging infrastructure in use was the Azure Service Bus, I believe most of these points apply to all other message-based integrations and service buses.

1. Don’t roll your own

Don't re-invent the wheel Don’t re-invent the wheel.

Not rolling your own should really be no-brainer in all projects, yet you still see it unnecessarily done more often than you should.

Unless your aim is to provide a competing solution that works better than what’s available and you have the resources to maintain it, you will get it wrong, miss edge cases, and it will eventually cost you more as your efforts are diverted from delivering business value to maintaining your custom solution.

So stand on the shoulder of giants as much as you can by using tried and tested infrastructure that’s already available.

2. Don’t bypass your messaging infrastructure

This is closely related to the previous point and includes plugging in your own service bus-esque solutions alongside a proper infrastructure like the Azure Service Bus, for example, storing messages in a database table for later processing.

One of the major initial issues I identified with the client’s integration, was that to avoid queues getting full, they were retrieving messages from the queues, storing them in an MSSQL database and doing the real processing from there. This had introduced a big flaw in the design, losing out on many benefits that came with ASB:

  • Proper locking of the message when it’s being processed, to prevent duplicate processing. Implementing your own locking mechanism on a database table is complex, error-prone and can easily cause dead-lock situations.

  • Scaling and performance. Even if you get the locking right, it won’t be anywhere near as performant as a proper message queue for high-traffic scenarios. You’ll have a high-write, high-read situation (a lot of messages coming in and a lot of polling) which is very hard to optimize a SQL database table for.

  • Availability and reliability. A good messaging platform like ASB, is highly available and redundantly stores your data.

  • Filtering, dead lettering, automatic expiry and a myriad of other features that come with ASB.

This would lead you to the pitfall #1 from above: in order to get it right, they had to essentially roll their own messaging infrastructure, so they ended up with two problems instead of solving the first one.

Treat the root problem, not the symptom.

‘What was the original problem you were trying to fix?’ ‘Well, I noticed one of the tools I was using had an inefficiency that was wasting my time.’

3. Use shared topics and a pub-sub pattern instead of queues

Imagine the scenario of updating customer contact details in a CRM system which should be reflected in the sales system. There’s only two systems, so it might be tempting to do something like this:

Point-to-Point An exclusive topic/queue for sending a single system’s customer update messages, effectively creating a point-to-point integration.

This might look fine if you’re only starting with a couple systems, however not going down this route is critical for future maintainability.

Let’s say after some time, the business needs a third legacy system – one that drives many other parts of the business – integrated as well:

Point-to-Point Keep it simple?

There’s several problems with the above:

  • Single points of failure. Each of the systems being slow or unavailable breaks the link and the data no longer flows to the next system. This actually caused one of the client’s main systems to have data that was weeks old until the faulty system was properly fixed and processed its backlog of messages.

  • If, some time later, you add a third integration between the legacy system and CRM like the diagram, all of a sudden you have inadvertently created an integration between all of the systems by forming a circle which may not have been your intention. It also becomes much more difficult to reliably stop infinite loops from happening between all the systems involved where one message keeps going round and round. This can result in invalid data and even more resources being wasted. More on this below about having a mechanism to stop infinite message loops.

If you want to have redundancy, you’ll end up having to integrate each system with all the other ones:

Point-to-Point Fancy some spaghetti?

With only 3 systems and one simple message, this is already looking like a mess. As you add more systems and integrations, it becomes exponentially costlier and harder to develop and maintain.

There are many flavours to these point-to-point integrations and each come with a similar set of problems.

A much better approach that addresses most of the above issues and is easily maintainable, is a shared topic that all participating systems publish and subscribe to:

Point-to-Point The publishers don’t have to know about the other systems. Any interested system can subscribe to the topic, effectively getting its own queue.

By using this approach, you further decouple the systems, keep it extensible for plugging new systems and keep complexity and maintenance costs linear. If subscribing systems need to know where a message originated from, that can be included in a standard message body format, or better yet, message properties as they can drive filters and actions.

4. Be mindful of loops – subscribers shouldn’t republish

It’s very easy to cause infinite loops that can wreak havoc on all the systems involved. From the above example, it can easily happen if each system received a message from the topic, updates a customer and as a result republishes a CustomerUpdated message back to the topic.

One solution to this problem that works with the above pub-sub model, is that a system’s action based on a message received from the topic, shouldn’t cause the same message to be republished back to the topic.

5. Include domain event timestamps in messages

Each message should have a domain event timestamp. I’m not referring to the one that is automatically stamped on messages by the service bus, but one included by your system in the message. The timestamp should describe when the event the message describes happened within that system. e.g., you care about when a customer updated their details in a system, not when the message was published and not when another system finally gets around to processing it.

This provides a means for other systems to not act based on outdated data. e.g. it’s possible that by the time a subscribing system receives a CustomerUpdated message, it’s already applied a more recent update. A simple timestamp check can prevent the more recent data from being overwritten.

While you’re at it, make that timestamp a UTC one – or better yet, use a type that unambiguously represents an instant, like DateTimeOffset – so you don’t run into time zone conversion and datetime arithmetic issues. You only care about the instant when something happened, not the local time, and a UTC datetime simply represents that.

6. Messages should be the single source of truth

‘And the whole setup is just a trap to capture escaping logicians. None of the doors actually lead out.’

All necessary information from the publishing system should be self-contained in each published message. That means all the information the subscribing systems need to be able to act on the message.

A simple scenario of how breaking this pattern could cause problems:

  1. The CRM system publishes a CustomerUpdated message that only states a specific customer’s details has been updated, without including the details.

  2. The Sales system makes a call to the CRM API to retrieve the details.

Problems:

  • There needs to be a direct means of communication between the two systems, whereas before they only needed to communicate with the messaging infrastructure. This takes away system and network decoupling, one of the major benefits of using a service bus.

  • By the time the message is processed by Sales, the CRM API might not even be available.

  • It puts additional load on the CRM system since its API would need to be called for each message.

  • By the time the message is processed by Sales, the customer details may have been changed again. If the Sales system’s action was dependent on the historical data at the time of the original event, such as auditing, it is going to retrieve inconsistent data.

So keep and treat messages as the single source of truth. On the other hand, don’t include unnecessary information and keep them lightweight to make the most of your topic space.

7. Have naming conventions for topics and subscriptions

Each topic and subscription following a naming convention brings many monitoring and troubleshooting upsides:

  • As the integrations within the service bus namespace grow it remains easy to find the target topic and subscription.

  • You can see what’s going on in your service bus namespace more easily when using an external tool like the Service Bus Explorer.

  • It could potentially drive an automatic monitoring tool that can visualize all the integrations and the flow of data between various systems.

An example of such a convention could be:

Topic Name
CustomerContactDetailsUpdated
Topic Subscription Names
CRM
Sales
Legacy

Like most conventions, the value mainly comes from there being one, rather than the convention per se.

8. Have proper logging and monitoring to quickly identify and deal with issues

Integrations, once in place, shouldn’t just become part of a black box that’s your service bus.

It is crucial to have proper logging and instrumentation and monitoring on top of those in place, to be notified of irregularities as soon as possible. You should have automatic alarm bells for at least the following:

  • A subscription’s incoming rate becomes consistently higher than its processing rate. This should be fine for short periods of time e.g. traffic spikes. However if it continues your topic would eventually get full.

  • A topic’s used space is consistently increasing.

  • A subscription’s dead-letter queue message count is consistently increasing.

The above are closely related and are usually a sign that a subscribing system is having issues with processing messages. These issues need to be dealt with ASAP. Depending on how the publishing systems can cope with the topic they’re publishing to being full, it could also lead to problems in those systems. For the client, this accumulated a backlog of failed tasks in a publishing system that was limited and hosted externally. It was hopelessly retrying over and over, unnecessarily using significant resources which affected other parts of the system such as delays in sending important marketing material.

As a last resort, to stop the topic from getting full, you could set up forwarding to redirect messages for the problematic system to a different backup topic subscription or queue until the problem is resolved. Just don’t move them to a database.

Azure Monitor can help here. Having structured logging of important information to a centralized logging server such as Seq could also be beneficial, however, be careful not to create a lot of noise by logging the world.

9. Don’t forget the dead-letter queue

All queues and topic subscriptions automatically come with a supplementary sub-queue, called the dead-letter queue (DLQ). Messages can end up here for a number of reasons:

  • By the engine itself as a result of your messaging entity configurations, e.g. automatically dead-lettering expired messages or filtering evaluation errors.

  • By the engine itself as messages exceeds maximum delivery count – which is 10 by default. For example, your subscriber receives the message, but fails to process it due to an exception.

  • By the subscriber’s explicit request in message processing code.

See here for a complete list and their details.

These dead-letter queues contribute to the overall used space of the topic, so you should keep an eye on them, using your monitoring tools, and empty them regularly via either resubmitting them if they were due to transient errors, or discard them if they were poison messages.

10. Performance is not just a nice-to-have

Having performant message processors ensures your integrations run smoothly and can withstand traffic spikes without your topics getting full. Here are some tips that can increase performance dramatically:

  • Use AMQP and avoid HTTP polling. This should be the default if you’re using the new .NET Standard Service Bus Client library, you can read about the benefits here. Don’t use the old library – most of the documentation around still point to that.

  • Use the asynchronous callback-based message pump API. Sending/receiving messages to/from the message broker is an inherently IO based asynchronous operation – you shouldn’t hold up threads for them.

  • Process messages concurrently. Many programmers shy away from writing concurrent code, as it is more complex and error prone. That’s usually a good approach, however the free lunch has long been over, and this is one of the scenarios where having concurrent code really shines. Concurrent code doesn’t mean you have to do unnecessary multithreading. If you use the asynchronous APIs and leverage truly async code where possible, even a single thread can accomplish orders of magnitude more. This needs to be accompanied by proper asynchronous synchronization so you don’t, for example, process two messages for the same customer simultaneously.

  • Keep connections to the service bus alive (i.e. the clients) and don’t recreate them as that’s expensive. They are designed to be kept alive and reused.

  • Leverage prefetching with a message lock duration that’s tweaked based on message processing times. When using the default lock expiration of 60 seconds, Microsoft recommends 20 times the maximum processing rate of your subscriber. e.g. if the subscriber processes 10 messages per second the prefetch count could be 10 x 20 = 200. The idea is to prefetch comfortably below the number of messages your subscriber can process, so they aren’t expired by the time it gets around to processing them. You can read more about that here.

  • Use partitioned topics. One of their biggest benefits is they can go up to 80GBs in size compared to just 5GBs for unpartitioned ones. That can give you a lot more time to deal with issues explained above and you almost never need to worry about them getting full. But they also have better throughput, performance and reliability. There’s really no good reason for not using them.

By combining the above approaches, I improved the processing time on a critical subscription for the client from ~15 seconds to ~3 seconds per message and the message processing rate per hour from ~240 to ~12000.

11. Have message versioning and remain backward compatible

It’s only a matter of when, not if, that your messages need to be changed. To make moving forward easier and seamless, have a message versioning strategy to start with and make those changes in a backward compatible way.

Prepare yourself for the situation that different subscriptions of a single topic contain different message versions. This allows subscribers to be upgraded at their own pace, while not blocking those that can process the new version.

Old message versions can ultimately be retired when all subscribers are upgraded.

12. Have idempotent subscribers

Even with other measures in place such as duplicate detection. It’s highly likely that your subscriber receives the same message twice. For example, this happens when during the time a subscriber is processing a message, the message’s lock expires and it is released back to the subscription queue. So you have to make sure your subscribers process messages idempotently. This can be achieved via various mechanisms depending on your circumstance, but checking against message timestamps or unique message IDs can be a simple effective measure.

 

In conclusion, service buses, like any other tool in our toolbox, can be misused, perhaps more easily than some of the others, which has led to many hating them. But they are powerful and very beneficial in the right situation. Following the above guidelines should help you build a solid foundation for large-scale integrations over service buses and not end up with huge improvement costs – because changing upstream design costs exponentially more downstream.

The Dangerous EF Core Feature: Automatic Client Evaluation

Update: starting from EF Core 3.0-preview 4, this damaging default behavior has been greatly limited, although not completely turned off.

Recently when going through our shiny new ASP.NET Core application’s logs, I spotted a few entries like this:

The LINQ expression ‘foo’ could not be translated and will be evaluated locally.

Huh?

I dug around in the code and found the responsible queries. Some of them were quite complex with many joins and groupings, while some of the other ones were very simple, like someStringField.Contains("bar", StringComparison.OrdinalIgnoreCase).

You may have spotted the problem right away. StringComparison.OrdinalIgnoreCase is a .NET concept. It doesn’t translate to SQL and EF Core can’t be blamed for that. In fact, if you run the same query in the classic Entity Framework, you’ll get a NotSupportedException telling you it can’t convert your perdicate to a SQL expression and that’s a good thing, because it prompts you to review your query and if you really want to have a predicate in your query that only applies in the CLR world, you can decide if it makes sense in your case to do a ToList() or similar at some point in your IQueryable to pull down the results of your query from the database into memory, or you may decide that you don’t need that StringComparison.OrdinalIgnoreCase after all, because your database collation is case-insensitive anyway.

The point is that, by default you are in control and can make explicit decisions based on your circumstances.

That’s unfortunately not the case in Entity Framework Core because of its concept of mixed client/server evaluation. What mixed evaluation effectively does is, if you have anything in an IQueryable LINQ query that can’t be translated to SQL, it tries to magically and silently make it work for you, by taking the untranslatable bits out and running them locally… and it’s enabled by default! what could go wrong?

That’s an extremely dangerous behavior change compared to the good old Entity Framework. Consider this entity:

public class Person
{
	public string FirstName { get; set; }
	public string LastName { get; set; }
	public List<Address> Addresses { get; set; }
	public List<Order> Orders { get; set; }
}

And this query:

var results = dbContext.Persons
	.Include(p => p.Addresses)
	.Include(p => p.Orders)
	.Where(p => p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase))
	.ToList();

EF Core can’t translate p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase) into a query that can be run on the database, so it pulls down the whole Persons table, as well as the whole Orders and Addresses tables from the database into the memory and then runs the .Where(p => p.LastName.Equals("Amini", StringComparison.OrdinalIgnoreCase)) filter on the results 🤦🏻‍♂️

The fact that this behavior is enabled by default is mind blowing! It’s not hard to imagine the performance repercussions of that on any real-size application with significant usage. It can easily bring down applications to their knees. Frameworks should make it difficult to make mistakes, especially ones with potentially devastating consequences like this.

You might be thinking that it’s the developer’s fault for including something like StringComparison.OrdinalIgnoreCase in the IQueryable prediate, but having untranslatable things like that in your query isn’t the only culprit that results in client evaluation. If you have too many joins or groupings, the query could become too complex for EF Core and make it fall back to local evaluation.

So if you’re using EF Core, you want to keep an eye on your logs to spot client evaluations. If you don’t want that additional cognitive overhead, you can disable it altogether and make it throw like the good old Entity Framework:

/* Startup.cs */

public void ConfigureServices(IServiceCollection services)
{
	services.AddDbContext<YourContext>(optionsBuilder =>
	{
		optionsBuilder
			.UseSqlServer(@"Server=(localdb)\mssqllocaldb;Database=EFQuerying;Trusted_Connection=True;")
			.ConfigureWarnings(warnings => warnings.Throw(RelationalEventId.QueryClientEvaluationWarning));
	});
}

/* Or in your context's OnConfiguring method */

protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
{
	optionsBuilder.ConfigureWarnings(warnings => warnings.Throw(RelationalEventId.QueryClientEvaluationWarning));
}

However, I found that disabling client evaluation made EF core so limited, that it became practically unusable. That’s why I’m staying away from it for now and will re-evaluate it when version 3 is out to see if it’s production-ready yet.

Create your own visual studio code snippets

Visual Studio Code Snippets are awesome productivity enhancers; I can only imagine how many millions of keystrokes I’ve saved over the years by making a habit out of using them.

Although a lot of common code you use daily might not be available out of the box, adding them yourself is very simple.

Here are some samples for creating Console.ReadLine & Console.ReadKey snippets:

Console.ReadLine:

<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
  <CodeSnippet Format="1.0.0">
    <Header>
      <Title>cr</Title>
      <Shortcut>cr</Shortcut>
      <Description>Code snippet for Console.ReadLine</Description>
      <Author>Community</Author>
      <SnippetTypes>
        <SnippetType>Expansion</SnippetType>
      </SnippetTypes>
    </Header>
    <Snippet>
      <Declarations>
        <Literal Editable="false">
          <ID>SystemConsole</ID>
          <Function>SimpleTypeName(global::System.Console)</Function>
        </Literal>
      </Declarations>
      <Code Language="csharp">
        <![CDATA[$SystemConsole$.ReadLine();$end$]]>
      </Code>
    </Snippet>
  </CodeSnippet>
</CodeSnippets>

Console.ReadKey:

<?xml version="1.0" encoding="utf-8" ?>
<CodeSnippets  xmlns="http://schemas.microsoft.com/VisualStudio/2005/CodeSnippet">
  <CodeSnippet Format="1.0.0">
    <Header>
      <Title>ck</Title>
      <Shortcut>ck</Shortcut>
      <Description>Code snippet for Console.ReadKey</Description>
      <Author>Community</Author>
      <SnippetTypes>
        <SnippetType>Expansion</SnippetType>
      </SnippetTypes>
    </Header>
    <Snippet>
      <Declarations>
        <Literal Editable="false">
          <ID>SystemConsole</ID>
          <Function>SimpleTypeName(global::System.Console)</Function>
        </Literal>
      </Declarations>
      <Code Language="csharp">
        <![CDATA[$SystemConsole$.ReadKey();$end$]]>
      </Code>
    </Snippet>
  </CodeSnippet>
</CodeSnippets>

You can save the above as .snippet files and then import them via Tools > Code Snippet Manager... > Import... and use them by typing cr or ck and hitting TAB twice.

So go ahead and create handy ones for things you find yourself typing all the time. You can refer to this MSDN article for more details.

Allowing Only One Instance of a C# Application to Run

Making a singleton application, i.e. preventing users from opening multiple instances of your app, is a common requirement which can be easily implemented using a Mutex.

A Mutex is similar to a C# lock, except it can work across multiple processes, i.e. it is a computer-wide lock. Its name comes from the fact that it is useful in coordinating mutually exclusive access to a shared resource.

Let’s take a simple Console application as an example:

    class Program
    {
        static void Main()
        {
            // main application entry point
            Console.WriteLine("Hello World!");
            Console.ReadKey();
        }
    }

Using a Mutex, we can change the above code to allow only a single instance to print Hello World! and the subsequent instances to exit immediately:

static void Main()
{
    // Named Mutexes are available computer-wide. Use a unique name.
    using (var mutex = new Mutex(false, "saebamini.com SingletonApp"))
    {
        // TimeSpan.Zero to test the mutex's signal state and
        // return immediately without blocking
        bool isAnotherInstanceOpen = !mutex.WaitOne(TimeSpan.Zero);
        if (isAnotherInstanceOpen)
        {
            Console.WriteLine("Only one instance of this app is allowed.");
            return;
        }

        // main application entry point
        Console.WriteLine("Hello World!");
        Console.ReadKey();
        mutex.ReleaseMutex();
    }
}

Note that we’ve passed false for the initiallyOwned parameter, because we want to create the mutex in a signaled/ownerless state. The WaitOne call later will try to put the mutex in a non-signaled/owned state.

Once an instance of the application is running, the saebamini.com SingletonApp Mutex will be owned by that instance, causing further WaitOne calls to evaluate to false until the running instance relinquishes ownership of the mutex by calling ReleaseMutex.

Keep in mind that only one thread can own a Mutex object at a time, and just as with the lock statement, it can be released only from the same thread that has obtained it.

Modeling PowerToys for Visual Studio 2013

I rarely use tools that generate code, however, one that has become a fixed asset of my programming toolbox is Visual Studio’s class designer. It’s a great productivity tool that helps you quickly visualize and understand the class structure of projects, classes and class members. It’s also great for presentation of code-base that does not come with a UI, e.g. a Class Library.

It also lets you quickly wire-frame your classes when doing top-down design, but it is limited in that aspect, for example it does not support Auto-Implemented Properties, which I tend to almost always use in my Types, instead it blurts out a verbose Property declaration along with a backing field. Fortunately, almost all of these issues are fixed with the great Modeling PowerToys Visual Studio add-in by Lie which turns Class Designer into an amazing tool.

When I finally upgraded from my beloved Visual Studio 2010 to 2013, in the midst of all the horrors of VS 2013, I also found out that this add-in has not been updated to support later versions and the original author seems to be inactive, so I upgraded it myself and decided to put it here for other fellow developers who also happen to like the tool:

Download Link

To install the add-in, extract the ZIP file contents to %USERPROFILE%\Documents\Visual Studio 2013\Addins and restart Visual Studio.

Please note that I’m not the author of this add-in, I merely upgraded it for VS 2013.

Git: Commit with a UTC Timestamp and Ignore Local Timezone

When you git commit, Git automatically uses your system’s local timezone by default, so for example if you’re collaborating on a project from Brisbane (UTC +10) and do a commit, your commit will look like this:

commit c00c61e48a5s69db5ee4976h825b521ha5bx9f5d
Author: Your Name <your@email.com>
Date:   Sun Sep 28 11:00:00 2014 +1000 # <-- your local time and timezone offset

Commit message here

If you find it rather unnecessary to include your local timezone in your commits, and would like to commit in UTC time for example, you have two options:

  1. Changing your computer’s timezone before doing a commit.
  2. Using the --date commit option to override the author date used in the commit, like this:

     git commit --date=2014-09-28T01:00:00+0000
    

The first option is obviously very inconvenient, changing the system’s timezone back and forth between UTC and local for commits is just silly, so let’s forget about that. The second option however, seems to have potential, but manually inputting the current UTC time for each commit is cumbersome. We’re programmers, there’s gotta be a better way…

Bash commands and aliases to the rescue! we can use the date command to output the UTC time to an ISO 8601 format which is accepted by git commit’s date option:

git commit --date="$(date --utc +%Y-%m-%dT%H:%M:%S%z)"

We can then alias it to a convenient git command like utccommit:

git config --global alias.utccommit '!git commit --date="$(date --utc +%Y-%m-%dT%H:%M:%S%z)"'

Now whenever we want to commit with a UTC timestamp, we can just:

git utccommit -m "Hey! I'm committing with a UTC timestamp!"