Initially I thought to myself, “this is the most misunderstood wait type that exists in the HA space for SQL Server”, then I realized maybe this isn’t the case… So, I pondered over this question, “is it truly misunderstood?” and came to the (possibly incorrect) realization that it is quite accurate in the general SQL Server’s users’ space of understanding. I also concluded that, really, it’s the way the wait is used in SQL Server coupled with how waits work in SQL Server, which leads to how it is viewed. Let me explain….
What Is HADR_SYNC_COMMIT?
This seems like an easy answer that anyone can read from the Docs page:
Waiting for a transaction commit processing on the synchronized secondary databases to harden the log. This wait is also reflected by the Transaction Delay performance counter. This wait type is expected for synchronous-commit Availability Groups and indicates the time to send, write, and acknowledge log commit to the secondary databases.
This is entirely correct, it’s the time spent waiting for commit processing of the log blocks to be hardened on synchronous commit replicas (This is a gross overgeneralization and simplification of the process and what a “hardened commit” means in this specific scenario). Seems easy enough, right? I mean, we’re just waiting for the log to be hardened, this shouldn’t really take a long time in the age of 40 Gb networking with sub ms latencies in closely connected datacenters running servers with local NVMe. I agree, but that’s in the best of circumstances, realistically people use public clouds, slow storage, 3rd party drivers, and a whole host of items that sit between the SQL Server instances. This isn’t about terrible setups of AGs, it’s about why I think this wait type is misunderstood, and to be fair, I don’t think it should have been the wait type that it is since it breaks the fundamental contract of waits.
Hurry Up and Wait
Waits in SQL Server happen in a generally predictable way. There is some action taking place, there is a need to wait on a resource (or, for example, when going preemptive and it leaves the purview of SQL Server). While the resource is being obtained, the wait time adds up and when the resource is obtained then the wait is satisfied and the time spent waiting is the time recorded. Seems straightforward and it’s how most waits work… but there’s always that one that doesn’t follow the rule. In comes HADR_SYNC_COMMIT!
HADR_SYNC_COMMIT works in mostly the traditional way, with one giant caveat which is that you shouldn’t see a wait for it longer than 2-3 seconds (unless your instance is extremely overburdened). That is quite the departure from the typical waits, though it won’t change the overall wait time in the aggregated system values form sys.dm_os_wait_stats, it does affect commonly used wait logic (such as waits longer than x seconds). What does this mean? It means, if you’re querying sys.dm_exec_requests (because you’re not using a legacy holdover from the year 2000 that may or may not properly report data) and looking for HADR_SYNC_COMMIT waits longer than 10 seconds, you’ll never find it. That’s right, after about 2-3 seconds, the wait will reset to 0. If you don’t believe me, try it yourself. This breaks the fundamental contract of waits showing in said DMVs as it is expected that the wait monotonically increases, not be reset for the *same* instance of the wait. One could argue that it truly is working like the rest of the waits, and to an extent that would be true in that there was an event that reset the wait. It would not be true in the pedantic sense of how waits work, though. This is due to the way certain items are implemented internally for AlwaysOn and is not a new style or type of wait. There are a few other wait types that adhere to this model, however they are almost exclusively used in the opposite way – as in they increase when the task is sleeping and then is reset when the task wakes up.
This doesn’t mean that there isn’t anything to understand how long and why the request was waiting. In fact, there’s a lovely extended event with the on the nose name of “hadr_db_commit_mgr_harden_still_waiting” that will fire when there are requests waiting on a log block, however it’s in the debug channel for AlwaysOn, which to me means people aren’t really going to look at it. Additionally, let’s say you’re a super DBA and already had this event set, it fires. Now what? You’ll get the total elapsed time waiting on a specific log block whose id will be in the extended event as well. Ok, so you’ll get the total time waiting for a specific log block in a specific database on a specific replica. Awesome. What do you even do with that? The answer is, unless you’re gathering a whole host of other data then a whole lot of nothing.
This means that the wait type only really answers the fundamental question of “what am I waiting on” to the most generalized extent. Asking further questions such as, “why is it waiting” cannot be answered by the wait, nor by the extended event associated with the wait.
But Wait, There’s More
The major issue with having synchronous commit replicas fall behind is that no other requests can continue to make progress if there is an outstanding wait for a log block that is before it. This means if you have log block 123 that your request is committed in, but log block 98 is still waiting to be acknowledged for harden, then nothing that needs past log block 98 will be able to be marked as hardened and thus will be in the wait state – even though the log blocks that hold all their information have already been hardened. This causes a massive blocking event, causing new requests to continue to pile up if they are running any sort of DML operation in that database. This can lead to worker thread exhaustion among other scheduling or processing issues.
The opposite occurs when the log block is eventually (hopefully!) hardened and acknowledged, which is that all the requests are immediately signaled and can continue… however there might be more to the batch, which then requires that many of these tasks find a scheduler and run. If the scheduler is already loaded due to many requests being queued up waiting on the HADR_SYNC_COMMIT then there will most likely be scheduling issues until the load normalizes. This can cause problems with items such as loading massive amounts of data in large batches or having “microservices” do thousands of extremely tiny and chatty requests.
There’s Always a Catch
To truly understand what the issue might be with the log block harden sequence, the entire process needs to be followed from start to finish throughout all synchronous commit replicas. This means if there are the maximum set in the environment of 3, then the primary and each synchronous commit replica will need to be traced. If you’ve worked on this before, you’ll know that it generates an insane amount of data in a very short time, which is based on the number of replicas in the environment and the number of log blocks created (which is roughly the log generation rate divided by the average block size). There are a host of other events needed as well, and in some of the worst cases, there are tens of gigabytes of data generated in just a few minutes.
Since SQL Server uses cooperative scheduling, the scheduling and processing infrastructure will also need to be monitored. Things such as long quantum usage (quantum thief’s) can wreak havoc. This is one of the reasons I continually preach about not overloading the replicas and having the proper amount of hardware for the peak load. Assume we had the same above scenario where we’re waiting for log block 98 to harden, except we have a thread that is monopolizing the processor (maybe another program running on the server) or there is something that isn’t yielding properly in SQL Server. In this case the harden message may have returned to the primary, however it hasn’t been processed yet and is just waiting its turn in line. This is quite common especially on secondary replicas which are used as readable copies. The secondary replica getting pegged with requests will absolutely cause the primary to show HADR_SYNC_COMMIT if it’s waiting on processing.
When Availability Groups are enabled and created on the instance of SQL Server, it goes from the self-contained instance to a distributed system. This means any issue or problem in the entire architecture can ripple throughout the chain. Think about this the next time someone asks if you can have secondary replicas be much smaller than the primary, changing the value of max worker threads, adding even more databases to the server, etc.
Please don’t make the knee jerk reaction that since this wait type occurs that there’s a problem with Availability Groups, it’s generally an issue that has nothing to do with the Availability Group feature but is more indicative of an environmental or infrastructure issue at a much different level.