Non-WSFC Dumping Woes – Assert in ucsconnectionsend.cpp

Setup

If you’re using availability groups with read-scale or linux (cluster type = None/External), you might want to watch the number of databases put into a single availability group. There seems to be an issue where having a very large number (~200) of databases in an AG works without issue, however going higher may result in constant asserts happening, causing dumps to be taken.

Work Around

If you’re getting close to around 200 databases, the workaround is to just create another availability group and put any other databases in that. Rinse, repeat.

What’s The Issue

Without getting too far into the depths of UCS and AGs, in Windows Clustering the cluster database (registry) is used to store AG metadata in non-human readable format. Since clustering takes care of updating the cluster databases on each node, this is done seamlessly for SQL Server and just occurs. Each node then has a proper (hopefully) copy of the cluster database so that failovers or other operations that require checking the metadata of an AG work appropriately.

Read-Scale and Linux AGs don’t have a cluster database and both need to use some other method to keep the AG metadata up to date on each node. This is done via messages between the nodes on the UCS transport (like every other message used for AlwaysOn). Each database takes a certain amount of space in the in-memory AG metadata, when this size becomes too large, it starts to hit into limits of UCS functionality as it currently exists. Note that WSFC based AGs do not have this problem – though this is a technical limit to the size of a registry key it is highly unlikely that this would be hit before some other resource exhaustion takes place, such as worker threads.

Sample Dump Comment

Location: “ucsconnectionsend.cpp”
Expression: m_pcscBoxcar->GetMessageCount () > 0

5 thoughts on “Non-WSFC Dumping Woes – Assert in ucsconnectionsend.cpp”

  1. Is there a hard limit to the size of the in-memory AG metadata? Or does it depend on the size of the server’s memory?

      1. Oh nevermind I see it more now – it has to do with the messaging. Universal something something for messaging probably 😀

        1. Unified Communication Stack = UCS, I probably should update to explain that – Thanks 🙂

    1. There is not, it’ll continue to grow. It doesn’t take up much space, for example with ~220 databases it takes about ~96k of space.

Comments are closed.