The road to Hell is paved with good intentions.
I get it, your security team is the best ever. You work at a company where the industry (or public sector) requires that servers be “compliant” by constantly and incessantly scanning them for vulnerabilities, in production of course. I know for some it checks a required box, and that’s a discussion about regulation oversight and requirements rather than a technical one that we’re focused on, here. Finally, at some point, you check your SQL Server and notice some errors in the errorlog, something about a weird Unicode login being attempted, SA without a password, connection errors, etc., one of which might be on your TCP Mirroring endpoint (Mirroring, Always On, etc.). This is almost always (I have yet to witness a scenario where this isn’t the underlying issue) due to a security vulnerability scanner and the most common one being Nessus.
Example Error 9642 – Generic error with actual error placeholder:
An error occurred in a Service Broker/Database Mirroring transport connection endpoint, Error: %i, State: %i. (Near endpoint role: %S_MSG, far endpoint address: ‘%.*hs’)
Example Error 8474 – Note that this is the error number inside the 9642 error:
An error occurred in a Service Broker/Database Mirroring transport connection endpoint, Error: 8474, State: 11. (Near endpoint role: %S_MSG, far endpoint address: ‘%.*hs’)
Example Error 17836:
Length specified in network packet payload did not match number of bytes read; the connection has been closed. Please contact the vendor of the client library.
What’s really happening.
The long and the short of it is 8474 state 11 means the service broker message (that’s another blog post about how people keep telling me SB functionality isn’t used for mirroring or always on) is corrupt and 17836 state 20 means there was an issue with the TDS packet. There are various actual checks that are made as part of defensive programming to keep SQL Server safe when bad actors such as vulnerability software attempts to do bad things. In most cases, the data and headers of the TCP packets are changed which is the security software attempting to do buffer overrun and underruns among other objectionable things. This can be found by a few different methods but the easiest one is a network trace and inspecting the packets coming across.
Is this cause for concern? Not really… but would you subject something you count on every day to a constant barrage of bad actors? Probably not. While it shouldn’t cause any actual issue, the stars sometimes align, and the shit hits the fan. If the security team is cool with being the people who get the giant dollar cost when the server does have an issue, then wow, kudos, carry on doing bad things in production. If they aren’t then maybe it’s time to rethink hitting your airplane with birds before it takes off to show how resilient it is to bird strikes. It’s stupid.
You have two major options. Option 1 is to have your security team whitelist that server/port or a combo approach that still allows checkboxes to be checked but doesn’t cause a bunch of clutter in the errorlog. Option 2 is to just stop incessantly scanning the server, which I’ve seen happen at 5-minute intervals, all day, in some financial industry scenarios.
Considering the ROOT CAUSE is an application *maliciously* attempting to cause issues by doing bad things, it’s a wonder people run it against production in the first place. Since this isn’t a SQL Server issue, you’ll want to contact someone who cares enough about your production server to stop trying to actively break it, or at the very least whomever is in charge of monitoring and altering (if that even exists) so that you aren’t alerted (my personal favorite is to copy the entire security team + manager + vp for each one of these alerts) when this happens every Tuesday at midnight.
Footnote: This doesn’t mean you should completely stop checking your production environment for vulnerabilities. It does ask the reader to think about how often this should be done, given they have good change management processes and skills in place and what is to be expected when the testing is run against SQL Server.