Monday 7 December 2015

Port exhaustion on Windows server

We had a set of servers that started showing what seemed to be random connection errors. They ran across services and across servers.

Our first thought was it was a hardware or centralised switch problem, but the network guys assured us the switch had no problem.

Then we started leaning towards configuration problems on the boxes. These servers ran multiple services, all heavy socket consumers.

Whenever a socket connection is made to a known port, Windows also uses a temporary socket for the return communications. These temporary sockets, known as ephemeral or short-lived sockets, are assigned from a pool which can be configured.

netsh int ipv4 show dynamicportrange tcp
Protocol tcp Dynamic Port Range
---------------------------------
Start Port      : 50001
Number of Ports : 255

This is an incredibly low number for a server high in comms. It shows that when a socket is created, only 255 are available. Bearing in mind when the communication has completed the socket remains open for a short while (known as the TcpTimedWaitDelay) for efficiency in reuse. This defaults to 30 seconds.

Also the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\Tcpip\Parameters showed  a MaxUserPort of 1279, which conflicted with the StartPort + Number of Ports range.

In a state of port exhaustion MS documentation says that you can get non-specific errors that just look like general connection errors.

e.g.
System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a send.

References
https://msdn.microsoft.com/en-us/library/aa560610(v=bts.20).aspx
http://www.outsystems.com/forums/discussion/6956/how-to-tune-the-tcp-ip-stack-for-high-volume-of-web-requests/
https://technet.microsoft.com/en-us/library/cc938219.aspx

No comments:

Post a Comment