我们拥有"标准"三层架构,我们的中间层托管在IIS中,并通过.net远程访问.这些错误发生在远程连接到应用服务器(中间层)的Web和Web服务服务器(前端层)之间.我们将在当天总计约130K的电话中每天3-10次收到此错误.
异常和堆栈跟踪看起来总是类似于:
Exception Type: System.Net.WebException Message: The underlying connection was closed: An unexpected error occurred on a receive. Server stack trace: at System.Runtime.Remoting.Channels.Http.HttpClientTransportSink.ProcessResponseException(WebException webException, HttpWebResponse& response) at System.Runtime.Remoting.Channels.Http.HttpClientTransportSink.ProcessMessage(IMessage msg, ITransportHeaders requestHeaders, Stream requestStream, ITransportHeaders& responseHeaders, Stream& responseStream) at System.Runtime.Remoting.Channels.BinaryClientFormatterSink.SyncProcessMessage(IMessage msg) Exception rethrown at [0]: at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg) at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type) at XXXXX.BusinessFacade.Interface.XXXXInterface.SubmitXXXX( at XXX.XXXXWebServicesLibrary.XXXXService.CreateXXXXXX.RunXXXXMethod() at XXX.XXXXWebServicesLibrary.XXXXService.XXXXXXMethod`2.RunMethod() at XXX.XXXXWebServicesLibrary.XXXXXWebMethod`2.Run()HandleReturnMessage() Inner Exception: Exception Type: System.IO.IOException Message: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host. at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.Net.PooledStream.Read(Byte[] buffer, Int32 offset, Int32 size) at System.Net.Connection.SyncRead(HttpWebRequest request, Boolean userRetrievedStream, Boolean probeRead)Read() Inner Exception: Exception Type: System.Net.Sockets.SocketException Message: An existing connection was forcibly closed by the remote host at System.Net.Sockets.Socket.Receive(Byte[] buffer, Int32 offset, Int32 size, SocketFlags socketFlags) at System.Net.Sockets.NetworkStream.Read(Byte[] buffer, Int32 offset, Int32 size)Receive()
没有特定的远程调用会导致这种情况发生,它可能是其中任何一个似乎排除任何类型的应用程序特定原因.唯一的共同点是"异常类型:System.Net.Sockets.SocketException消息:错误地由远程主机强制关闭现有连接".
前面和中间层由防火墙隔开,我们也使用VIP设备.我强烈怀疑我们的网络/防火墙配置存在问题,但我们的网络人员只是摸不着头脑,没有提出任何建议.
虽然0.003%的失败率可能看起来微不足道,但我们的合作伙伴非常谨慎地审查我们的沟通,我只是在等待这个问题成为他们注意到的问题.当那个时候到来时,我不想说"我不知道".
有没有人对如何提供更多信息或我可以向网络人员提出的任何建议有任何想法来解决这个问题?
问题是思科CSS.我们通过将第1层服务器直接指向第2层服务器并在没有观察到问题的情况下进行48小时来确定这一点.一旦我们确定它是CSS,我们通过调整此参数的极低默认值来纠正此问题:
"TCP或UDP端口的默认流量不活动超时(以秒为单位).如果流量在超时值中指定的时间内处于空闲状态,则CSS会断开流量并回收流量资源."
我们将其设置为84(84 16秒增量).由于HTTP的默认保持活动状态为120秒,因此默认值太低.