Sql-Server

SQL Server 2014 標準 FCI 不進行故障轉移

  • July 11, 2018

有一個特殊情況,以前工作的 FCI 在安裝大量積壓的 Windows 更新後,現在經常失敗,並且沒有故障轉移。有時,這似乎發生在備份時間前後,FCI 將失敗,無法重新啟動 SQL Server 服務,然後顯然無法嘗試在另一個節點上啟動它。“SQL Server”資源策略配置為“嘗試在目前節點上重新啟動”,但也“如果重新啟動不成功,則故障轉移所有資源”,儘管後者似乎沒有發生。

執行 Windows Server 2012(非 R2)標準版 — 兩個節點上的 SQL Server 版本相同,安裝了所有 Windows 更新,NIC 驅動程序和韌體是最新的,DNS 伺服器正常並且集群驗證通過。集群日誌的相關部分:

00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: [cxl::Pinger-""EXAMPLE-SQL""] Host not registered."
00000d2c.00001528::2018/06/27-00:42:20.212  WARN    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: [cxl::Pinger-""EXAMPLE-SQL""] Could not find any endpoints for remote target"
00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: PingName internal returned 258
00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Setting resource specific message to Name Resolution Not Yet Available
000006ac.00001e84::2018/06/27-00:42:20.212  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3354
00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: Slow Operation, FinishWithReply: 258"
00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: InternalReplyHandler: 258
00000d2c.00001528::2018/06/27-00:42:20.212  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:22.896  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.00001528::2018/06/27-00:42:22.896  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:22.896  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:22.896  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:24.737  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.00001528::2018/06/27-00:42:24.737  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:24.737  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:24.737  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:27.903  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.0000136c::2018/06/27-00:42:27.903  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:27.903  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:27.903  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:29.744  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.00001528::2018/06/27-00:42:29.744  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:29.744  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:29.744  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:32.911  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.0000136c::2018/06/27-00:42:32.911  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:32.911  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:32.911  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:34.752  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.0000136c::2018/06/27-00:42:34.752  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:34.752  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:34.752  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:37.919  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.0000136c::2018/06/27-00:42:37.919  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:37.919  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:37.919  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:39.760  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.0000136c::2018/06/27-00:42:39.760  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:39.760  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:39.760  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:42.927  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.00001528::2018/06/27-00:42:42.927  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:42.927  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:42.927  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:44.768  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.0000136c::2018/06/27-00:42:44.768  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:44.768  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:44.768  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.00001528::2018/06/27-00:42:47.935  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.0000136c::2018/06/27-00:42:47.935  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.0000136c::2018/06/27-00:42:47.935  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.0000136c::2018/06/27-00:42:47.935  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:49.775  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.00001528::2018/06/27-00:42:49.775  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:49.775  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:49.775  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:52.942  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.00001528::2018/06/27-00:42:52.942  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:52.942  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:52.942  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000e70.000008dc::2018/06/27-00:42:53.379  ERR [RES]   "SQL Server <SQL Server>: [sqsrvres] Failure detected, diagnostics heartbeat is lost"
00000e70.000008dc::2018/06/27-00:42:53.379  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] IsAlive returns FALSE
00000e70.000008dc::2018/06/27-00:42:53.379  WARN    [RHS]   Resource SQL Server IsAlive has indicated failure.
000006ac.00001e84::2018/06/27-00:42:53.379  INFO    [RCM]   "HandleMonitorReply: FAILURENOTIFICATION for 'SQL Server', gen(0) result 1/0."
000006ac.00001e84::2018/06/27-00:42:53.379  INFO    [RCM]   Res SQL Server: Online -> ProcessingFailure( StateUnknown )
000006ac.00001e84::2018/06/27-00:42:53.379  INFO    [RCM]   TransitionToState(SQL Server) Online-->ProcessingFailure.
000006ac.0000151c::2018/06/27-00:42:53.379  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3355
000006ac.00001e84::2018/06/27-00:42:53.379  INFO    [RCM]   "rcm::RcmGroup::UpdateStateIfChanged: (SQL Server (MSSQLSERVER), Online --> Pending)"
000006ac.00001e84::2018/06/27-00:42:53.379  ERR [RCM]   rcm::RcmResource::HandleFailure: (SQL Server)
00000e70.000010f4::2018/06/27-00:42:53.379  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQLMoreResults() returns -1 with following information
00000e70.000010f4::2018/06/27-00:42:53.379  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [HYT00] [Microsoft][SQL Server Native Client 11.0]Query timeout expired (0)
00000e70.000010f4::2018/06/27-00:42:53.379  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [01000] [Microsoft][SQL Server Native Client 11.0][SQL Server]  (0)
00000e70.000010f4::2018/06/27-00:42:53.379  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] No more diagnostics results
00000e70.000010f4::2018/06/27-00:42:53.379  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Discard the pending result sets
000006ac.00000ac8::2018/06/27-00:42:53.379  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3356
000006ac.0000096c::2018/06/27-00:42:53.379  INFO    [RCM]   ignored non-local state Pending for group SQL Server (MSSQLSERVER)
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   "resource SQL Server: failure count: 0, restartAction: 0 persistentState: 1."
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   resource SQL Server will not be restarting; isLowPriority: true; numDependents: 1
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   Res SQL Server: ProcessingFailure -> WaitingToTerminate( Failed )
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   TransitionToState(SQL Server) ProcessingFailure-->[WaitingToTerminate to Failed].
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   Res SQL Server Agent: Online -> WaitingToTerminate( OfflineDueToProvider )
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   TransitionToState(SQL Server Agent) Online-->[WaitingToTerminate to OfflineDueToProvider].
000006ac.00002660::2018/06/27-00:42:53.395  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3357
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   Res SQL Server Agent: [WaitingToTerminate to OfflineDueToProvider] -> Terminating( OfflineDueToProvider )
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   TransitionToState(SQL Server Agent) [WaitingToTerminate to OfflineDueToProvider]-->[Terminating to OfflineDueToProvider].
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   SQL Server not yet ready to terminate; dependent SQL Server Agent still terminating.
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   Will NOT try to long delay restart SQL Server.
000006ac.00001e84::2018/06/27-00:42:53.395  INFO    [RCM]   RecursivelyCancelRestart: SQL Server Agent in state [Terminating to OfflineDueToProvider]
000006ac.00000ac8::2018/06/27-00:42:53.410  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3358
00000e70.000010f4::2018/06/27-00:42:53.442  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [24000] [Microsoft][SQL Server Native Client 11.0]Invalid cursor state (0)
00000e70.000010f4::2018/06/27-00:42:53.442  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Diagnostics is stopped
00000e70.0000105c::2018/06/27-00:42:53.442  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Online worker helper is stopped
00000e70.000010f4::2018/06/27-00:42:53.442  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Disconnect from SQL Server
00000e70.000010f4::2018/06/27-00:42:54.456  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Connect to SQL Server ...
00000e70.000010f4::2018/06/27-00:42:54.565  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] The connection was established successfully
00000e70.000010f4::2018/06/27-00:42:54.580  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Diagnostics is started
00000e70.000016c4::2018/06/27-00:42:54.580  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Online worker helper is started
00000e70.000010f4::2018/06/27-00:42:54.580  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server component 'system' health state has been changed from '' to 'clean' at 2018-06-27 00:42:54.577
00000e70.000010f4::2018/06/27-00:42:54.580  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server component 'resource' health state has been changed from '' to 'clean' at 2018-06-27 00:42:54.577
00000e70.000010f4::2018/06/27-00:42:54.612  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server component 'query_processing' health state has been changed from '' to 'clean' at 2018-06-27 00:42:54.577
00000e70.000010f4::2018/06/27-00:42:54.612  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server component 'io_subsystem' health state has been changed from '' to 'clean' at 2018-06-27 00:42:54.577
00000e70.000010f4::2018/06/27-00:42:54.612  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server component 'events' health state has been changed from '' to 'unknown' at 2018-06-27 00:42:54.577
00000d2c.0000136c::2018/06/27-00:42:54.783  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.00001528::2018/06/27-00:42:54.783  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:54.783  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:54.783  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
000006ac.0000151c::2018/06/27-00:42:55.906  INFO    [RCM]   "HandleMonitorReply: TERMINATERESOURCE for 'SQL Server Agent', gen(0) result 0/0."
000006ac.0000151c::2018/06/27-00:42:55.906  INFO    [RCM]   Res SQL Server Agent: [Terminating to OfflineDueToProvider] -> OfflineDueToProvider( StateUnknown )
000006ac.0000151c::2018/06/27-00:42:55.906  INFO    [RCM]   TransitionToState(SQL Server Agent) [Terminating to OfflineDueToProvider]-->OfflineDueToProvider.
000006ac.0000151c::2018/06/27-00:42:55.906  INFO    [RCM]   Res SQL Server: [WaitingToTerminate to Failed] -> Terminating( Failed )
000006ac.00001e84::2018/06/27-00:42:55.906  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3359
000006ac.0000151c::2018/06/27-00:42:55.906  INFO    [RCM]   TransitionToState(SQL Server) [WaitingToTerminate to Failed]-->[Terminating to Failed].
00000e70.000008dc::2018/06/27-00:42:55.906  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Request to terminate SQL Server
00000e70.000008dc::2018/06/27-00:42:55.906  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Stop service MSSQLSERVER immediately
00000e70.000016c4::2018/06/27-00:42:55.906  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Online worker was asked to terminate
00000e70.000010f4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQLMoreResults() returns -1 with following information
00000e70.000016c4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Online worker helper is stopped
00000e70.000010f4::2018/06/27-00:42:56.063  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [08S01] [Microsoft][SQL Server Native Client 11.0]TCP Provider: The specified network name is no longer available.
-64         
00000e70.000010f4::2018/06/27-00:42:56.063  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [08S01] [Microsoft][SQL Server Native Client 11.0]Communication link failure (64)
00000e70.000010f4::2018/06/27-00:42:56.063  ERR [RES]   SQL Server <SQL Server>: [sqsrvres] ODBC Error: [01000] [Microsoft][SQL Server Native Client 11.0][SQL Server]  (0)
00000e70.000010f4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] No more diagnostics results
00000e70.000010f4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Diagnostics is stopped
00000e70.000010f4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Disconnect from SQL Server
00000e70.000010f4::2018/06/27-00:42:56.063  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Extended Event logging is stopped
00000e70.000010f4::2018/06/27-00:42:56.078  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Extended Event target state:
00000e70.000010f4::2018/06/27-00:42:56.078  INFO    [RES]   "SQL Server <SQL Server>: [sqsrvres] Extended Event session summary: dropped buffers = 0, dropped events = 0"
00000e70.000010f4::2018/06/27-00:42:56.078  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Online worker is stopped
000006ac.00000de8::2018/06/27-00:42:57.155  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3360
00000d2c.0000136c::2018/06/27-00:42:57.950  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:c65409d9-3cd6-4670-a26e-07912ecc888b:Netbios
00000d2c.00001528::2018/06/27-00:42:57.950  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:57.950  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:57.950  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000d2c.0000136c::2018/06/27-00:42:59.791  INFO    [RES]   Network Name: Agent: Sending request Netname/RecheckConfig to NN:80999ce3-1900-4137-85d5-4c4bc9309ced:Netbios
00000d2c.00001528::2018/06/27-00:42:59.791  INFO    [RES]   "Network Name <Cluster Name>: Netbios: Slow Operation, FinishWithReply: 0"
00000d2c.00001528::2018/06/27-00:42:59.791  INFO    [RES]   Network Name:  [NN] got sync reply: 0
00000d2c.00001528::2018/06/27-00:42:59.791  INFO    [RES]   "Network Name <Cluster Name>: Netbios: End of Slow Operation, state: Initialized/Idle, prevWorkState: Idle"
00000e18.000021cc::2018/06/27-00:42:59.931  INFO    [RES]   Physical Disk <Cluster Disk 3>: VolumeIsNtfs: Volume \\?\GLOBALROOT\Device\Harddisk1\ClusterPartition1\ has FS type NTFS
00000e70.000008dc::2018/06/27-00:43:00.992  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Service was stopped successfully
00000e70.000008dc::2018/06/27-00:43:00.992  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] Terminate handling is completed
00000e70.000008dc::2018/06/27-00:43:00.992  INFO    [RES]   SQL Server <SQL Server>: [sqsrvres] SQL Server resource state is changed from 'ClusterResourceOnline' to 'ClusterResourceFailed'
00000e70.000008dc::2018/06/27-00:43:00.992  WARN    [RHS]   returning ResourceExitStateTerminate.
000006ac.00001e84::2018/06/27-00:43:00.992  INFO    [RCM]   "HandleMonitorReply: TERMINATERESOURCE for 'SQL Server', gen(1) result 0/0."
000006ac.00001e84::2018/06/27-00:43:00.992  INFO    [RCM]   Res SQL Server: [Terminating to Failed] -> Failed( StateUnknown )
000006ac.00001e84::2018/06/27-00:43:00.992  INFO    [RCM]   TransitionToState(SQL Server) [Terminating to Failed]-->Failed.
000006ac.00001e84::2018/06/27-00:43:00.992  INFO    [RCM]   "rcm::RcmGroup::UpdateStateIfChanged: (SQL Server (MSSQLSERVER), Pending --> Failed)"
000006ac.00002660::2018/06/27-00:43:00.992  INFO    [RCM]   moved 0 tasks from staging set to task set.  TaskSetSize=0
000006ac.00002660::2018/06/27-00:43:00.992  INFO    [RCM]   "rcm::RcmPriorityManager::StartGroups: [RCM] done, executed 0 tasks"
000006ac.0000151c::2018/06/27-00:43:00.992  INFO    [GEM]   Node 1: Sending 1 messages as a batched GEM message with gid 3361
000006ac.0000096c::2018/06/27-00:43:00.992  INFO    [RCM]   ignored non-local state Failed for group SQL Server (MSSQLSERVER)
00000d2c.0000136c::2018/06/27-00:43:02.724  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: HealthCheck: EXAMPLE-SQL
00000d2c.0000136c::2018/06/27-00:43:02.724  INFO    [RES]   "Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: End of Slow Operation, state: Initialized/Reading, prevWorkState: Reading"
00000d2c.0000136c::2018/06/27-00:43:02.724  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: PingName internal returned 0
00000d2c.0000136c::2018/06/27-00:43:02.724  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Dns: Endpoint is up
00000d2c.0000136c::2018/06/27-00:43:02.724  INFO    [RES]   Network Name <SQL Network Name (EXAMPLE-SQL)>: Setting resource specific message to

自從我發帖以來,更新的 NIC 驅動程序和韌體可用,並且更新到這些解決了問題,有點。雖然我的 FCI 不再失敗,因為網路現在很穩定(看起來高 NIC 負載會導致各種問題,但 NIC 更新解決了這些問題),但我不太相信即使在我又有問題了。

很抱歉將此作為答案發布,但我還不能“發表評論”。我在日誌中看到“SQL Server:

$$ sqsrvres $$IsAlive 返回 FALSE” 如果我沒記錯的話,有一個帳戶執行此檢查,但不記得是哪個帳戶。如果您看到某些登錄失敗或任何相關內容,請檢查 SQL Server 日誌。可能是該帳戶沒有權限或身份驗證正在處理一些問題。有時防病毒軟體也可能導致問題。如果可能,請在問題發生期間粘貼 SQL 日誌。

引用自:https://dba.stackexchange.com/questions/210882