Microsoft Hyper-V Cluster CSV Enter Pause State

 

Some time without any warnings the CSV (Cluster Shared Volumes = LUNS) entering pause state . This can cause the VM’s to restart, goes into save state and all other unwanted behavior !

this can happens for many reasons starting from storage disconnections, ISCSI network problems,Bad subnet’s IP’s for ISCSI network with Stuck Switches as well as Active Directory errors like miss configure DNS server, wrong cluster configuration on DC and more.

Some time the error codes are simple and clear but some time it can be laconic and not clear enough to understand what went wrong .

for example Event ID 5120:

Event ID: 5120
Source: Microsoft-Windows-FailoverCluster
Level: Error
Description: Cluster Shared Volume “volume_name” is no longer available on this node because of “STATUS_BAD_NETWORK_PATH(c00000be)’. All I/O will temporarily be queued until a path to the volume is re-established.

Event ID: 5120
Source: Microsoft-Windows-FailoverCluster
Level: Error
Description: Cluster Shared Volume “volume_name” is no longer available on this node because of ‘STATUS_CONNECTION_DISCONNECTED(c000020c)’. All I/O will temporarily be queued until a path to the volume is reestablished.

Event ID: 5120
Source: Microsoft-Windows-FailoverCluster
Level: Error
Description: Cluster Shared Volume “volume_name” is no longer available on this node because of ‘STATUS_MEDIA_WRITE_PROTECTED(c00000a2)’. All I/O will temporarily be queued until a path to the volume is reestablished.

Event ID generated: 5142
Source: Microsoft-Windows-FailoverCluster
Description: Cluster Shared Volume “volume_name” (‘Cluster Disk #’) is no longer accessible from this cluster node because of error ‘ERROR_TIMEOUT(1460)’. Please troubleshoot this node’s connectivity to the storage device and network connectivity.

You can get this error with out really understand What the hell went wrong. apparently this error codes come from SMB error codes, full list here.

Sometime it can happens when accessing a CSV volume from a passive (non-coordinator) node, the disk I/O to the owning (coordinator) node is routed through a ‘preferred’ network adapter and requires SMB be enabled on that network adapter. For SMB connections to work on these network adapters, the following protocols must be enabled:

just check you have this protocols on the network adapter

  • Client for Microsoft Networks
  • File and Printer Sharing for Microsoft Networks

In any case you should check SMB connection between all nodes as well, First check the version and connections. Run from PowerShell on Hyper-V Node :

PS C:\> Get-SmbConnection

Then run

PS c:\> Get-SmbClientConfiguration

Make sure you have EnableMultiChannel=True

You can make ajustment to you node SMB configuration, all the information and registery keys on this pages :

https://blogs.msdn.microsoft.com/openspecification/2013/03/27/smb-2-x-and-smb-3-0-timeouts-in-windows/

 

CIFS and SMB Timeouts in Windows

 


*.You can also insert all the cluster IP and names information on the hosts file on all the Nodes

Good Luck

 

 

 

3 Comments

  • Alstar says:

    Hi,
    I have a question about the Get-SmbConnection cmdlet. On one of my Hyper Nodes, I can see an output when I run this command but on the other node don’t. Is that how it should be or I should see an out on both nodes when I run this cmdlet?
    The multi channel is True on both nodes.

    Thanks
    A

    • admin says:

      You should see the result on any node you run this…make sure you allow it by policy set-executionpolicy …

  • N says:

    We had the same problem ongoing for 8 months. MS support were not able to resolve. We eventually disabled the firewall as we notice security event logs that indicated critical traffic being dropped (port 53 & 445).
    We later discovered this problem to be related to a know issue with IP Sec offload being enabled when firewall is also enabled. Disable IP Sec Offload 😄

Leave a Reply to N Cancel reply

Your email address will not be published. Required fields are marked *