Site-aware Failover Clusters in Windows Server 2016

Windows , debuts the birth of site-aware clusters. Nodes in stretched clusters can now be grouped based on their physical location (site). site-awareness enhances key operations during the lifecycle such as failover behavior, placement policies, heartbeating between the nodes and quorum behavior. In the remainder of this blog I will explain how you can configure sites for your , the notion of a “preferred site” and how site awareness manifests itself in your cluster operations.

Configuring Sites

A node's site membership can be configured by setting the Site node property to a unique numerical value.

For example, in a four node cluster with nodes – Node1, Node2, Node3 and Node4, to assign the nodes to Sites 1 and Site 2, do the following:

Launch Microsoft PowerShell© as an Administrator and type:

#Create Site Fault Domains
New-ClusterFaultDomain –Name Seattle –Type Site –Description “Primary” –Location “Seattle DC”
New-ClusterFaultDomain –Name Denver –Type Site –Description “Secondary” –Location “Denver DC”

#Set Fault Domain membership
Set-ClusterFaultDomain –Name Node1 –Parent Seattle
Set-ClusterFaultDomain –Name Node2 –Parent Seattle

Set-ClusterFaultDomain –Name Node3 –Parent Denver
Set-ClusterFaultDomain –Name Node4 –Parent Denver

Configuring sites enhances the operation of your cluster in the following ways:

Failover Affinity

  • Groups failover to a node within the same site, before failing to a node in a different site
  • During Node Drain VMs are moved first to a node within the same site before being moved cross site
  • The CSV will distribute within the same site

Storage Affinity

Virtual Machines (VMs) follow storage and are placed in same site where their associated storage resides. VMs will begin live migrating to the same site as their associated CSV after 1 minute of the storage being moved.

Cross-Site Heartbeating

You now have the ability to configure the thresholds for heartbeating between sites. These thresholds are controlled by the following new cluster properties:

Property Default Value Description
CrossSiteDelay 1000 Amount of time between each heartbeat sent to nodes on dissimilar sites in milliseconds
CrossSiteThreshold 20 Missed heartbeats before interface considered down to nodes on dissimilar sites

To configure the above properties launch PowerShell as an Administrator and type:

(Get-Cluster).CrossSiteDelay = <value>
(Get-Cluster).CrossSiteThreshold = <value>

You can find more information on other properties controlling failover clustering heartbeating.

The following rules define the applicability of the thresholds controlling heartbeating between two cluster nodes:

  • If the two cluster nodes are in two different sites and two different subnets, then the Cross-Site thresholds will override the Cross-Subnet thresholds.
  • If the two cluster nodes are in two different sites and the same subnets, then the Cross-Site thresholds will override the Same-Subnet thresholds.
  • If the two cluster nodes are in the same site and two different subnets, then the Cross-Subnet thresholds will be effective.
  • If the two cluster nodes are in the same site and the same subnets, then the Same-Subnet thresholds will be effective.

Configuring Preferred Site

In addition to configuring the site a cluster node belongs to, a “Preferred Site” can be configured for the cluster. The Preferred Site is a preference for placement. The Preferred Site will be your Primary datacenter site.

Before the Preferred Site can be configured, the site being chosen as the preferred site needs to be assigned to a set of cluster nodes. To configure the Preferred Site for a cluster, launch PowerShell © as an Administrator and type:

(Get-Cluster).PreferredSite = <Site assigned to a set of cluster nodes>

Configuring a Preferred Site for your cluster enhances operation in the following ways:

Cold Start

During a cold start VMs are placed in in the preferred site

Quorum

  • Dynamic Quorum drops weights from the Disaster site (DR site i.e. the site which is not designated as the Preferred Site) first to ensure that the Preferred Site survives if all things are equal. In addition, nodes are pruned from the DR site first, during regroup after events such as asymmetric connectivity failures.
  • During a Quorum Split i.e. the even split of two datacenters with no , the Preferred Site is automatically elected to win
    • The nodes in the DR site drop out of cluster membership
    • This allows the cluster to survive a simultaneous 50% loss of votes

[box type=”info”] Note: the LowerQuorumPriorityNodeID property previously controlling this behavior is deprecated in Windows [/box]

Preferred Site and Multi-master Datacenters

The Preferred Site can also be configured at the granularity of a cluster group i.e. a different preferred site can be configured for each group. This enables a datacenter to be active and preferred for specific groups/VMs.

To configure the Preferred Site for a cluster group, launch PowerShell as an Administrator and type:

(Get-ClusterGroup -Name <GroupName>).PreferredSite = <Site assigned to a set of cluster nodes>

Placement Priority

Groups in a cluster are placed based on the following site priority:

  1. Storage affinity site
  2. Group preferred site
  3. Cluster preferred site

Additional Information: Fault Domain Awareness in WS2016

Fault Domains are being introduced for clustering in Windows , which provide Node, Chasse, Rack, and Site awareness. See this blog as well as the below video's to learn more about Fault domain awareness.

Part 1: Overview

Part 2: Using PowerShell

Part 3: Using XML

Part 4: Location, Description

 

This article was originally published by Microsoft's Premier Field Engineering Blog. You can find the original article here.