Scale-Out File Server Improvements in Windows Server 2019

SMB Connections move on connect

Scale-Out File Server (SOFS) relies on DNS round robin for inbound connections sent to cluster nodes.  When using Spaces on Windows and older, this behavior can be inefficient: if the connection is routed to a cluster node that is not the owner of the Cluster Shared Volume (aka the coordinator node), all data redirects over the network to another node before returning to the client. The SMB Witness service detects this lack of direct I/O and moves the connection to a coordinator.  This can lead to delays.

In Windows Server 2019, we are much more efficient.  The SMB Server service determines if direct I/O on the volume is possible.  If direct I/O is possible, it passes the connection on.  If it is redirected I/O, it will move the connection to the coordinator before I/O starts.  Synchronous client redirection required changes in the SMB client, so only Windows Server 2019 and Windows 10 Fall 2017 clients can use this new functionality when talking to a Windows 2019 Failover Cluster.  SMB clients from older OS versions will continue relying upon the SMB Witness to move to a more optimal server.

As a note here, I wanted to point out when a move would and would not occur in a stretch scenario and it will depend on the you are using.  So for my example, my Scale-Out File Server is running on NodeA in SiteA.  All node's IP Addresses are registered in DNS and it is round robin on where a client connects.

If you have a stretch Failover Cluster and the presents itself as symmetric; meaning, all nodes have access to the drives, the client connection will be moved to SiteA as described above.

But let's say the SAN storage and is asymmetric; meaning, each site has it's own SAN storage and there is between them.  This is the process that will occur.

1. A client connection is sent to a node in SiteB

2. The node in SiteB will retain that connection.

3. All data requests will be redirected over the CSV network to SiteA.

4. Data is retrieved and sent back over the CSV network to the node in SiteB.

5. The node in SiteB then sends the data to the client.

6. Rinse, repeat for all other data requests.

Infrastructure Scale-Out File Server

There is a new Scale-Out File Server role in Windows Server 2019 called Infrastructure File Server.  When you create an Infrastructure File Server, it will create a single namespace share automatically for the CSV drive (i.e. InfraSOFSNameVolume1, etc.).  In hyper-converged configurations, an Infrastructure SOFS allows an SMB client ( host) to communicate with guaranteed Continuous Availability (CA) to the Infrastructure SOFS SMB server.  There can be at most only one infrastructure SOFS cluster role on a Failover Cluster.

To create the Infrastructure SOFS, you would need to use PowerShell.  For example:
Add-ClusterScaleOutFileServerRole -Cluster MyCluster -Infrastructure -Name InfraSOFSName

SMB Loopback

There is an enhancement made with Server Message Block (SMB) to work properly with SMB local loopback to itself which was previously not supported.  This hyper-converged SMB loopback CA is achieved via Virtual Machines accessing their virtual disk (VHDx) files where the owning VM identity is forwarded between the client and server.

This is a role that Cluster Sets takes advantage of where the path to the VHD/VHDX is placed as InfraSOFSNameVolume1.  This InfraSOFSNameVolume1 path can then be utilized by the virtual machine if it is local or remote.

Identity Tunneling

In , if virtual machines are hosted on a SOFS share, you must grant the machine accounts of the compute nodes permission to access the VHD/VHDX files.  If the virtual machines and VHD/VHDX is running on the same cluster, then the user must have rights.  This can make management difficult as two sets of permissions are needed.

In Windows Server 2019 when using SOFS, we now have “identity tunneling” on Infrastructure shares. When you access Infrastructure Share from the same cluster or Cluster Set, the application token is serialized and tunneled to the server, and VM disk access is done using that token. This works even if your identity is Local System, a service, or virtual machine account.

John Marlin
Senior Program Manager
High Availability and Storage

Follow me on Twitter @JohnMarlin_MSFT


This article was originally published by Microsoft’s Failover Clustering Blog. You can find the original article here.