Failover Cluster File Share Witness and DFS

First published on MSDN on Apr 13, 2018
One of the quorum models for Failover Clustering is the ability to use a file share as a witness resource. As a recap, the File Share Witness is designated a vote in the when needed and can act as a tie breaker in case there is ever a split between nodes (mainly seen in multi-site scenarios).
However, over the years, we have seen where this share is put on a DFS Share. This is an awfully bad idea and one not supported by Microsoft.  Please do not misunderstand that this is a stance against DFS.  DFS is a great feature with numerous deployments out there.  I am specifically talking about putting a File Share Witness on a DFS share.
Let me give you an example of what can happen on a Windows . Let's take the example of a 4-node multisite cluster with two nodes at each site running SQL FCI. Each side has shared drives utilizing some sort of ( Storage Replica for those Ned fans ) . The cluster connects to a file share witness that is a part of DFS share. So, it would look something like this.

All is fine, dandy and working smoothly. But this is what can happen if there is some sort of break in communications between the two sites.

What has happened is there is a loss of connectivity between the two sites. Site A already has the file share witness and places a lock on it so no one else can come along and take it. Because it is running SQL already, it stays status quo. Over on Site B, is where the problem occurs. Since it cannot communicate to Site A, it has no idea what is going on. Site B nodes do what it is supposed to which is to arbitrate to get the Cluster Group and the witness resource. It goes to connect and DFS Referral sends it to one of the other machines and connects. Site B nodes see it has the witness, so it starts bringing everything online, which would include SQL and its databases. For those not so familiar with Failover Clustering and all its jargons, this is known as a split brain.

So as far as each sides view of membership, they have quorum and SQL clients are connecting and writing/updating the databases. When connectivity is restored between the sites and we get back to our normal cluster view again, we think everything is all roses again.
However, remember, each side had the SQL databases being written to. Once the begins again, a very possible outcome is that everything that was written on one of the sides is now gone.
So as pointed out earlier:
This is an awfully bad idea.
Microsoft does not support running a File Share Witness on DFS shares.
For 2019, additional safeguards have been added to help protect from misconfigurations. We have added logic to check to check if the share is going to DFS.
In Failover Cluster Manager, if you go through the quorum configuration wizard and try to use a DFS share, it will fail on the Summary Page with this dialog:

If you attempt to set it through PowerShell, it will fail with this error:

PS C:Windowssystem32> Set-ClusterQuorum -FileShareWitness contoso.comdfs-share
Set-ClusterQuorum : There was an error configuring the file share witness ‘contoso.comdfs-share'.
Unable to save property changes for ‘File Share Witness'.
The request is not supported

There has also been added logic during an online of the File Share Witness as well as the thorough resource health check (IsAlive) to validate if it is on a DFS share. If DFS is added after the fact, these checks will fail the resource.
John Marlin
Senior Program Manager
High Availability and
Follow me on Twitter @JohnMarlin_MSFT


This article was originally published by Microsoft's Core Infrastructure and Security Blog. You can find the original article here.