- 1 Introduction
- 2 How to Test S2D for Health and Performance
- 3 Create a CSV (Cluster Shared Volume)
- 4 Test Network LAN Server-to-Server
- 5 Monitor Storage Performance
- 6 Workload Stress S2D Cluster – VMFleet
- 7 Monitor Network Speed with Resource Monitor
- 8 Monitor Performance and Error Counters with Performance Monitor
- 9 References
This article is the second of a four part series.
- Configuring Core Cluster
- Troubleshooting Storage Clusters [this article]
- Configuring Storage Network Infrastructures
- Managing Storage Clusters
How to Test S2D for Health and Performance
This article provides a guideline for performing health tests for a S2D cluster. The most common configuration issues for a new Storage Spaces Direct (S2D) is the networking and storage components. The tests documented here only cover the network and storage components and provide a starting procedure for doing more comprehensive testing and troubleshooting.
The testing procedures covered are:
- Stress load each network port and test for performance and errors. Often network issues present as reduced network speed caused by hardware problems in the network interconnecting each physical server.
- Perform simple storage stress test on each cluster node. This uses a simple file copy or performance tool between the physical node to the S2D Storage system.
- Load the cluster with a group of VMs running stress tests using VMFleet. This will simulate a large number of VM application workloads.
Instructions on monitoring are included at the end of this article.
Create the volume with two or three storage tiers based on your hardware configuration and types of physical disks in place. The following storage performance tests can optionally be performed on various Storage Spaces Virtual Disk configurations. Since each Virtual Disk configuration may have different performances, this procedure is valuable method to verify storage volumes perform as desired.
Documentation on creating Storage Spaces Volumes
Once the Cluster Shared Volume is created, it should appear in the Failover Cluster Manager console on the Disks section.
The local path on each physical server will be C:\ClusterStorage\Volume1, and in this case C:\ClusterStorage\VMSpace1.
Storage tests use the direct local file path rather than accessing storage via a SMB share.
Test Network LAN Server-to-Server
This step is optional, but useful. Often network issues can be caused by network cable issues or Top of Rack (TOR) switches in the path between cluster nodes. This step will perform network tests on each Network Interface Card (NIC) port.
The test can be performed using either a network performance software utility or by performing file copies to and from the cluster node.
Simple File Copy Tests
This test will require creating a network share on the boot disk on the target server. This will bypass Storage Spaces, isolating the environment to only the servers and local area network.
- Each network port will require an IP address. The server network interfaces will not have IP addresses by default. This IP address will be used to isolate network traffic to the specific NIC port.
- Create a SMB share on the C: drive on each cluster node
- Select a large file and copy this file to and retrieve file to the SMB share
- Multiple copy operations will likely be required to place a large load on the high-speed network interfaces.
- Monitor the network performance and error counters on both source and target cluster server.
Network Test Utility
Numerous network software tools are available which will both create a load of network traffic and provide performance statistics.
- Configure the utility as required to send traffic between a source cluster node to a target network node. Some software utilities have a software component on both the source and target server node. Others require a network share.
- Run the performance tests.
Review Test Results
The file copies between servers should be fast and consistent. Network errors will cause sporadic performance drops and spikes in the Windows Resource Monitor graphs.
Run Windows Performance Monitor and add the following counters:
- RDMA Activity
- Network Adapter
- Network Interface
- Configuring Windows Performance Monitor and adding counters is documented below.
Look at the following Network Adapter Counters and Network Interface Counters
- Output Queue Length
- Packets Outbound Discarded
- Packets Outbound Errors
- Packets Received Discarded
- Packets Received Errors
Network Queue Length should be low. Larger queue lengths occur when network congestion causes packets to queue until processed.
Packet Discarded and Errors should be very low and usually zero for healthy networks.
Monitor Storage Performance
This section will describe storage performance test procedures using both simple file copies and with the Microsoft Disk Speed utility.
File Copy Tests
In this section we describe a simple storage test by copying large files from a S2D storage cluster node to the Storage Spaces direct folder path.
The direct path uses the local path: C:\StorageCluster\<volume name>
Copy data on each cluster node to the local path for the volume being tested. The data being written and read will be processed by Storage Spaces Direct and distributed across all storage nodes.
Monitor the read/write speed during the file copy for speed and consistency. Using File Explorer will show a performance graph during copies. Spikey and sporadic performance will likely indicate configuration or health issues.
Review Physical Disks Health
As the data copies are in process, run the following PowerShell command to show the health status of Storage Spaces disk drives:
Get-StoragePool -IsPrimordial $False | Get-PhysicalDisks
This command will create a list of disk drives and the health status of each. The HealthStatus field of each disk should be “Healthy”.
The following PowerShell show the physical disk SMART counters
Get-StoragePool -isPrimordial $false | Get-PhysicalDisk | Get-StorageReliabilityCounter | fl
The following counters should be monitored periodically.
- ReadErrorsCorrected : 0
- ReadErrorsTotal : 24
- ReadErrorsUncorrected : 24
- WriteErrorsCorrected :
- WriteErrorsTotal :
- WriteErrorsUncorrected :
Some errors may be expected. These error counters incrementing too quickly or accelerate can indicate a disk will soon critically fail.
Documentation on Storage Spaces health states, Troubleshoot Storage Spaces Direct health and operational states.
Microsoft Disk Speed Utility
Microsoft has a utility available on GitHub called Disk Speed (DiskSpd). DiskSpd can perform a variety of disk performance benchmark tests and creates a performance report.
Download DiskSpd and run this utility on one of the cluster storage nodes. Use the Storage Spaces volume as the target disk to test.
An example command to test random concurrent reads of 4KB blocks:
diskspd -c2G -b4K -F8 -r -o32 -W60 -d60 -Sh C:\ClusterStorage\VMSpace1
Review Test Results
As the storage performance tests are in process, monitor the following Windows Performance Monitor counters:
- Cluster CSVFS
- RDMA Activity
- Storage Spaces Tier
- Storage Spaces Virtual Disk
- Storage Spaces Write Cache
- Cluster Storage Cache Stores
The counters for Storage Spaces Direct are numerous and complex. Review the counter values for reasonable values. The Cluster Storage Cache Stores will show Cache performance.
The RDMA Activity will report on the RDMA health and activities.
More information on Storage Performance Counters at Windows Performance Monitor Disk Counters Explained.
Workload Stress S2D Cluster – VMFleet
Microsoft has provided a software package which will create a workload for S2D hyperconverged systems. VMFleet will launch a group of VMs running Diskspd and place a load on the S2D cluster network and storage subsystems.
VMFleet is part of DISKSPD available on GitHub. Instructions on installing and running VMFleet can be found at Leverage VM Fleet Testing the Performance of Storage Space Direct.
VMFleet can be configured to run for an extended time and apply as much of a stress workload as desired. At the end of the tests, VMFleet will return a report of performance calculations including overall storage performance measurements.
Monitor Network Speed with Resource Monitor
Windows Resource Monitor can be launched on any of the storage node desktops.
Run: All Programs → Windows Administrative Tools → Resource Monitor
Select the Network tab to display the network activities for each physical and virtual network interface.
Monitor Performance and Error Counters with Performance Monitor
The following steps show how to configure Windows Performance Monitor to select and display specific system counters.
Windows Performance Monitor can be launched on any of the storage node desktops.
- Start Windows Performance monitor as Administrator.
Run: All Programs → Windows Administrative Tools → Performance Monitor
- Select Performance Monitor
- Click on the green plus icon to add counters to the Performance Monitor.
After clicking the icon, the Add Counters form will appear.
- To add counter groups, select the counter group, then click Add.
Continue selecting another group or click OK to display all the selected counters.
- The default display is a line graph of each counter.
- Click on the display type pulldown arrow and select Report.
The Report layout will display all of the active counters selected.
- S2D Volume Planning
- Performance tuning for Storage Spaces Direct
- S2D Software Storage Bus Cache
- Monitor Storage Tiers Performance in Windows Server 2012 R2
- Understanding S2D Storage Cache
- Understanding SSD endurance: drive writes per day (DWPD), terabytes written (TBW), and the minimum recommended for Storage Spaces Direct
- Network-Related Performance Counters