For CPU-bound workloads, Fsv2 can improve query response time and workload throughput by 20-40%, compared to Gen5 hardware.
What is Fsv2-series hardware?
At the Ignite conference in November 2019, we introduced the preview of two new hardware generations in Azure SQL Database, M-series and Fsv2-series. In a previous blog post, we described a workload that achieved 3x scalability using M-series hardware. In this post, we will take a closer look at the Fsv2-series hardware, available in preview for single databases and elastic pools in the General Purpose service tier.
The most notable feature of Fsv2 virtual machines in Azure is high CPU clock speed. The base processor frequency in these machines is 2.7 GHz, compared to 2.3 GHz in the widely used Gen5 hardware. However, unlike Gen5, all Fsv2 cores in a machine can simultaneously increase processor frequency up to 3.4 GHz in response to increased load. A single core can further burst to 3.7 GHz.
This makes Fsv2 hardware very attractive for running compute-intensive database workloads, such as OLTP workloads dominated by many small queries. Higher clock speed on Fsv2 results in faster response time for individual queries, and in a significant cumulative gain in workload throughput.
Comparing performance between Gen5 and Fsv2 hardware
How does Fsv2 compare to existing Gen5 hardware that is broadly used in Azure SQL Database today? To see the difference in performance between Gen5 and Fsv2 for a typical OLTP workload, we ran a HammerDB test with 60 virtual users against two copies of the same 50 GB database, one using Gen5 hardware, and one using Fsv2 hardware. Since in current preview Fsv2 databases are available only with 72 cores, we used an 80-core Gen5 database as the closest match.
Performance metrics for the two tests were visualized on a Grafana dashboard. Here is a screenshot of the dashboard for the Gen5 database during the test:
And here is the same dashboard for the Fsv2 database:
In these tests, Fsv2 achieved 5,571 batch requests/second on average during a representative time interval, while Gen5 achieved 4,596 batch requests/second in a similar time interval. In other words, Fsv2 overperformed Gen5 by 19%, even though Gen5 had 8 more cores. Normalizing throughput by core, and compensating for the difference in the number of cores, Fsv2 overperformed Gen5 by 30% for this OLTP workload.
Does higher clock speed of Fsv2 help workloads other than OLTP? By how much could it improve analytical queries? To find out, we wrote a typical aggregation query, and ran it against the same two databases. We used both serial and parallel query execution to see the performance impact of query parallelism on different hardware.
SUM(ol.ol_amount) AS w_amount,
COUNT_BIG(ol.ol_quantity) AS w_quantity
FROM dbo.orders AS o
INNER JOIN dbo.order_line AS ol
ON o.o_id = ol.ol_o_id
GROUP BY o.o_w_id
ORDER BY o.o_w_id
OPTION (MAXDOP N);
Query duration was as follows:
For this analytical query, Fsv2 again overperformed Gen5 by 26% at MAXDOP 1, and by 43% at MAXDOP 8.
Choosing between Fsv2 and other hardware generations
How should you choose between Fsv2 and other hardware in Azure SQL Database, such as Gen5 or M-series? Not surprisingly, the answer largely depends on workload characteristics.
As we have just shown, faster clock speed clearly has a big advantage for OLTP and analytical workloads dominated by CPU-bound queries. However, Fsv2 has less memory per core than Gen5 (1.89 GB/core vs. 5.1 GB/core), so for IO-intensive workloads that require more memory for the buffer pool (data cache), the benefits of higher clock speed may be offset by higher cumulative storage IO waits due to more frequent reads from storage. Similarly, analytical queries that require large memory grants may have to wait for other queries to complete and release memory, before they can acquire a grant.
The “sweet spot” for Fsv2 are CPU-bound workloads that do not require a large buffer pool, or many large memory grants. The workload should also have low to moderate tempdb space usage, because Fsv2 has less local SSD storage for the tempdb database. IO-bound workloads that require more memory for the buffer pool, or workloads requiring a large tempdb database, will likely work better on Gen5 or M-series hardware.
Since local SSD storage space is limited on Fsv2, this hardware is only available in the General Purpose service tier that uses remote Azure Premium storage. Workloads that require low latency local SSD storage, or higher availability provided by fast failovers, will work better on Gen5 or M-series hardware in the Business Critical service tier. Workloads that use very large databases (up to 100 TB), and/or need multiple readable replicas, fast scaling, and fast database restore, will work best in the Hyperscale service tier.
Regardless of hardware used, tuning a workload to reduce unnecessary data IO improves performance, often changing a workload from being IO-bound to being CPU-bound, and thus able to benefit from higher CPU clock speed of Fsv2. The Automatic tuning features of Azure SQL Database help you reduce data IO by using optimal indexes and avoiding query plan regressions. Additional improvements can be achieved via manual query performance tuning. For well-tuned workloads that are primarily limited by CPU clock speed, Fsv2 is an excellent hardware choice that will improve query response time and workload throughput.
Fsv2-series is a new hardware option in Azure SQL Database that complements existing Gen5 and M-series hardware and is targeted toward CPU-intensive workloads, where it can improve query response time and workload throughput by 20-40% compared to Gen5 hardware. While in the current preview only the largest Fsv2 databases with 72 cores are available, smaller SKUs are on the roadmap for both single databases and elastic pools. Once available, a broad range of customers will be able to evaluate Fsv2-series for their new and existing Azure SQL Database workloads, and take advantage of performance improvements provided by fast CPUs of Fsv2 hardware.