The world of hyper-converged infrastructure (HCI) is witnessing a revolutionary shift with VMware’s introduction of vSAN 8.0 Express Storage Architecture (ESA) on VxRail. This advanced technology promises to unlock new performance levels for customers, specifically when paired with 100 Gigabit Ethernet (GbE) cards. A detailed analysis we did with Bill Leslie, Director of Technical Marketing Engineering at Dell Technologies, sheds light on this significant advancement.
Understanding vSAN 8.0 ESA
Hardware Evolution and ESA
Previous iterations of vSAN struggled to fully leverage advancements in hardware, particularly in the networking domain. With vSAN 8.0 ESA, VMware has rearchitected the storage solution to exploit the latest hardware technologies effectively. This particular implementation we’re assessing is delivered through Dell’s VxRail, integrating these advancements into the HCI platform.
ESA vs. Original vSAN
It’s crucial to understand that ESA does not replace the original vSAN; rather, it offers an alternative architecture. The original vSAN, designed around 2012, was aligned with the then-prevalent spinning hard disk drives in mind and the shift from 1Gb to 10Gb networks. ESA, on the other hand, is a reimagined approach that maximizes the capabilities of current-generation hardware, including NVMe flash drives and advanced networking technologies.
Testing and Performance Analysis
Isolating the Impact of 100 GbE Network Cards
In traditional system architectures, mechanical disk units were quite often the bottleneck. With today’s NVMe semiconductor-based flash storage devices, network performance becomes more critical and exposes new bottlenecks. The objective of this study was to isolate the impact of faster networking to determine the preferred configuration for vSAN ESA.
Table 1 shows the test configurations. The objective was to isolate and evaluate the impact of network performance on these clusters. The benchmarks (done by Dell) provide a comparative analysis of two VxRail clusters, with the only difference being the networking configurations. Both setups use identical hardware, software, policies, and other components, except for the networking.
One cluster utilized a 25 gigabit Ethernet (GigE) solution, while the other employed a more robust 100 GigE configuration. The network interface cards were sourced from Broadcom. The distinction in networking was deliberate to understand how different network capabilities affect the overall performance of the clusters. Because the benchmarks maintain all variables constant, except for the networking hardware, the results ensure that any performance differences are directly attributable to the network speed.
The results shown below are clear. The 100 GbE network cards showed a marked improvement in performance, demonstrating the importance of advanced networking in realizing the full potential of vSAN ESA.
Real-World Test Scenarios
RAID 1 mirroring is a common historical technique to address performance challenges associated with de-staging cache to backend disk systems. However, mirroring is is more expensive than RAID 5 and RAID 6 parity approaches, forcing sub-optimized tradeoffs for customers – i.e. having to choose more expensive RAID 1 mirroring to maintain consistent performance. New architectural innovations address this conundrum by enabling advanced networking technologies to be both high performance and cost effective.
- Block Size Variation and Read-Write Mix: Figures 1 and 2 show the test which varies block sizes and read-write ratios to simulate real-world workloads. The performance with 100 GbE was significantly higher and suggest that the new ESA architecture is capable of cost effectively leveraging higher performance networking.
- RAID Configurations and Data Services: With vSAN ESA, RAID 5 and RAID 6 configurations with compression outperform the original vSAN’s RAID 1 setup. This shift not only boosts performance but also enhances usable storage capacity and cost efficiency.
The analysis above shows the blue line representing the 25 GigE config and the green line represents the 100 GigE. The key point of focus is the ‘knee of the curve’ for each line, which shows the rate of latency degradation (vertical axis) at the various IOPS levels (horizontal axis).
The specific workload being tested includes the workload size (22 KB), the read write mix and the RAID configuration used. The purpose is to provide customers with enough information to replicate these tests in similar configurations. Note the RAID 6 configuration which provides significant improvements in vSAN ESA’s RAID architecture compared to vSAN Original Storage Architecture (OSA). Notably, RAID 5 and RAID 6 configs in ESA perform better with compression enabled than RAID 1 in OSA.
The inflection point in the blue curve (25 GigE) indicates the theoretical limits of that config’s capacity. In contrast, the green line (100 GigE) shows performance exceeding these limits, particularly on the right side of the chart where the 25 GigE latency curve spikes. This difference is crucial, as it demonstrates a nearly 50% performance gain with the 100 GigE over the 25 GigE in this specific workload. The implication is that using 100 GigE in VxRail or vSAN ESA environments can unlock significant untapped potential, highlighting the importance of choosing the right network to fully harness a system’s capabilities.
Increasing the Block Size to 32KB
Figure 2 below shows the results when varying the block size to 32KB.
Figure 2 above shows the impact of varying the block size to 32KB with an increased percentage of read operations. The previous test shows a more write intensive environment, which are typically more challenging for performance.
The key observation is the difference in performance curves between the two network configurations is even more dramatic. The 25 GigE configuration (blue line) shows a steep spike in latency or performance degradation, while the 100 GigE configuration (green line) experiences a spike much later and more smoothly. This suggests that the 100 GigE network handles increased reads and larger block sizes much more efficiently than the 25 GigE network.
The data show the top-end limit being hit by the 25 GigE network, indicating the performance bottleneck customers face. In contrast, the higher performance networking config unlocks the potential of the newer generation Intel processors used in the test and shows these CPUs can significantly enhance performance when paired with the 100 GigE network. This combination pushes the performance much further, almost doubling the overall level compared to the 25 GigE network.
It’s important to reiterate that all other variables (processors, memory, drives, software, etc.) remained constant during this test, underscoring that the observed performance improvements are solely attributable to the upgrade from 25 GigE to 100 GigE networking. The test results serves as an “eyeopener,” demonstrating how significantly different the performance can be with the appropriate networking choice in VxRail and vSAN ESA nodes.
Takeaway: These results challenge the conventional notion that performance bottlenecks restrict the use of more cost effective RAID 5 and 6 configurations with VxRail and vSAN.
Top End Performance Envelope
The previous tests showed real world workloads so that customers can conduct similar tests in their environments. Often with performance benchmarks, tests are run to show the architectural potential of the system. Even though such workloads may not represent real world examples, they are instructive to identify the architectural limits of new systems. Figure 3 below shows such tests.
The intent of these benchmark tests, as explained by Bill Leslie, is to explore the performance boundaries of these systems with the largest block sizes, a scenario where previous generations faced challenges, particularly with cache drives in vSAN. Large block, high throughput workloads often lead to performance bottlenecks, causing customers to develop workarounds to maintain consistent performance levels.
The bars above in Figure 3 show a near doubling of throughput potential across various workload types when utilizing the 100 Gig E networking cards in these nodes. This improvement is especially notable in the context of write operations, which were previously identified as a major performance challenge in the vSAN OSA architecture. The near doubling of throughput in write operations shows that the technology in these platforms is being fully harnessed.
Takeaway: The test results demonstrate a substantial enhancement in the system’s ability to handle large block, high throughput workloads, particularly in write operations. This indicates that the latest VxRail and vSAN technologies, with their advanced networking hardware components, offer a significant performance advantage over previous generations, especially in scenarios that demand high capacity and throughput.
Implications for Customers
Cost-Performance Analysis
While initially more expensive on a component basis, the 100 GbE networking option in the context of vSAN ESA offers a compelling cost-performance ratio. This setup requires fewer nodes for the same workload, translating to significant savings when evaluating a full total cost of ownership (TCO).
Expanded Use Cases
vSAN ESA extends the scope of HCI applications, making it suitable for more demanding tasks like performance-intensive databases. It provides a blend of high performance, operational efficiency, and cost-effectiveness, challenging the traditional limitations of HCI.
Future-Proofing Infrastructure
With the rapid evolution of technology, investing in vSAN 8.0 ESA and 100 GbE networking prepares enterprises for future demands, ensuring that their infrastructure can handle increasing data volumes and complex data-intensive and AI workloads more efficiently. We see this as a game changing innovation brought forth with the combination of higher performance networking and a balanced architecture that eliminates historical tradeoffs.
Watch the full video analysis:
This technological leap heralds a new era in HCI, redefining performance benchmarks and setting new standards for enterprise storage solutions.
To delve deeper into the technical details and performance metrics of VMware’s vSAN 8.0 ESA on VxRail, readers can access the white papers and benchmark studies on Dell Technologies’ InfoHub:
This research was commissioned by Dell Technologies, Inc.