The impact of Generative AI (GenAI) on data center infrastructure has been significant, reshaping the requirements for high-performance computing and altering the traditional design and operation of data centers. The increasing demand for power and cooling in GenAI data centers poses new challenges for facility design and operations.
I recently had the opportunity to meet with Wes Cummins, CEO of Applied Digital, to discuss these new requirements.
Cummins highlighted that traditional data centers typically consume around 7.5 kilowatts (kW) per cabinet for several servers, networking, and storage. However, a single NVIDIA H100 server, essential for high-performance AI tasks, requires over 10 kW. This disparity highlights a fundamental shift: next-generation data centers must support higher power densities. Applied Digital anticipates that next-generation data centers must design facilities with far more power than traditional setups.
How much more power? Cummins stated that Applied Digital designs all new facilities with power densities ranging from 50 kW to 150 kW per rack. This upgrade is crucial for handling the dense clusters of GPUs required for AI computations. The increased power density also introduces challenges related to cooling, as managing heat becomes more complex with higher power consumption.
Design Considerations for AI Data Centers
Next, we discussed what organizations should consider when planning next-generation data centers to support AI workloads. There are several critical criteria, including:
- Access to Power: The foremost consideration is ensuring a robust and reliable power supply. GenAI data centers require substantial power, often exceeding 200 megawatts at a single location. Applied Digital’s approach includes securing high-capacity and renewable power sources. For example, their upcoming Ellendale, North Dakota campus will provide 400 megawatts of critical IT load, a testament to their commitment to meeting high power demands.
- Cooling Efficiency: Efficient cooling is integral to managing the heat generated by these high-performance computing environments. Applied Digital favors northern regions for their data centers due to cooler climates, which enhance cooling efficiency and reduce Power Usage Effectiveness (PUE). PUE is a metric used to gauge data center efficiency, and the goal is to be as close to 1 as possible; a lower PUE indicates that more of the power consumed is going to actual computing rather than cooling and other infrastructure needs. The Ellendale facility, for instance, is projected to achieve a PUE of 1.17.
- Site Selection: The choice of physical location is influenced by environmental factors and the ability to drive operational efficiency. Northern climates not only provide natural cooling but also help in minimizing operational costs. Additionally, proximity to renewable energy sources, such as wind farms, plays a critical role in reducing the environmental footprint of these facilities.
- Connectivity and Latency Considerations Connectivity remains a vital aspect of data center infrastructure, particularly for AI applications that involve large-scale data transfer. Cummins emphasized the importance of fiber connectivity, stating data centers should have at least two diverse fiber routes with ultra-high bandwidth capabilities. This redundancy ensures availability and reliable data transfer, which is crucial for AI training operations. While GenAI training workloads are less sensitive to latency outside the data center, internal latency is vital. That is why Nvidia incorporates InfiniBand technology for high-speed data transfer between GPUs, as it delivers low-latency connections within the data center. It should also be noted that Ethernet solutions are also being used due to their widespread adoption and available skill sets. To help, the Ultra Ethernet Consortium is focused on explicitly enhancing Ethernet for GenAI workloads.
Sustainability Initiatives
Sustainability is a significant concern in the data center industry, especially with the increasing power demands of AI. Applied Digital recommends addressing these concerns through various strategies:
- Renewable Energy: A key component of any new data center’s sustainability strategy is sourcing power from renewable sources. Choosing renewable energy aligns with global efforts to reduce the carbon footprint of data centers and supports the transition to greener energy sources. For example, the Applied Digital Ellendale, ND facility will be powered by wind energy, with approximately two gigawatts of wind power feeding into its substation.
- Efficiency Improvements: Beyond sourcing renewable energy, organizations should focus on improving the efficiency of their operations. Achieving a low PUE ensures that a more significant portion of the power consumed is used directly for computing rather than cooling. The Applied Digital facility in North Dakota exemplifies this commitment with its impressive PUE of 1.17.
- Heat Recovery: Looking towards the future, organizations need to explore innovative ways to utilize excess heat generated by their facilities. For instance, transitioning to liquid cooling systems makes capturing and repurposing heat easier. Potential applications include heating greenhouses, shrimp farming, or other agricultural projects on or near the data center. Applied Digital believes there are several options that could drive higher levels of sustainability and additional benefits to the local communities.
Ecosystem and Partnerships
Building and operating high-performance data centers requires collaboration with various partners. Applied Digital shares its diverse ecosystem of technology providers to deliver optimal and energy-efficient data center environments:
- Technology Partners: Key partners include NVIDIA, Super Micro, and Dell. Applied Digital is an elite tier partner with NVIDIA, which provides them with advanced GPU technology and support. Super Micro supplies the servers with NVIDIA GPUs, while Dell also offers high-performance Nvidia GPU solutions. These partnerships ensure Applied Digital’s data centers have the latest and most efficient technologies.
- Supply Chain and Design: Beyond hardware, Applied Digital collaborates with partners for data center design and construction. This collaboration extends to leveraging NVIDIA’s expertise in data center design, ensuring that facilities are optimized for high-performance computing and AI workloads.
Future Development and Opportunities
Applied Digital is preparing for future expansion as the demand for AI continues to grow. The Ellendale campus, while currently still being built out, is already under a Letter of Intent,(LOI) with one party. However, Applied Digital has additional contracts for power resources in several locations that meet the above criteria. Potential clients can benefit from their expertise and advanced infrastructure solutions as they prepare to roll out new sites. Organizations interested in securing space in these next-generation data centers are encouraged to connect with Applied Digital directly.
Conclusion
The conversation with Wes Cummins provided valuable insights into the evolving landscape of next-generation data centers driven by GenAI. Applied Digital’s approach to addressing the high power and cooling demands, optimizing connectivity and latency, and incorporating sustainable options where possible highlights its leadership in this space. By leveraging strong partnerships and focusing on efficiency, Applied Digital is well-positioned to meet the growing needs of AI and high-performance computing.
For more information about Applied Digital and its AI data center solutions, visit its website at Home—Applied Digital Corporation (APLD).