Formerly known as Wikibon

Welcome to the AI Factory Era. Preview of Dell Technologies World 2025

In just a few years, computing has undergone a massive shift. What was once a marketplace dominated by general‑purpose servers and monolithic datacenters has fractured into a complex ecosystem of specialized accelerators, hyper‑scaled clusters, edge‑enabled devices, large‑scale cloud providers, and sovereign‑cloud platforms. At the center of this transformation stand two sides of the market growth, 1) NVIDIA, the incumbent kingpin of merchant GPUs, and 2) everyone else, the established semiconductor and infrastructure players.

I’m often asked who will win? Will it be NVIDIA or everyone else. The answer is both. The demand for AI applications is fueling an infrastructure renaissance not seen since the 1990s. Now it’s a 100× value enablement that weaves together competitive narratives—and sets the agenda for Dell Technologies World.

Let’s explore the trends powering the “AI Factory Era,” referring to the large‑scale systems that support training, reasoning, and inference at unprecedented scale. I’ll break down the following areas, each critical in this supercycle of transformation:

  • GPU and xPU Adoption: Why AI acceleration is the linchpin of modern compute, and how mixed GPU/xPU clusters are reshaping CapEx and supply‑chain strategies.
  • Integrated AI‑Factory Solutions: The rise of open vs. closed systems, hyperscalers vs. enterprises, and the urgent need for holistic hardware‑and‑software stacks.
  • Portfolio Diversification & Market Recovery: How broad technology portfolios are protecting vendors, even as industrial and automotive verticals falter.
  • Edge AI, Telco & Sovereign‑Cloud Momentum: Why operators at our recent Cube event at Mobile World Congress demand on‑country AI and hybrid‑cloud architectures, and how these trends force a rethink of network design.
  • AI Model Efficiency & New Usage Patterns: From reasoning‑driven token surges to multi‑step inference, and how these innovations create a flywheel of AI Factory growth.
  • Revenue Growth & Operating Performance: The business case for massive AI refresh cycles, R&D‑driven differentiation, and the emergence of Chief AI Officers.
  • Tariff & Supply‑Chain Agility: How U.S.‑China tensions have scrambled manufacturing capabilities, and why agile logistics are now a competitive advantage.
  • Emerging Areas to Watch: From the “datacenter as computer” mindset to co‑packaged silicon‑optic modules, open‑stack momentum, and hyperscaler bellwethers.


GPU and xPU Adoption: The New Cost‑and‑Control Imperative

AI workloads—whether foundation‑scale training or real‑time reasoning—are hungry beasts, devouring FLOPS, memory bandwidth, network interconnects, and rack‑scale power. NVIDIA’s GPUs have fed that hunger for years, carving out over 90% share of AI system shipments. Their innovative GPUs, the CUDA ecosystem, and smart software libraries have made them the default choice for Amazon, Google, Microsoft, and Meta as they built ever‑more ambitious AI Factories.

But this dominance carries risk. Hyperscalers routinely face “GPU rationing”—quarterly quotas, long lead times, and price volatility. They’re at the mercy of shifting U.S. export controls and potential Chinese counter‑measures. For companies spending tens of billions on AI CapEx, the ability to “own their destiny” is a strategic imperative. Many, like AWS, are building—or contemplating—custom chips and system software.

Others are rallying around xPUs: custom AI accelerators co‑designed with hyperscaler partners to speed critical kernels. These ASICs sit alongside NVIDIA GPUs in standard PCIe slots, enabling mixed clusters that deliver comparable throughput at 20–30% lower CapEx. More importantly, they provide a credible second source of compute, giving buyers the bargaining power to secure GPU allocations more timely. As one industry executive said, “It’s not just about saving money—it’s about controlling the future of your AI infrastructure.”

Open Systems vs. Closed Silos

The rise of xPUs amplifies a deeper debate. Should AI Factories be built on open, composable architectures or closed, tightly integrated stacks?

  • Open systems champion modularity which mean pick the best GPU, the best network switch, the best storage array, and bolt them together. Hyperscalers have long favored this path, building custom racks with disaggregated components at scale.
  • Closed systems promise turnkey simplicity where one vendor provides compute, network, storage, and software in a validated stack. Enterprises who seem less prepared to assemble and maintain giant clusters—are gravitating toward this model, especially for PoCs.

Today’s AI landscape swings between these extremes. Hyperscalers value open composability despite the integration burden, while enterprises crave turnkey AI solutions but worry about vendor lock‑in. Proofs‑of‑Concept are bleeding into production, yet startups—starved for clarity on where to add value—still struggle to articulate services in a half‑open, half‑closed world.

Integrated Solutions for the AI‑Factory Era

The vision of an AI Factory extends beyond GPUs or xPUs. It spans training, reasoning, and inference—each with unique workload profiles. It demands networking and storage architectures tailored to massive clusters and hardware‑aware orchestration software that places workloads optimally.

Training: Bulk Parallelism at Scale

  • High‑performance compute: GPUs/xPUs with balanced tensor throughput and memory bandwidth.
  • High‑bandwidth fabrics: Switch ASICs such as Broadcom’s Tomahawk series, NVIDIA’s Spectrum line, or emerging custom fabrics optimized for collective operations.
  • Low‑latency optics: Pluggable transceivers or co‑packaged silicon‑photonics modules to minimize link failures and power consumption.

Reasoning: The 10× Token Surge

The rise of multi‑step reasoning—on‑the‑fly decision branches and chain‑of‑thought processing—changes everything. A 10‑million‑token Q&A can balloon to 100 million tokens once reasoning is enabled, multiplying network traffic and GPU cycles tenfold. Clusters must rethink interconnect design, buffer sizing, and bursting performance to support these peaks.

Inference: Edge to Cloud

Inference workloads range from high‑throughput, low‑latency cloud requests to resource‑constrained real‑time edge deployments. Use cases—agentic chatbots, generative recommendation engines, digital twins—often demand sub‑10 ms response times. Network operators envision AI‑powered network slices in cell towers, while embedded devices—from autonomous drones to industrial robots—require specialized ASICs and chip‑plus‑FPGA hybrids.

The optimal AI Factory spans central cloud clusters for training, regional on‑premises clusters for inference, and edge micro‑clusters for low‑latency tasks. Hardware configurations—GPU/xPU ratios, NIC speeds, NVMe topologies—must adapt fluidly to workload demands.

Portfolio Diversification & Market Recovery

A broad mix of systems and technologies is essential, covering general‑purpose AI deployments and highly specialized vertical workloads. Although enterprise AI budgets have stalled, they’re beginning to rebound. A breakout is expected in 2026 as organizations seek efficiency and ROI. Meanwhile, industrial and automotive sectors highlight supply‑chain risks and uncertainties that diversified portfolios mitigate.

Vendors with broad portfolios of chips, software, and services are better insulated against downturns. Not all vendors will win with a single SKU; diversification is key.

The Enterprise Break‑Out

Enterprises who are slower and sometimes hesitant to invest in AI at scale are quietly ramping up budgets. By 2026, corporate AI CapEx is expected to accelerate to high levels driven by:

  • Efficiency mandates: CFOs demanding proof of ROI from PoC pilots.
  • CapEx rationalization: Stretching every dollar via mixed GPU/xPU clusters and hardware‑accelerated inference appliances.
  • C‑Suite sponsorship: The rise of Chief AI Officers who centralize strategy, consolidate platform decisions, and drive enterprise‑wide roll‑outs.

Early examples include financial services piloting real‑time credit scoring clusters, manufacturers embedding AI‑driven defect detection, and retailers personalizing shopping experiences with on‑prem inference.

Edge AI, Telco & Sovereign‑Cloud Momentum

At Mobile World Congress 2025 in Barcelona, conversations on theCUBE with 30+ enterprises and operators revealed a consensus: sovereign‑cloud AI is real and urgent. National regulations mandate that sensitive data and model execution stay within domestic borders. Operators want turnkey, on‑country AI clouds that rival AWS, Azure, and Google Cloud but remain under local control. They can deploy inference nodes within regulated borders, integrate them into national 5G cores, and offer AI‑as‑a‑service without offloading data to foreign clouds.

Edge AI & Physical‑AI Convergence

Edge AI embeds intelligence in the physical world—smart cameras that detect safety hazards, autonomous logistics robots, and digital twins monitoring factory floors. New AI‑enabled chipsets will power these devices, blurring OT and IT boundaries.

By 2026, tens of millions of edge nodes will form a distributed AI environment that complements centralized cloud systems. Physical AI and robotics will become major vectors in AI infrastructure.

AI Model Efficiency & New Usage Patterns

Every month we are seeing major leaps in algorithmic efficiency, fueling demand via new reasoning workloads. Traditional Q&A might consume 10 million tokens; multi‑step chain‑of‑thought reasoning blasts that to 100 million tokens. Real‑time agents consulting knowledge graphs add more compute layers, driving ever‑larger clusters.

The Innovation Flywheel

  1. Algorithmic and system improvements reduce per‑token costs by 2–4×.
  2. New usage models (chain‑of‑thought, retrieval‑augmented generation, multi‑modal tasks) increase per‑session token consumption by 10–20×.
  3. Cluster purchases scale with demand, driving rack deployments.
  4. Hyperscalers and enterprises renew or expand contracts for GPUs, xPUs, switches, and optics, boosting vendor revenues.
  5. Vendor R&D doubles down on integration, power efficiency, and turnkey automation, feeding back into algorithmic gains.

This self‑reinforcing cycle cements AI Factories’ centrality in computing and rewards vendors with unified, end‑to‑end platforms.

Revenue Growth & Operating Performance

Vendors that master AI Factory solutions stand to reap windfalls. Hyperscalers committed over $300 billion in CapEx for 2025, two‑thirds of which will support AI compute and networking. Enterprises, now guided by Chief AI Officers, are poised to unlock new CapEx waves for AI‑native infrastructure.

Tariff & Supply‑Chain Agility

No discussion of global AI infrastructure is complete without U.S.‑China tensions. In April 2025, proposed semiconductor tariffs threatened 10–25% duties on key components. Manufacturers are shifting export/import duties onto customers and relocating footprints from China and Vietnam to Mexico and Canada. Agile logistics—turning geopolitical risk into a competitive moat—will cement customer trust in an uncertain world.

Emerging Areas to Watch

As we enter the heart of the AI Factory era, several innovation curves demand attention:

  • Datacenter as Computer: Racks and blades dissolve into pooled accelerators, composable networks, and smart storage. System architects become software‑defined infrastructure designers.
  • Co‑Packaged Silicon & Optics: Breaking copper‑reach limits could enable 1,000+ GPU/xPU clusters per hop, if power and thermal budgets are mastered in a single module.
  • Scale‑Up vs. Scale‑Out: Will clusters adopt fat‑node topologies (scale‑up) or disaggregated fabrics (scale‑out)? The outcome will shape switch ASIC design, optical standards, and cabling.
  • Open‑Stack Momentum: Hyperscalers’ blueprints—DC/OS, Kubernetes, Ray—could merge with enterprise frameworks, spawning new AI orchestration platforms.
  • AI‑Driven System Configuration: AI itself will optimize infrastructure—using reinforcement learning to tune cluster parameters, allocate resources, and auto‑remediate faults.
  • Bellwether CapEx Spenders: Amazon, Alphabet, Microsoft, and Meta. Their combined AI spending—over $200 billion annually—will set the price‑performance bar for hardware suppliers.

What’s Next on the AI Factory Roadmap

We stand at the dawn of a new computing era. AI Factories will drive the next wave of productivity, automation, and innovation across every industry. Established and emerging players are racing to fill niches in the AI‑factory supply chain.

For enterprises and hyperscalers alike, the message is clear: the accelerator, network, cloud, and supply‑chain choices you make today will determine your competitive posture for the decade to come. The AI Factory era starts now.

Article Categories

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
"Your vote of support is important to us and it helps us keep the content FREE. One click below supports our mission to provide free, deep, and relevant content. "
John Furrier
Co-Founder of theCUBE Research's parent company, SiliconANGLE Media

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well”

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content