Building Performant, Portable Apps for AI and Beyond

By Paul Nashawaty | April 09, 2025

The modern application stack is undergoing one of the most consequential transformations since the birth of cloud-native. With serverless computing returning to center stage and edge environments now treated as primary destinations—not just content caches—developers are embracing new tools and approaches that can scale intelligently, run everywhere, and keep pace with the demands of real-time AI.

At the core of this shift is WebAssembly (Wasm), which has quickly evolved from a browser-based novelty into a legitimate foundation for portable, secure, and high-performance execution. WebAssembly’s platform-agnostic capabilities, coupled with the resurgence of serverless design principles, are unlocking new opportunities for organizations trying to bridge modern innovation with legacy infrastructure. Whether you’re deploying applications to retail branches, manufacturing floors, or smart cities, the ability to build once and run anywhere—quickly—is critical.

As Matt Butcher, CEO of Fermyon, put it during our recent conversation:

“The neat thing about a binary-neutral format like WebAssembly is that when you’re dealing with AI inferencing or things that are typically hardware-bound, having that layer of abstraction means the developer can write code that’s going to work and run anywhere.”

The rise of WebAssembly and serverless at the edge

There’s a growing consensus among developers and infrastructure teams alike: the edge is no longer a second-class citizen. In fact, according to our research, 39% of organizations are already using WebAssembly in some capacity, and we expect that number to climb rapidly. What’s driving this momentum? The confluence of serverless design patterns, edge-native architectures, and AI-driven applications is forcing a rethink of where—and how—compute happens.

As Matt explained, serverless began as a novel way to offload work onto cloud provider infrastructure, with AWS Lambda pioneering the approach. That model, however, was built on spare compute cycles and Gen 1 infrastructure, not optimized runtimes. The limitations of cold start latency—often measured in hundreds of milliseconds—posed significant performance issues for frontend experiences and real-time systems.

“We hit a limit… You cannot run serverless functions as frontline compute because there’s a cold start problem… 200 to 500 milliseconds sounds small, but it’s significant. Google research shows bounce rates start climbing at just 100 milliseconds of delay.”

To push serverless beyond its utility compute roots, developers need faster startup times, better portability, and the ability to run workloads closer to users. That’s where WebAssembly shines. Lightweight, architecture-neutral, and blazing fast, Wasm is proving to be the runtime of choice for next-gen edge-native applications.

As performance-sensitive workloads move to the edge—driven by e-commerce, media, and AI—the ability to execute secure, fast, and portable code is essential. “Edge-native” isn’t just a buzzword. It reflects a reality where compute spans a continuum, from data centers to retail branches to cars and cameras in the field.

“Edge-native computing means the application doesn’t live in the data center and get accelerated by the edge—it lives at the edge… You’re no longer just deploying to Kubernetes in US-West-2. You’re deploying to a vast network of global points of presence.”

AI inferencing meets WebAssembly: A new performance paradigm

It’s impossible to talk about edge computing without mentioning AI. In our recent research, 54% of edge workloads are now using AI in production, and inferencing at the edge is becoming a standard part of modern architectures. But traditional thinking—throw more GPUs at the problem—isn’t viable in constrained environments.

Matt emphasized the importance of separating AI training and inference. While training models still requires massive GPU clusters, inferencing can—and should—happen closer to the user, reducing latency and improving personalization. Fermyon’s work with Akamai, the world’s largest edge network, reflects this vision.

“You can’t upgrade all of Akamai’s points of presence to have huge hardware profiles. You need a runtime that’s lightweight and performant. That’s where WebAssembly is a perfect fit.”

Running inference at the edge isn’t just about performance—it’s also about sovereignty and relevance. Delivering prompts that are geographically and contextually appropriate makes AI outputs more useful and efficient. With tools like Fermyon’s platform and Google’s new autoscaling capabilities for WebAssembly, organizations are starting to unlock this edge-AI synergy in real deployments.

Looking ahead: The year of Wasm, Java, and AI optimization

2025 is shaping up to be a breakout year for WebAssembly. One major driver? Broader language support. With Oracle now bringing Java into the Wasm ecosystem, the final barrier to mainstream enterprise adoption may be falling. When developers can use WebAssembly with .NET, Java, and Rust without rewriting heritage applications, it opens up significant transformation potential.

“When you see WebAssembly meet Java and .NET—that’s the tipping point. You’ll start to see very massive adoption of WebAssembly in classic enterprise architectures.”

The edge-native, Wasm-powered future won’t be evenly distributed—but it’s already here for those who are paying attention. The rise of developer tooling (like Spin and SpinKube now under CNCF), AI model optimization for inferencing, and real-world announcements from players like Akamai and Google are moving us from theory to execution.

The developer experience is also catching up. As Matt and I discussed, more organizations are realizing they don’t need to rewrite the world—they just need to extend and modernize what they already have, with the right abstractions in place.

Whether you’re deploying to smart cities, retail locations, or embedded devices, WebAssembly is becoming the foundational substrate for a new generation of apps—apps that are smarter, faster, and closer to the people they serve.

Article Categories

By Paul Nashawaty | April 09, 2025

Paul Nashawaty

You may also be interested in

286 | Breaking Analysis | Cloud Quarterly – Azure’s AI Pop, AWS Supply Pinch & Google Execution

David Vellante August 9, 2025

Futuristic illustration of globally distributed databases, showing neon-lit cloud and on-premises databases connected across a glowing Earth.

Oracle Redefines Distributed Database Architecture with Exascale

Rob Strechay August 7, 2025

Building Performant, Portable Apps for AI and Beyond

The rise of WebAssembly and serverless at the edge

AI inferencing meets WebAssembly: A new performance paradigm

Looking ahead: The year of Wasm, Java, and AI optimization

Article Categories

Paul Nashawaty

You may also be interested in

Oracle Redefines Distributed Database Architecture with Exascale

Studio Locations

Research Areas

Podcasts

Solutions

Engage

Stay Connected

theCUBE Research weekly

Building Performant, Portable Apps for AI and Beyond

The rise of WebAssembly and serverless at the edge

AI inferencing meets WebAssembly: A new performance paradigm

Looking ahead: The year of Wasm, Java, and AI optimization

Article Categories

Paul Nashawaty

You may also be interested in

286 | Breaking Analysis | Cloud Quarterly – Azure’s AI Pop, AWS Supply Pinch & Google Execution

Oracle Redefines Distributed Database Architecture with Exascale

Book A Briefing