At the “What’s Next with AWS?” event in San Francisco, Amazon Web Services delivered a clear message to enterprise organizations: the AI conversation is no longer about simply accessing models. It is now about operationalizing agents, simplifying inference, and building AI-powered applications that can execute real business workflows at scale.
The announcements and discussions throughout the event reflected a broader shift occurring across the enterprise market. Organizations are moving beyond AI experimentation and proof-of-concepts toward operational deployment. The emphasis is rapidly shifting from training models to managing inference, orchestrating agents, integrating enterprise systems, and measuring Return on AI (ROAI).
That transition matters because enterprises are beginning to realize that agentic AI is not a model problem. It is an infrastructure, orchestration, data platform, and token economics problem.
During my discussion with Bharat Sandhu, AWS’s lead for AI and ML marketing, the focus repeatedly returned to one central theme: AWS aims to solve what it calls the “last mile” of AI adoption.
The Enterprise AI Market Has Reached an Inflection Point
One of the most revealing data points shared during the discussion was the scale of growth AWS is seeing around inference and agents. According to Sandhu, the number of tokens processed on Amazon Bedrock during Q1 2026 exceeded the total number of tokens processed across all previous years combined.
That statistic is important for several reasons.
First, it reinforces how quickly enterprise demand for AI services is accelerating. Second, it highlights how inference, not training, is rapidly becoming the dominant operational workload in enterprise AI environments. Finally, it shows that organizations are increasingly standardizing around platforms that simplify deployment, governance, and operational management.
The reality is that the barriers to entry for AI development have collapsed.
Where organizations previously needed specialized data science teams and highly customized infrastructure stacks, enterprises can now begin building agentic workflows with far less operational friction. AWS is betting heavily that simplifying this process will become one of the biggest competitive differentiators in the AI market.
Bedrock Becomes the Operational Layer for Enterprise AI
A major focus of the event was the continued expansion of Amazon Bedrock as both an inference platform and an agent-building environment.
AWS positioned Bedrock as more than simply a managed model marketplace. The company increasingly views Bedrock as the operational foundation for enterprise AI applications. This includes:
- Hosting models
- Managing inference
- Supporting agent runtimes
- Enabling orchestration
- Simplifying framework integration
- Providing cost optimization controls
- Managing security and governance
The most strategically important announcement centered around Bedrock Managed Agents powered by OpenAI.
AWS described the enterprise agent market as splitting into two primary architectural models:
- Opinionated, tightly integrated model-and-harness systems optimized for simplicity and speed
- Open, modular architectures that prioritize flexibility and customization
This is a broad industry direction, and represents a significant evolution.
AWS explained that modern agentic systems increasingly rely on what it called a “harness,” the orchestration and operational logic surrounding a model. That includes:
- Tool calling
- Workflow execution
- Memory handling
- API integration
- Runtime orchestration
- Skill execution
- MCP connectivity
The key insight is that models and harnesses are now being optimized together. OpenAI’s models and orchestration layers are increasingly developed to improve performance and reduce operational complexity. AWS is integrating these tightly coupled systems into Bedrock so enterprises can deploy production-grade agents more quickly and with less tuning.
This reflects a broader shift in enterprise AI architecture.
The market is moving away from isolated model experimentation and toward integrated AI operational systems.
AgentCore Supports the Open AI Ecosystem
At the same time, AWS is carefully balancing openness and abstraction.
While Bedrock Managed Agents provide a simplified, opinionated approach, AWS continues to position AgentCore as the more flexible and open framework for organizations that want greater control over model selection, orchestration logic, and agent architectures.
This dual-platform strategy is important because enterprises are increasingly worried about lock-in.
Some organizations want tightly integrated platforms that abstract away complexity. Others want the flexibility to:
- Use multiple models
- Optimize for cost or latency
- Fine-tune workflows
- Swap orchestration frameworks
- Build multi-agent systems
- Customize inference pipelines
AWS appears to understand that the enterprise market will not converge around a single AI architecture pattern.
Instead, organizations will operate across a spectrum ranging from fully managed AI environments to highly customized open agent ecosystems.
AI Workflows Are Evolving Into Enterprise Applications
Another major takeaway from the event was the evolution from AI workflows into AI-native applications.
AWS outlined four primary enterprise adoption patterns emerging today:
- Deterministic workflows enhanced with AI
- Coding agents and developer productivity tools
- Existing applications augmented with AI services
- Entirely new AI-native, cloud-native applications built from the ground up
This progression is important.
Most enterprises begin with incremental automation of processes. They enhance existing workflows with summarization, OCR, translation, or conversational interfaces. Over time, organizations evolve toward applications where AI becomes the primary interaction layer.
The example discussed around Strava’s “Athlete Intelligence” capability illustrated this transition well. AWS described how Bedrock enables Strava to combine user data, AI summarization, voice interaction, and personalized insights directly into the customer experience.
This is the future direction of enterprise software.
Applications are increasingly becoming adaptive, conversational, and agent-driven rather than static systems of record.
Connect Reflects the Shift Toward Agentic Enterprise Applications
AWS also highlighted how Connect is evolving beyond its traditional contact-center role into a broader platform for agentic enterprise applications.
That evolution matters because many enterprises are beginning to treat AI agents as a new application delivery model.
Organizations increasingly want to:
- Automate workflows
- Connect data systems
- Integrate enterprise APIs
- Execute tasks autonomously
- Generate insights dynamically
- Create conversational operational interfaces
The implication is larger than customer service, moving into healthcare and supply chain.
AI agents are becoming the interaction layer between users, enterprise data, and operational systems.
This mirrors what I continue to see across the market: AI is becoming the new user experience layer for enterprise infrastructure and applications.
My Take: ROAI Becomes the Enterprise Decision Framework
One of the most important parts of the conversation centered around ROAI — Return on AI.
As organizations operationalize AI, they are realizing that AI economics are fundamentally different from traditional infrastructure economics. Token consumption, inference variability, context-window growth, reasoning complexity, and orchestration overhead all create new operational cost dynamics.
This is forcing enterprises to rethink how they evaluate AI investments.
AWS emphasized that organizations should focus on business-level ROI rather than token-level optimization alone. At the same time, AWS outlined several mechanisms designed to reduce inference costs:
- Model selection flexibility
- Prompt caching
- Intelligent routing
- Reserved capacity
- Model distillation
- Dynamic workload optimization
This is where the infrastructure conversation becomes critical.
Inference economics are rapidly becoming the defining battleground of enterprise AI.
Organizations are moving from measuring GPU utilization toward measuring GPU productivity.
That changes how enterprises think about:
- Memory architectures
- KV cache persistence
- CPU/GPU coordination
- RAM optimization
- Distributed inference
- Storage integration
- Edge deployment
- Power consumption
- Latency guarantees
Inference Is Becoming the Most Important AI Infrastructure Challenge
One of the strongest technical discussions during the conversation centered around inference architecture itself.
Sandhu explained that inference workloads are significantly more complicated than many organizations initially assumed. Unlike training workloads, inference involves a constantly shifting combination of:
- CPU-intensive tokenization
- GPU-intensive prefill operations
- Memory-intensive context management
- De-tokenization workloads
- KV cache coordination
- Dynamic request variability
This is why inference architectures are becoming increasingly complex.
Every request can vary dramatically in size, reasoning depth, latency sensitivity, and compute requirements. A simple conversational prompt may consume minimal resources, while a large reasoning workflow involving massive document sets can generate highly unpredictable operational demands.
This is also why AWS continues investing heavily across:
- Trainium
- Graviton
- Bedrock optimization
- Distillation
- Intelligent routing
- Distributed inference architectures
The enterprise market is beginning to understand that inference is where AI value is operationalized—and where infrastructure economics are ultimately won or lost.
So What?
The significance of the “What’s Next with AWS?” event is not simply the announcements themselves.
The larger story is that AWS is positioning itself as the operational platform for enterprise agentic AI.
The company is building:
- Agent orchestration layers
- Managed inference services
- Open and opinionated agent frameworks
- AI-native application tooling
- Cost optimization controls
- Infrastructure abstractions
- Enterprise governance models
- Distributed inference architectures
AWS understands that the enterprise AI market is rapidly shifting from experimentation toward operationalization.
Organizations are no longer asking: “What model should we use?”
They are increasingly asking: “How do we operationalize AI securely, efficiently, economically, and at scale?”
That is a fundamentally different conversation.
And based on the announcements and discussions coming out of San Francisco, AWS is clearly trying to provide that operational layer of the AI stack.
FULL VIDEO:
Feel free to reach out and stay connected through rob@smugetconsulting.com, read @realstrech on x.com, and comment on my LinkedIn posts.

