Formerly known as Wikibon

Fueling Real-Time AI with Federated Queries

The Stale Data Crisis

A global study revealed that 82% of companies are making business decisions based on stale information, despite 86% saying they need real-time data to make smart decisions. This gap is severely hindering the acceleration of AI-powered applications. In this episode of AppDevANGLE, I sat down with Justin Borgman, CEO and co-founder of Starburst, to discuss the key to solving this crisis: building an application architecture that can query data where it lives. Starburst, leveraging the open-source query engine Trino, enables a Data-as-a-Product paradigm, giving developers the right data, governed and in real-time, to build responsible AI features.

The AI Data Challenge: Context, Controls, and Hallucinations

AI is fundamentally a data challenge. For developers building chatbots and interactive applications, the core difficulty lies in accessing the right, relevant, and timely data to provide context to the LLMs and avoid common hallucinations.

Traditional data architectures force data movement (ETL), which creates lag, complexity, and cost. Starburst, rooted in the Trino project originally developed at Facebook, provides a query engine that bypasses this by running federated queries across diverse sources without moving the data.

Crucially, in the enterprise context, simply accessing data is not enough; it requires solving a governance challenge:

“Not every user should have access to all of the data that’s available. And so you need to really bring together multiple disciplines to build a successful AI-driven application around governance, access controls, and then, of course, the performance…”

Solving the “stale data crisis” for AI means unifying access control policies in a central layer so that row-level, column-level, and data masking policies are applied to every query, regardless of the data’s physical location.

Data as a Product: The Human and Technical API

The concept of Data as a Product (DaaP), which originated with the Data Mesh paradigm, is changing how developers approach application design, ownership, and systems architecture.

Under DaaP, data is treated as a curated, governed building block. Instead of relying on centralized data teams, ownership is decentralized to domain experts. This creates two distinct “APIs” for developers:

  1. The Human API: A Product Owner is assigned to manage the data product end-to-end. This person is the singular point of contact to ensure the data is fit for purpose, pure, and reliable for the application’s specific business outcome. This is the human interface for data governance.
  2. The Technical API: The data product is exposed through a standardized query engine. Starburst/Trino allows developers to use standard SQL to query data that might be in a legacy Teradata warehouse, an operational store like Mongo/Cassandra, or a cloud data warehouse like Snowflake all in a single query.

This paradigm shift is vital because the consumer of data is increasingly becoming the AI agent itself, which requires the same (if not stricter) standards of purity and governance as a human analyst.

Unlocking Legacy Systems and Safe AI Experimentation

The challenge of legacy systems is particularly acute for DaaP. Existing systems of record have embedded compliance and security that is lost when the data is extracted. The federated query approach is the solution:

  • Centralized Governance, Distributed Data: Starburst ensures that the access control policies are defined centrally and applied through every query run through the system, bridging the security gap between old and new systems.
  • Open Formats and Cost Control: To keep the data foundation cost-effective and open, the combination of Trino and the open table format Iceberg (originally developed at Netflix) is becoming a de facto standard. Iceberg provides the low-cost, open-source storage structure, while Trino provides the high-performance query layer, allowing organizations to avoid the enormous costs of proprietary data processing.

For safely incorporating AI features into the CI/CD pipeline, the DaaP mindset is essential. Developers must rethink monolithic systems in favor of these modular, curated building blocks. To incorporate experimentation safely, testing must focus on: hallucination detection, bias mitigation, and consistency validation across different model versions using fixed benchmark datasets.

The future of AppDev is being built by the “citizen developer”—finance, marketing, and sales teams using AI to create applications. This explosion of non-professional developers accessing core data necessitates that the data foundation is inherently secure and correct. Starburst is actively building for this future with the imminent release of its MCP server, enabling agent-to-agent connectivity where multiple AI agents will communicate with Starburst as the foundational “data dealer.”

Conclusion and Next Steps

The consequences of using “wrong data” for AI are too high to ignore. Deep data integrity is the only way to build reliable, responsible, and scalable AI applications.

For those looking to explore this new architecture, the best advice is to experiment with the open technologies:

  • Explore Starburst Galaxy (available with a free tier on starburstdata.com) to practice connecting to disparate data sources and running federated queries.
  • Investigate Iceberg and Trino to understand how open formats and decoupled query engines enable a cost-effective, governed, and highly performant data layer.
  • Adopt the Data Product philosophy by assigning ownership and governance to curated data sets, turning your data from a sprawling liability into a foundational, trustworthy asset for the AI era.

Article Categories

Join our community on YouTube

Join the community that includes more than 15,000 #CubeAlumni experts, including Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger, and many more luminaries and experts.
"Your vote of support is important to us and it helps us keep the content FREE. One click below supports our mission to provide free, deep, and relevant content. "
John Furrier
Co-Founder of theCUBE Research's parent company, SiliconANGLE Media

“TheCUBE is an important partner to the industry. You guys really are a part of our events and we really appreciate you coming and I know people appreciate the content you create as well”

You may also be interested in

Book A Briefing

Fill out the form , and our team will be in touch shortly.
Skip to content