Inside the inaugural Collate Summit '26 Customer Awards, Where Proof Beats Hype

Jun 17, 2026
Steve Wooledge
Inside the inaugural Collate Summit '26 Customer Awards, Where Proof  Beats Hype

AI demos on enterprise data are everywhere this year. Production deployments that hold up are rarer, and the teams running them have the results to prove it. At Collate Summit '26, we honored five: OpenAI, Yelp, Scout24, Ambry Genetics, and DMG.

The inaugural Collate x OpenMetadata Awards recognize organizations putting AI to work on their data in production. Their work spans the full range of what that takes, from open source contributions to regulated clinical governance to internal AI data agents serving thousands of employees. What connects these award winners is proof over promise. Rather than pointing a model at raw tables and hoping, they did the work to give their agents the context to answer reliably, and they can show what it delivered: answers in seconds instead of minutes, governed access at scale, measurable confidence in the results.

Collate Summit '26, held virtually on June 10, brought the Collate and OpenMetadata community together to share that work. The five customer award winners spoke at the event, sharing insights from their experiences putting AI for Data into production.

"With AI for Data, the variable is context, not the model. The enterprises shipping reliable AI on their data today are the ones that built and own their context layer, and that layer is data context, semantics, and memory,” said Suresh Srinivas, CEO and co-founder, Collate. “AI infers and reasons about data well when it has precise meaning to work from, and it infers confidently wrong when it doesn't. The teams we recognized this year understood that early. They built the foundation first, and their AI agents run on top of it. We exist because of customers like these, and watching what they shipped this year is the most rewarding part of the journey."

AI Innovator of the Year: OpenAI

OpenAI's internal AI data agent, "Kepler," answers data questions in natural language for more than 3,500 employees across a platform of roughly 70,000 datasets and 580-plus petabytes of data queried daily. That production track record earned OpenAI the AI Innovator of the Year award for the boldest use of AI on its own data.

OpenMetadata serves as the knowledge foundation for the agent's context. Rather than pointing a model at raw tables, OpenAI built a layered context model and gave the agent memory, so it learns from prior queries and improves over time. On a repeat question, response time dropped from 22 minutes 41 seconds to 1 minute 22 seconds once the agent had learned where the right data lived.

"Models are really smart, but they're not the full answer, and context is really what makes the difference," said Bonnie Xu, Tech Lead, Data Productivity, at OpenAI. "It's been really useful for us to have OpenMetadata sit on top of our data warehouse for all this extra metadata context that is easy to find for both agents and humans."

OpenMetadata in Production: Yelp

Across roughly 100,000 data assets, Yelp built an in-house OpenMetadata MCP designed for how AI agents search rather than how humans do, using six read-only tools tuned for agent retrieval. The work took the open source project further than anyone, earning Yelp the OpenMetadata in Production award.

The work delivered measurable efficiency. Yelp's search tooling uses about 66 percent fewer tokens than the native equivalent (266 docstring tokens versus 794) and cut payloads by roughly 80 percent on large context bundles. Yelp contributed those search optimizations back upstream, its first contributions to the OpenMetadata community it had built on.

"The foundation of data discovery is good search," said Amy Forest, Software Engineer at Yelp. "The great thing about OpenMetadata is the focus on search, the API support is broad and accessible, and it's open source and extensible. That's how we ended up contributing back to OpenMetadata, on top of a few search optimizations."

Business Impact of the Year: Scout24

Scout24 gave every AI agent across its stack a single source of truth to draw on, replacing its legacy data catalog with what it calls a "context catalog" built on OpenMetadata, RDF, and an MCP-first architecture. The enterprise-wide impact earned it the Business Impact of the Year award.

The result is a self-reinforcing system Scout24 calls the AI Virtuous Cycle, where every interaction enriches the context layer and the enriched context produces better answers. Confidence is measurable: answers grounded in context found in Collate carry an 87 percent confidence score. Centralized PII governance and policy enforcement give agents and people consistent, governed access to the same trusted data.

"Simply allowing LLMs to query data without context leads to a lot of complications, and because of this we have Collate as our governance and context catalog," said Angelita Frozza Sanches, Head of Data Platform Engineering at Scout24. "We all talk about AI, but the real product is not AI. Context is the real product. AI just multiplies it."

Data Governance Excellence: Ambry Genetics

Ambry Genetics, now part of Tempus AI, governs more than 11,000 data assets across six PHI-free database servers in a regulated, multi-system clinical estate. That best-in-class governance at clinical scale earned the genetic diagnostics company the Data Governance Excellence award.

Using Collate's automated classification, Ambry applies 29 distinct PHI element tags across its source assets to stand up a safe, PHI-free research environment. The program is governed under CAP, CLIA, HIPAA, and FDA 21 CFR Part 11, with FAIR data principles built in, and it uses a clinical data product model to unify mutable genomic data into a single, versioned source of truth.

"We have very specific state, local, and federal requirements for safe handling of patient-derived genetic data, and I can't overstate the need to document and prove that all of these regulations are being adhered to before any analysis or data sharing can occur," said Dan Kostecki, Data Engineer at Ambry Genetics. "We're audit-prepared at all times, and Collate and the clinical data product concept are making that much easier for all parties involved."

Design Partner of the Year: Divisions Maintenance Group

Nearly 80 percent of Divisions Maintenance Group's (DMG) operational staff now use Collate, with 400 to 450 users live. As the first customer for Collate AI Analytics, DMG co-defined the requirements that shipped into it, including Excel and CSV upload and a split-panel interface, work that earned it the Design Partner of the Year award.

"I don't worry anymore that we're going to be making the wrong decisions based on data," said Peeyush Nahar, Chief Product and Technology Officer at DMG. "Collate AI Analytics has been a big boost to us being able to get self-service dashboards and analytics through a chat-based AI agent."

Neil Taylor, VP of Engineering at DMG, credited the partnership itself: "I don't think we would have been able to deliver as good a product, either of us individually, as what that partnership delivered."

Recordings of the customer and product sessions from Collate Summit 26 are now available on demand.

Ready for trusted intelligence?
See how Collate helps teams work smarter with trusted data

Keep Reading