Democratizing your data access with AI agents

Snowflake provides a fully-managed data platform that developers can build AI apps on. We’re happy to have Stack Exchange data available on the Snowflake Marketplace. Connect with Jeff on LinkedIn and Twitter. Congrats to Timeless for throwing a Lifejacket to “Using pandas to read HTML.”

Transcript

[Intro music]

Ryan Donovan: Hello everyone and welcome to the Stack Overflow Podcast, a place to talk all things software and technology. I am your host, Ryan Donovan, and today we are talking about Snowflake, the AI data platform. My guest today is Jeff Hollan, who’s Director of Product over at Snowflake. Welcome to the show, Jeff.

Jeff Hollan: Hey, thanks so much, Ryan. It’s awesome to be here.

Ryan Donovan: Our pleasure. So, at the top of the show, we like to get to know our guests, see how they got into software and technology.

Jeff Hollan: For me, I was actually really lucky. When I was in fourth grade, my school was selected in a pilot program where they started this new thing—I even remember it was called Chill—and we got to learn how to do some programming. I remember after school, I decided to show up and we learned how to do some QBasic programming and build some simple little console apps. I just fell in love right away. I felt like it was finally my creative juices flowing, and it honestly just rolled out from there.

I’ve always really loved tinkering, building, and coding from fourth grade on. And, you know, all the places that took me: from cloud to AI. At the time, it wasn’t common at all to teach fourth graders to code, so I consider myself pretty lucky.

Ryan Donovan: We are a couple of years deep into the AI revolution, and data is very much a big part of that. What does an AI data platform do for somebody who’s trying to implement AI?

Jeff Hollan: I think about my own personal life. I use various LLMs for tasks like code editing or ChatGPT. They are super useful. For instance, when my kids were sick earlier this week, I was asking questions like, “Is this normal?” and got some helpful answers. But when I switch to my day job as a product manager at Snowflake, those AI models stop being as useful. It’s not because they aren’t powerful, but because the questions I care about are very specific — like what Snowflake customers are doing with our product or who some new customers are.

If I put those questions into any LLM, it gives me a big shrug emoji. It doesn’t know what’s happening with Snowflake usage because that context is specific to my organization. So, what we’re building at Snowflake is a way to make it easy and secure to have rich conversations with AI that include your unique business context—your “special sauce”—which is embedded in your data.

Ryan Donovan: The data platform is an interesting specificity there. AI data typically needs a database, vector storage, and probably something else attached. What is the platform beyond that?

Jeff Hollan: Those are primary components. Especially for agentic AI, Snowflake provides several building blocks out of the box:

  • Fast vectorized lookups: For LLMs, a key pattern is retrieval augmented generation (RAG), where you bring relevant snippets of data at just the right time. Snowflake offers services like Cortex Search to enable very fast and powerful vector searches.
  • Query-able data: While many agentic AI solutions focus on RAG with unstructured data, sometimes you want to ask business questions like “What has revenue been like in the last three days?” This requires generating the right SQL queries to pull fresh, accurate data directly from your data warehouse. Snowflake has tools to enable this.
  • Governance layer: For example, ensuring that role access controls prevent agents from accessing data they shouldn’t. This is crucial for data security and compliance.

Ryan Donovan: With SQL querying, are you querying external databases, production or analytics databases, or is that part of the platform too?

Jeff Hollan: Snowflake’s core strength is querying massive amounts of data quickly. The vast majority of data Snowflake queries resides natively inside Snowflake’s platform, powered by its own compute engine. However, Snowflake supports data stored in open table formats in external storage like AWS S3 or Azure Blob. It can query both native and external data efficiently.

Ryan Donovan: For those external queries, do you bring the data into the platform, or is it a single serve?

Jeff Hollan: We mainly see two patterns: the data is either imported into Snowflake, or the storage remains outside but Snowflake handles the compute with open formats like Iceberg. Federated querying where you query external databases like Postgres outside of Snowflake exists but is less common due to complexity and added layers.

Ryan Donovan: In the last few months, I’ve talked to companies dealing with querying across different storage formats. What about Snowflake makes those queries so fast?

Jeff Hollan: The key innovation was the separation of storage and compute. Previously, scaling databases required scaling storage and compute together, which was less flexible. Snowflake was designed at the rise of cloud with AWS S3 storage and EC2 compute, which are independent resources. This allows flexible scaling — you can assign many compute cores to heavy queries or just a few cores to lighter queries without scaling storage.

Ryan Donovan: We recently moved our public platform to the cloud and realized that we don’t always need all the compute but do need memory to keep data in-memory for speed.

Jeff Hollan: That flexibility—to mix and match compute and storage resources—is one of the coolest parts of cloud infrastructure, enabling you to architect precisely what you need.

Ryan Donovan: With AI, LLMs, and agents, what next-level data platform features are you developing to optimize the agentic workflow?

Jeff Hollan: A few areas stand out. First, more data sources. Snowflake’s Data Marketplace is central here. It allows organizations to easily add relevant third-party data, federated data, or partner data to enhance AI capabilities. For example, Stack Overflow data is available on the Snowflake Marketplace, giving agents access to extensive programming Q&A content. This was built in partnership with Stack Overflow, ensuring fairness and credit for the data provider.

Ryan Donovan: That’s exciting because a lot of folks train on your data anyway. Making it available while protecting the community is important.

Jeff Hollan: Exactly. It’s a win-win: consumers get valuable data; providers maintain credit and protection.

Ryan Donovan: Can you combine these data sources and integrate them into AI workflows?

Jeff Hollan: Definitely. For example, I use a Product Management Agent inside Snowflake that accesses technical documentation and usage data, blending multiple sources to answer complex questions. Agents can chain together multiple steps, like looking up documentation, querying SQL data, then conducting web searches, to rapidly get useful answers.

Ryan Donovan: How do these marketplace data integrations work? Is it a one-click process?

Jeff Hollan: It really is that easy. I demonstrated this myself earlier: within 30 seconds, I added a data set from the marketplace, created an agent and assigned it access to that data, and started chatting with it. It’s seamless.

Ryan Donovan: I know you’re moving towards becoming an AI-first company. What does that mean to you and what initiatives support that goal?

Jeff Hollan: It spans two main categories:

  • Productivity for Snowflake users: Enhancing data scientists, engineers, and analysts’ efficiency, letting them get in an hour what might have taken a day. For instance, our Data Science Agent can build machine learning models by guiding users through the process and automating steps.
  • Democratizing insights and automating mundane tasks: Many tasks like digging through dashboards are time consuming and tedious. Agents can provide fast, conversational answers, making data insights accessible to more people in an organization without deep expertise.

Overall, AI-first means embedding AI “magic” throughout Snowflake’s features, improving productivity and accessibility.

Ryan Donovan: Our surveys find adoption is high but mistrust remains. How do you ensure AI workflows are safe, reliable, and not “BS-ing” on the spot?

Jeff Hollan: Trust is critical and skepticism healthy. We approach it in several ways:

  • Improving accuracy: Using business context and certified queries to ensure answers are as accurate as possible.
  • Transparency: Showing users how answers are derived, providing confidence scores and metadata. For example, indicating which query produced the answer or what data source it came from.
  • Grounding responses: Our agents rarely generate answers from thin air; instead, they first fetch and check relevant data (like Stack Overflow or live SQL data) before responding. This approach massively boosts reliability.

Ryan Donovan: Attribution is important for us as a data provider. We want users to know where a response comes from and verify its trustworthiness.

Jeff Hollan: Exactly. Providing links, vote counts, or source documents helps users judge credibility.

Ryan Donovan: Agents seem central to Snowflake’s AI strategy. What bets are you placing?

Jeff Hollan: Agents will augment user productivity by offloading mundane tasks while keeping humans in the loop. Snowflake doesn’t aim to replace entire job functions but to accelerate workflows dramatically—tasks that took a week might take a day.

We’re investing heavily in trust, compliance, and observability around agents. For example, our acquisition of TruEra allows us to monitor agent quality and behavior. This ensures large-scale enterprise deployments meet high standards.

Ryan Donovan: Observability at the LLM level is still challenging, but agentic workflows make it easier by segmenting operations and logging actions.

Jeff Hollan: Precisely. LLMs are non-deterministic, which complicates testing and debugging. Agents let us test components individually while the LLM manages orchestration, reducing risk.

Ryan Donovan: Are you enabling tool usage, agent-to-agent communication, and standards like MCP?

Jeff Hollan: Yes, we’re excited about adopting emerging standards like MCP and agent-to-agent protocols. The adoption curve for MCP has been astonishing—a few months ago it was new, now widely embraced. These standards simplify interoperability and development.

Ryan Donovan: What’s the next big unsolved problem in AI platforms?

Jeff Hollan: Two major areas:

  • Autonomous agents: Today, most AI interactions are conversational and manual. The future involves agents working autonomously behind the scenes—generating reports, making decisions, driving workflows.
  • Organizational context: Agents today understand data and tasks, but they lack deep knowledge about how individual organizations operate—their structure, processes, and culture. Capturing and embedding that “business semantic” layer will empower agents to act more meaningfully and effectively.

This ties into designing semantic views and guardrails that encode business understanding and security policies so agents can work aligned with company rules.

Ryan Donovan: How do security policies fit as guardrails into this context?

Jeff Hollan: They must be defined clearly so agents can respect them automatically. Integrating these guardrails is an exciting and ongoing challenge.

Ryan Donovan: Any advice for listeners interested in AI adoption?

Jeff Hollan: Absolutely: experiment! Try out different AI technologies to understand what they do well and where they fall short. AI isn’t magic—it’s a powerful tool with limitations and considerations.

I’ve personally used agents for tasks from workout recommendations to generating demos for customers. These experiences help reveal AI’s potential and boundaries.

If you’re serious about AI’s future impact, start exploring now. It will transform how we work over the next decade, and understanding it firsthand is invaluable.

Ryan Donovan: Great. Time to shout out someone who contributed on Stack Overflow: the Lifejacket badge winner “Timeless” for their answer on “Using Pandas to Read HTML.” Check the show notes if you’re curious.

I’m Ryan Donovan, editing the blog here at Stack Overflow and hosting this podcast. If you liked what you heard or have questions, email me at [email protected]. You can also find me on LinkedIn.

Jeff Hollan: I’m Jeff Hollan, Director of Product at Snowflake, working on building AI agents in Snowflake apps. You can find me on Twitter X @JeffHollan or LinkedIn. Feel free to reach out with any questions or interests.

https://stackoverflow.blog/2025/09/23/democratizing-your-data-access-with-ai-agents/

Leave a Reply

Your email address will not be published. Required fields are marked *