DSS Podcast Episode 60: Beyond Checklists: Evaluating Conversational AI

Episode 60

Beyond Checklists: Evaluating Conversational AI

In this episode of the Data Science Salon Podcast, we sit down with Carlos Aguilar, Head of Product at Hex and former founder of Hashboard, to discuss a topic critical for every data team: how to properly evaluate AI analytics tools.

Carlos shares why traditional checklist-based evaluations fail for conversational AI and generative analytics tools, and how focusing on context, workflow, and real user testing can dramatically improve the chances of success. Drawing on his experience leading the Data Insights team at Flatiron Health, he provides practical guidance for both end-users and data teams.

Key Highlights:

End-User vs Data Team Evaluation: Why both perspectives are crucial for measuring AI effectiveness.
Context Management: How setting up reference questions ensures accurate and relevant answers.
Workflow & Observability: Why monitoring and iterating on AI outputs is essential for real-world success.
Lessons from the Field: Examples of tools that look good in demos but fail in production—and how to avoid those pitfalls.

🎧 Tune in to Episode 60 to learn how to evaluate AI analytics tools the right way and ensure your data teams deploy solutions that actually work in practice.

Be sure to mark your calendars for the 9th annual DSS ATX on Feb 18, where we will focus on GENAI AND INTELLIGENT AGENTS IN THE ENTERPRISE. Join us to hear from experts on how AI is shaping the future of the enterprise. https://www.datascience.salon/austin/

Previous episode

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Episode 60

Beyond Checklists: Evaluating Conversational AI

Cookies