Episode 59

Reproducible EDA – Building Trustworthy Analytics Pipelines

In this episode of the Data Science Salon Podcast, we sit down with Leon Shpaner and Oscar Gil, data scientists and the creators of the open-source EDA Toolkit. Leon brings over 15 years of experience in predictive modeling across healthcare, finance, and education, while Oscar has collaborated with him to make exploratory data analysis (EDA) more intuitive, reproducible, and standardized for data science projects.

Together, they share how applied EDA practices remain the backbone of trustworthy analytics pipelines in both academic and industry settings. Their discussion highlights the challenges and lessons learned from building the EDA Toolkit, and why reproducible workflows are more important than ever in the age of AI and ML.

Key Highlights:

  • Reproducible EDA: How to standardize exploratory data analysis workflows for consistent and trustworthy insights.
  • Open-Source Innovation: The design and impact of the EDA Toolkit, bridging research, healthcare, and education.
  • Best Practices for Analytics: Lessons learned from creating tools that make EDA more intuitive and scalable across projects.
  • The Future of Data Science Workflows: Why reproducibility and standardization matter in modern AI/ML pipelines.

🎧 Tune in to Episode 59 to hear Leon Shpaner and Oscar Gil’s insights on building reproducible, reliable, and effective data science workflows, and how open-source tools can transform analytics practices across domains.


Be sure to mark your calendars for the 9th annual DSS ATX on Feb 18, where we will focus on GENAI AND INTELLIGENT AGENTS IN THE ENTERPRISE. Join us to hear from experts on how AI is shaping the future of the enterprise. https://www.datascience.salon/austin/