Welcome to the official podcast for Data Science Salon!

The Data Science Salon series is a unique vertical focused conference which brings together specialists face-to-face to educate each other, illuminate best practices, and innovate new solutions in a casual atmosphere.

In our podcast, our Senior Content Advisor Q McCallum holds deep, engaging interviews with guests ranging from DSS speakers, authors of data science books, and professionals whose work interacts with the AI world. All of them bring their unique perspective on trends and business use cases in the world of data science, machine learning, and AI.

NEW – Episode Twenty Six

The roles of economists in data science, with Dr. Amar Natt

We’ve all heard the term “economist,” sure. But exactly what does and economist do? And as economics is a very data-driven field, where does their work intersect with data science, machine learning, and AI?

To answer that question, Senior Content Advisor Q McCallum spoke with Amar Natt, PhD. She’s an economist at Econ One Research, and her work focuses on advanced analytics and predictive modeling. Does that sound like ML to you? Well, Amar explains that it’s similar in some ways, different in others. From there, she tells us about techniques economists can learn from data scientists, and what data scientists can pick up from econ. (Hint: “causal inference.” You heard it here first.)

Episode Twenty Five

ML at The Home Depot with Pat Woowong: The Falloff Model and Lead Scoring

When people think about The Home Depot, they probably think more about lumber and tile than they do ML models. Sure, there is plenty of lumber. But machine learning also plays a key role in the business, in places that customers can see as well as the behind-the-scenes operations.

Senior Content Advisor Q McCallum met up with Pat Woowong, Director of Data Science at The Home Depot, to explore how the company mixes their very rich dataset with domain knowledge to employ machine learning deep inside the business. To frame this, he walked me through the Falloff model and Lead scoring, two projects that his team deployed to address the unique challenges of a company that handles both retail and services.

During our conversation, we discussed: understanding where models fit into the bigger business picture; using expert domain knowledge to drive feature selection and feature engineering; the value of process; and, to top it off, what it’s like to work at The Home Depot.

Episode Twenty Four

Coffee Chat: Inspiring ML Use Cases in Retail Delivering Measurable Impact

This episode is a coffee chat recording from DSS Virtual in May 2022. Charles Irizarry (Phygital) and Ankita Mangal (P&G) share in war stories of ML use cases they use in retail and eCommerce scenarios, brokering data, and protecting the important principles of data ethics and privacy. Ankita shares the digital transformation journey that P&G undertook, her growth together with P&G, and some of the incredible technologies P&G has developed to better serve their customers world wide.

Episode Twenty Three

Data Science and Data Engineering in the Federal Space with Dr. Pragyansmita Nayak

A lot of data scientists work in the private sector: finance, adtech, retail, and all that. Today’s guest offers her perspective on what it means to do data work in the federal space.

In this conversation, our Senior Content Advisor Q McCallum spoke with Dr. Pragyansmita Nayak, Chief Data Scientist at Hitachi Vantara Federal. They explored how different federal agencies use data and how they share datasets with each other. They also talked about how to measure operational efficiency, when you can’t rely on metrics like “profit.” And, the big question: should we release t-shirts that read “just give me my AI solution!” ?

You can find Pragyan online:
Twitter: https://twitter.com/SorishaPragyan
LinkedIn: http://linkedin.com/in/pragyansmita

The book Q mentioned is Army of None, by Paul Scharre.

EPISODE TWENTY TWO

SOFTWARE DEVELOPMENT SKILLS IN ML/AI

In this episode, our Senior Content Advisor Q McCallum met up with Murium Iqbal from Etsy.  They spoke about an important skill for data scientists: software development!

Data scientists write a lot of code, sure, but few of them come from a formal software dev background.  That can lead them to struggle with slow, buggy code that ultimately holds back the company’s ML efforts.  Want to write cleaner, more performant code?  Looking for ways to make those model deployments more reproducible?  Listen to Murium and Q explore topics such as writing tests, using Docker to isolate dependencies, and learning best practices from your software developer teammates.

Episode Twenty One

COFFEE CHAT: MODEL INTERPRETABILITY AND HOW TO CREATE TRUST IN AI PRODUCTS

This episode is a recording of the panel conversation at the virtual Data Science Salon in April 2022, which focused on AI & machine learning applications in the enterprise.

Charles Irizarry (CEO & Co-Founder at Strata.ai) had the chance to talk to Amarita Natt (Managing Director, Data Science at Econ One Research), Preethi Raghavan (VP, Data Science Practice Lead at Fidelity Investments) and Serg Masís (Climate and Agronomic Data Scientist at Syngenta) about the important topic of model interpretability and how to create trust in AI products.

    Episode Twenty

    Coffee Chat: DSS Hybrid Miami 2022

    This episode is a recording of the coffee chat at the hybrid Data Science Salon Miami, which focused on AI & machine learning applications in the enterprise.

    Charles Irizarry, CEO & Co-Founder at Strata.ai had the chance to talk to Nirmal Budhathoki, Senior Data Scientist at VMware Carbon Black and Moody Hadi, Group Manager – New Product Development & Financial Engineering at S&P Global. Tune in to hear about ML techniques they are using in their current roles, tools to put ML into production, model explainability, and future trends.

      Episode Nineteen

      Communal Computing and AI with Chris Butler – Pt. 2

      In the previous episode, our Senior Content Advisor Q McCallum met with product manager Chris Butler to explore the role of uncertainty and how it relates to AI product management. That conversation sets the stage for Chris and Q to talk about communal computing today.Chris starts by explaining what shared, AI-backed devices mean for data collection, analysis, and regulation. After that, Chris and Q explore important questions such as: What are some challenges in getting communal computing devices to coordinate? How do social norms mix with assumptions made by the ML models behind these devices? What do we lose when we use data lakes? How do product managers and machine learning engineers interact on these kinds of projects? What do communal computing devices have in common with software developers on shared platforms? And, most importantly: what does all of this have to do with the film Napoleon Dynamite …?

        Episode Eighteen

        Coffee Chat: DSS Virtual 2021/12: Applying AI & Machine Learning to Finance & Technology

        This episode is a recording from our recent Data Science Salon event, which focused on applying AI and ML to finance and technology.  Our Senior Content Advisor Q McCallum sat down with data scientists Linda Liu (Hyrecar) and Giacomo Vianello (Cape Analytics) to talk about their work.  We explored the techniques and tools for the various data projects they’re running, some of the challenges of working with geospatial data, and how they approach R&D efforts in the company.  (The hint for that last one: balance, discipline, and structure rule the day.  Very practical.)

          

          Episode Seventeen

          AI, Product, and Uncertainty with Chris Butler – Pt. 1

          Welcome to our first two-part episode!  Our Senior Content Advisor, Q McCallum, caught up with product manager Chris Butler to talk about the intersection of AI and product.  In particular, Chris’s two decades of professional experience have taught him a lot about the role of uncertainty: we dig deep into what that term really means, how much data scientists need to concern themselves with uncertainty in their work, and how this relates to a company’s values.This discussion also explores the context around which we collect data, polysocial reality, design individualism, and contextual integrity.  (Yes, we covered a lot of ground in just 45 minutes.)Because of our tight schedule, Chris and Q had to stop before they could get to their second topic.  That’s why Chris will be back in the next episode to talk about communal computing and what that means for AI.    

            Episode Sixteen

            Coffee Chat: DSSe Virtual 2021

            Today’s episode is a recording of the Coffee Chat from our Data Science Salon Elevate series. Elevate is our unique women focused virtual conference that includes BIPOC, members of the LGBTQIA+, and other underrepresented groups.
            Formulated.by’s Senior Content Advisor, Q McCallum, caught up with Vidhi Chugh (Walmart), Piyanka Jain (Aryng), and Tempest van Schaik (Microsoft). Our guests explored the impact of the Covid-19 pandemic on hiring and retention, then shifted to a discussion on finding and serving as a mentor.

              Episode Fifteen

              Analytics vs. Data Science vs. ML Research: Economist Sonali Syngal Shares Her View

              The world of data has a lot of hazy definitions. This leads to confusion as people use the same terms in a conversation but mean very different things. Three such terms that are often conflated are “analytics,” “data science,” and “machine learning research.” How do we tell the difference between them? And what are the different duties and qualifications of these roles?

                Episode Fourteen

                Charting a Course: from Physics PhD to Professional Data Scientist with Dr Resham Sarkar

                There’s no single path to a data scientist role. Practitioners come from fields as varied as software development, economics, and academia. Many people in that last group aren’t sure what it’s like to transition from an advanced degree program into industry.  That’s why I was happy to speak with Dr Resham Sarkar, a machine learning expert who heads up personalization at Slice.  Before she started building ML around pizza, she completed a PhD in physics and then worked in insuretech.  What was it like to move from a physics lab into the data scientist’s chair?  How did she find that first job? And what elements of her PhD experience have proven especially valuable in her machine learning work?  Join us in this conversation to find out.

                  Episode Thirteen

                  Data Monetization Strategies with Micheline Casey

                  The idea of turning data into money has been a draw since the early days of the term “Big Data.”  As many companies have learned, sometimes the hard way, this isn’t always easy and it’s hardly guaranteed to work.
                  That’s where today’s guest comes in.  For this episode, Formulatedby’s Senior Content Advisor Q McCallum sat down with Micheline Casey to explore the what, why, and how of a company monetizing its data.  There are a lot of matters to consider, ranging from technology to policy to business model, and she’s seen them all.

                    Episode Twelve

                    Software Testing, Performance Tuning, and Code Handoff for Data Scientists

                    Data scientists and ML engineers write a lot of code: building data pipelines, wiring up models, and sometimes translating concepts from research papers into algorithms.
                    Once in a while, that code runs into performance problems. These can be painful to debug when you don’t come from a formal software development background. That’s why Formulatedby’s Senior Content Advisor Q McCallum rang up Matt Godbolt to learn the deep details of software testing, tracing performance bugs, working with data at scale, and how data scientists can work with developers to prepare their code for a production handoff.

                    Episode Eleven

                    Coffee Chat at DSSVirtual for Healthcare, Finance & Technology

                    We recorded this episode at our February 2021 Data Science Salon Virtual on Healthcare, Finance & Technology. Formulated.by’s Senior Content Advisor, Q McCallum, sat down with Ayda Farhadi, Senior Data Scientist at UPS, and Vasileios Stathias, Lead Data Scientist at Sylvester Comprehensive Cancer Center to discuss applying AI to healthcare.

                    Episode Ten

                    Trading, Risk, and Reinsurance with Otakar Hubschmann

                    Our Senior Content Advisor Q McCallum sat down with Otakar Hubschmann, Head of Applied Data at TransRe, to talk about ML/AI in the world of reinsurance.  They take a deep dive into the insurance industry and the role reinsurance plays there, with a side-trip to show how this differs from the quantitative finance you see in hedge funds.  Along the way, Otakar offers his favorite tips for hiring data scientists.  (Whether you’re applying for a job, or hiring for one, take note.)

                    Episode Nine

                    Virtual Coffee Chat: Live from DSS Virtual

                    We recorded this episode at our December 2020 Data Science Salon Virtual on Finance & Technology. Formulated.by’s Senior Content Advisor, Q McCallum, sat down with some new friends to discuss trends and challenges in the world of AI:

                    Thulasi Nambiar – Senior Manager, Marketing Data Science at Prosper, Jeff Sharpe – Senior Manager / Tech Lead at CapitalOne, Sonali Syngal – Applied Scientist and Project Lead AI Garage at Mastercard

                    Episode Eight

                    Virtual Coffee Chat: Live from DSS Virtual

                    We recorded this episode at our November 2020 Virtual Data Science Salon on Retail & Ecommerce. Formulated.by’s Content Advisor, Roger Magoulas, sat down with some of the event’s speakers to talk about data science trends and challenges in retail & ecommerce.

                    Phillip Rossi, Head of Data Science at Shopify, Laya Shamgah, Data Scientist at Lowe’s Company, Jeffrey Yau, Head of Data Science at Walmart Labs, Samantha Cvetkovski, Data Science Manager at Mindbody

                    Episode Seven

                    Automated Content Moderation and the Intersection of AI and Law

                    Today’s podcast is about the intersection of AI and the law. Formulatedby’s Senior Content Advisor, Q McCallum, spoke with Shane Glynn, an attorney who has deep knowledge of the tech and AI worlds. He’s worked for a couple of law firms that you may have heard of, and for a tech company that you have most certainly heard of.
                    Shane gave us an attorney’s view on AI practices, explored the ways in which an attorney can help with an AI effort, and explained the how, when, and why AI teams should involve their legal counsel. (Hint: early. Very early.) Shane also talked about the legal and technical aspects of AI-driven, automated content moderation.
                    At the end of the episode, Shane mentions some blog posts that Q wrote on AI lessons learned from the world of algorithmic trading. That series starts here.

                    Episode Six

                    Virtual Coffee Chat: Live from DSS Virtual

                    We recorded this episode at our September 2020 Data Science Salon virtual event on Media, Advertising, & Entertainment. Formulatedby’s Senior Content Advisor, Q McCallum, sat down with some new friends to discuss trends and challenges in the world of AI:

                    Anne Bauer – Director of Data Science at The New York Times, Yves Bergquist – Director of the AI & Neuroscience in Media Project, at USC, Kim Martin – Engineering Leader of Data Science and Engineering at Netflix, Dominick Rocco – Data Scientist at phData

                    Episode Five

                    Mission and Purpose in Data Science: Lessons from the Military and Intelligence

                    How can mission and purpose drive a data professional? And what happens when we can no longer trust the data that’s presented to us?
                    Richard Dunks served as a member of the US Army and the intelligence community (IC), where he honed skills that he now uses in his civilian pursuits as a data scientist, trainer, and educator. He recently caught up with Q McCallum (Senior Content Advisor at Formulatedby, the company behind Data Science Salon) to talk about what his time in the IC taught him about data analysis, having a sense of mission, and what it means to lose trust in data.

                    Episode Four

                    Marcello La Rocca on Algorithms and Data Structures

                    The term “algorithms” has several meanings, from machine learning models to tools of Wall St traders. Then there’s the classic computer science definition: a set of instructions for solving problems. Think “simulated annealing,” “evolutionary computing,” or “LRU cache.” These are the sort of algorithms we’ll explore today.

                    Episode Three

                    Jean-Georges Perrin on Spark and Data Quality

                    Our guest for this episode is Jean-Georges Perrin, the author of Spark in Action, 2nd edition. We talk about his career path (he’s been doing “big data” since before the term existed), what inspired him to write Spark in Action, and where Spark fits in your company’s data efforts. He also shares his thoughts on data quality.

                    Episode Two

                    Applications of Data Science in Media & Entertainment

                    The Media and Entertainment industry has undeniably been heavily disrupted by changes in technology. Listen as Ayan Battacharya, Advanced Analytics Specialist Leader at Deloitte Consulting and Harini Krishnan, Data Scientist at Capsule8, share observations they’ve garnered from their own experience on the state of data science in Media & Entertainment, live from DSS NYC 2019.

                    Episode ONE

                    Prolific vs. private data in media advertising @ DSS NYC

                    In June 2019, over 200 data scientists gathered at Viacom HQ in New York to hear key industry players’ takes on what makes an effective data-driven strategy. Q McCallum, Senior Content Adviser at Formulated.by, took a deeper dive into the major topics of concern for data science when he spoke with DSS NYC speakers Lauren Lombardo, Senior Data Scientist at Nielsen and Sergey Fogelson, Vice President of Data Science and Modeling at Viacom. Listen as they speak about current practices and debate the ways in which the growth of AI will impact advertising.