THE FUTURE OF APPLIED AI
in Finance and Banking

New York | December 11, 2025

Jayeeta Putatunda

Senior Data Scientist at Fitch Ratings

Jayeeta is a Senior Data Scientist with several years of industry experience in Natural Language Processing (NLP), Statistical Modeling, Product Analytics and implementing ML solutions for specialized use cases in B2C as well as B2B domains. Currently, Jayeeta works at Fitch Ratings, a global leader in financial information services. She is an avid NLP researcher and gets to explore a lot of state-of-the-art open-source models to build impactful products and firmly believes that data, of all forms, is the best storyteller.

Jayeeta also led multiple NLP workshops in association with Women Who Code, and GitNation among others. Jayeeta has also been invited to speak at International Conference on Machine Learning (ICML 2022), ODSC East, MLConf EU, WomenTech Global Conference, Data Science Salon, The AI Summit, and Data Summit Connect, to name a few. She is also an ambassador for Women in Data Science, at Stanford University, and a Data Science Mentor at Girl Up, United Nations Foundation, and WomenTech Network where she aims to inspire more women to take up STEM.

Jayeeta has been nominated for the WomenTech Global Awards 2020 and has been spotlighted in the List of Top 100 Women Who Break the Bias 2022. She received her MS in Quantitative Methods and Modeling, and a BS in Economics and Statistics, and is now based in New York City. 

PREVIOUS VIDEOS

Watch in-person: JUNE 18 @ 12:10 – 12:40PM ET

Decoding LLMs: Challenges in Evaluation

Large Language Models (LLMs) gave a new life to the domain of natural language processing, revolutionizing various fields from conversational AI to content generation. However, as these models grow in complexity and scale, evaluating their performance presents many challenges.

One of the primary challenges in LLM evaluation lies in the absence of standardized benchmarks that comprehensively capture the capabilities of these models across diverse tasks and domains. Secondly, the black-box nature of LLMs poses significant challenges in understanding their decision-making processes and identifying biases. In this talk, we address the fundamental questions such as what constitutes effective evaluation metrics in the context of LLMs, and how these metrics align with real-world applications.

As the LLM field is seeing dynamic growth and rapid evolution of new architectures, it also requires continuous evaluation methodologies that adapt to changing contexts. Open source initiatives play a pivotal role in addressing the challenges of LLM evaluation, driving progress, facilitating the development of standardized benchmarks, and enabling researchers to consistently benchmark LLM performance across various tasks and domains. We will also evaluate some of the OS evaluation metrics and walkthrough of code using demo data from Kaggle.