Watch live | Wednesday, June 16 @ 2:00 p.m ET
Spark NLP is the most widely used NLP library in the enterprise, thanks to implementing production-grade, trainable, and scalable versions of state-of-the-art deep learning & transfer learning NLP research. It is also Open Source with a permissive Apache 2.0 license that officially supports Python, Java, and Scala languages backed by a highly active community and JSL members.
Spark NLP library implements core NLP algorithms including lemmatization, part of speech tagging, dependency parsing, named entity recognition, spell checking, multi-class and multi-label text classification, sentiment analysis, emotion detection, unsupervised keyword extraction, and state-of-the-art Transformers such as BERT, ELECTRA, ELMO, ALBERT, XLNet, and Universal Sentence Encoder.
The latest release of Spark NLP 3.0 comes with over 1100+ pretrained models, pipelines, and Transformers in 190+ different languages. It also delivers massive speeds up on both CPU & GPU devices while extending support for the latest computing platforms such as new Databricks runtimes and EMR versions.
The talk will focus on how to scale Apache Spark / PySpark applications in YARN clusters, use GPU in Databricks new Apache Spark 3.x runtimes, and manage large-scale datasets in resource-demanding NLP applications efficiently. We will share benchmarks, tips & tricks, and lessons learned when scaling Spark NLP.
Maziyar Panahi - Spark NLP Lead at John Snow Labs
Maziyar Panahi is a Senior Data Scientist and Spark NLP Lead at John Snow Labs with over a decade long experience in public research. He is a senior Big Data engineer and a Cloud architect with extensive experience in computer networks and software engineering. He has been developing software and planning networks for the last 15 years. In the past, he also worked as a network engineer in high-level places after he completed his Microsoft and Cisco training (MCSE, MCSA, and CCNA).
He has been designing and implementing large-scale databases and real-time Web services in public and private Clouds such as AWS, Azure, and OpenStack for the past decade. He is one of the early adopters and main maintainers of the Spark NLP library. He is currently employed by The French National Centre for Scientific Research (CNRS) as a Big Data engineer and System/Network Administrator working at the Institute of Complex Systems of Paris (ISCPIF).
Watch live | Wednesday, June 23 @ 1:00 p.m ET
Today, NLP (Natural Language Processing) algorithms power a wide range of intelligent applications from smart devices, customer service chatbots, document processing to search, and targeting. It’s hard to develop a state-of-the-art NLP application and it’s even harder to monitor and guarantee quality and consistency in production.
With the models making key product and business decisions it’s imperative that we have access to specialized production monitoring tools and techniques designed with the complexity and unique approaches of NLP algorithms in mind. For example, to know if your production model is making inaccurate predictions requires ground truth which is very complex and time-consuming to obtain as you consider languages, geographies, context, emotions, and other NLP nuances. On top of that ground truth for NLP is ambiguous and not always black and white.
In this talk we will discuss why monitoring your NLP models is a fundamentally complex problem and key considerations of a model monitoring system. Finally, we will dig into a specific NLP use case and demonstrate how we can leverage the new Verta Model Monitoring capability to easily monitor any NLP model performance, identify model/data drifts and errors, segment model inputs, and outputs by cohorts, and perform root cause analysis.
Meeta Dash - VP Product at Verta
As VP Product at Verta Meeta Dash is building MLOps tools to help data science teams track, deploy, operate and monitor models and bring order to Enterprise AI/ML chaos. Prior to Verta, Meeta held several product leadership roles in Appen, Figure Eight, Cisco Systems, Tokbox/Telefonica and Computer Associates building ML data platform, Voice & Conversation AI products and Analytics/Operational Monitoring Tools. Meeta has an MBA Degree from UC Davis and an engineering degree from National Institute of Technology, India.
Watch live | Thursday, June 24 @ 3:00 p.m ET
It’s happened again. You built another AI model that will never see the light of day because it won’t make it past the AI “valley of death” – the crossover of model development to model deployment across your enterprise. The handoff between data science and engineering teams is fraught with friction, outstanding questions around governance and accountability, and who is responsible for different parts of the pipeline and process. Even worse? The patchwork approach when building an AI pipeline leaves many organizations open to risks because of a lack of a holistic approach to security and monitoring.
Join us to learn about approaches and solutions for configuring a ModelOps pipeline that’s right for your organization. You’ll discover why it’s never too early to plan for operationalization of models, regardless of whether your organization has 1, 10, 100, or 1,000 models in production.
The discussion will also reveal the merits of an open container specification that allows you to easily package and deploy models in production from everywhere. Finally, new approaches for monitoring model drift and explainability will be revealed that will help manage expectations with business leaders all through a centralized AI software platform called Modzy®.
Clayton Davis - Head of Data Science at Modzy
Clayton Davis is Head of Data Science at Modzy where he oversees model development, operational data science capability development, and AI research. Prior to his role at Modzy, Mr. Davis spent over 15 years leading data science work for commercial and government organizations. His experience has spanned the data science spectrum, from analytic macro creation to cloud based deep learning research and petabyte scale big data processing on Hadoop clusters. He has a passion for solving complex puzzles and holds a graduate degree in Physics.