Upcoming webinarswatch recordings

Upcoming webinars

Accurate Table Extraction from Documents & Images with Spark OCR   

Watch live | wednesday, august 11 @ 2:00 p.m ET

Extracting data formatted as a table (tabular data) is a common task — whether you’re analyzing financial statements, academic research papers, or clinical trial documentation. Table-based information varies heavily in appearance, fonts, borders, and layouts. This makes the data extraction task challenging even when the text is searchable – but more so when the table is only available as an image.

This webinar presents how Spark OCR automatically extracts tabular data from images. This end-to-end solution includes computer vision models for table detection and table structure recognition, as well as OCR models for extracting text & numbers from each cell. The implemented approach provides state-of-the-art accuracy for the ICDAR 2013 and TableBank benchmark datasets.

Presented by

Mykola Melnyk

Mykola Melnyk is a senior Scala, Python, and Spark software engineer with 15 years of industry experience. He has led teams and projects building machine learning and big data solutions in a variety of industries – and is currently the lead developer of the Spark OCR library at John Snow Labs.

1 Line of Code to Use 200+ State-of-the-Art Clinical and Biomedical NLP Models

Watch live | Wednesday, September 16 @ 2:00 p.m ET

In this Webinar, Christian Kasim Loan will teach you how to leverage the hundreds of medical State-of-the-Art models for various Medical and Healthcare domains in 1 line of code like Named Entity Recognition (NER) for Adverse Drug Events, Anatomy, Diseases, Chemicals, Clinical Events, Human Phenotypes, Posology, Radiology, Measurements, and many other fields plus the best in class resolution algorithms to map the extracted entities into medical code terminologies like ICD10, ICD0, RXNORM, SNOMED, LOINC, and many more.

Additionally, we will showcase how to extract the relationship between predicted entities for the Posology, Drug Adverse Effects, Temporal Features, Body Party problems, Procedures domains, and how to De-Identify your text documents.

Finally, we will take a look at the latest NLU Streamlit features and how you can leverage them to visualize all model predictions and test them out with 0 lines of code in your web browser!

Presented by

Christian Kasim Loan - Data Scientist and Spark/Scala ML Engineer

Interested in putting your brand in front of the most diverse data science community in the space?