Daryl Kang

Lead Data Scientist at Forbes

Daryl Kang is a data scientist at Forbes. He is passionate about media & entertainment, and the role that technology plays in advancing the industry. He graduated summa cum laude from UCLA with a degree in economics and earned his master’s in data science from Columbia University. Having come from a non-technical background, he credits having both an insider and outsider perspective to data science as an advantage in the workplace. In his current role at Forbes, Daryl works closely with various stakeholders to create data-driven solutions that empower journalists at every step of the creative process.

WATCH LIVE: September 23 @ 2:25PM – 2:55PM ET

Summarizing News Clusters Through Text Vectorization

Information gathering, an essential part of the journalistic process, is often burdened by the vast and disorienting body of material available. This process is especially challenging in the fast-paced world of news media, where technology has enabled us to communicate across the globe in real-time. Fortunately, technology has also given us new tools and methods to tackle the challenges. Through the application of data science and natural language processing, we seek a more organized and automated way of initiating the journalistic research process. We can begin by summarizing the cluster of existing news articles surrounding a developing story and visualizing the landscape of reports and opinions already expressed. Text vectorization allows us to distill the contents of an article into their semantic components to be compared against other articles in the space. This offers the journalist a quick overview of the news landscape at a glance, helping her form her unique angle for her story amidst a crowded field, and providing her an organized starting point for further research.