Member-only story

Understanding Unstructured Data With Language Models

ODSC - Open Data Science
6 min readMay 21, 2019

--

Much of our machine learning capabilities come from structured data, but the real payload lies in the messy, unstructured data underneath. If we want to gain practical insights, machines have to learn to parse things like social media posts filled with misspellings or sarcasm or handwritten doctor’s notes with illegible lettering.

So how do machines do this? Alex Peattie, the co-founder of PEG, has thoughts on where we’ve been with language models in the past and how they may help machines decipher these difficulties.

ODSC West 2024 tickets available now!

In-Person & Virtual Data Science Conference

October 29th-31st, 2024 — Burlingame, CA

Join us for 300+ hours of expert-led content, featuring hands-on, immersive training sessions, workshops, tutorials, and talks on cutting-edge AI tools and techniques, including our first-ever track devoted to AI Robotics!

REGISTER NOW

Origins of Language Models

Roughly 80 years ago, Alan Turing and a group of brilliant minds gathered to create the newest iteration of the enigma machine, which helped win the war and began the journey to unlocking one of our most persistent problems, how to define a language with data.

--

--

ODSC - Open Data Science
ODSC - Open Data Science

Written by ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.

Responses (1)