AI Researchers Create the Largest Database for Studying Career Identity

ODSC - Open Data Science
3 min read13 hours ago

--

Researchers at Stony Brook University have developed the largest dataset for studying career identity by analyzing over 51 million English-language Twitter biographies spanning six years. Their findings were published in The Evolution of Occupational Identity in Twitter Biographies.

This AI career database looks to provide new insights into how individuals perceive their professional roles and how these identities evolve over time.

AI Researchers Create the Largest Database for Studying Career Identity

Led by Ph.D. students Xingzhi Guo and Dakota Handzlik, Distinguished Teaching Professor Steven Skiena from the Department of Computer Science, and Jason J. Jones from the Sociology Department, the AI career database study examined 435 million biography updates from February 2015 to July 2021. Many of these updates involved users revising their job titles, offering valuable data on career mobility and prestige.

The study highlights how career identity shapes behavior and social interactions. “Work is an essential part of our daily lives. It significantly contributes to our sense of identity and how we behave around others,” said Skiena.

For instance, a business owner may adopt different traits than a teacher, while a nurse may exhibit more patience than a race-car driver.

Key Findings on Career Identity

The research found that individuals are more likely to highlight prestigious roles in their biographies. Titles such as “CEO” and “Owner” appeared more frequently than “Restaurant Busser” or “Clerk,” despite the latter representing a larger workforce. The study also revealed that individuals tend to remain within the same job category, reinforcing their sense of belonging within a particular professional group.

One notable observation emerged from the COVID-19 pandemic. In 2020, the number of users including job titles in their bios declined, aligning with the spike in unemployment during the early months of the crisis. The dataset also uncovered shifts in occupational mobility, showing that half of the top 10 entry-level roles were in personal care and service occupations, such as estheticians, doulas, and tattoo artists.

According to the U.S. Bureau of Labor and Statistics, these jobs are projected to grow 14% between 2021 and 2031, nearly double the overall job market growth rate. Conversely, the most frequently exited roles fell within media, sports, and entertainment — fields often considered aspirational but challenging to sustain as long-term careers.

Implications and Future Research

This AI career database research marks the most comprehensive study of social media biographies to date, offering significant implications for workforce analysis. The dataset can help organizations enhance job prestige, recognize employees, and refine workforce policies without relying on private resume data.

It also provides insights into economic trends and labor market shifts, contributing to well-informed policymaking. Looking ahead, the research team aims to explore the persistence of self-identity, the evolution of aspirational careers, and the impact of generative AI on workforce dynamics.

Their findings continue to provide a valuable lens into how professional identities are shaped and adapted in an ever-changing job market.

--

--

ODSC - Open Data Science
ODSC - Open Data Science

Written by ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.

No responses yet