Sitemap

10 Important Mental Health Datasets for Data Science and AI Projects

4 min readApr 14, 2025

The intersection of data science and mental health has never been more important. As awareness around mental health grows globally, so does the need for high-quality, accessible datasets that can drive impactful research, innovation, and product development.

Whether you’re working on a predictive model for early diagnosis, building a chatbot for mental health support, or simply looking to expand your data skills, these mental health datasets offer a valuable starting point. Below, we’ve rounded up 10 top mental health-related datasets from Kaggle and beyond. Each includes a brief summary and a direct link so you can dive right in.

1. Mental Health in Tech Survey

This dataset explores the prevalence of mental health issues within the tech industry. It includes anonymized responses from employees on company support structures, willingness to discuss mental health at work, and more.

  • Data Highlights: Demographics, work environment, mental health history, and treatment.
  • Ideal For: Exploring workplace mental health, building predictive models, or conducting sentiment and correlation analysis.

Mental Health in Tech Survey

2. Depression Detection Using Text

This dataset includes text samples labeled for depressive sentiment, making it ideal for NLP-based mental health projects.

  • Data Highlights: User-generated text labeled as depressed or not.
  • Ideal For: Sentiment analysis, depression classification models, and chatbot training.

🔗 Depression Detection Using Text

3. Suicide Rates Overview 1985 to 2016

Spanning 30 years and 100+ countries, this dataset provides a macro-level view of suicide statistics by age, gender, and socioeconomic factors.

  • Data Highlights: Suicide rates, GDP, generation, country, and age groups.
  • Ideal For: Time-series analysis, public health research, and data storytelling.

🔗 Suicide Rates Overview

4. Teen Mental Health Dataset

Focused on adolescents, this dataset surveys factors influencing teen mental wellness — an often overlooked segment in research.

  • Data Highlights: Responses to mental health questions, coping mechanisms, social influences.
  • Ideal For: Behavioral analysis, education-focused applications, and survey-based research models.

🔗 Teen Mental Health

5. Anxiety and Depression Survey Data

This structured survey mental health dataset includes detailed responses on symptoms, daily habits, and lifestyle factors influencing mental health.

  • Data Highlights: Multiple features spanning symptoms of anxiety and depression, daily routines, and triggers.
  • Ideal For: Predictive modeling, health tech applications, or demographic correlation studies.

🔗 Anxiety and Depression Survey

6. COVID-19 and Mental Health Dataset

This mental health dataset looks at how the pandemic has impacted psychological well-being across demographics.

  • Data Highlights: Stress levels, anxiety, isolation metrics, and coping strategies during COVID-19.
  • Ideal For: Time-based analysis, social science studies, and policy research.

🔗 COVID-19 Mental Health Impact

7. Mental Health Social Media Posts

A collection of anonymized Reddit posts from users dealing with mental health struggles. Rich in natural language data.

  • Data Highlights: Thousands of posts categorized by mental health themes like anxiety, depression, and PTSD.
  • Ideal For: NLP sentiment analysis, building mental health chatbots, or language modeling.

🔗 Reddit Mental Health Posts

8. HappyDB: Emotion and Mood Dataset

While not strictly “mental health,” this dataset contains labeled moments of happiness, useful for studying emotion classification or positive reinforcement models.

  • Data Highlights: 100k+ crowd-sourced entries on what makes people happy.
  • Ideal For: Mood prediction models, reinforcement learning, or emotion-based recommendation engines.

🔗 HappyDB

9. Student Mental Health Survey

An anonymized mental health dataset examining how academic and social pressure affects student well-being.

  • Data Highlights: Stressors, academic performance, gender-based differences, and access to support systems.
  • Ideal For: Academic research, intervention strategy modeling, or student mental health prediction.

🔗 Student Mental Health

10. Emotion Classification Dataset (Text-based)

This multi-class mental health dataset allows for deeper classification across a spectrum of emotions — beyond just negative or positive.

  • Data Highlights: Emotion-labeled text samples across seven categories (joy, anger, fear, etc.).
  • Ideal For: Advanced NLP projects like multi-label classification and real-time emotion detection tools.

🔗 Emotion Text Dataset

Why These Datasets Matter — Even If You’re Not in Mental Health

Even if your focus isn’t directly on psychology or public health, these mental health datasets offer tremendous value:

  • Train your NLP and classification skills with real-world data
  • Learn how to handle sensitive, high-impact topics with care
  • Explore time-series, demographic, and correlation analyses in new contexts
  • Prototype responsible AI tools that prioritize well-being

Now, Are You Ready to Take It Further?

Want to build smarter solutions faster? Join us at ODSC East to deepen your skills and learn from top minds in AI, ML, and data science. With sessions ranging from responsible AI to NLP in healthcare, it’s a great place to see how real-world data (like these) can drive meaningful change.

Register now for ODSC East — your next breakthrough starts here. Use this link for an additional 10% off all ticket types.

--

--

ODSC - Open Data Science
ODSC - Open Data Science

Written by ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.

No responses yet