12 Excellent Datasets for Data Visualization in 2022

ODSC - Open Data Science
5 min readMar 2, 2022

Data visualization requires quality data just as much as any other project. Finding data visualization datasets can be frustrating, but these datasets offer excellent resources to support visualization projects of all kinds. Let’s explore the best data visualization datasets for 2022.

A Quick Word on Data Visualization

A search on Indeed revealed over 67,000 jobs listed just for data visualization. That doesn’t even include the general need for data scientists. Visualization skills help businesses build rapport and gain real insight from their data.

Whether you’re a seasoned data scientist or new to the field, you can always practice visualization. These datasets offer the perfect chance to manage projects and build experience.

FiveThirtyEight

FiveThirtyEight is a journalism site that makes its datasets from its stories available to the public. These provide researched data suitable for visualization and include sets such as airline safety, election predictions, and U.S. weather history. The sets are easily searchable, and the site continually updates.

BuzzFeed

BuzzFeed also makes data available to the public through its GitHub page. Users can find data analysis, libraries, and guides, all open source. Some example data sets include FCC comments and data breaches, fake news sites, and figure skating scores, among other varied things. Although BuzzFeed has a reputation for writing simple articles, these datasets come from investigative journalism sections.

The U.S. Census Bureau

The Census Bureau offers a wide variety of datasets on everything from population to foreign trade. These sets are free, and researchers can access them through a simple data search. The site includes maps, tables, statistics, and data profiles. These datasets span decades of information and could offer excellent infographics or other visualizations.

AWS Covid Job Impacts

For those looking for specific Covid visualization data, AWS offers this look at how Covid has impacted jobs since March 1, 2020. According to the landing page, the dataset updates daily, and researchers are free to use it under the Creative Commons license. Data comes from online job listings, and each filter segment includes the average of new job listings over a seven-day period.

Twitter Edge Nodes

This dataset allows users to build geographical representations using the 11 million nodes and 85 million edges sources in the set. It lives on Kaggle and is free for users to download and explore. Researchers can explore relationships between Twitter users, one of the biggest social media interactions available.

Earth Data

Earth Data offers science-related datasets for researchers in open access formats. Information comes from NASA data repositories, and users can explore everything from climate data to specific regions like oceans, to environmental challenges like wildfires. The site also includes tutorials and webinars, as well as articles. The rich data offers environmental visualizations and contains data from scientific partners as well.

Urban Atlas European Environmental Agency

Located on the Spider Portal at the United Nations site, this dataset offers spatial data on land use and land data. The data covers large urban zones with more than 100,000 inhabitants. Users can explore data through the interactive map, and data comes from sources such as web GIS or real-time monitoring.

The GDELT Project

The Global Dataset of Events Language and Tone collects events at a global scale. It offers one of the biggest data repositories for human civilization. Researchers can explore people, locations, themes, organizations, and other types of subjects. Data is free, and users can also download RAW data sets for unique use cases. The site also offers a variety of tools as well for users with less experience doing their own visualizations.

The Open Data Institute

The Open Data Institute offers datasets covering subjects like precipitation data, electricity usage, or air quality. Researchers can explore these datasets as part of an open data project with information taken from various Italian institutions. The Node Trentino projects can offer researchers real-life utility data for visualizations and other relevant projects.

Hotel Booking Demand Data

This dataset offers the opportunity to visualize questions about travel and data. It’s best for practicing visualization to answer questions because it’s about two years old. Users can find it housed on Kaggle, and it includes booking information for a city hotel and a resort hotel, including dates, times, who stayed, and other relevant information.

ProPublica

The news site ProPublica makes datasets available to the public covering subjects like education, the environment, or the military. The site includes both free and premium datasets, and users can sign up for notifications of new uploaded choices. Some of the information comes from older reports and research, but the site offers valuable resources for practice or real research.

Singapore Public Data

Another civic source of data, the Singapore government makes these datasets available for research and exploration. Users can search by subject through the navigation bar or enter search terms themselves. Datasets cover subjects like the environment, education, infrastructure, and transport.

Leveraging Visualization for Data Insights

Visualization is a valuable skill for new data scientists to master. Even seasoned data scientists can always use practice to level their visualization skills. These datasets offer a range of information in a variety of subjects perfect for launching your 2022 projects.

Learn more about Data Viz and Data Visualization Datasets at ODSC East 2022

There’s a lot to learn about data viz and data visualization datasets. What tools do you use? What frameworks are the best for your data visualization needs? Can it help my career by knowing data viz? By attending ODSC East 2022 this April 19th-21st, and checking out the Data Visualization focus area, you can learn all of these skills and more. Here are a few sessions coming with more to be added every week:

  • Network Analysis Made Simple: Eric Ma, PhD | Author of nxviz Package
  • Data Visualization with ggplot2: Martin Frigaard | Senior Clinical Programmer | BioMarin
  • Beyond the Basics: Data Visualization in Python: Stefanie Molin | Data Scientist, Software Engineer, Author of Hands-On Data Analysis with Pandas | Bloomberg

Original post here.

Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday. You can also get data science training on-demand wherever you are with our Ai+ Training platform. Subscribe to our fast-growing Medium Publication too, the ODSC Journal, and inquire about becoming a writer.

--

--

ODSC - Open Data Science

Our passion is bringing thousands of the best and brightest data scientists together under one roof for an incredible learning and networking experience.