Creating Multiple Visualizations in a Single Python Notebook
For a data scientist without an eye for design, creating visualizations from scratch might be a difficult task. But as is the case with most problems, a solution awaits thanks to Python.
Those drawn to using Python for data analysis have been spoiled, as more advanced libraries have made previously irksome tasks, like turning a comma separated value file into an organized dataset, into non-issues. Visualization in Python has had a few helping hands as well, with libraries like matplotlib allowing you to see your data as it’s being analyzed. Now, creating interactive, web-accessible visualizations is just as easy. Here’s how you can get started making an interactive map in a Python Notebook using Plotly.
[Related Article: 3 Things Your Boss Won’t Care About in Your Data Visualizations]
After obtaining your API key and installing proper packages, we’ll import them into our notebook.
import numpy as np
import plotly
import plotly.plotly as py
import pandas as pd
import scipy as sp
import plotly.figure_factory as ff
import plotly.graph_objs as go
plotly.tools.set_credentials_file(username=’user’, api_key=’your_key’)
For this example, I’m using Boston weather data I downloaded from the National Centers for Environmental Information and loading it using Pandas. Feel free to use whichever dataset you like, or to create one using any list.
#This line will create the text that is displayed when hovering over data points on the map
#I’ve included the name and average precipitation of each city in the dataset
df[‘text’] = df[‘NAME’].astype(str) + ‘, ‘ + ‘\n’ + ‘Precipitation: ‘ + df[‘PRCP’].astype(str)
Here’s the fun part — we’ll create a scatterplot using our data and map each city by its coordinates. Plotly’s API handles geolocation, so the accuracy of location shouldn’t be a problem. We’ll also change the appearance of each point using the marker parameters. If you aren’t familiar with the attributes I’m using here, the documentation should be able to explain.
data = [ go.Scattergeo(
locationmode = ‘USA-states’,
lon = df[‘LONGITUDE’],
lat = df[‘LATITUDE’],
text = df[‘text’],
mode = ‘markers’,
marker = dict(size = 8,
opacity = 0.8,
reversescale = False,
autocolorscale = True,
symbol = ‘circle’,
line = dict(
width=.8,
color=’rgba(102, 102, 102)’
),
cmin = 0,
color = df[‘PRCP’],
cmax = df[‘PRCP’].max(),
colorbar=dict(
title=”Precipitation Avg. Jan. 2019 (in)”
)
))]
Lastly, we’ll set our layout, which simply sets the actual parameters and map on which our data points will appear. Again, a little refresher on this section of the docs might be helpful.
layout = dict(
title = ‘Average Precipitation in Greater Boston Area (Jan. 2019)’,
geo = dict(
scope=’usa’,
projection=dict( type=’albers usa’ ),
showland = True,
landcolor = “rgb(250, 250, 250)”,
subunitcolor = “rgb(217, 217, 217)”,
countrycolor = “rgb(217, 217, 217)”,
countrywidth = 0.5,
subunitwidth = 0.5
),
)
All that’s left to do is create the figure using the data and layout we’ve created and making a call to the API, which will output a map that should look something like this:
fig = go.Figure(data=data, layout=layout )
py.iplot(fig, filename=’Precipitation’ )
Let’s try another type of graph, this time using a line chart to visualize monthly precipitation averages in two places over the course of a year.
[Related Article: Introduction to R Shiny]
First, we’ll have to create the two timelines we want to trace separately. Like creating our scatterplot, we’ll call the API and set our parameters. For each line, I’ll be using a different dataframe — each includes precipitation levels in Boston and Chelsea, respectively. We’ll set our x-axis as date values and our y-axis as precipitation. To make the actual data to use in the layout, create a list containing our two line objects.
trace1 = go.Scatter(
x = dfBos[‘DATE’],
y = dfBos[‘PRCP’],
name = ‘Precipitation in Boston’,
line = dict(
color = (‘blue’),
width = 3)
)
trace2 = go.Scatter(
x = dfCh[‘DATE’],
y = dfCh[‘PRCP’],
name = ‘Precipitation in Chelsea’,
line = dict(
color = (‘red’),
width = 3)
)
data = [trace1, trace2]
Here, the layout is less complex. We just need to give the x and y-axes labels and title the graph.
layout = dict(title = ‘Average Precipitation in Suffolk County, MA’,
xaxis = dict(title = ‘Month’),
yaxis = dict(title = ‘Precipitation (in)’),
)
Once you create the figure and call the API, you should have a neat line chart like this, which also has interactive labels:
fig = dict(data=data, layout=layout)
py.iplot(fig, filename=’Precipitation in Suffolk County’)
Now that you have the basics of Plotly using Python, you can try these concepts out with different data, and practice adding skills along the way.
Read more data science articles on OpenDataScience.com, including tutorials and guides from beginner to advanced levels! Subscribe to our weekly newsletter here and receive the latest news every Thursday.