Below we'll plot the location of Beijing subway stops over time, half of which have been built since 2015. Locations for subway stops come from Wikipedia and OpenStreetMap. This is not a rigorous study, so some subway stops may be missing.
First, we can use the Pandas library to download our data. You're likely already familiar with it–Pandas is a very popular library in Python for filtering, aggregating, and joining data.
import pandas as pd
import pydeck as pdk
# First, let's use Pandas to download our data
URL = 'https://raw.githubusercontent.com/ajduberstein/data_sets/master/beijing_subway_station.csv'
df = pd.read_csv(URL)
df.head()
Next, we'll have to engage in some necessary data housekeeping. The CSV encodes the [R, G, B, A]
color values a str
, and literal_eval
lets us convert that string a list.
from ast import literal_eval
# We have to re-code position to be one field in a list, so we'll do that here:
# The CSV encodes the [R, G, B, A] color values listed in it as a string
df['color'] = df.apply(lambda x: literal_eval(x['color']), axis=1)
pydeck features some utilities for visualizing data, like an automatic zoom using data_utils.compute_view
for 2D data sets.
We'll render the viewport, as well, just to verify that the visualization looks sensible.
# Use pydeck's data_utils module to fit a viewport to the central 90% of the data
viewport = pdk.data_utils.compute_view(points=df[['lng', 'lat']], view_proportion=0.9)
auto_zoom_map = pdk.Deck(layers=[], initial_view_state=viewport)
auto_zoom_map.show()
Sure enough, we're centered to Beijing.
We'll render the data and use some Jupyter notebook functionality to provide a header with a year.
It's worth spending some time on each line, if you haven't seen the Layer object yet:
scatterplot = Layer(
'ScatterplotLayer',
df,
get_radius=500,
get_fill_color='color',
get_position='position')
We can specify the layer type as the first argument, the data as the second, and the layer arguments as keywords. ScatterplotLayer is one of a list of layers available in the deck.gl core library. We'll also provide a header to list the year using some built-in Jupyter notebook tools.
For a list of other layers, see the deck.gl documentation. Remember that deck.gl is a JavaScript library and not a Python one, so the documentation may differ for some kinds of terminology and functionality (e.g., pydeck doesn't support passing functions as arguments but this is a common occurrence within deck.gl).
from IPython.core.display import display
import ipywidgets
year = 2019
scatterplot = pdk.Layer(
'ScatterplotLayer',
df,
get_position=['lng', 'lat'],
get_radius=500,
get_fill_color='color')
r = pdk.Deck(scatterplot, initial_view_state=viewport)
# Create an HTML header to display the year
display_el = ipywidgets.HTML('<h1>{}</h1>'.format(year))
display(display_el)
# Show the current visualization
r.show()
Finally, we can loop through the data and see the dramatic development in Beijing since 1971, as demonstrated by subway stop opening dates.
import time
for y in range(1971, 2020):
scatterplot.data = df[df['opening_date'] <= str(y)]
year = y
# Reset the header to display the year
display_el.value = '<h1>{}</h1>'.format(year)
r.update()
time.sleep(0.2)