Introduction

This demonstration will go over how to utilize a choropleth map built using the MapLibre mapping library in an Observable Notebook (link to the Notebook). While the map built displays data from Maine county subdivisions, it can be modified to display a colorful data visualization from any specified region. Observable Notebooks use Javascript as a primary language, but Python users can interact with and modify existing visualizations from Observable using the Observable-Jupyter library. The following demonstration will include how to set up the computing environment, load in data recognizable by Observable, and embed data and specifications into the map. Throughout the demonstration, instructions point at cells written in Python that can be found within this Google Colab.

Setting Up

The demonstration will be conducted in Python using a Jupyter Notebook, a web-based interactive computing platform. First, using the pip package manager, install the pandas and Observable-Jupyter packages. The pandas package will be used for reading in data, while Observable-Jupyter will enable interaction with an Observable notebook.

!pip install observable_jupyter

!pip install pandas

Also, be sure to include the following import statements. The embed function is imported from Observable-Jupyter to include and modify Observable visualizations. The json package is imported such that quantitative data can be formatted as a JSON object. The requests package will be used to load in the JSON file including boundary information.

from observable_jupyter import embed
import pandas as pd
import json
import requests

Loading the Data

To update the choropleth map with new information, the input data must be readable by Observable. The following two cells provided below demonstrate how to do so. The to_json() function with orient paramter set to "records" converts the pandas data frame to a simple JSON format including column value pairs. The json package is then used to include proper indentation and return the data in the form of a JSON object. The get() function is used from requests to ensure the geographical data maintains its GeoJSON format.

temp_df = pd.read_csv('https://raw.githubusercontent.com/derekgmuse/choropleth/main/Demonstration%20Files/StateTemp.csv')
result = temp_df.to_json(orient="records")
parsed = json.loads(result)
data = json.dumps(parsed, indent=4)
Formatted_Data = json.loads(data)

url = "https://raw.githubusercontent.com/derekgmuse/choropleth/main/Demonstration%20Files/gz_2010_us_040_00_20m.json"
resp = requests.get(url)
Boundary_Data = resp.json()

Linking Data with Boundary Information

In order to create a choropleth visualization, the data from each region must be connected with its boundary coordinates. A function exists within the Observable notebook that takes care of this, but specific inputs must be provided for successful mapping. This section will go over how to determine the "regional_name", "regional_data" and "comonality" inputs included in the next section. The following two cells are run to provide us with examples of how the data will be processed. Three strings must be extracted from these cells, the key associated with the name of each region, the key associated with the data from each region, and the property associated with the name of each boundary. In the case of this example, "State" represents the name of each state ("regional_name"), "Avg °F" ("regional_data") represents the data, and "NAME" represents the name of each boundary ("comonality"). Extracting this information is a crucial step, and each string must be entered correctly when entering inputs in the next step.

Formatted_Data[0]
{'State': 'Alabama', 'Avg °F': 62.8, 'Avg °C': 17.1, 'Rank': 7}

Boundary_Data["features"][0]["properties"]
{'GEO_ID': '0400000US04',
'STATE': '04',
'NAME': 'Arizona',
'LSAD': '',
'CENSUSAREA': 113594.084}

Embedding Data and Specifications

The final step is to embed the Observable Notebook into a Jupyter cell and include the data and specifications desired for the visualization. The cells we have embeded include the map as well as the Maplibre specific CSS stylesheet. The following inputs allow the user to customize the graph to their requirements.

"map_data": The formatted data containing the geographical boundaries of each region

"quantitative_data": The formatted quantitative data

"regional_name": A string set to the key associated with the name of each region from "quantitative_data"

"regional_data": A string set to the the key associated with the data from each region from "quantitative_data"

"comonality": A string set to the property associated with each region from "map_data"

"toolTip_title": A list of two strings used for the map's popups. The first string titles the type of data being used, and the second string titles the units of each value

"Levels": A list of 7 numerical points ordered from smallest to largest that will partition the data into 7 sections. Each section will be related to a color. Each of the sections are less than or equal to each of the values from the list. Ensure no data value is larger than the 7th number.

"center": A list representing the coordinates [longitude, latitude] of the center of the map

"zoom": A value used to set the initial zoom of the map (accepted values between 0 and 24)

"Colors": A list of CSS color codes representing the 7 colors related to each of the 7 sections of data

embed("@derekgmuse/maine-county-subdivisions-choropleth",
    cells = ["map", "style"],
    inputs = { 
        "map_data" : Boundary_Data,
        "quantitative_data" : Formatted_Data,
        "regional_name" : "State",
        "regional_data" : "Avg °F",
        "commonality" : "NAME",
        "toolTip_title" : ["Average Temperature", "°F"],
        "Levels" : [25, 33, 41, 49, 57, 65, 73],
        "center" : [-98, 40],
        "zoom" : 3,
        "Colors" : ['#045a8d','#7fcdbb', '#c7e9b4','#fec44f', '#fe9929', '#d95f0e', '#993404']

    }
)

The following example demonstrates a case in which the Maine County Subdivision boundaries are desired. For this example, embedding our data is much more straightforward, as the geographical boundaries are already loaded into the Observable Notebook. The inputs "map_data", "comonality", "center" and "zoom" inputs can be disregarded.

pop_df = pd.read_csv('https://raw.githubusercontent.com/derekgmuse/choropleth/main/Demonstration%20Files/MainePopulation.csv')
result = pop_df.to_json(orient="records")
parsed = json.loads(result)
data = json.dumps(parsed, indent=4)
Formatted_Data2 = json.loads(data)

embed("@derekgmuse/maine-county-subdivisions-choropleth",
    cells = ["map", "style"],
    inputs = { 
        "quantitative_data" : Formatted_Data2,
        "regional_name" : "Place",
        "regional_data" : "2020 Total Population",
        "toolTip_title" : ["Population", "people"],
        "Levels" : [100, 1000, 5000, 10000, 15000, 25000, 70000],
        "Colors" : ['#045a8d','#7fcdbb', '#c7e9b4','#fec44f', '#fe9929', '#d95f0e', '#993404']
    }
)

Reference

This demonstration is a modification of the Leaflet Choropleth demonstration from the Observable Jupyter documentation that can be found here.