Quick tip: Using Uber's H3 to visualise British Transport Police crime data

Akmal Chaudhri - Oct 4 '22 - - Dev Community

Abstract

Creating visualisations can often be a great way to present data. In this short article, we'll apply Uber's H3 Hexagonal Hierarchical Spatial Index to British Transport Police (BTP) crime data, and then visualise the results. We'll use a local Jupyter installation as our development environment.

The notebook file used in this article is available on GitHub.

Introduction

In a previous article, we discussed how to map crimes and visualise hot routes. We'll extend that work in this article by obtaining the latest BTP crime data for the UK and use Uber's H3 library.

Obtain BTP crime data

The file we need is 2024-08-btp-street.csv. This can be generated from the Data Downloads page. On that page, we'll select the following:

  • Date range: August 2024 to August 2024.
  • Forces: Check (✔) British Transport Police.
  • Data sets: Check (✔) Include crime data.
  • Generate file.

The download will be a zip file, and the CSV file we need can be extracted from that.

Notebook

Let's now start to fill out our notebook.

First, we'll need to install the following:

!pip install geopandas h3 pandas --quiet --no-warn-script-location
Enter fullscreen mode Exit fullscreen mode

Next, we'll import some libraries:

import geopandas as gpd
import pandas as pd

from h3 import h3
from shapely.geometry import Polygon
Enter fullscreen mode Exit fullscreen mode

Next, we'll read the CSV file into a Pandas Dataframe, filter what we need and create a Geopandas Dataframe, with the correct coordinate system.

df = pd.read_csv("2024-08-btp-street.csv")

crimes = gpd.GeoDataFrame(
    df["Crime type"],
    geometry = gpd.points_from_xy(df.Longitude, df.Latitude),
    crs = "EPSG:4326"
)

crimes.head(5)
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                     Crime type                   geometry
0                 Bicycle theft  POINT (-0.32764 50.83438)
1                 Vehicle crime  POINT (-0.32764 50.83438)
2                   Other theft  POINT (-0.23643 50.83255)
3  Violence and sexual offences  POINT (-0.23643 50.83255)
4                   Other theft  POINT (-3.55862 54.64503)
Enter fullscreen mode Exit fullscreen mode

We'll now convert our geometry to H3 using code from an excellent article. Initially, we'll set the h3_level to 5 and then we'll try it with a smaller value.

h3_level = 5

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

def lat_lng_to_h3(row):
    return h3.geo_to_h3(
        row.geometry.y, row.geometry.x, h3_level
    )

crimes["h3"] = crimes.apply(lat_lng_to_h3, axis = 1)

crimes.head(5)
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                     Crime type                   geometry               h3
0                 Bicycle theft  POINT (-0.32764 50.83438)  85194a73fffffff
1                 Vehicle crime  POINT (-0.32764 50.83438)  85194a73fffffff
2                   Other theft  POINT (-0.23643 50.83255)  85194a73fffffff
3  Violence and sexual offences  POINT (-0.23643 50.83255)  85194a73fffffff
4                   Other theft  POINT (-3.55862 54.64503)  85195097fffffff
Enter fullscreen mode Exit fullscreen mode

Next, we'll aggregate the number of crimes:

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

counts = (crimes.groupby(["h3"])
                .h3.agg("count")
                .to_frame("count")
                .reset_index()
)

counts.head(5)
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count
0  851870d3fffffff      1
1  851870dbfffffff      1
2  85187433fffffff      1
3  85187463fffffff      1
4  8518746bfffffff      1
Enter fullscreen mode Exit fullscreen mode

Now, we'll convert H3 to polygons that can be visualised:

# https://spatialthoughts.com/2020/07/01/point-in-polygon-h3-geopandas/

def add_geometry(row):
    points = h3.h3_to_geo_boundary(
        row["h3"], True
    )
    return Polygon(points)

counts["geometry"] = counts.apply(add_geometry, axis = 1)

counts.head(5)
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count                                           geometry
0  851870d3fffffff      1  POLYGON ((-5.3895450275705175 50.2383686006280...
1  851870dbfffffff      1  POLYGON ((-5.618770916484385 50.21578200467119...
2  85187433fffffff      1  POLYGON ((-4.402362230572691 50.46432248133249...
3  85187463fffffff      1  POLYGON ((-5.0895660767735675 50.4018208682358...
4  8518746bfffffff      1  POLYGON ((-5.319050950902007 50.38000939979234...
Enter fullscreen mode Exit fullscreen mode

We'll also ensure that we have the correct coordinate system:

crimes_h3 = gpd.GeoDataFrame(counts, crs = "EPSG:4326")

crimes_h3.head(5)
Enter fullscreen mode Exit fullscreen mode

The output should be similar to the following:

                h3  count                                           geometry
0  851870d3fffffff      1  POLYGON ((-5.38955 50.23837, -5.48931 50.18359...
1  851870dbfffffff      1  POLYGON ((-5.61877 50.21578, -5.71845 50.16074...
2  85187433fffffff      1  POLYGON ((-4.40236 50.46432, -4.5027 50.41076,...
3  85187463fffffff      1  POLYGON ((-5.08957 50.40182, -5.18967 50.34749...
4  8518746bfffffff      1  POLYGON ((-5.31905 50.38001, -5.41907 50.32542...
Enter fullscreen mode Exit fullscreen mode

Finally, we'll plot the data:

btp_crimes = crimes_h3.plot(
    column = "count",
    cmap = "OrRd",
    edgecolor = "black",
    figsize = (7, 7),
    legend = True,
    legend_kwds = {
        "label" : "Number of crimes",
        "orientation" : "vertical"
    }
)

btp_crimes.set_axis_off()

btp_crimes.plot()
Enter fullscreen mode Exit fullscreen mode

h3_level set to 5 will render the chart shown in Figure 1.

Figure 1. h3_level = 5.

Figure 1. h3_level = 5.

Changing the value of h3_level to 3 and re-running the code will render the chart shown in Figure 2.

Figure 2. h3_level = 3.

Figure 2. h3_level = 3.

London and the South East have higher crime numbers than other parts of the United Kingdom.

Summary

Using Uber's H3, we have been able to create some useful charts. H3 could be used in many different application domains. Feel free to experiment with different h3_level settings and also try your own dataset.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player