H3 intro
It looks like a futuristic model of a soccer ball designed for the World Cup, doesn’t it? In reality, it’s the planet Earth with visible continents alongside with a H3 grid, which will be the subject of today’s discussion.
The simplest definiton of H3 is: geospatial indexing system that partitions the world into hexagonal cells. H3 is hierarchical system (every cell has 7 child below in hierarchy), with 16 resolutions levels (numbered from 0 to 15). System is developed by Uber and its first version was released in 2018.
Idea behind hexagonal system is presented below.
We resign from discrete representation of the phenomenon for the sake of continuous cell representation. You decide what resolution is applicable for your specific scenario (cell size varies from ~4.3 mln square kilometers down to 1 square meter). Detailed information of cells on different levels is presented below. Please note that for all levels grid contains exactly 12 pentagons (otherwise creating approximation of the globe would not be possible).
Res | Number of cells | Average hexagon area (km2) |
---|---|---|
0 | 122 | 4,357,449.416078381 |
1 | 842 | 609,788.441794133 |
2 | 5,882 | 86,801.780398997 |
3 | 41,162 | 12,393.434655088 |
4 | 288,122 | 1,770.347654491 |
5 | 2,016,842 | 252.903858182 |
6 | 14,117,882 | 36.129062164 |
7 | 98,825,162 | 5.161293360 |
8 | 691,776,122 | 0.737327598 |
9 | 4,842,432,842 | 0.105332513 |
10 | 33,897,029,882 | 0.015047502 |
11 | 237,279,209,162 | 0.002149643 |
12 | 1,660,954,464,122 | 0.000307092 |
13 | 11,626,681,248,842 | 0.000043870 |
14 | 81,386,768,741,882 | 0.000006267 |
15 | 569,707,381,193,162 | 0.000000895 |
Let’s consider pseudo-random point with coordinates (somwehere in Warsaw, capital of Poland):
With usage of H3 API we can easily obtain H3 index (originally API is written in C, but multiple bindings are available). H3 index is unique identifier of the cell. Traditionally for this purpose we use a hexadecimal number, but with some implementations (like in Clickhouse) index is represented by 64 bites long integer (in JS called Big Int).
Res | H3 index (hexadecimal) | H3 index (Big Int) |
---|---|---|
0 | 801ffffffffffff | 577023702256844799 |
1 | 811f7ffffffffff | 581518505791193087 |
2 | 821f57fffffffff | 586019356639494143 |
3 | 831f53fffffffff | 590522681388957695 |
4 | 841f53dffffffff | 595026272426393599 |
5 | 851f53cbfffffff | 599529866685054975 |
6 | 851f53cbfffffff | 604033465641336831 |
7 | 861f53c97ffffff | 608537065168044031 |
8 | 881f53c91dfffff | 613040664793317375 |
9 | 891f53c91d3ffff | 617544264419901439 |
10 | 8a1f53c91d0ffff | 622047864047075327 |
11 | 8b1f53c91d0bfff | 626551463674429439 |
12 | 8c1f53c91d095ff | 631055063301789183 |
13 | 8d1f53c91d094bf | 635558662929159359 |
14 | 8e1f53c91d094b7 | 640062262556529847 |
15 | 8f1f53c91d094b0 | 644565862183900336 |
Alright, so let’s get to the point. What are the practical applications of H3 system ? Consider images below.
All 3 images show boundaries of my hometown. First (from the left) is pure geoJSON. This is most accurate representation of boundaries. Next two options are H3 based approximation (see how hex 10 based appoximation is close to the original). Total number of hexagons is equal:
- resolution 8: 465
- resolution 10: 22727
And now the problem: check if point belongs to given city or no (or in more general if the point is in polygon or no). As we can read on Wikipedia, it’s not simple task at all. With H3 we can convert point into cell and check if elements is included in hash-map and by this reduce time complexity to O(1)! Similar approach can be used when checking for country boundaries or even house boundaries (remember smallest hex has around 1 square meter).
BTW. images are taken from web app I wrote some time ago , called hexifier. It allows you to convert geoJSON into cells (you define resolution) and later export into CSV or plain text. If you want to understand H3 system better I recommend to play around with this tool.
Another example. At my work (I’m working for company called MobilePhysics) we collect a lots of environmental data (several dozen million of points per day). Simplest (and fastest as well) way to visualize them is to use H3 cells. Below you can see average PM2.5 values on the globe presented as H3 cells of resolution 3 (this is where hierarchical structure works like a charm - API gives us possibility to check parent of any resolution, this is why averaging is simple).
Do you feel the potential hidden in H3 geospatial indexing system? If no, just give it a chance and try with your specific scenario! There are lots of functions which help you to use H3 in most desirable way.
I hope you enjoyed presented examples. For today that’s all, but I have good news - I’ve already started working on next article - my annual technology summary. See you soon!