Cubist artists analyzed and reassembled objects in an abstracted form that was complex but insightful. We can use a multi-dimensional clustering based approach for mapping geographic time series that draws a close parallel to cubism and may do a better job of depicting major trends than conventional techniques.
Cubism was a highly abstracted, early 20th century visual art style. “Analytical Cubism” refers the initial phase of this movement.
Cubist artists rejected the notion that art should aim to perfectly capture nature. Classical notions of perspective were left behind as instead the 2-dimensionality of the canvas was emphasized and embraced. As Wikipedia contributors describe, “In Cubist artwork, objects are analyzed, broken up and reassembled in an abstracted form —- instead of depicting objects from a single viewpoint, the artist depicts the subject from a multitude of viewpoints to represent the subject in a greater context.”
One painting by Marcel Duchamp depicts a human figure walking down a set of stairs and serves effectively as a static representation of change through time. Cubism excels in visually and statically depicting complex objects and multi-dimensional concepts.
Geographic Time Series
Another sort of complex objects are geographic time series datasets. These datasets are collections of geographic features where each feature has some sort of value or measurement that changes between moments of time. I’ll use taxi cab dropoffs in New York City as an example, where my features will be hexagonal grid cells overlaid on the city with measurements of the number of dropoffs that occurred in that grid cell during each hour of the day, from midnight to midnight, over a two year timespan.
The most typical and effective way to show a time series is with a simple line chart. However, line charts quickly turn into jumbled messes of spaghetti when trying to depict hundreds or thousands of individual features. Line charts also fail to provide geographic context when history, environment, demographics, or spatial relationships can be critical to understanding temporal trends in a region or place.
So how can we communicate both temporal and spatial patterns?
Existing methods range on a spectrum of lossy to lossless, meaning they either generalize out much of the data’s information or they provide the data in complete detail. Let’s examine the strengths and shortcomings of these approaches.
Aggregating measurements to a mean, maximum, minimum, or some other measure over the full range of time is clear and easy to read. However, this method fails to provide any information about hourly measurements. It’s completely lossy.
Another choice may be to show the absolute change between moments of time. These types of maps are straightforward, but they still miss out on most of the available temporal information. A map depicting change in number of taxi dropoffs between 6:00 AM and 12:00 PM may answer one question, but it leaves complete uncertainty for the other 22 hours of the day. This approach is still super lossy.
To capture the full range of moments, small multiples are common. These are usually fascinating displays and because they’re lossless, small multiples make a great resource for pouring over all the information. However, they are limited by display space and teasing out patterns requires lots of mental and visual work from the reader. There’s no guarantee that everyone walks away having seen the same patterns.
The most obvious way to show a geographic time series might be animation. With animation, the passage of time is represented in the most literal way possible. This is a lossless approach very similar to small multiples. However, animation places a massive cognitive load on the map reader, forcing them to simultaneously:
- remember the past
- watch for changes
- track time
- understand the map symbology
The invisible gorilla is a good example of how we can be confident in our visual attention, but actually miss the real story. What might we be missing when we watch animated maps?
Is there another option that falls somewhere between these extremes of straightforward-yet-lossy and lossless-yet-burdensome?
Treating geographic time series as a multi-dimensional dataset where each measurement is a dimension opens the possibility using clustering techniques to tease out major trends. Below are the four major trends from the New York City taxi dropoffs dataset. This approach is making use of the tried-and-true line charts that are useful when depicting a small number of trends, like the four clusters here.
Here, each grid cell feature is assigned to one of four clusters based on normalized rates of between-measurement change. K-means clustering looks at each feature and its hourly measurements and assigns it to a group of features that are most similar. The chart below, an annotated version of the blue trend in the figure above, is representative of the first cluster.
The real magic happens when grid cells on the map are colored based on the cluster they were assigned. The spatial correlation is striking.
You may also notice the very distinctive red cluster and that two clusters are nearly the same color green. In assigning colors to the clusters, I’ve reprojected the cluster centroids into a perceptual color space so that similar clusters have similar colors and outlying clusters stand out. In this case, the bright red grid cells correspond directly with the unique traffic dropoff patterns of JFK and LaGuardia Airports, while the similar green cells mix throughout Brooklyn, Queens, the Bronx, and northern Manhattan.
And to take things a step further, I’ll also adjust the transparency, or alpha or opacity, of each grid cell based on how well it fits into its assigned cluster. Now, the most prominent hexagons are the ones that are most representative of their cluster.
Remember those cubist art paintings? They analyzed and reassembled objects in an abstracted form that was complex but insightful. This major-trends approach for mapping geographic time series draws a close parallel. The simplicity of absolute change maps or realism of map animations is tempting, but we can also look towards the cubist art movement for ideas about creating abstracted views on geographic time series that provide a fuller context than conventional methods.
Major trends in United States population between 1950 and 2010:
- Checkout geo-time-series on GitHub for replicating these techniques in a web browser
- Ping me with your questions and thoughts: @aaronpdennis
- This is a recap of my talk from NACIS 2016