tl;dr

Cubist artists analyzed and reassembled objects in an abstracted form that was complex but insightful. We can use a multi-dimensional clustering based approach for mapping geographic time series that draws a close parallel to cubism and may do a better job of depicting major trends than conventional techniques.

Analytical Cubism

Cubism was a highly abstracted, early 20th century visual art style. “Analytical Cubism” refers the initial phase of this movement.


Girl with a Mandolin, Pablo Picasso, 1910

Girl with a Mandolin, Pablo Picasso, 1910


Cubist artists rejected the notion that art should aim to perfectly capture nature. Classical notions of perspective were left behind as instead the 2-dimensionality of the canvas was emphasized and embraced. As Wikipedia contributors describe, “In Cubist artwork, objects are analyzed, broken up and reassembled in an abstracted form —- instead of depicting objects from a single viewpoint, the artist depicts the subject from a multitude of viewpoints to represent the subject in a greater context.”


Violon (Violin), Pablo Picasso, 1911-12

Violon (Violin), Pablo Picasso, 1911-12


One painting by Marcel Duchamp depicts a human figure walking down a set of stairs and serves effectively as a static representation of change through time. Cubism excels in visually and statically depicting complex objects and multi-dimensional concepts.


Nude Descending a Staircase, Marcel Duchamp, 1912

Nude Descending a Staircase, Marcel Duchamp, 1912


Geographic Time Series

Another sort of complex objects are geographic time series datasets. These datasets are collections of geographic features where each feature has some sort of value or measurement that changes between moments of time. I’ll use taxi cab dropoffs in New York City as an example, where my features will be hexagonal grid cells overlaid on the city with measurements of the number of dropoffs that occurred in that grid cell during each hour of the day, from midnight to midnight, over a two year timespan.


New York City hexagonal grid for counting taxi cab dropoffs

New York City hexagonal grid for counting taxi cab dropoffs


The most typical and effective way to show a time series is with a simple line chart. However, line charts quickly turn into jumbled messes of spaghetti when trying to depict hundreds or thousands of individual features. Line charts also fail to provide geographic context when history, environment, demographics, or spatial relationships can be critical to understanding temporal trends in a region or place.

So how can we communicate both temporal and spatial patterns?

Existing methods range on a spectrum of lossy to lossless, meaning they either generalize out much of the data’s information or they provide the data in complete detail. Let’s examine the strengths and shortcomings of these approaches.


Current methods of displaying geographic time series

Current methods of displaying geographic time series ranging from lossy to lossless.


Aggregating measurements to a mean, maximum, minimum, or some other measure over the full range of time is clear and easy to read. However, this method fails to provide any information about hourly measurements. It’s completely lossy.


Aggregate to mean

Aggregating to above or below the mean number of dropoffs.


Another choice may be to show the absolute change between moments of time. These types of maps are straightforward, but they still miss out on most of the available temporal information. A map depicting change in number of taxi dropoffs between 6:00 AM and 12:00 PM may answer one question, but it leaves complete uncertainty for the other 22 hours of the day. This approach is still super lossy.


Absolute change

Showing the change in number of dropoffs between two moments in time.


To capture the full range of moments, small multiples are common. These are usually fascinating displays and because they’re lossless, small multiples make a great resource for pouring over all the information. However, they are limited by display space and teasing out patterns requires lots of mental and visual work from the reader. There’s no guarantee that everyone walks away having seen the same patterns.


Small multiples

Showing the number of dropoffs every hour in a small multiple display.


The most obvious way to show a geographic time series might be animation. With animation, the passage of time is represented in the most literal way possible. This is a lossless approach very similar to small multiples. However, animation places a massive cognitive load on the map reader, forcing them to simultaneously:

  • remember the past
  • watch for changes
  • track time
  • understand the map symbology

The invisible gorilla is a good example of how we can be confident in our visual attention, but actually miss the real story. What might we be missing when we watch animated maps?


Animation

Animation showing the number of dropoffs every hour.


Is there another option that falls somewhere between these extremes of straightforward-yet-lossy and lossless-yet-burdensome?


Another option

Another option


Treating geographic time series as a multi-dimensional dataset where each measurement is a dimension opens the possibility using clustering techniques to tease out major trends. Below are the four major trends from the New York City taxi dropoffs dataset. This approach is making use of the tried-and-true line charts that are useful when depicting a small number of trends, like the four clusters here.


Four major trends

Four major trends in taxi cab dropoffs based on multidimensional clustering of the geographic time series.


Here, each grid cell feature is assigned to one of four clusters based on normalized rates of between-measurement change. K-means clustering looks at each feature and its hourly measurements and assigns it to a group of features that are most similar. The chart below, an annotated version of the blue trend in the figure above, is representative of the first cluster.


Annotated chart

The first trend determined by a multidimensional clustering of the time series shows few dropoffs in the early morning, more during the daytime, and most occurring after the work day.


The real magic happens when grid cells on the map are colored based on the cluster they were assigned. The spatial correlation is striking.


Trends mapped

Major trends mapped


You may also notice the very distinctive red cluster and that two clusters are nearly the same color green. In assigning colors to the clusters, I’ve reprojected the cluster centroids into a perceptual color space so that similar clusters have similar colors and outlying clusters stand out. In this case, the bright red grid cells correspond directly with the unique traffic dropoff patterns of JFK and LaGuardia Airports, while the similar green cells mix throughout Brooklyn, Queens, the Bronx, and northern Manhattan.


Geographic context

Some labels for geographic context


And to take things a step further, I’ll also adjust the transparency, or alpha or opacity, of each grid cell based on how well it fits into its assigned cluster. Now, the most prominent hexagons are the ones that are most representative of their cluster.


Transparency adjusted

Grid cells made more transparent if they were outliers within their assigned cluster


Cubist Maps

Remember those cubist art paintings? They analyzed and reassembled objects in an abstracted form that was complex but insightful. This major-trends approach for mapping geographic time series draws a close parallel. The simplicity of absolute change maps or realism of map animations is tempting, but we can also look towards the cubist art movement for ideas about creating abstracted views on geographic time series that provide a fuller context than conventional methods.

Major trends in United States population between 1950 and 2010:


Trends in US population

Major trends in United States population from 1950 to 2010. Source: https://www.nhgis.org/


Notes

Further Reading

Jonathan Schroeder’s urban core trend maps