Skip to main content

The Data-Visualization Revolution

Virtual “telescopes” for big data make it possible to see through the deluge

Galileo was the first person to discover that Jupiter was not alone, but that it was actually accompanied by other celestial objects that revolved around it. Ultimately, his finding presented the greatest evidence for debunking the geocentric view of the universe, relegating Earth from its center to that of a mere planet orbiting the sun.

We would argue that our ability to understand and visualize large sets of data is entering a similar stage of evolution as 17th-century astronomy. As Galileo did centuries ago, we now have primitive versions of tools that have the potential to become powerful ones. These tools allow us to explore the fluid landscape of bits, instead of the rigidity of atoms, giving rise to a new medium that is helping us comprehend the complex while simultaneously providing a new means of artistic expression.

As data visualizations leave the rigidity of traditional graphic design, and ink is replaced by pixels, we encounter the fluidity of working with designs that are not fully specified a priori. In its more modern incarnation data visualization has generated a new form of graphic design where visual attributes such as lines, shapes and colors become nothing more than the corporeal reality of graphic objects whose soul is made of data. The new graphic designer no longer creates visualizations by choosing a rigid collection of shapes, positions and colors but rather by choosing the rules needed for data to breathe form into geometric abstractions.


On supporting science journalism

If you're enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


These new visualizations have properties that were absent in their ink-based predecessors. These properties allow the emergence of a richer relationship between the visualization and the reader, who is now less of a spectator and more of an explorer. So the revolution is not just one of form but also of function.

Going back to our friend Galileo and his early telescope, what we now have is the power to hand out telescopes to anyone whose curiosity is piqued and who wants to learn more. Instead of studying single snapshots of topical issues or events in ever-greater detail, however, we can create “datascopes” that can be used to zoom in and out of large data sets in search of new understanding.

Public records are an obvious place on which to begin training our datascopes. Even when made public by law, these large collections of information are often inaccessible in practice. This lack of accessibility is mostly a matter of inadequate technology, because the data are already digitized—and in some cases even made available on unfriendly government Web sites.

Take DataViva as an example of a datascope that can help us democratize access to public records. Recently released by the Brazilian state, Minas Gerais, DataViva has opened up data for the entire formal sector of the Brazilian economy. DataViva is not built around links to files but around more than 100 million interactive visualizations that are organized into eight different “apps.” Thanks to DataViva, people anywhere in the world now have the ability to point their browsers to Brazil’s public data and explore the Brazilian economy at an unprecedented resolution. People interested in comparing the salaries paid in Rio and Minas or looking to understand the industrial structure of Belo Horizonte and its opportunities, can now very quickly and with relative ease bring these public records from Brazil to their minds.

 


Another example of a datascope is Pantheon. Pantheon makes available data collected to quantify global cultural development. This makes Pantheon similar to DataViva, in the sense that it is a datascope that visualizes data on human capacities, albeit instead of focusing on the capacities that are expressed in industries—such as motorcycle manufacturing—Pantheon focuses on the capacities expressed in human accomplishments, such as Newton’s theories or the songs of Elvis Presley.

Pantheon allows us to visualize the historical cultural production of the U.K. similarly to the way that DataViva allows us to explore the industrial structure of Belo Horizonte. Yet the true value of these datascopes does not reside in their ability to create one-off visualizations but in their ability to provide the frameworks needed to weave together stories that can be accurately told only from multiple points of view.

When using Pantheon to observe the evolution of cultural domains, one is reminded of Elizabeth Eisenstein’s book, The Printing Press as an Agent of Change. Here Eisenstein argues that the printing press not only changed the number of books published, but also who was published, what was published and who became publishers. Eisenstein’s argument, similar to Marshall McLuhan’s “the medium is the message,” is that changes in media trigger profound changes in society, by shifting who receives attention, and hence, bringing prominence to new cultural forms. In the case of the printing press, the forms that increased in popularity were the arts and sciences.

So let’s try to use Pantheon to quickly construct a simplified version of Eisenstein’s story. If we look at the world’s cultural production before the year A.D. 1,300 we will notice that it is composed mostly of religious figures and political leaders. The arts and the sciences are conspicuously absent. Yet, when we turn our gaze to the next 400 years, those spanning to 1700, we find a large increase in the cultural prominence of the arts. The next period, that between 1700 and 1900, includes the peak of science, which becomes the second-largest cultural domain in the 19th century. Together, these illustrations quickly confirm (although they certainly don’t prove) Eisenstein’s theory:


Although Pantheon’s findings are consistent with Eisenstein’s story, to really hit the nail on the head we need to move away from the printing press and into other changes in communication technology. Consider the first half of the 20th century: Together with the radio and the silver screen, actors and singers become some of the most popular cultural products. This extends Eisenstein’s original ideas to new broadcasting technologies. Yet, there is more. The second half of the 20th century, and the rise of television, introduced yet a new change in communication technology that was accompanied by the rise of a new cultural icon: the sports figure, who for the first time in history rose to the status of global celebrity.

 

César A. Hidalgo directs the Center for Collective Learning at the Artificial and Natural Intelligence Institute (ANITI) of the University of Toulouse. He also holds appointments at the Harvard School of Engineering and Applied Sciences (SEAS), at the Alliance Manchester Business (AMBS) School of the University of Manchester, at the Toulouse School of Economics (TSE) and at the Institute of Advanced Study in Toulouse (IAST). He is the author of three books including How Humans Judge Machines (MIT Press, 2021).

More by César A. Hidalgo