Climate Data As Memory, Truce, and Target
Total Page:16
File Type:pdf, Size:1020Kb
2 KNOWLEDGE INFRASTRUCTURES UNDER SIEGE Climate data as memory, truce, and target Paul N. Edwards Introduction Data politics are not really about data per se. Instead, the ultimate stakes of how data are collected, stored, shared, altered, and destroyed lie in the quality of the knowledge they help to produce. The value of knowledge depends on trust in the infrastructures that create it, which themselves depend on trusted data. “Trust” and “truth” are closely related English words, and both are strongly connected to the “truce” of my title. Indeed, etymologically “true” and “truce” are in fact the same word.1 This chapter argues that both trust in climate knowledge infrastructures and the truth they deliver descend from the truces necessary to share and maintain climate data—all of them heavily contested in the contemporary United States. The partial success of climate change denialism stems, in large part, from the recent arrival of what I call “hypertransparency”: open data, open code, com- modity software tools, and alternative publication venues have quite suddenly upended truces painstakingly built over multiple centuries. Not only climate knowledge, but virtually all scientific knowledge of public concern is affected by this transformation. My title derives from Nelson and Winter’s theory of organizational routines in An Evolutionary Theory of Economic Change (1982). In large organizations, Nelson and Winter posited, routines serve three critical purposes. First, they hold the organiza- tion’s knowledge, storing organizational memory even as individual employees come and go. Argote (1999) later argued that automating routines transfers organizational knowledge to machines, a line of reasoning readily extended to many kinds of algorithms. Second, routines and technologies constitute a de facto truce among individuals and organizational elements whose goals and interests conflict. Routines, machines, and algorithms may incorporate workarounds and exception-handling 22 Paul N. Edwards procedures, and organizational culture can evolve to tolerate failures, delays, and protests (up to a point). Finally, routines, machines, and algorithms function as tar- gets, in two senses. First, they embody the organization’s aims and goals. Second, they serve as patterns or templates for managing new systems. Nelson and Winter’s paradigm cases were factories and corporations, but their theory fits most organizational contexts. It can be extended to encompass routines that cross organizational boundaries, as is typical in the case of infrastructures (Edwards et al. 2007). In the first part of this chapter, I show how the routines, machines, and algorithms of climate data systems fit Nelson and Winter’s paradigm of memory, truce, and target. Against this background, I then engage a much less benign sense of the word “target”—namely, “an object aimed at in shooting”—in the context of aggressive assaults on environmental knowledge infrastructures in the contemporary United States. These attacks seek to dismantle the very sources of climate knowledge by defunding crucial instruments, satellites, and data analysis programs. “Long data” and environmental knowledge Some kinds of environmental knowledge, such as the monitoring of weather, water quality, air pollution, or seismic activity, concern what is happening in the present, often with a view to short-term prediction (hours to days). Many such systems col- lect “big data,” e.g. from satellites or large sensor networks. As environmental data make their way into archives, they become “long data” (Arbesman 2013). Long data enable scientists to track and understand environmental change. Memory: environmental data and data models Enduring, carefully curated collections of long data—scientific memory—are therefore among the most valuable resources of modern environmental science. Such collections require continual maintenance. Over time, all kinds of things change about how data are collected. Instruments and sensors get out of calibra- tion, or are replaced with newer models with different characteristics. New human observers, new standards, new organizations, and new methods replace predeces- sors. National boundaries and political systems also change, sometimes with effects on data-collecting practices (Edwards 2010; Gitelman 2013). Such changes nearly always affect the relationship of today’s data to those recorded in the past. If a new digital thermometer (say) systematically reads 0.2°C higher than a previous model, this is because the instruments differ, not because the air tempera- ture has actually changed. To keep the long-term record consistent, scientists must adjust data to take account of these shifts. In the simple case just mentioned, when graphing temperature change over time, they might add 0.2°C to all readings taken by the older instrument. Far from falsifying data, this practice actually “truthifies” the long-term record. Adjusting data was once a manual routine; today it is usually han- dled by computerized algorithms. I often call such algorithms “data models,” because they encode models of how different instruments or data sources relate to each other. Knowledge infrastructures under siege 23 In addition, many environmental sensing instruments, such as those flown on satellites, produce data that require interpretation by algorithms before they can be used for most purposes. Since data analysis algorithms change—often but not always for the better—data interpreted with older algorithms must be continu- ally re-interpreted using newer ones. For example, many major satellite data sets are “versioned,” or re-issued after reprocessing with revised algorithms. At this writing, the 35-year record of tropospheric temperatures from satellite-mounted microwave sounding units is on version 6, with multiple minor releases between major ones (Spencer, Christy, and Braswell 2017). Routines for homogenizing data—whether enacted by people, machines, or algorithms—embody scientific organizations’ memory of how their data were previously created, just as Nelson and Winter argued. Perhaps counterintuitively, this ongoing adjustment—and revisions of the techniques used to make those adjustments—proves critical to maintaining and even improving the integrity of the long-term record. Different versions of data sets often, though not always, converge on similar results as analysis techniques improve. Further improvement— i.e., further revision, potentially reversing previous convergence—is almost always possible, and no data set is ever entirely definitive. InA Vast Machine (2010), I called this phenomenon “shimmering data.” Truce: keeping things running Nelson and Winter’s idea of routines as truces among conflicting groups and goals appears very clearly in the record of many environmental data systems. An extreme example of one such truce: even during the Cuban Missile Crisis in 1962, the weather services of Cuba, the USSR, and the United States continued to exchange weather data (Schatz 1978). With only a few exceptions, from the late 19th century to the present only actual war interrupted the routine flow of meteorological data among national weather services—even those of pariah states such as North Korea. When a scientific field depends on data from many sources, such as the weather services of the world’s 190-plus nations, conflicts among the various contributors are almost inevitable. In meteorology, such conflicts have occurred over observing hours, temperature and pressure scales, instrument placement, instrument housings, methods of calculating averages, reporting deadlines, and many other details of exactly when, where, and how observational data are taken and recorded. To share data in any meaningful way, these differences must be reconciled (Bowker 2000; 2005). Before computers, this was done primarily by standard-setting and laborious post hoc calculation; after computers, algorithms took over the calculations. The dramatic advance of the environmental and geosciences resulted in large part from routines-as-truces surrounding data. In meteorology, merchant ships from many nations began keeping standardized weather logs in the 1850s; these logs represented a truce among nations that used different temperature scales and other norms. The International Meteorological Organization, formed in the 1870s, 24 Paul N. Edwards promoted common standards and data sharing (Daniel 1973). The World Data Centers established during the 1957–58 International Geophysical Year, at the height of the Cold War, today remain critical institutions supporting virtually all of the geosciences: soil, solid Earth, climate, oceanography, and others (Aronova 2017; International Council of Scientific Unions 2016). Examples are easily multiplied. Target: routines and technologies as patterns for new systems By representing the new and unfamiliar in terms of something old and well understood, the page, file folder, and wastebasket icons of modern comput- ers helped enable the transition from paper-based office information systems. Similarly, Nelson and Winter argued that when organizations confront new needs, they use their existing routines as “targets” for the design of new ones. In just this way, existing data handling routines serve as targets when new data sources become available. For example, satellite radiometers can measure atmospheric temperature, but they do so in a very different way from instruments that work by direct con- tact with the atmosphere, such as the thermometers carried on weather