Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

Journal of Machine Learning Research 21 (2020) 1-43 Submitted 4/20; Revised 10/20; Published 11/20 Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning Peter Henderson [email protected] Stanford University, Stanford, CA, USA Jieru Hu [email protected] Facebook, Menlo Park, CA, USA Joshua Romoff [email protected] Mila, McGill University, Montreal, QC, Canada Emma Brunskill [email protected] Stanford University, Stanford, CA, USA Dan Jurafsky [email protected] Stanford University, Stanford, CA, USA Joelle Pineau [email protected] Facebook AI Research, Mila, McGill University, Montreal, QC, Canada Editor: David Sontag Abstract Accurate reporting of energy and carbon usage is essential for understanding the potential climate impacts of machine learning research. We introduce a framework that makes this easier by providing a simple interface for tracking realtime energy consumption and carbon emissions, as well as generating standardized online appendices. Utilizing this framework, we create a leaderboard for energy efficient reinforcement learning algorithms to incentivize responsible research in this area as an example for other areas of machine learning. Finally, based on case studies using our framework, we propose strategies for mitigation of carbon emissions and reduction of energy consumption. By making accounting easier, we hope to further the sustainable development of machine learning experiments and spur more research into energy efficient algorithms. Keywords: energy efficiency, green computing, reinforcement learning, deep learning, climate change 1. Introduction Global climate change is a scientifically well-recognized phenomenon and appears to be accelerated due to greenhouse gas (GHG) emissions such as carbon dioxide or equivalents (CO2eq) (Crowley, 2000; IPCC, 2018). The harmful health and safety impacts of global climate change are projected to “fall disproportionately on the poor and vulnerable” (IPCC, 2018). Energy production remains a large factor in GHG emissions, contributing about ∼ 25% of GHG emissions in 2010 (IPCC, 2018). With the compute and energy demands of many modern machine learning (ML) methods growing exponentially (Amodei and Hernandez, 2018), ML systems have the potential to significantly contribute to carbon emissions. Recent ©2020 Peter Henderson, Jieru Hu, Joshua Romoff, Emma Brunskill, Dan Jurafsky, Joelle Pineau. License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/20-312.html. Henderson, Hu, Romoff, Brunskill, Jurafsky, and Pineau work has demonstrated these potential impacts through case studies and suggested various mitigating strategies (Strubell et al., 2019; Schwartz et al., 2019). Systematic and accurate measurements are needed to better estimate the broader energy and carbon footprints of ML—in both research and production settings. Accurate accounting of carbon and energy impacts aligns incentives with energy efficiency (Schwartz et al., 2019), raises awareness, and drives mitigation efforts (Sundar et al., 2018; LaRiviere et al., 2016), among other benefits.1 Yet, most ML research papers do not regularly report energy or carbon emissions metrics.2 We hypothesize that part of the reason that much research does not report energy and carbon metrics is due to the complexities of collecting them. Collecting carbon emission metrics requires understanding emissions from energy grids, recording power outputs from GPUs and CPUs, and navigating among different tools to accomplish these tasks. To reduce this overhead, we present experiment-impact-tracker 3—a lightweight framework for consistent, easy, and more accurate reporting of energy, compute, and carbon impacts of ML systems. In Section 4, we introduce the design and capabilities of our framework and the issues with accounting we aim to solve with this new framework. Section 5 expands on the challenges of using existing accounting methods and discusses our learnings from analyzing experiments with experiment-impact-tracker. For example, in an empirical case study on image classification algorithms, we demonstrate that floating point operations (FPOs), a common measure of efficiency, are often uncorrelated with energy consumption with energy metrics gathered by experiment-impact-tracker. In Section 6, we focus on recommendations for promoting energy-efficient research and mitigation strategies for carbon emissions. Using our framework, we present a Reinforcement Learning Energy Leaderboard in Section 6.1.1 to encourage development of energy efficient algorithms. We also present a case study in machine translation to show how regional energy grid differences can result in large variations in CO2eqemissions. Emissions can be reduced by up to 30x just by running experiments in locations powered by more renewable energy sources (Section 6.2). Finally, we suggest systemic and immediate changes based on our findings: • incentivizing energy-efficient research through leaderboards (Section 6.1) • running experiments in carbon-friendly regions (Section 6.2) • reducing overheads for utilizing efficient algorithms and resources (Section 7.1) • considering energy-performance trade-offs before deploying energy hungry models (Section 7.2) • selecting efficient test environment especially in RL (Section 7.3) • ensuring reproducibility to reduce energy consumption from replication difficulties (Section 7.4) • consistently reporting energy and carbon metrics (Section 7.5) 1. See Section 4.1 for an extended discussion on the importance of accounting. 2. See Section 3 and Appendix B for more information. 3. https://github.com/Breakend/experiment-impact-tracker 2 Towards the Systematic Reporting of the Energy and Carbon Footprints of ML 2. Related Work Estimating GHG emissions and their downstream consequences is important for setting regulatory standards (U.S. Environment Protection Agency, 2013) and encouraging self- regulation (Byerly et al., 2018). In particular, these estimates are used to set carbon emissions reduction targets and in turn set carbon prices for taxes or emissions trading systems.4 A large body of work has examined modeling and accounting of carbon emissions5 at different levels of granularity: at the global scale (IPCC, 2018); using country-specific estimates (Ricke et al., 2018); targeting a particular industrial sector like Information and Communication Technologies, for example, modeled by Malmodin et al. (2013); or even targeting a particular application like bitcoin mining, for example, modeled by Mora et al. (2018). At the application level, some work has already modeled carbon impacts specifically in computationally intensive settings like bitcoin mining (Krause and Tolaymat, 2018; Stoll et al., 2019; Zade et al., 2019; Mora et al., 2018). Such application-specific efforts are important for prioritizing emissions mitigation strategies: without understanding projected impacts, policy decisions could focus on ineffective regulation. However, with large amounts of heterogeneity and endogeneity in the underlying data, it can be difficult to model all aspects of an application’s usage. For example, one study suggested that “bitcoin emissions alone could push global warming above 2 °C” (Mora et al., 2018). But Masanet et al. (2019), Houy (2019), and others, criticized the underlying modeling assumptions which led to such large estimates of carbon emissions. This shows that it is vital that these models provide accurate measurements if they are to be used for informed decision making. With ML models getting more computationally intensive (Amodei and Hernandez, 2018), we want to better understand how machine learning in research and industry impacts climate change. However, estimating aggregate climate change impacts of ML research and applications would require many assumptions due to a current lack of reporting and accounting. Instead, we aim to emphasize and aid systematic reporting strategies such that accurate field-wide estimates can be conducted in the future. Some recent work specifically investigates climate impacts of machine learning research. Strubell et al. (2019) demonstrate the issue of carbon and energy impacts of large NLP models by evaluating estimated power usage and carbon emissions for a set of case studies. The authors suggest that: “authors should report training time and sensitivity to hyperparameters”, “academic researchers need equitable access to computation resources”, and “researchers should prioritize computationally efficient hardware and algorithms”. Schwartz et al. (2019) provide similar proposals, suggesting floating point operations (FPOs) as a guiding efficiency metric. Lacoste et al. (2019) recently provided a website for estimating carbon emissions based on GPU type, experiment length, and cloud provider. In Section 5, we discuss how while the estimation methods of these works provide some understanding of carbon and energy impacts, 4. An emissions trading system is a cap on total allowed carbon emissions for a company with permits issued. When a company emits a certain amount of carbon, they trade in a permit, creating a market for emissions permits. This is a market-based approach to incentivize emission reductions. See Ramstein et al. (2019) for a description of such carbon pricing efforts across different countries. 5. See also assorted examinations on carbon accounting, standardized reporting, and policy recommendations (Stechemesser and Guenther, 2012; Dayarathna et al., 2015; IPCC, 2018; Ajani

Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support