Arxiv:1903.05192V2 [Physics.Space-Ph] 3 Apr 2019 Orsodn Uhr .Camporeale, E
Total Page:16
File Type:pdf, Size:1020Kb
manuscript submitted to Space Weather The Challenge of Machine Learning in Space Weather Nowcasting and Forecasting E. Camporeale CIRES, University of Colorado, Boulder, CO, USA Centrum Wiskunde & Informatica, Amsterdam, The Netherlands Key Points: • A review on the use of machine learning in space weather. • A gentle introduction to machine learning concepts tailored for the space weather community. • A selection of challenges on how to beneficially combine machine learning and space weather. arXiv:1903.05192v2 [physics.space-ph] 3 Apr 2019 Corresponding author: E. Camporeale, [email protected] –1– manuscript submitted to Space Weather Abstract The numerous recent breakthroughs in machine learning make imperative to carefully ponder how the scientific community can benefit from a technology that, although not necessarily new, is today living its golden age. This Grand Challenge review paper is focused on the present and future role of machine learning in space weather. The pur- pose is twofold. On one hand, we will discuss previous works that use machine learning for space weather forecasting, focusing in particular on the few areas that have seen most activity: the forecasting of geomagnetic indices, of relativistic electrons at geosynchronous orbits, of solar flares occurrence, of coronal mass ejection propagation time, and of so- lar wind speed. On the other hand, this paper serves as a gentle introduction to the field of machine learning tailored to the space weather community and as a pointer to a num- ber of open challenges that we believe the community should undertake in the next decade. The recurring themes throughout the review are the need to shift our forecasting paradigm to a probabilistic approach focused on the reliable assessment of uncertainties, and the combination of physics-based and machine learning approaches, known as gray-box. 1 Artificial Intelligence: is this time for real? The history of Artificial Intelligence (AI) has been characterized by an almost cycli- cal repetition of springs and winters: periods of high, often unjustified, expectations, large investments and hype in the media, followed by times of disillusionment, pessimism, and cutback in funding. Such a cyclical trend is not atypical for a potentially disruptive tech- nology, and it is very instructive to try to learn lessons from (in)famous AI predictions of the past (Armstrong, Sotala, & O´ hEigeartaigh,´ 2014), especially now that the debate about the danger of Artificial General Intelligence (AGI, that is AI pushed to the level of human ability) is in full swing (S. Russell & Bohannon, 2015; S. J. Russell & Norvig, 2016). Indeed, it is unfortunate that most of the AI research of the past has been plagued by overconfidence and that many hyperbolic statements had very little scientific basis. Even the initial Dartmouth workshop (1956), credited with the invention of AI, had un- derestimated the difficulty of understanding language processing. At the time of writing some experts believe that we are experiencing a new AI spring (e.g. Bughin & Hazan, 2017; Olhede & Wolfe, 2018), that possibly started as early as 2010. This might or might not be followed by yet another winter. Still, many reckon that this time is different, for the very simple reason that AI has finally entered industrial pro- duction, with several of our everyday technologies being powered by AI algorithms. In fact, one might not realize that, for instance, most of the time we use an app on our smart- phone, we are using a machine learning algorithm. The range of applications is indeed very vast: fraud detection (Aleskerov, Freisleben, & Rao, 1997), online product recom- mendation (Pazzani & Billsus, 2007; Ye, Zhang, & Law, 2009), speech recognition (Hin- ton et al., 2012), language translation (Cho et al., 2014), image recognition (Krizhevsky, Sutskever, & Hinton, 2012), journey planning (Vanajakshi & Rilett, 2007), and many others. Leaving aside futuristic arguments about when, if ever, robotic systems will replace scientists (Hall, 2013), I think this is an excellent time to think about AI for a space weather scientist, and to try formulating (hopefully realistic) expectations on what our commu- nity can learn from embracing AI in a more systematic way. Other branches of physics have definitely been more responsive to the latest developments in Machine Learning. Notable examples in our neighbor field of astronomy and astrophysics are the automatic identification of exoplanets from the Kepler catalog (Kielty et al., 2018; Pearson, Palafox, & Griffith, 2017; Shallue & Vanderburg, 2018), the analysis of stellar spectra from Gaia (Fabbro et al., 2017; X.-R. Li, Pan, & Duan, 2017), and the detection of gravitational waves in LIGO signals (George & Huerta, 2018). Each generation has its own list of science fiction books and movies that have made young kids fantasize about what the future will look like after AGI will finally be achieved. –2– manuscript submitted to Space Weather Without digressing too much, I would just like to mention one such iconic movie for my generation, the Terminator saga. In the second movie, a scene is shown from the cyborg point of view. The cyborg performs what is today called a segmentation problem, that is identifying single, even partially hidden, objects from a complex image (specifically, the movie’s hero is intent in choosing the best motorcycle to steal). The reason I am men- tioning this particular scene is that, about 30 years later, a landmark paper has been pub- lished showing that solving a segmentation problem is not science fiction anymore (see Figure 1) (He, Gkioxari, Doll´ar, & Girshick, 2017). Not many other technologies can claim to have made fiction come true, and in a such short time frame! Figure 1. Top: scene from the Terminator 2 movie (1991). Bottom: examples of segmenta- tion problems as solved by Mask R-CNN (2018) (He et al., 2017). 2 The Machine Learning renaissance One of the reasons why the current AI spring might be very different from all the previous ones and in fact never revert to a winter is the unique combination of three fac- tors that have never been simultaneously experienced in our history. First, as we all know, we live in the time of big data. The precise meaning of what constitutes big data depends on specific applications. In many fields the data is securely guarded as the gold mine on which a company’s wealth is based (even more than proprietary, but imitable, algorithms). Luckily, in the field of Space Weather most of the data and associated software is released to the public.(National Academies of Sciences & Medicine, 2018) The second factor is the recent advancement in Graphics Processing Units (GPU) computing. In the early 2000s GPU producers (notably, Nvidia) were trying to extend their market to the scientific community by depicting GPUs as accelerators for high per- formance computing (HPC), hence advocating a shift in parallel computing where CPU –3– manuscript submitted to Space Weather clusters would be replaced by heterogeneous, general-purpose, GPU-CPU architectures. Even though many such machines exist today, especially in large HPC labs worldwide, I would think that the typical HPC user has not been persuaded to fully embrace GPU computing (at least in space physics), possibly because of the steep learning curve re- quired to proficiently write GPU codes. More recently, during the last decade, it has be- come clear that a much larger number of users (with respect to the small niche of HPC experts) was ready to enter the GPU market: machine learning practitioners (along with bitcoin miners!). And this is why GPU companies are now branding themselves as en- ablers of the machine learning revolution. It is certainly true that none of the pioneering advancements in machine learning would have been possible without GPUs. As a figure of merit, the neural network NAS- net, that delivers state-of-the-art results on classification tasks of ImageNet and CIFAR- 10 datasets, required using 500 GPUs for 4 days (including search of optimal architec- ture) (Zoph, Vasudevan, Shlens, & Le, 2017). Hence, a virtuous circle, based on a larger and larger number of users and customers has fueled the faster than Moore’s law increase in GPU speed witnessed in the last several years. The largest difference between the two group of GPU users targeted by the industry, i.e. HPC experts and machine learning practitioners (not necessarily experts) is in their learning curve. While a careful design and a deep knowledge of the intricacies of GPU architectures is needed to successfully accelerate an HPC code on GPUs, it is often sufficient to switch a flag for a machine learn- ing code to train on GPUs. This fundamental difference leads us to the third enabling factor of the machine learning renaissance: the huge money investments from IT companies, that has started yet another virtuous circle in software development. Indeed, companies like Google or Facebook own an unmatchable size of data to train their machine learning algorithms. By realizing the profitability of machine learning applications, they have largely contributed to the advancement of machine learning, especially making their own software open-source and relatively easy to use (see, e.g., Abadi et al., 2016). Arguably, the most successful applications of machine learning are in the field of computer vision. Maybe because im- age recognition and automatic captioning are tasks that are very easy to understand for the general public, this is the field where large IT companies have advertised their suc- cesses to the non-experts and attempted to capitalize them. Classical examples are the Microsoft bot that guesses somebody’s age (https://www.how-old.net), that got 50 mil- lion users in one week, and the remarkably good captioning bot www.captionbot.ai (see Figure 2 from Donahue et al. (2015) for a state-of-the-art captioning example).