DECEMBER 2018 V E I L L E T T E E T A L . 2323

Creating Synthetic Radar Imagery Using Convolutional Neural Networks

MARK S. VEILLETTE,ERIC P. HASSEY,CHRISTOPHER J. MATTIOLI,HAIG ISKENDERIAN, AND PATRICK M. LAMEY Lincoln Laboratory, Massachusetts Institute of Technology, Lexington, Massachusetts

(Manuscript received 24 January 2018, in final form 2 October 2018)

ABSTRACT

In this work deep convolutional neural networks () are shown to be an effective model for fusing heterogeneous geospatial data to create radar-like analyses of precipitation intensity (i.e., synthetic radar). The CNN trained in this work has a directed acyclic graph (DAG) structure that takes inputs from multiple data sources with varying spatial resolutions. These data sources include geostationary satellite (1-km visible and four 4-km infrared bands), lightning flash density from Earth Network’s Total Lightning Network, and numerical model data from NOAA’s 13-km Rapid Refresh model. A regression is performed in the final layer of the network using NEXRAD-derived data mapped onto a 1-km grid as a target variable. The outputs of the CNN are fused with analyses from NEXRAD to create seamless radar mosaics that extend to offshore sectors and beyond. The model is calibrated and validated using both NEXRAD and spaceborne radar from NASA’s Global Precipitation Measurement (GPM) Mission’s Core Observatory satellite. The advantages over a random forest–based approach used in previous works are discussed.

1. Introduction Weather radar is particularly important in air trans- portation (Evans et al. 2006). In the United States, air Depictions of storm location and intensity obtained traffic controllers (ATC) and air traffic managers from weather radar are extremely important for public (ATM) employed by the Federal Aviation Adminis- safety, transportation, agriculture, tourism, and several tration (FAA) rely heavily on weather radar to track other areas. The United States is covered by a network storms that can impact the safety and efficiency in the of 159 Weather Surveillance Radar-1988 Dopplers (WSR-88Ds or, more commonly, NEXRAD;1 Crum National Airspace System (NAS). However, the NAS and Alberty 1993), which are long-range S-band radars includes areas both inside and outside of land-based that provide frequent and detailed analyses of re- radar range, leaving many offshore and oceanic con- flectivity (dBZ), radial velocity, and a number of po- trollers without direct access to weather information larimetric variables. Precipitation mosaics constructed required for proper air traffic management. The left from the NEXRAD enable meteorologists and oper- panel of Fig. 1 demonstrates this shortcoming by ational forecasters to track and anticipate impacting showing storm activity (indicated by lightning de- weather events, such as precipitation, thunderstorms, tection) outside radar range that is not depicted in the hail, snow, tornadoes, and other forms of hazardous NEXRAD mosaic. This lack of adequate situational weather. awareness may be detrimental to aviation safety and can lead to inefficiencies in the NAS. This deficiency motivated the development of the 1 At the time of this writing, since the San Juan NEXRAD was Offshore Precipitation Capability (OPC; Veillette et al. disabled in September 2017 from . 2016, 2015; Veillette and DeLaura 2016; Ryan 2016), which is a system that fills in gaps outside the coverage of Denotes content that is immediately available upon publica- weather radar with synthetic radar, which is a radar-like tion as open access. depiction of precipitation created by combining data from multiple nonradar sources that provide coverage in Corresponding author: Mark S. Veillette, [email protected]. areas where there is no weather radar. These nonradar edu. data include lightning flash detections, geostationary

DOI: 10.1175/JTECH-D-18-0010.1 Ó 2018 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2324 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

FIG. 1. (left) NEXRAD depiction of precipitation intensity (VIL) off the U.S. East Coast shown on a six-level scale. Lightning flashes offshore (white plus symbols) indicate storm activity outside the coverage of weather radar. (right) The same analysis augmented using the OPC described in this paper. OPC fills in storms outside radar range by fusing various nonradar sources using a convolutional neural network. This image is taken at 1330 UTC 2 Feb 2017. satellite imagery, and numerical weather prediction ~ 5 E ... R [RjI1, I2, , In] . (1) (NWP) model output. Individually, these datasets do not provide depictions as descriptive as radar; however, Typically, these nonradar data will have a greater OPC effectively combines these quantities and provides coverage umbrella than radar and thus are able to fill radar-like mosaics familiar to air traffic controllers and regions outside the radar range with radar-like esti- air traffic managers. The right panel of Fig. 1 shows an mates. With this method, radar mosaics with far greater example of the NEXRAD mosaic augmented with the coverage (potentially global) can be constructed. version of OPC described in this paper. The initial version of OPC utilized a random forest More formally, let R2RMxN be a 2D image2 generated (OPC-RF; Veillette et al. 2016) to estimate the expec- from weather radar data. In general, R can represent a tation in (1). This model was trained using NEXRAD number of meteorological quantities; however, our fo- as a target variable and generated imagery depicting cus lies with images derived from radar reflectivity VIL and echo tops4 that were chosen because of their (dBZ) that represent precipitation intensity, for exam- relevance to air traffic control. The OPC-RF used a ple, base reflectivity, composite reflectivity, echo top, feature vector made up of roughly 100 subjectively de- vertically integrated liquid (VIL), and rainfall rate.3 The termined features constructed from patches of input sat- image R will only be valid in range of weather radar and ellite, lightning, and numerical model images extracted is otherwise assigned a missing value. By ‘‘radar like’’ or around each pixel. While this approach provided adequate ~ synthetic radar, we mean an image R that depicts the radar-like estimates of offshore and oceanic storms, this same meteorological quantity as R but is estimated from subjective feature extraction methodology is computa- nonradar datasets. Let I1, I2, ..., In be a collection of tionally expensive and raised the question of whether the images obtained from a set of nonradar sources. In OPC, constructed feature vectors were ‘‘optimal’’ for estimating ~ R is constructed using the conditional expectation given radar-derived quantities. these additional data: In this paper we show that convolutional neural net- works (CNNs; LeCun et al. 1998; Zhang et al. 2016; Szegedy et al. 2015) are an effective class of models for 2 The term image here actually means a raster, which is an image generating radar-like depictions from nonradar data. We embedded in some spatial reference system that maps between present a CNN-based OPC (OPC-CNN) that outperforms geocoordinates (latitude, longitude) and pixel coordinates its predecessor (OPC-RF). CNNs have provided state-of- through a predefined map projection M: S2 / R2. With this no- the-art results in many areas, including computer vision, tion, we can discuss the distance between pixels in units of actual distance (e.g., km). 3 Even though VIL and rainfall rate are not strictly ‘‘radar quantities,’’ well-known methods exist that map reflectivity to es- 4 Echo tops are defined as the height of the 18-dBZ layer of timates of liquid water content and rainfall. reflectivity.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2325 remote sensing, natural language processing, and others Retrievals for GPM (IMERG; Huffman et al. 2017). (Krizhevsky et al. 2012; Zhang et al. 2015). In many of These methods combine visible (VIS) and IR imagery these applications, CNNs are able to learn compact from GEO satellites with the limited coverage of passive feature representations that can efficiently transform microwave (PMV) sounders, which are capable of more highly detailed pixel-level information into higher-order accurate precipitation estimates than IR sensors. Some features relevant for a particular objective. These fea- work has also been done to combine data across multiple tures can then be used for classification, segmentation, sensors, for example, combining lightning with satellite or, in the case of OPC, regression. With CNNs it is no data (Grecu et al. 2000). While machine learning is used longer necessary to subjectively engineer features from throughout these works, these other models often invoke input images (as is done in OPC-RF) and instead the much simpler methods and shallower models than those image data can be passed directly to the model. As we described here, for example, lookup tables, linear re- will show, this offers many advantages in accuracy and gression, and shallow and fully connected neural networks efficiency over the random forest method. that do not generalize well to the multiple-input frame- This paper is organized as follows. Section 2 provides work inherent in OPC. a brief overview of related work. Section 3 describes the data used to train OPC-CNN, as well as the sampling 3. Data and preprocessing steps involved in constructing the training set. The CNN architecture created for OPC is This section describes the data used to train and val- discussed in section 4, and the results of training and idate the OPC-CNN model as well as preprocessing in- additional verification are presented in section 5. volved in preparing the training set. Four major data Methods to generalize the model outside the training sources are used in training the model: (i) cloud-to- domain are discussed in section 6. ground lightning flashes, (ii) VIS and IR imagery from geostationary satellite, (iii) six parameters from a nu- merical weather prediction model, and (iv) VIL obtained 2. Related work from weather radar data (target variable). Two additional Besides OPC-RF, there are a number of systems that derived products, cloud-top height and solar zenith angle, create a form of synthetic radar imagery that exist in both are also created from these data and are used as additional industry and academia. However, details of these systems training inputs. Validation data include GPM Dual- are often proprietary or remain largely unpublished. In Frequency Precipitation Radar (DPR) reflectivity values Iskenderian (2008), a probability matching method is used that are preprocessed into VIL. to map cloud-to-ground lightning densities to an estimate Incoming data are mapped onto a common map of VIL and was a major source of inspiration for OPC-RF. projection (e.g., Lambert equal area) using bilinear in- Commercial products that correlate cloud-to-ground and terpolation. The output horizontal resolution of this intercloud lightning detections to radar reflectivity are also remapping is chosen based on the native resolution of available, for example, PulseRad (Earth Networks 2017) each data source (specified below). From here on, the and Thunderstorm Manager (Vaisala 2017). Methods resolution of each data source will refer to units of this that rely more on satellite data and/or numerical model map projection (even though technically the true resolu- data include NowRad ( 2017). tions may vary spatially, as is also the case with geo- Based on publically available descriptions of these products, synchronous footprints). After data are mapped onto a none fuse together as much data as OPC (namely, light- common set of grids, a patch selection methodology ning, geostationary satellite imagery, and numerical (explained below) is used to subsample input images and weather prediction model output) to provide a radar- to create a training dataset for the CNN. Each sample in like depiction of storm intensity. this training dataset consists of a time stamp ti,latitudeui A related problem that has been studied for decades is and longitude fi, and the set of image ‘‘patches’’ obtained that of estimating surface precipitation (e.g., rainfall rate) from input sources that match this time and location. from data sources other than rain gauges (Kidd and The following sections provide more detail on the Huffman 2011; Tapiador et al. 2012). Since geostationary data sources and the construction of this dataset. satellites (GEO) provide imagery at semiregular intervals, a. Input data descriptions multichannel rainfall estimates derived from infrared (IR) satellite imagery have been developed (Ba and Gruber 1) LIGHTNING 2001; Behrangi et al. 2009; Hsu et al. 1999). Other methods are the CPC morphing technique (CMORPH; Global lightning (LGHT) datasets are provided Joyce et al. 2004) and the Integrated Multisatellite through Earth Network’s Total Lightning Network

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2326 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

(Earth Networks 2017, unpublished data; Heckman Although NWP models exist, such as the High- 2014). The data consist of latitude–longitude, a time Resolution Rapid Refresh (HRRR) with much better stamp, amplitude, multiplicity, and height information resolution and update rate, at the time of writing these of all detected flashes, and are received every second. To models do not extend as far into the oceanic regions of be used in OPC, cloud-to-ground detections are geo- interest. However, once higher-resolution models be- graphically binned into grids with a resolution of 2 km. come available over these regions, the framework set Three images of this type are constructed using 10-, 20-, out in the remainder of the paper is adaptable to include and 30-min history of lightning flash data. The binned this information. A global model, such as the Global lightning data are smoothed with three Gaussian kernels Forecast System (GFS; https://www.ncdc.noaa.gov/), having standard deviations s 5 6, 12, and 18 km, re- can also be used as a lower-resolution alternative spectively. This creates a set of nine lightning-derived to RAP. images. As a final step, a logarithmic transform is ap- 4) WEATHER RADAR plied to the density images to make the distributions less concentrated near zero. Radar data are used as the target variable for training our model. The main source of radar data used here are 2) VIS AND IR GEO SATELLITE VIL mosaic images obtained from the Corridor In- In this work satellite data for OPC are taken from the tegrated Weather System (CIWS; MIT Lincoln Labo- Geostationary Operational Environmental Satellite-13 ratory 2017, https://ciws.wx.ll.mit.edu/, unpublished data; (GOES-13). All imager channels provided by the sat- Evans and Ducot 2006; Klingle-Wilson and Evans 2005). ellite are utilized, including the VIS (0.65 mm) channel at The CIWS uses NEXRAD data to generate VIL im- 1-km horizontal resolution and four IR channels (3.9, ages at a 1-km resolution every 2.5 min that covers the 6.7, 10.7, and 13.3 mm), each at a 4-km horizontal reso- CONUS. These data are plentiful; however, CIWS is lution. Data arrival times depend on the current GOES- available only over the United States and southern 13 schedule set by the Office of Satellite and Product Canada, which may impact model skill when generalized Operations (OSPO). During routine operations, con- to other parts of the world. When discussing VIL in the tinental U.S. (CONUS) scans arrive roughly every context of aviation applications, it is common to bin VIL 10–15 min, whereas the larger Northern Hemisphere into six intensity categories, or levels, defined using the scans arrive approximately every 30–45 min, and full thresholds 0.16 (level 1), 0.78 (level 2), 3.5 (level 3), 6.2 2 Disk scans arrive every 3 h. For any given time, the most (level 4), 12 (level 5), and 32 kg m 2 (level 6). recently available satellite images are selected for training. Radar data from the GPM DPR (NASA 2018) are Archives of GOES-13 data used in this work can be ob- used for model validation over areas not covered by tained from NOAA’s Comprehensive Large Array-Data NEXRAD. Though these data are not used to train the Stewardship System (CLASS) website (NOAA 2018). model, they are invaluable for assessing model perfor- mance in its intended domain. The data used for the 3) NWP MODEL (MOD) assessments used in this paper are version 06 level 2A NWP models are capable of simulating a large num- DPR reflectivity (dBZ) and for brevity are referred to as ber of meteorological quantities in regions not covered ‘‘GPM DPR data.’’ These reflectivity data are converted by traditional sensors, including 3D soundings of tem- into VIL for use in validating the model. perature, pressure, and humidity, as well as relevant 2D 5) DERIVED INPUTS fields, such as precipitable water, convective available potential energy, and temperature and pressure slices at In addition to the aforementioned sources, two addi- various layers of the atmosphere (e.g., surface, tropopause). tional images are derived from previous inputs. The first In this paper, model fields are sampled from the National is an image of solar zenith angles (SZA), which repre- Oceanic and Atmospheric Administration (NOAA)’s sent the angle made between the sun and the zenith at 13-km Rapid Refresh (RAP) model (Earth System each pixel. This angle is a function of location, time of Research Laboratory 2017, https://rapidrefresh.noaa.gov/, day (particularly day/night), and time of year (Woolf unpublished data; Benjamin et al. 2016) that covers a large 1968). Not only is SZA a useful proxy for time of day, portion of North America, including the oceanic regions but it also has a drastic influence on satellite data, par- of interest. It is executed every hour and provides hourly ticularly VIS data that lose brightness as the SZA ap- forecasts out to 22 h. At any given target time ti, to account proaches 908 and become completely unavailable at for forecast latency OPC uses the most recent 2-h forecast night (SZA . 908). product valid at ti. For convenience, this data is upsampled The other derived input is an estimate of cloud- onto a 4-km grid using a bilinear interpolation scheme. top height (CTH) obtained by combining IR 10.7-mm

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2327 brightness temperature with RAP model forecasts of ideal because of the overlapping coverage of both the temperature and pressure. In a method similar to that GOES-13 satellite and NEXRAD. Note that a ‘‘fully found in Donovan et al. (2008), the height of the cloud operational’’ version of OPC that provides data year- top is estimated by matching the satellite brightness round for various sections of the NAS will likely need a temperature to a height from the vertical profile of at- larger training domain (as well as additional data sour- mospheric temperature generated from NWP output. ces); however, we choose to limit the domain here for CTH is computed on a 4-km grid similar to the IR the purposes of demonstration. Generalizing the models satellite data. to different regions (particularly to those far outside the range of NEXRAD) will be addressed in a later section. b. Training domain and patch selection A set of target times ft g T from which to i i51,‥,NT The data sources listed above generate far more data sample input data sources was selected based on the scan than are actually needed for training. Not only are the schedule of the GOES-13 satellite.5 For the 3-month images they produce quite large, but the majority of window considered, this resulted in approximately pixels in any given image will likely contain little or no NT 5 9000 target times. For each time ti, images from precipitation that would be detected by a radar. More- each input source were gathered and trimmed to the over, not all sources provide the same coverage (radar region G. Since input data are created on different being the most limited). To mitigate these issues, a schedules and take different lengths of time to refresh, training domain is used to limit where data are sampled the closest time stamp to ti within a maximum time offset from, and image patches are subsampled from these was selected from each source for inclusion in the regions in a way to create a balanced training set that training data. The maximum time offsets are sensor contains roughly equal portions of zero-intensity, low- dependent and were chosen based on sensor refresh intensity, and high-intensity VIL. This balancing is rate, which was 5 min for radar and 1 h for the NWP ui fi 5 ... necessary so that the training set is not dominated by forecasts. A set of patch locations ( j, j), j 1, , Mi cases with low or no radar return. In an unbalanced denoting latitude and longitude were then randomly training set that was sampled uniformly at random, high- sampled within G. To ensure that the training is not dom- intensity cases would be relatively sparse, and their ef- inated by cases without precipitation, locations with greater fect on model weights would be minor. The resulting VIL values were sampled with higher probability such model would show a low area bias for high-intensity that the final training set contains roughly equal pro- 2 storms because the fit necessarily favors low intensity. portions of no VIL (5 0kgm 2), mild to moderate VIL 2 2 By raising this frequency of high-intensity storms in the (0–0.77 kg m 2), and strong to severe VIL (.0.77 kg m 2). training data, the algorithm becomes more sensitive to Additionally, areas of degraded NEXRAD, such as these cases. While this balancing achieves the desired beam-blocked areas, were not selected to be a part of the effect of raising the detection rate for high-intensity training set, as this would affect model performance. To storms, it also increases the tendency to predict low- and limit oversampling certain regions, the number of points medium-intensity storms and leads to a high area bias in sampled per time Mi was limited to 200, or less than these regimes. This effect is accounted for in post- 0.01% of the pixels in G (and was limited even further if processing with a calibration described in section 5. there was no severe weather present). Around each A training domain is defined as a pair D 5 (G, T) that point, a patch of size (64 3 64 km2) was extracted from contains a geographic G component and a temporal T each input image and saved for training. This resulted in component. The G component consists of a region that approximately 1 million training samples for a 3-month is adequately covered by all the input data sources listed period, which was then further randomly subsampled above and is large enough to contain a range of possible down to approximately 150 000 samples, which was weather scenarios that should be captured by the model. found to be sufficient for training. The T component consists of a time span from which to c. Handling image misalignment sample training data. This time span should also be chosen to match the time stamps for which the model is For optimal training, input data sources should be being applied (for instance, training a model using data aligned spatially and temporally as accurately as possi- sampled in July may not perform well if applied in ble. While samples are chosen to be close to a common January). target time, there will still be some difference between In this paper our training domain is selected to be the the target time and the exact time data are measured. eastern half of the United States that is in range of NEXRAD, and our temporal domain includes July– September of 2015. The eastern U.S. training domain is 5 http://cimss.ssec.wisc.edu/goes/blog/archives/13001.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2328 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

X X ... X This, together with other issues such as sensor latency f 1, 2, , NSg corresponding to each of the NS input and satellite parallax , will lead to image features sources listed in Table 1. not being perfectly aligned across patches taken from We start with a short review of components that make different sources (Lindstrom 2006). This can be partially up a CNN and then move into the specific architecture improved by limiting the difference in sensor valid time used in OPC-CNN. A more thorough review of CNNs and target time, or by shifting the center of patches can be found in Goodfellow et al. (2016). taken from satellite data to account for a parallax shift a. Review of CNNs (which is height, location, and sensor dependent). On- going improvements in sensor technology will also help CNNs were first introduced in LeCun et al. (1989).In reduce this in the future, for example, rapid updates on contrast to classical feed-forward neural networks with the GOES-R series of geostationary satellites; however, fully connected layers, nodes in a convolutional layer are perfect alignment remains a challenge in this type of connected to only a subset of nodes in a previous layer. approach and will limit the overall accuracy of the This local connectivity is useful when dealing with im- final output. ages, which have a dimensionality that is often too large To help alleviate this problem, we introduce some for classical neural networks. For images, this subset of smoothing in our target output variable (VIL) and use nodes corresponds to a rectangular slice of the image these smoothed images in the loss layers of the CNN. known as a receptive field. This subset of nodes makes for The output VIL image is smoothed with a mean kernel an efficient feed-forward operation, and layers contain a of size 4 km 3 4 km (a hyperparameter) and down- much smaller number of parameters that need to be sampled to 4-km resolution. This smearing was found to learned compared to classical (fully connected) networks. dull the intensity of severe storms during training, so to A CNN is made up of several types of layers that can be increase the probability of detection a histogram matching described by a directed acyclic graph (DAG). These procedure to match smoothed 4-km VIL to the original layers can be categorized as either input layers, image

1-km VIL; this new set of patches is named S-VIL. The processing layers, or loss layers. Input layers Xs are 4D implication of using S-VIL to train OPC-CNN is that the arrays consisting of a batch of N samples from data source output may suffer from some storm location error re- S and are provided from the training data. Image pro- lated to the degree of misalignment of the structure on cessing layers are functions that transform one or more the 4-km grid to the original structure at 1 km. Also, 4D arrays into a new 4D array. In OPC-CNN, the fol- storms may be enlarged, since histogram matching is lowing types of image processing layers are utilized: spreading out high values in a 1-km grid over a 4-km pixel. In practice, this results in a lower resolution or Convolutional layers (conv): These are the main ‘‘softer’’ output, where the radar-like output visually image processing layers of the CNN. A convolu- appears smoother than true weather radar imagery. This tional layer is parameterized by a bias vector 5 ... effect should be noted in aviation applications of OPC; b (b1, b2, , bH) and a 3D array of weights ... for example, storm sizes may be overestimated by a [W1; W2; ; WH] that are learned during train- factor related to this applied smoothing and image ing. The hyperparameter H is the depth of the misalignment. Examples of this effect will be observed layer, and the Wi represent weight matrices (or in section 5. filters) of size l 3 w that are typically much smaller than the input image size. Given an input array of 3 3 3 X2RN D L W , a convolution layer outputs another 3 3 03 0 4. Network architecture 4D array Y2RN H L W by convolving the layers X Y 5 This section presents the network architecture used in of with the weight matrices, that is, k,i D X 1 5 ... 5 ... the OPC-CNN. We start by giving a brief overview of åj51Wi* k,j bi, i 1, , H and k 1, , N is 0 0 CNNs in general and then describe the network con- the batch index. Note the size L 3 W of the Yk,i is structed for our problem. To define the notation below, dependent on the definition of the convolution suppose that at each time, an input data source S provides operator. In OPC-CNN, convolutional layers use a DS images (or patches) of size LS 3 WS. We can represent stride of 1 and use the ‘‘valid’’ convention that does this collection of images as a 3D array, which we express not add any image padding to X. This results in a 3 ... RLS WS 0 5 2 1 as [X1; X2; ; XDS], where the layers Xi 2 are reduction in patch size computed by L L l 1 matrices representing the input images. A collection, or and W0 5 W 2 w 1 1. minibatch, of N observations from S will therefore make Nonlinearity: Nonlinearity layers (also known as 3 3 3 up a 4D array X2RN LS WS DS . The training data- activation layers) are elementwise computations that set for OPC-CNN consists of a set of 4D tensors typically follow convolutional layers. These layers

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2329

TABLE 1. Description of image patches extracted from a set of input sources. A set of these patches exists for each target time ti and patch ui fi location ( j, j).

Input Input source Image resolution (km) Patch size (pixels) No. of channels Input or output VIS Satellite 1 64 3 64 1 Input IR Satellite 4 16 3 16 4 Input CTH Derived 4 16 3 16 1 Input LGHT Lightning 2 32 3 32 9 Input SZA Derived 4 16 3 16 1 Input MOD Model 4 (originally 13) 16 3 16 6 Input VIL Radar 1 64 3 64 1 Output S-VIL Radar 4 16 3 16 1 Output

simply apply a function y 5 f (x) to each pixel of its The aforementioned layers are commonly organized input. Common choices for f include the rectified into groups or ‘‘blocks.’’ A typical block consists of a linear unit (relu) f (x) 5 max(0, x), a sigmoid convolutional layer, a nonlinearity, a pooling layer, and f (x) 5 tanh (x), and the identity function f (x) 5 x. possibly a dropout layer. When describing networks, it is Pooling layers (pool): Pooling layers reduce the often simpler to explain the structure of a network in dimensionality of the input image by summarizing terms of blocks than individual layers. groups of neighboring pixels. The groups of pixels in The third category of layer used in OPC-CNN is a loss the input array are formed by subregions of size d layer. A loss layer receives output from an image pro- spaced s pixels apart. A common choice is the max cessing layer and compares it to some form of ground pooling layer, which computes the maximum of all truth provided by an input layer. The loss layer outputs a pixels in a subregion. By setting the spacing of the scalar-valued score that represents the quality of the subregions to be half of d, max pooling layers effec- network’s output. The objective of training the CNN is tively decrease the resolution of an image but retain the to minimize the output loss through backpropagation. same key features. In OPC-CNN, 2 3 2 max pooling The type of loss function used in OPC-CNN is a mean- layers are used to downsample images by a factor of 2. squared loss layer, which averages the squared error Dropout layers (drop): Dropout layers (Srivastava across all pixels and layers of an image, that is, et al. 2014) are layers that attempt to prevent over- MSE(X, Y) 5 1/(LW)å (X 2 Y )2. fitting the training data. During training, these layers i,j,k ijk ijk will randomly ‘‘turn off’’ (set to 0) pixels in an input Loss layers are minimized by tuning the weights and biases in the convolutional layers through stochastic array with some probability p. By doing so, the network is forced not to rely on a small number of gradient descent (Bishop 2006) in which a minibatch of layers, which helps with robustness. samples from the training set is used to estimate the gra- › › Batch normalization layers (batch norm): Many dient of the loss function L/ w through backpropagation. ... machine learning algorithms are more effective In the case of multiple loss layers L1, L2, Ln, the gra- when inputs are properly normalized (e.g., have a dient descent update uses a weighted average across all v › › v mean of 0 and a variance of 1). Often in neural loss layers å k( Lk/ w), where the k are a set of weights networks with several layers, the distribution of that typically sum to 1. hidden layers over the training set can differ from b. Architecture in OPC-CNN this ideal, so batch normalization layers (Ioffe and Szegedy 2015) rescale data from the previous layer An overview of the network architecture used in to have a more desirable mean and variance. OPC-CNN is shown in Fig. 2. The network consists of 51 Concatenation layers (concat): Concatenation layers that are grouped into processing blocks for clarity. layers combine or ‘‘stack’’ multiple 4D input The lettered blocks (Table 2) represent convolutional arrays. These layers receive multiple input ar- layers followed by some combination of a nonlinearity,

rays Xk, k 5 1, ..., n with sizes N 3 Dk 3 Lk 3 Wk dropout layer, and/or a pooling layer. Conceptually, this and combine them into a single array Y of size network can be decomposed into two parts—a set of

N 3 åDk 3 min(Lk) 3 min(Wk) by concatenating feature extraction layers that process individual input k k each along the second dimension (image depth D). sources into features and a set of fusion layers that Note that if image sizes do not match across inputs, combine these features and map to an output image that then cropping of larger images is necessary. compares to a true radar image in a final loss layer.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2330 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

Multiple variations of the chosen architecture were only rely on one final loss layer were unable to propa- attempted and the best performing on a holdout set was gate error back effectively to initial layers. This led to selected. Most network designs that were tested were noisy (untrained) feature extraction layers and less motivated from popular CNNs in the literature, for ex- accuracy overall. By shortening pathlengths (e.g., VIS / / / / ample, Szegedy et al. (2015) and Krizhevsky et al. (2012). A1 A2 O3 output), meaningful weight ma- This particular architecture was also heavily influenced trices were successfully updated in early stages (see the by the following factors. results section for more details). Convolutional blocks were not used for MOD and SZA because their lower 1) Output resolution: For this work, 4 km was chosen as spatial resolution made them unnecessary. the output resolution. To achieve this, inputs of Outputs of the feature extraction layers are concatenated different resolutions were mapped to this common into one 4D array. The combined features undergo batch resolution via pooling layers or upsampling layers. normalization and are further processed by three convolu-

2) Concatenation: After images are mapped to a com- tional blocks. The output from the last processing block O4 mon resolution, it is necessary to combine all images has one layer representing the radar-like estimate and is into a single tensor for further processing. This layer passed to a final MSE loss layer and compared to S-VIL. is added after all layers are mapped to the common Network training is done using input patches de- output resolution. scribed in Table 1; however, a network that only works 3) Regularization: Dropout layers are utilized in this on small patches would not be very practical in an op- work to regularize the network and to avoid overfitting. erational setting. It is preferable that the trained net- These are typically added after convolution–relu com- work be applicable to an arbitrarily sized domain. This binations, with the exception of later layers. Dropout property is achieved in OPC-CNN by avoiding fully layer rates were chosen via experimentation. connected layers in the network architecture, which 4) Convolution depth: As convolution layers are the main flatten images created by hidden layers of the neural image processing layer of the CNN, these should appear network into 1D vectors of fixed length. Not including in both the feature extraction and fusion sections of the network. The number of convolution layers in each fully connected layers maintains the image structure of section was also chosen via experimentation. the data throughout the network and allows the net- work to accept input images of sizes different than In the feature extraction layers, processing blocks are what was used during training. CNNs that have this trained from individual data sources. Blocks A , A are 1 2 property are referred to as fully convolutional neural applied to only VIS; blocks B1, B2 are applied to networks and have been applied in other areas, such as IR/CTH; and C1 is applied to only LGHT inputs. To enable better learning in the early layers of the network, image segmentation (Long et al. 2015). Because these additional output (O1, O2, O3) and mean-square-error types of convolutions will trim off the outer edges of (MSE) loss layers were added to the early stages of the an image, the output of the network will be of a network. It was found in this case that networks that slightly smaller size. This can be compensated for in

FIG. 2. OPC-CNN architecture. Inputs layers are in gray, image processing blocks are in blue, and loss layers are in green. Details of the image processing layers are found in section 4.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2331

TABLE 2. Description of the processing blocks shown in Fig. 2.

Input Convolution No. of output Output Block name Layers resolution (km) filter size layers resolution (km)

A1 conv, relu, drop, pool 1 7 3 7162 A2 conv, relu, drop, pool 2 5 3 5324 B1 conv, relu, drop 4 3 3 3324 B2 conv, relu, drop 4 3 3 3324 C1 conv, relu, drop, pool 2 3 3 3164 O1 conv 4 3 3 31 4 O2 conv 4 3 3 31 4 O3 conv 4 3 3 31 4 upsamp upsampling 13 — 6 4 concat concatenation 4 — 128 4 norm batch norm 4 — 128 4

D1 conv, drop 4 3 3 3644 D2 conv, drop 4 3 3 3324 O4 conv 4 3 3 31 4 test mode by adding appropriate image padding to increases, the weights corresponding to VIS, IR, and full-size input images. LGHT inputs decay exponentially, while the weight Another important consideration in the network de- associated with the final output approaches 1. The net- sign is image resolution. Input images arrive in differing work was trained for 300 epochs. To choose the final resolutions of 1, 2, 4, and 13 km. To ensure images are network, the fusion loss layer was evaluated for each properly aligned spatially in the concatenation layers of training epoch on a separate hold out set, and a median the network, features must be mapped into a common filter was applied to smooth this error curve over epochs. resolution prior to concatenation. To achieve this, VIS The epoch that provided the minimum error was chosen (1 km) and LGHT (2 km) features are downsampled to a as the final network. 4-km grid using max pooling layers of 2 3 2 pixels. MOD Training was performed using the Lincoln Laboratory patches are upscaled to 4 km using a bilinear upsampling Supercomputering Center (Byun et al. 2016). The pro- (upsamp block in Fig. 2), and SZA is computed natively cessing node used in training had 64 GB of RAM and on a 4-km resolution and thus does not require any ad- has access to a Tesla K80 graphics processing unit ditional processing. (GPU) with 4 GB of memory. Training using 300 epochs with this setup took approximately 30 min for the chosen c. Model training dataset. Network weights were initialized randomly using a Gaussian distribution with zero mean and a standard 2 5. Results deviation of 10 3. Biases were initialized to 0. The training dataset was split into training, validation, and This section provides visualizations and performance testing components using a 70%, 15%, 15% split, re- estimates of the trained network applied to full-size spectively. The training component was used for sto- images (i.e., not limited to the patches used during chastic gradient descent, the validation set was used for training). Postprocessing steps, including calibration stopping criteria and model selection, and the test set and the incorporation of actual radar data, are also was used to estimate final network performance and for discussed. hyperparameter tuning. a. Application to full-size inputs As described in section 4b, the network is trained by minimizing the MSE criteria over multiple loss layers. The OPC-CNN was trained to output radar estimates Three of these output layers depend exclusively on VIS, on small patches of input data. However, in practice, the IR, and LGHT inputs, respectively, and the final output trained layers can be applied to inputs of arbitrary size. layer depends on the output of the fusion layers. The Inputs are first remapped to the same patch resolution influence that these output layers have on the final ob- used during training and then cropped so each input jective depends on weights vi, i 5 1, 2, 3, 4, which are tensor covers identical geographic regions. After nor- hyperparameters. These weights are initially set to be malization, the trained network is then applied to these equal to ensure that the feature extraction layers inputs. In this case, the output of the network is also a are adequately trained, but as the number of epochs full-size image of nearly equal size to the original data,

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2332 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

FIG. 3. Sample outputs from OPC-CNN that show VIL binned into six intensity levels. For each case, the left panels show synthetic radar VIL images generated from OPC-CNN using satellite, lightning, and numerical model output, and the right panels show the radar-derived VIL (target variable). The gray shading in the right panels depicts the quality of radar coverage categorized by good coverage (G), degraded coverage (D) and no coverage (N). Note that these images include the calibration described in section 5b. Case time stamps: (a) 1740 UTC 23 Jul 2015, (b) 1830 UTC 22 Sep 2015, (c) 1730 UTC 29 Jul 2015, and (d) 2345 UTC 24 Jul 2015. except for a small loss of data around the edges as a convection over the Florida Peninsula in Fig. 3a and result of the convolutional layers (recall no padding was thunderstorms over Alabama in Fig. 3c. Nonconvective used). Because input images can be quite large, inputs storms that do not produce lightning can sometimes be may need to be divided into smaller-size geographic missed or underrepresented, for example, the line of blocks and processed individually before reconstructing precipitation south of the Florida Panhandle in Fig. 3b. upon output. A reverse parallax shift is applied to the Overall, the number of false alarms in OPC-CNN is low; output image using the same vectors as used previously in very few instances does OPC-CNN depict storm cells to align the output to approximate the view that would (which are defined for aviation purposes as VIL greater 2 come from ground-based radar. Finally, a calibration than 3.5 kg m 2, represented by yellow shading in Fig. 3) explained later in section 5b is applied. in radar coverage that do not match an observed storm. Figures 3a–d show examples of OPC-CNN being ap- This suggests that storms outside radar range that are plied to portions of the Southeast United States. The left depicted by OPC-CNN are reliable indications of haz- panel of each case shows output from OPC-CNN based ardous weather, for example, the cluster of storms in the on only satellite, lightning, and numerical model inputs. Gulf of Mexico in Fig. 3d. In the right panel of each case, the target variable b. Calibration (NEXRAD VIL) is shown along with an indication of radar coverage (gray shading). Qualitatively, in each Recall from section 3b that the training set was con- case OPC-CNN is able to capture the size, intensity, and structed using roughly equal portions of no VIL, mod- orientation of the storms, albeit at a slightly lower res- erate VIL, and severe VIL. This distribution is far from olution. Small isolated convective cells, typically those the true relative frequencies of these regimes observed that generate cloud-to-ground lightning, are very well in nature and thus may lead to area biases when applied depicted by OPC-CNN as can be seen in the isolated to actual nonbalanced data. The left panel of Fig. 4

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2333

FIG. 4. (left) Comparison of the distribution of uncalibrated OPC-CNN VIL vs NEXRAD VIL generated over the training domain. The OPC-CNN is slightly overestimating VIL in the lower levels (L1 and L2 marked on the axis) and overestimating at the higher levels (L5 and L6). (right) The result of the histogram matching procedure. This curve shows the calibration that maps OPC-CNN-generated VIL to new values in such a way its distribution shown on the left in blue will shift to approximately match the NEXRAD curve in red. shows a comparison between the distribution function of used. To define these metrics, assume R~ is a radar-like VIL measured from NEXRAD and the OPC-CNN over image created by OPC-CNN, and R is the associated the training domain. It is clear that the OPC-CNN is over- radar-derived target variable. These two images are to estimating the lower levels of VIL and slightly under- be compared in regions of adequate radar coverage. To estimating the higher levels (note the logarithmic scale). avoid overfitting from the histogram matching pro- To address these area biases, a histogram matching pro- cedure, these metrics are computed only on the odd cedure is applied to the output of the final layer. This pro- Julian days. cedure modifies pixel intensities of the output in such a way BIAS computes the ratio of area coverage of storm that the distribution function of the resulting images over intensity greater than a threshold T for both model and the training domain matches that of a target collection of target in regions with valid radar coverage: images (NEXRAD VIL in this case). In other words, if 5 $ 5 ~ FNEXRAD(x) Prob(NEXRAD VIL x)andFOPC(x) #jR $ Tj $ BIAS 5 , Prob(OPCCNN VIL x) represent the distribution func- T #jR $ Tj tions of NEXRAD VIL and OPC-CNN VIL, respec- tively, then the histogram matching computes a mapping where # indicates pixel counts over the training domain. 5 21 + g : FNEXRAD FOPC .Themappingg(x) can be found by Ideally, BIAS should be close to 1. minimizing the square of the log ratio of the distribution The other metrics are POD, FAR, and CSI. For a functions: given T, these metrics compare the ratio of pixels cor- rectly classified or misclassified as greater than T to F (x) g(x) 5 argmin log2 OPC . (2) those that were not. More precisely, define hits H, misses y FNEXRAD(y) M, and false alarms (FA) for T as

~ ~ The left panel of Fig. 4 shows FNEXRAD(x) and FOPC(x) H 5 #jR $ T and R $ Tj, M 5 #jR , T and R $ Tj, estimated within the training domain on even Julian T T 5 ~ $ , days. The right panel of Fig. 4 shows the resulting g(x) FAT #jR T and R Tj. corresponding to six VIL levels (L1–L6) along the y axis and a piecewise linear interpolation applied in between Then POD, FAR, and CSI are defined as these levels. As expected, this function suppresses lower VIL levels while enhancing higher levels. H POD 5 T , T H 1 M c. Model validation T T FA In this section we validate the OPC-CNN on full-size FAR 5 T , and T 1 inputs. Standard performance metrics— probability of HT FAT detection (POD), false alarm rate (FAR), bias (BIAS), H and critical success index (CSI)—commonly used for CSI 5 T . T 1 1 validating weather forecasts (Stanski et al. 1989) are HT MT FAT

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2334 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

Because this training domain is the same as used to 5 2 v 1 v ~ RS maxfR,(1 d)R dRg, create the patches used to train the CNN, there is a small amount of overlap between this validation and the data where the weights vd depend on the distance (km) to the used to train OPC-CNN. However, this overlap repre- closest radar d via sents an extremely small proportion of all pixels used 8 for scoring in this section, and separate experiments > 0, d , d > min revealed that scores were not sensitive to this overlap. <> d 2 d For a baseline reference, the results of the OPC-CNN v 5 s min , d # d # d , v d > d 2 d min max d are compared to the OPC-RF that was running opera- > max min :> tionally at the Massachusetts Institute of Technology $ 1, d dmax (MIT) Lincoln Laboratory during the summer of 2015. Six scoring thresholds corresponding to the six VIL and s(x) is the logistic sigmoid function. In this formula, intensity levels shown in Fig. 1 were used when com- dmin represents the radius of what is considered to be puting CSI and BIAS. OPC-CNN scores were computed good radar coverage (where no OPC will be merged in), for both calibrated (OPC-CNN-Cal) and uncalibrated and dmax represents the maximum range of radar, out- (OPC-CNN-NoCal) outputs and compared to the side of which only OPC will be used for the final output. 6 output from the legacy random forest algorithm In this application, dmin and dmax were chosen to be 230 (OPC-RF). The results of this scoring are summarized in and 450 km, respectively. The right panel of Fig. 1 shows the performance diagram showing in Fig. 5. This dia- an example of radar stitching applied. gram shows POD and 1 2 FAR along the axes, BIAS along rays emanating from the origin, and contours of 6. Expansion outside the training domain constant CSI. For CSI, OPC-CNN outperforms both OPC-RF and the uncalibrated results in all six VIL Much of the data and results up until this point have levels. The most significant boosts are seen in lower VIL relied on land-based NEXRAD. NEXRAD was chosen levels, with .12% CSI increase for level 1 and a 13% for training because it provides a high-quality and increase for level 2. These results also show the impor- plentiful source of ‘‘truth’’; however, its coverage is tance of calibration, as some scores increased dramati- limited mainly to the continental United States. Using a cally after calibration was applied. The calibrated training set sampled from a relatively small training OPC-CNN had a BIAS closer to 1 compared to domain calls into question how well the resulting model OPC-RF in four out of six levels. The calibrated OPC- will generalize to different global climates (Pacific CNN shows a high area bias for level 6 VIL (.4), which Ocean, Alaska, desert regions, etc.). This problem can is likely a result of the histogram matching proce- be partially alleviated by including data obtained from dure overfitting to a small set of cases in the even Julian other radar networks (European, China, India, etc.), but days. This area bias is likely to be improved with addi- these data will still not provide truly global coverage. tional data. Despite these concerns, nothing is preventing the application of the current model in other parts of the d. Radar stitching world and testing the results. Initial testing outside As a final postprocessing step, calibrated output the United States has been performed (Veillette and from the OPC-CNN are stitched or blended with DeLaura 2016). In this work the OPC-RF algorithm actual NEXRAD returns in regions with adequate trained over the eastern CONUS was tested in regions in radar coverage. The stitching procedure used here is the Gulf of Mexico and the western Atlantic Ocean. The similar to the method described in Veillette et al. output was compared to both NEXRAD VIL and GPM (2015), which computes a weighted average of actual DPR-derived VIL. These results showed that although radar R with OPC-CNN output R~, where the weights an area bias existed for intense storms, the model still are based on the distance to the closest functioning provided a useful radar-like depiction in areas lacking NEXRAD. To ensure that strong radar returns far from any weather radar. Qualitative studies have also shown the radar are not dulled in this average, the maximum of that the network can provide reasonable radar-like de- the average field and R is then taken. More precisely, the pictions in other regions of the world, but a quantitative stitched radar field RS is given by study has not yet been performed. This issue should be addressed more rigorously if the model is to be applied far outside the training domain 6 The OPC-RF results scored here also include a histogram used in this work. For global training, a valuable re- matching–based calibration. source of radar data is from the DPR on board the GPM

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2335

FIG. 5. Performance diagram comparing NEXRAD (target variable) to three versions of the OPC algorithm: (a) OPC-RF, (b) OPC-CNN-NoCal, and (c) OPC-CNN-Cal. Scores are computed for six precipitation intensity levels marked 1–6. Each of these points represents the mean CSI, BIAS, POD, and FAR measured over a set of validation cases described in the text. For each intensity level, the calibrated OPC-CNN outscores its predecessor OPC-RF in CSI, especially for the lower intensity levels 1 and 2. The calibration helps to simultaneously de- crease the BIAS for lower intensity levels and increase BIAS for higher intensity levels. satellite described in section 3a; however, its footprint is F (x) F (x) F (y) F (x) OPC 5 OPC GPM 5 OPC B(y), limited to relatively small swaths along the satellite’s FNEXRAD(y) FGPM(y) FNEXRAD(y) FGPM(y) orbit width. The size of this footprint limits the amount (3) of training data that can be collected. Such a training set will require a larger temporal component in the training where B(y): 55 FGPM(y)/FNEXRAD(y) is estimated within domain to account for the limited spatial coverage of the NEXRAD range. GPM footprint. This type of approach may be consid- Combining (2) with (3) and a regularization term, the ered in future work. regionally dependent histogram matching procedure A simpler option for domain adaptation is to modify can be expressed in terms of the following updated the calibration procedure described in section 5b to have objective: regional dependence. This approach will utilize the same OPC-CNN trained in the current training domain, F (x) l g (x) 5 argmin log2 OPC B(y) 1 x (x 2 y)2 . but it will ensure that the output has some degree of R y FGPM(y) Nx statistical similarity to precipitation intensity outside the (4) training domain by using a regional calibration. This proposed approach also has the benefit of not requiring The term lx is used to prevent overfitting in situa- vast amounts of data to train a CNN from scratch. tions where the sample size Nx 5 #jOPC VIL . xj 1 To incorporate this regional dependence, the objec- #jNEXRAD VIL . xj is small. For multiple domains 5 ‥ tive function (2) ismodifiedintwoways:1)OPC-CNN Ri, i 1, m, gRi can be computed separately for each VIL intensity is compared to GPM-derived VIL region and interpolated between regions to avoid edge intensity rather than NEXRAD VIL and 2) a regulari- effects. zation term is added to prevent overfitting in data- This regional adaptation was applied to the five re- sparse situations. Because OPC-CNN is trained using gions shown in Fig. 6. These include two regions over NEXRAD as a target variable, modification 1 also re- land, which are covered by NEXRAD, and three off- quires the introduction of a GPM correction that ad- shore regions outside radar range. OPC-CNN VIL mo- dresses the difference between NEXRAD VIL and saics were constructed within all five regions without GPM-derived VIL. To motivate this correction, let merging NEXRAD, and they were used to compute

FGPM be the distribution function for GPM-derived FOPC(x) separately within all five regions. The FGPM(x) VIL and notice that the quotient inside the logarithm in was computed in offshore regions using all available (2) canberewrittenas overlapping GPM data, and FNEXRAD(x) was estimated

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2336 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

FIG. 6. Region calibration of OPC. (left) Five regions used to calibrate OPC. The U.S. Northeast and Southeast regions use NEXRAD VIL as a calibration source. The Atlantic, Gulf of Mexico, and Caribbean regions use GPM VIL as a calibration source. (right) Calibration curves generated using (4) for each calibration region. Note that all the curves follow a somewhat similar shape, which suggests the model trained over land retains skill when applied directly outside its training domain. for regions with radar coverage. The calibration func- methodology was developed to help generalize the tions were then computed for each region and plotted in model to regions outside the training domain. This was Fig. 6 using a regularization of lx 5 1: Overall, the cali- done by calibrating the output of the trained model us- brations curves are similar, with minor differences ob- ing data from the GPM DPR as validation in regions served in low-level and severe precipitation. This outside radar coverage. result suggests that a model trained over the CONUS During the summer of 2017, a research prototype of (Northeast and Southeast United States) retains skill OPC-CNN was transitioned into an operational dem- when applied directly outside the training domain. onstration at five key air traffic control centers, including Miami, Florida; San Juan, Puerto Rico; Houston, Texas; New York, New York; and the Air Traffic Control 7. Concluding remarks System Command Center. This capability will provide Air traffic controllers rely heavily on NEXRAD to an interim solution while OPC is transitioned into future provide safe, effective, and reliable transportation; FAA weather systems and will provide valuable feed- however, these data are lacking in many regions of the back that will influence future improvements to the world without weather radar. This paper provides a capability. methodology for creating radar-like precipitation mo- Moving forward, advanced sensors and increasing saics in areas lacking weather radar. Multiple forms of amounts of training data will allow better model per- nonradar data, including geostationary satellite imag- formance and greater spatial coverage of the OPC ery, lightning detections, numerical weather prediction model. The GOES-R series of geostationary satellites, model output, and sun position are used as input layers the first of which launched in November of 2016, will to a convolutional neural network. A training database provide higher-resolution and faster updating image was constructed using VIL mosaics taken from the data, and also lightning data from the Geostationary Corridor Integrated Weather System (CIWS) to use as Lightning Mapper that can be leveraged to improve the the target variable in a regression. The OPC-CNN has a OPC algorithm. Methods like OPC can be used to take DAG structure that takes inputs in their native resolu- full advantage of these rich datasets to generate effec- tion and combines them into estimates of radar-derived tive data-driven meteorological tools. quantities. These OPC-CNN-generated images are com- bined with actual radar data that exist in areas with radar Acknowledgments. This material is based upon work coverage to create seamless precipitation mosaics that can supported by the Federal Aviation Administration be extended far outside the range of radar. under Air Force Contract FA8721-05-C-0002 and/or The model was verified over the summer of 2015 and FA8702-15-D-0001 under Interagency Agreement compared to a prior version of the model trained using a DTFAWA-11-X-80007. Any opinions, findings, con- random forest algorithm. The CNN version showed clusions, or recommendations expressed in this material higher skill scores for both CSI and BIAS. A calibration are those of the author(s) and do not necessarily reflect

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC DECEMBER 2018 V E I L L E T T E E T A L . 2337 the views of the Federal Aviation Administration. We Huffman, G. J., and Coauthors, 2017: NASA Global Precipitation wish to recognize the late Leonard Story of the Federal Measurement (GPM) Integrated Multi-Satellite Retrievals Aviation Administration for his advocacy and guidance for GPM (I-MERG). Version 4.6, NASA Algorithm Theo- retical Basis Doc., 28 pp, https://pmm.nasa.gov/sites/default/ in the early stages of this work. We also wish to thank files/document_files/IMERG_ATBD_V4.6.pdf. Earth Networks for providing the global lightning data Ioffe, S., and C. Szegedy, 2015: Batch normalization: Accelerating to us, and two anonymous reviewers for their help with deep network training by reducing internal covariate shift. improving the article. 2015 IEEE 14th International Conference on Machine Learn- ing and Applications (ICMLA), IEEE, 448–456. Iskenderian, H., 2008: Cloud-to-ground lightning as a proxy for REFERENCES nowcasts of VIL and echo tops. 13th Conf. on Aviation, Range and Aerospace Meteorology, New Orleans, LA, Amer. Ba, M. B., and A. Gruber, 2001: GOES multispectral rainfall al- Meteor. Soc., P1.23, https://ams.confex.com/ams/88Annual/ gorithm (GMSRA). J. Appl. Meteor., 40, 1500–1514, https:// webprogram/Paper132935.html. , . doi.org/10.1175/1520-0450(2001)040 1500:GMRAG 2.0.CO;2. Joyce, R. J., J. E. Janowiak, P. A. Arkin, and P. Xie, 2004: CMORPH: Behrangi, A., K.-L. Hsu, B. Imam, S. Sorooshian, and R. J. Kuligowski, A method that produces global precipitation estimates from 2009: Evaluating the utility of multispectral information in de- passive microwave and infrared data at high spatial and temporal lineating the areal extent of precipitation. J. Hydrometeor., 10, resolution. J. Hydrometeor., 5, 487–503, https://doi.org/10.1175/ 684–700, https://doi.org/10.1175/2009JHM1077.1. 1525-7541(2004)005,0487:CAMTPG.2.0.CO;2. Benjamin, S. G., and Coauthors, 2016: A North American hourly Kidd, C., and G. Huffman, 2011: Global precipitation measure- assimilation and model forecast cycle: The Rapid Refresh. ment. Meteor. Appl., 18, 334–353, https://doi.org/10.1002/ Mon. Wea. Rev., 144, 1669–1694, https://doi.org/10.1175/ met.284. MWR-D-15-0242.1. Klingle-Wilson, D., and J. Evans, 2005: Description of the Corridor Bishop, C. M., 2006: Pattern Recognition and Machine Learning. Integrated Weather System (CIWS) weather products. MIT Information Science and Statistics, Springer-Verlag, 738 pp. Lincoln Laboratory Project Rep. ATC-317, 107 pp. Byun, C., and Coauthors, 2016: LLMapReduce: Multi-level map- Krizhevsky, A., I. Sutskever, and G. E. Hinton, 2012: ImageNet reduce for high performance data analysis. Proc. 2016 IEEE High classification with deep convolutional neural networks. Ad- Performance Extreme Computing Conf. (HPEC’16), Waltham, vances in Neural Information Processing Systems 25: 26th MA,IEEE,8pp.,https://doi.org/10.1109/HPEC.2016.7761618. Annual Conference on Neural Information Processing Systems Crum, T. D., and R. L. Alberty, 1993: The WSR-88D and the WSR- 2012, F. Pereira et al., Eds., Vol. 1, Neural Information Pro- 88D Operational Support Facility. Bull. Amer. Meteor. Soc., cessing Systems, 1097–1105. 74, 1669–1687, https://doi.org/10.1175/1520-0477(1993)074,1669: LeCun, Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, TWATWO.2.0.CO;2. W. Hubbard, and L. D. Jackel, 1989: Backpropagation applied Donovan, M. F., E. R. Williams, C. Kessinger, G. Blackburn, P. H. to handwritten zip code recognition. Neural Comput., 1, 541– Herzegh, R. L. Bankert, S. Miller, and F. R. Mosher, 2008: The 551, https://doi.org/10.1162/neco.1989.1.4.541. identification and verification of hazardous convective cells ——, L. Bottou, Y. Bengio, and P. Haffner, 1998: Gradient-based over oceans using visible and infrared satellite observations. learning applied to document recognition. Proc. IEEE, 86, J. Appl. Meteor. Climatol., 47, 164–184, https://doi.org/10.1175/ 2278–2324, https://doi.org/10.1109/5.726791. 2007JAMC1471.1. Lindstrom, S., 2006: The problem of parallax. Accessed 31 May Earth Networks, 2017. PulseRad solution: Overview. Earth 2017, https://cimss.ssec.wisc.edu/goes/blog/archives/217. Networks Doc., 2 pp., https://www.earthnetworks.com/ Long, J., E. Shelhamer, and T. Darrell, 2015: Fully convolutional wp-content/uploads/2017/01/PS_PulseRad_EarthNetworks.pdf. networks for semantic segmentation. Proc. Computer Vision Evans, J. E., and E. R. Ducot, 2006: Corridor Integrated Weather and Pattern Recognition 2015 (CVPR2015),Boston,MA, System. Lincoln Lab. J., 16, 59–80. Computer Vision Foundation, 10 pp., https://www.cv-foundation. ——, M. E. Weber, and W. R. Moser, 2006: Integrating advanced org/openaccess/content_cvpr_2015/papers/Long_Fully_ weather forecast technologies into air traffic management Convolutional_Networks_2015_CVPR_paper.pdf. decision support. Lincoln Lab. J., 16, 81–96. NASA, 2018: GPM data downloads. NASA Precipitation Goodfellow, I., Y. Bengio, and A. Courville, 2016: Deep Learning. Measurement Missions, https://pmm.nasa.gov/data-access/ Adaptive Computation and Machine Learning Series, MIT downloads/gpm. Press, 800 pp. NOAA, 2018: Comprehensive Large Array-Data Stewardship Grecu, M., E. N. Anagnostou, and R. F. Adler, 2000: Assessment of System (CLASS). NOAA, www.class.noaa.gov. the use of lightning information in satellite infrared rainfall Ryan, D., 2016: Expanding air traffic controllers’ view of off- estimation. J. Hydrometeor., 1, 211–221, https://doi.org/10.1175/ shore weather. MIT Lincoln Laboratory, https://www.ll. 1525-7541(2000)001,0211:AOTUOL.2.0.CO;2. mit.edu/news/expanding-air-traffic-controllers-view-offshore- Heckman, S., 2014: ENTLN status update. Proc. 15th Int. Conf. on weather. Atmospheric Electricity (ICAE 2014), Norman, OK, IUGG Srivastava, N., G. Hinton, A. Krizhevsky, I. Sutskever, and and IAMAS, 80 pp., https://www.nssl.noaa.gov/users/mansell/ R. Salakhutdinov, 2014: Dropout: A simple way to prevent icae2014/preprints/Heckman_103.pdf. neural networks from overfitting. J. Mach. Learn. Res., 15, Hsu, K. L., H. V. Gupta, X. Gao, and S. Sorooshian, 1999: Esti- 1929–1958. mation of physical variables from multichannel remotely Stanski, H. R., L. Wilson, and W. R. Burrows, 1989: Survey of sensed imagery using a neural network: Application to rainfall common verification methods in meteorology. 2nd ed. WMO estimation. Water Resour. Res., 35, 1605–1618, https://doi.org/ World Weather Watch Tech. Rep. 8, WMO/TD-358, MSRB 10.1029/1999WR900032. 89-5, 81 pp.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC 2338 JOURNAL OF ATMOSPHERIC AND OCEANIC TECHNOLOGY VOLUME 35

Szegedy, C., and Coauthors, 2015: Going deeper with convolutions. Phoenix, AZ, Amer. Meteor. Soc., 1.2, https://ams.confex.com/ Proc. 2015 IEEE Conference on Computer Vision and Pattern ams/95Annual/webprogram/Paper259616.html. Recognition (CVPR), Boston, MA, IEEE, 1–9, https://doi.org/ ——, ——, M. Wolfson, C. Mattioli, E. Hassey, and P. Lamey, 10.1109/CVPR.2015.7298594. 2016: The Offshore Precipitation Capability. MIT Lincoln Tapiador, F. J., and Coauthors, 2012: Global precipitation measure- Laboratory Project Rep. ATC-430, 24 pp., https://www.ll.mit. ment: Methods, datasets and applications. Atmos. Res., 104–105, edu/sites/default/files/publication/doc/2018-05/Veillette_2016_ 70–97, https://doi.org/10.1016/j.atmosres.2011.10.021. ATC-430.pdf. The Weather Company, 2017: Aviation weather data services. Woolf, H. M., 1968: On the computation of solar elevation angles Accessed January 2017, https://business.weather.com/products/ and the determination of sunrise and sunset times. NASA data-services. Tech. Memo. X-1646, 19 pp. Vaisala, 2017: Thunderstorm Manager. Accessed January 2017, http://www. Zhang, L., L. Zhang, and B. Du, 2016: Deep learning for remote vaisala.com/en/products/thunderstormandlightningdetectionsystems/ sensing data: A technical tutorial on the state of the art. IEEE Pages/Vaisala-Thunderstorm-Manager.aspx. Geosci. Remote Sens. Mag., 4, 22–40, https://doi.org/10.1109/ Veillette, M. S., and R. DeLaura, 2016: Identification of convective MGRS.2016.2540798. hazards in New York oceanic airspace. Proc. 16th AIAA Zhang, X., J. Zhao, and Y. LeCun, 2015: Character-level con- Aviation Technology, Integration, and Operations Conf., volutional networks for text classification. Advances in Neural Washington, DC, AIAA, AIAA 2016-3762, https://doi.org/ Information Processing Systems 28 (NIPS 2015): 29th Annual 10.2514/6.2016-3762. Conference on Neural Information Processing Systems 2015, ——, H. Iskenderian, C. J. Mattioli, and E. P. Hassey, 2015: The Offshore C. Cortes et al., Eds., Neural Information Processing Systems, Precipitation Capability. 13th Conf. on Artificial Intelligence, 649–657.

Unauthenticated | Downloaded 09/27/21 03:21 AM UTC