OCTOBER 2020 M C N E E L Y E T A L . 1671

Unlocking GOES: A Statistical Framework for Quantifying the Evolution of Convective Structure in Tropical Cyclones

TREY MCNEELY AND ANN B. LEE Department of Statistics and Data Science, Carnegie Mellon University, Pittsburgh, Pennsylvania

KIMBERLY M. WOOD Department of Geosciences, Mississippi State University, Mississippi State, Mississippi

DORIT HAMMERLING Department of Applied Mathematics and Statistics, Colorado School of Mines, Golden, Colorado

(Manuscript received 2 December 2019, in final form 31 July 2020)

ABSTRACT: Tropical cyclones (TCs) rank among the most costly natural disasters in the United States, and accurate forecasts of track and intensity are critical for emergency response. Intensity guidance has improved steadily but slowly, as processes that drive intensity change are not fully understood. Because most TCs develop far from land-based observing networks, geostationary satellite imagery is critical to monitor these storms. However, these complex data can be chal- lenging to analyze in real time, and off-the-shelf machine-learning algorithms have limited applicability on this front be- cause of their ‘‘black box’’ structure. This study presents analytic tools that quantify convective structure patterns in infrared satellite imagery for overocean TCs, yielding lower-dimensional but rich representations that support analysis and visu- alization of how these patterns evolve during rapid intensity change. The proposed feature suite targets the global orga- nization, radial structure, and bulk morphology (ORB) of TCs. By combining ORB and empirical orthogonal functions, we arrive at an interpretable and rich representation of convective structure patterns that serve as inputs to machine-learning methods. This study uses the logistic lasso, a penalized generalized linear model, to relate predictors to rapid intensity change. Using ORB alone, binary classifiers identifying the presence (vs absence) of such intensity-change events can achieve accuracy comparable to classifiers using environmental predictors alone, with a combined predictor set improving classification accuracy in some settings. More complex nonlinear machine-learning methods did not perform better than the linear logistic lasso model for current data. KEYWORDS: Tropical cyclones; Satellite observations; Classification; Statistics; Model interpretation and visualization

1. Introduction Scheme (SHIPS; DeMaria and Kaplan 1999) uses large-scale, Tropical cyclones (TCs) can be long lived, powerful, and numerical weather prediction (NWP)-derived diagnostics of spatially extensive; these three factors place them among the environmental factors known to affect TC intensity change, most damaging and costly natural disasters in the United States including vertical wind shear and relative humidity. SHIPS also (Klotzbach et al. 2018). Our TC guidance has improved over relies on observations such as infrared (IR) imagery and sea the past few decades, though reductions in track forecast error surface temperature (SST). Environments in which TCs pref- have outpaced reductions in intensity forecast error (DeMaria erentially undergo rapid intensification (RI; operationally de- 2 et al. 2014). Rapid intensity change [$30-kt (1 kt ’ 0.51 m s 1) fined as an increase of $30 kt in 24 h) tend to have higher SSTs, increase or decrease in 24 h] continues to challenge forecasters, greater atmospheric humidity, and lower vertical wind shear and the need to better understand the processes that drive (e.g., Kaplan and DeMaria 2003). To target RI events, a cor- these changes spurred the Hurricane Forecast Improvement responding subset of SHIPS predictors was used to develop the Project (HFIP; http://www.hfip.org/) with specific goals to SHIPS rapid intensification index (SHIPS-RII; Kaplan et al. further reduce both track and intensity forecast errors. 2010). SHIPS predictors can also provide insight into the Because the near-storm environment influences TC inten- characteristics of rapid weakening events (RW; defined as a sity, understanding and predicting the evolution of that envi- decrease of $30 kt in 24 h; Wood and Ritchie 2015). ronment is a key component in forecasting intensity change. Although NWP-derived predictors are included in SHIPS, The operational Statistical Hurricane Intensity Prediction global NWP models such as the Global Forecast System (GFS) cannot resolve subgrid-scale processes like . In situ TC observations aid in capturing small-scale storm features, Supplemental information related to this paper is available at but such aircraft reconnaissance is infrequent and expensive. the Journals Online website: https://doi.org/10.1175/JAMC-D-19- Because most TCs develop and strengthen far from dense, 0286.s1. land-based observing networks, analysts have long relied on geostationary satellite observations to monitor TCs. Since Corresponding author: Trey McNeely, [email protected] colder cloud tops imply deeper convection and thus stronger

DOI: 10.1175/JAMC-D-19-0286.1 Ó 2020 American Meteorological Society. For information regarding reuse of this content and general copyright information, consult the AMS Copyright Policy (www.ametsoc.org/PUBSReuseLicenses). Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1672 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

FIG. 1. DAV(r) as an ORB function for organization: DAV as an ORB function of the threshold r and deviation angles for each FIG. 2. SIZE and SKEW as ORB statistics for bulk morphology: image in Fig. 8, i.e., Edouard (2014) (solid line and left inset) and (left) Example level set for a threshold of c 52208C on Edouard’s Nicole (2016) (dashed line and right inset). Tb (Fig. 8). (right) Level sets for {08, 2208, 2408, 2608C} (in de- scending order, with 2208C repeated). SIZE(c) is the area covered updrafts, IR observations of cloud-top brightness temperatures by a level set; here, the size is 400 000 km2, or 20% of the stamp

Tb provide a proxy for convective strength. In addition, as TCs area. SKEW(c) is the displacement of a level set’s center of mass intensify, the distribution of convection becomes more sym- (yellow; 34 km from center) normalized by the average displace- metric around the storm center (i.e., axisymmetric). The rela- ment of points in the set (green; 92 km from center); here, the skew tionship between convective organization and the maximum is 0.37 west-northwest. intensity of TCs was used to develop the original Dvorak technique (Dvorak 1975), a subjective method relating pat- (ii) radial structure, and (iii) bulk morphology. Each category terns in IR imagery to TC intensity. targets one aspect of convective structure by mapping the

Today, the Advanced Baseline Imager (ABI; Schmit et al. complex Tb image data to summary statistics computed over a 2017) on Geostationary Operational Environmental Satellite range of temperature or radial distance thresholds. We will (GOES)-16 and GOES-17 observes the North Atlantic (NAL) hereafter refer to the continuous treatment of ORB statistics as and eastern North Pacific (ENP) every 10 min in 16 bands at ORB functions. As demonstrated in Figs. 1 and 2, these func- spatial resolutions from 2 km for infrared channels to 0.5 km tions transition from threshold-based statistics (e.g., coverage for the visible band 2. The current version of SHIPS incor- of Tb below 2308C) to continuous functions of thresholds (e.g., porates ABI IR observations as percent coverage of Tb coverage of Tb below a continuously varying temperature). below 2308C and standard deviation of Tb, each computed Furthermore, we develop an empirical orthogonal function within a 50–200-km annulus centered on the TC (Kaplan et al. (EOF) representation of each ORB function to capture its 2015). However, single-value representations produced every main modes of variability. The EOFs are computed separately 6 h cannot fully utilize the information provided by these finer for the NAL and ENP basins via principal component analysis spatial and temporal resolutions. How, then, do we objectively (PCA); see Fig. 3 and appendix A for details. ‘‘unlock’’ the increasingly rich information contained in geo- Our final representation of TC convective structure is stationary IR imagery that may highlight physical processes physically interpretable (Figs. 4, 5) and supports both analysis associated with short-term (#24 h) TC intensity change? and visualization of evolving convective structure patterns in From a practical point of view, there are two challenges: near–real time (section 3b). The ORB functions capture some (i) forecasters have access to increasingly detailed data but IR patterns analogous to the advanced Dvorak technique have the same limited time (6 h) to prepare each operational (ADT; Olander and Velden 2007, 2019) scene types as well as forecast, and (ii) although machine-learning algorithms theo- novel approaches to potentially informative convective pat- retically can analyze sequences of high-resolution images in an terns. For example, the EOFs for radial profiles and bulk automated way, the extracted features can be difficult to in- morphology can mimic the ADT’s cloud region and region terpret. However, if we can extract meaningful features that scene type scores, whereas global organization provides an

are both informative inputs to machine-learning methods and alternate approach to assessing Tb axisymmetry versus the interpretable by forecasters, then we can reduce forecaster cloud symmetry value included in the ADT. The ORB effort and leverage some of the predictive power of modern framework is also extensible to any thresholded feature, as statistical methods. defined in section 3; this enables further characterization of To address these challenges, we develop a rich and scien- both additional ADT scene types and novel patterns in an tifically motivated family of IR imagery descriptors through objective, interpretable fashion. our global organization, radial structure, and bulk morphology This study focuses on applying the ORB framework to de-

(ORB) feature suite for Tb. These features identify patterns in velop interpretable and rich descriptions of Tb structure. To Tb structure in three broad categories: (i) global organization, demonstrate that this suite of features retains information

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1673

FIG. 3. EOFs and mean ORB functions: EOFs for all 6 ORB functions in the (top) NAL and (middle) ENP, with (bottom) sample means with 95% pointwise confidence intervals estimated by a stationary bootstrap (Politis and Romano 1994). The computed EOFs differ significantly only in the sign of the second ORB coefficient for eccentricity (ECC2), which merely flips the direction of interpretation. The difference between basins is largely contained in the sample means. Despite these differences, the same qualitative structures persist between basins.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1674 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

FIG. 4. EOFs for radial profiles: Results from PCA of NAL TCs. The first three EOFs capture 90% of the variability in the ORB FIG. 5. Hourly SIZE functions for Hurricane Katrina (2005). functions for radial structure and describe the three primary or- Lighter colors indicate higher intensities. The first two EOFs in the thogonal shapes present in their profiles; see section 3b for details. inset capture 94% of the variance in the size functions. relevant to short-term intensity change, we build statistical acquisition difficulties, this study uses the period 1998–2016; models to diagnose the current state of TC intensification. We future work will use the full period 1994–present. We refer to construct separate binary classifiers for RI and RW events that the TC-centered field at a given time as a ‘‘stamp.’’ This term is ask, ‘‘Is this TC currently in the midst of an RI event (or RW borrowed from observational astronomy, where small digital event)?’’ To predict the probability that the current state meets images of the sky centered around an astronomical object are RI or RW criteria, we use logistic lasso, a penalized logistic sometimes referred to as postage stamps. The earliest versions regression that automatically selects a set of relevant predic- tors by shrinking the regression coefficients of the other (i.e., irrelevant) predictors to zero. For the predictors included in this study, we find that several harder-to-interpret, nonlinear machine-learning methods result in similar performance as lasso, and thus we focus on the latter given its interpretability. Our analysis indicates that a combination of ORB features and the lasso can provide a powerful observational tool for relating short-term intensity change to convective structure patterns in overocean TCs. These features (i) enable rapidly digested summary and visualization of evolving convective structures while (ii) capturing the complexity of those struc- tures across many thresholds in a rich and extensible frame- work. We find that binary classification methods restricted to GOES-derived ORB predictors achieve an accuracy at least comparable to classification methods restricted to SHIPS en- vironmental predictors when distinguishing rapid change from nonrapid change events. Adding the ORB suite to the envi- ronmental predictor set can improve the results beyond indi- vidual predictor sets, with RI/non-RI in the NAL basin showing significant promise (Fig. 6; Table 1). ORB also allows us to interpret regression coefficients in a way that enables physical insight into the models (Fig. 7).

2. Data

The GOES-East/West satellites provide Tb imagery with high and consistent spatial and temporal resolutions. NOAA’s Gridded Satellite (GridSat)-GOES database contains histori- cal, hourly GOES channel 4 (IR band) T remapped to an even FIG. 6. Binary classification by logistic lasso and nonlinear clas- b sifiers: The area under the ROC curve (AUC; white markers) for 0.048 grid over the full GOES domain (758Nto758S and 1508E 8 test data, with bootstrap-estimated 95% confidence intervals to 5 W) in the Western Hemisphere through GOES-15 (up to (colored intervals) computed for each of the 16 event–basin– 2017 at the time of this study; Knapp and Wilkins 2018). predictor type combinations using the classifiers (top) LASSO, Though these resolutions are coarser than the ABI, GridSat- (middle) RF, and(bottom) GBCT. For all settings, ORB-only has a GOES covers the period 1994–2017 and thus ensures we have performance on par with SHIPS-only. Qualitative results persist a reasonably large dataset for analysis. Due to early data across different classifiers. See text for details.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1675

TABLE 1. The P values for testing for differences in AUC rele- (Rhome et al. 2006). These eight NWP-derived atmospheric vant to the goals of the paper, estimated as in appendix C; P values predictors and two oceanic predictors are used alongside our significant at the 0.05 level for each individual model are shown in Tb-derived predictors to assess the value of combining envi- boldface. Rows 1–3 correspond to tests of ORB-only AUCs lagging ronmental predictors with TC structure information (Table 2) significantly behind SHIPS-only, and rows 4–6 correspond to tests in predicting the current rate of TC intensity change. of SHIPS 1 ORB improving on SHIPS-only. In addition to a minimum intensity of 50 kt, we further re- RI/non-RI RW/non-RW strict our sample to overwater TCs to reduce the influence of land on TC convective structure. For a TC center to be con- NAL ENP NAL ENP sidered over water, it must (i) be at least 250 km from land and LASSO 0.886 0.152 0.233 0.828 (ii) not move within 250 km of land during the 24-h window RF 0.999 0.404 0.639 0.450 under consideration. Since rapid intensity changes of at least GBCT 0.952 0.080 0.066 0.324 30 kt in 24 h are rare events, we ensure our study relies on a H : AUC 5 AUC 0 ORB-only SHIPS-only reasonable sample size by defining rapid change events as a 24-h Ha: AUCORB-only , AUCSHIPS-only intensity change of at least 25 kt; this lower threshold is frequently LASSO 0.027 0.176 0.192 0.049 examined in operational forecasts such as SHIPS-RII as well as RF 0.010 0.157 0.231 0.100 RI/RW literature (Kaplan et al. 2010, 2015; Wood and Ritchie GBCT 0.106 0.492 0.596 0.292 2015). We examine the evolution of all TCs that meet our criteria H : AUC 1 5 AUC 0 SHIPS ORB SHIPS-only between 1998 and 2016 in the NAL and ENP basins, and we H : AUC 1 . AUC a SHIPS ORB SHIPS-only analyze both intensification and weakening events. Finally, we only include stamps with less than 5% of pixels missing. The above criteria applied to 1998–2016 produce a dataset of of ORB were inspired by nonparametric approaches to quan- 2811 6-h observations (Table 3), including 174 distinct RI events tifying galaxy morphology (e.g., Conselice 2014). and 162 distinct RW events—that is, nonoverlapping runs of We use the National Hurricane Center’s ‘‘best track’’ hur- consecutive 6-h time steps during which a TC is undergoing RI ricane database (HURDAT2; Landsea and Franklin 2013)to or RW. While the above sample is used to analyze RI and RW limit our TC sample to tropical time steps with intensities of at events, we will compute the features described in section 3 on least 50 kt to ensure identifiable structure in Tb and exclude ex- nearly 25 000 hourly stamps (14 470 NAL; 9818 ENP) to track tratropical times. From these 6-h samples, we compute 24-h in- the evolution of TCs at finer temporal resolution; that is, we tensity change to identify rapid versus nonrapid change events develop our ORB feature suite using all stamps from the time and then interpolate each track to hourly resolution to extract the the TC first breaches the 50-kt threshold until the last time it

associated GridSat-GOES Tb stamp of radius 800 km (Fig. 8). drops below 50 kt, regardless of proximity to land. Due to the The SHIPS developmental database provides historical values relaxed restrictions on the included stamps, this set of hourly of observed and NWP-derived predictors used to generate SHIPS stamps is more than 6 times as large as the restricted, 6-h set used forecasts. In this study, we use SHIPS predictors for reasons of for the modeling of section 4. availability and known correlation with TC intensity change, not for the purpose of comparing our statistical model with SHIPS 3. ORB features for convective structure operational models. Our goal is to classify the current (rather than ORB seeks to support statistical analysis that includes inter- the future) state of the TC as RI/non-RI (or, equivalently, pretable Tb features derived from IR imagery. This section de- RW/non-RW). We restrict our analysis to the 0-h SHIPS values; scribes the development of ‘‘ORB coefficients’’ (which later serve that is, we only use values valid at each time step rather than the as inputs to our statistical models) from a suite of ORB statistics. forecast values valid at subsequent times. We are then able to Throughout this section, we denote the brightness temperature at demonstrate the merits of the ORB feature suite in interpretable location s as Tb(s). Each ORB statistic is based on a single Tb analyses when used alongside selected 0-h SHIPS predictors. stamp and is parameterized by a threshold value for either the We include two observed oceanic SHIPS predictors: ocean maximum cloud-top temperature considered c, or the radial dis- heat content (OHC) and Reynolds tance from the TC center r. Though all of these statistics are

(RSST). We also select eight NWP-derived atmospheric fields: functions of both Tb and either c or r, we will for notational 200-hPa zonal wind (U200), relative humidity at three different convenience simply denote them by f(c)andf(r). The values of levels (RHLO, RHMD, and RHHI), three measures of vertical c and r can be fixed, or they can be allowed to vary continuously, wind shear (SHRD, SHDC, and SHRS), and maximum po- resulting in ‘‘fixed threshold’’ ORB statistics or ‘‘continuous’’ tential intensity (VMPI) (DeMaria 2018). The predictors in ORB functions, respectively. this initial set are selected for their (i) relationship to TC convection, (ii) suspected relevance to current intensity given a. From ORB statistics to ORB functions our study’s goal of classifying current intensity change, and ORB statistics quantify the spatial structure of a Tb stamp by (iii) interpretability. For example, 200–850-hPa vertical capturing (i) global organization by using departures of the wind shear calculated over a 200–800-km annulus (SHRD) image gradient from perfect symmetry about the center of the and over a 0–500-km annulus (SHDC) are part of the SHIPS-RII storm, (ii) radial structure by using azimuthal averages of Tb predictor suite (Kaplan et al. 2015), and 500–850-hPa vertical about the center, and (iii) the bulk morphology of the storm by wind shear (SHRS) also correlates with 24-h intensity change using descriptions of Tb level sets. These ORB statistics strike

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1676 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

FIG. 7. Interpreting regression coefficients in logistic lasso with the SHIPS 1 ORB predictor set (NAL): Regression coefficients for SHIPS 1 ORB lasso models for (left) RI events and (center) RW events, as well as (right) the difference between the RI and RW regression coefficients in the NAL basin. Variables are ordered vertically by RI 2 RW such that an increase in the variable with the largest RI 2 RW coefficient, holding all other variables constant, is associated with an increase in the probability of RI much more strongly than an increase in the probability of RW. See the text for more discussion. a balance in the trade-off between interpretability and descrip- TC intensity (Fig. 9). However, it is unclear which choice of tiveness. More specifically, they attempt to capture Tb struc- radial threshold would best describe this structure at all time ture using a handful of numbers that correspond to intuitive steps. Many features of these curves can vary from TC to TC or aspects of TC structure (interpretability) while retaining as within a single TC, while others may not even appear at all time

much of the information contained in the original Tb image as steps. For example, the radius at which the global minimum possible (descriptiveness). occurs is not fixed in Fig. 9, while an eye (high cloud-top Rather than describing properties of a stamp for a fixed temperatures near r 5 0) is only occasionally present. Though threshold (as in a single ORB statistic), we can consider the we can account for many of these subtleties, generating ad hoc whole range of thresholds simultaneously. This is the basic idea features from T(r) will sacrifice the cohesiveness of a single behind ORB functions. For example, the deviation angle var- ORB function. [We documented our attempt at such an ap- iance [DAV; defined in section 3a(1)] is typically computed proach for T(r)inMcNeely et al. 2019.] for a fixed radius (e.g., 250 km) or over a few different radii. In this study, we consider the entire range of thresholds c or However, we can choose any arbitrarily high number of sample distances r simultaneously. This approach ensures that we do thresholds. In the limit, we have a dense sampling that results in not miss structure in the feature, as selecting just a few an approximation of a continuous function, the ORB function. thresholds is likely to. The density of our sampling scheme is For Hurricane Nicole (2016), the radial profile functions determined by the precision of the data; here we sample at the [T(r); defined in section 3a(2)] exhibit structure that follows 0.048 spatial resolution of GridSat-GOES for ORB functions

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1677

FIG. 8. GOES Tb stamps for (left) Edouard at 1800 UTC 16 Sep 2014 (showing an eye–eyewall structure at an intensity of 95 kt) and (right) Nicole at 0100 UTC 9 Oct 2016 (showing an asymmetric at an intensity of ;47 kt).

f(r) of radius r and every 18C for ORB functions f(c) of tem- compute the 2D image gradient of Tb at each position s 5 (x, y) perature threshold c. [denoted =Tb(s)]. Consider the core structure of Edouard (2014) in Fig. 8: the image gradient of a perfectly axisymmetric 1) GLOBAL ORGANIZATION TC is expected to point either directly toward or away from the

The characterization of global organization utilizes the de- TC center. The deviation angle compares the direction of =Tb(s) viation angle variance technique (Pineros et al. 2008). We first to the gradient directions expected of such an idealized storm.

TABLE 2. Predictors in each predictor set, and their descriptions. Lags for persistence (D6V and D12V) are consistent with SHIPS persistence predictor ‘‘INCV,’’ the incremental 6-h changes in intensity. Note that the use of three different lags will multiply the total number of predictors by a factor of 4for each of SHIPS-only and ORB-only.

SHIPS 1 Persistence (d 5 3 1 48 5 51) SHIPS 1 ORB (d 5 48 1 68 5 116) Persistence (d 5 3) SHIPS-only (d 5 12 3 4 5 48) ORB-only (d 5 17 3 4 5 68) V Current intensity SHRD 850–200-hPa shear magnitude DAV 3 ORB coefs, section 3a(1) (200–800 km)

D6V 6-h intensity change SHDC 850–200-hPa vortex-removed RAD 3 ORB coefs, section 3a(2) shear magnitude (0–500 km)

D12V 12- to 6-h intensity change SHRS 850–500-hPa shear magnitude SIZE 2 ORB coefs, section 3a(3) OHC Ocean heat content from altimetry SKEW 3 ORB coefs, section 3a(3) where available, from SST anomaly otherwise RSST Reynolds SST SHAPE 3 ORB coefficients, section 3a(3) RHLO 850–700-hPa relative humidity ECC 3 ORB coefs, section 3a(3) (200–800 km)

RHMD 700–500-hPa relative humidity D6() 6-h change in the above predictors (200–800 km)

RHHI 500–300-hPa relative humidity D12() 12-h change in the above predictors (200–800 km)

VMPI Maximum potential intensity from D24() 24-h change in the above predictors Kerry Emanuel equation U200 200-hPa zonal wind, (200–800 km) LAT Latitude, 8N from equator LON Longitude, 8E from prime meridian

D6() 6-h change in the above predictors D12() 12-h change in the above predictors D24() 24-h change in the above predictors

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1678 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

TABLE 3. Observations used in our study; the first seven rows of the table refer to numbers of stamps. The ‘‘Unique TCs’’ row in- dicates the number of TCs in the sample. The bottom two rows indicate the number of RI/RW events, which each consist of mul- tiple temporally adjacent stamps. The statistical models in section 4 are based on the final GOES 1 SHIPS available sample (boldface). The sample size in the NAL basin is reduced because of more frequent landfall than in the ENP basin.

NAL ENP Total 6-h HURDAT2 entries 8438 8080 16 518 $50-kt HURDAT2 entries 4111 3400 7511 24-h history available 4017 3339 7356 $250 km from land 2225 2627 4852 GOES 1 SHIPS available 1236 1575 2811 RI obs 361 602 963 RW obs 221 587 808 Unique TCs 154 206 360 FIG. 9. Radial profile T(r) as an ORB function for radial struc- ture: Radial profiles of Edouard’s Tb (Fig. 8; 1800 UTC 16 Sep RI events 71 103 174 2014) are shown both centered on the best track (solid line) and RW events 56 106 162 centered on the eye (dashed line). If we use the best track center as the origin about which T(r) is computed, the temperature of the eye (near r 5 0) is deflated; we start including the eyewall at r 5 0 We denote the deviation angle of the image gradient of Tb at a because the center of the image is not the center of the eye. point s as c(s); values near c(s) 5 0 indicate local axisymmetry. We summarize the level of organization by taking the vari- ance of c(s) over circular regions centered on the TC. We will features from general ORB functions; these radial profiles denote the ORB statistic for global organization over a region serve as ORB functions for radial structure. of fixed radius r as Figure 10 demonstrates the relationship between the radial profile and TC intensity. Radial profiles discard angular in- DAV(r) 5 Var[c(s)jjsj # r]. (1) formation, making them informative primarily when bulk symmetry is high, as determined from bulk morphology ORB We will consider the value of DAV over a range of radii from functions defined in the next section. In such cases, radial r 5 r 5 ORB functions for global 50 km to 400 km, resulting in profile temperatures can help discriminate between eye– organization. Figure 1 shows DAV(r) as ORB functions for eyewall structures and uniform central dense overcasts. In an the two infrared images of Fig. 8. eye–eyewall structure, the temperature at the center r 5 0 will

be high, while the minimum of the radial profile, minrT(r), will 2) RADIAL STRUCTURE be low; this typically indicates a strong TC. Conversely, a The characterization of radial structures is based on radial uniform central dense overcast is marked by low temperatures profiles (e.g., Sanabia et al. 2014), the angular averages of

Tb(s). At a fixed distance r from the TC center, define the ORB statistic for radial structure as ð 1 2p T(r) 5 T (r, u) du. (2) p b 2 0

The primary complication comes from centering the coordi- nate system (i.e., defining the location of r 5 0). Sanabia et al. (2014) solve this by manually identifying the center of each image, whereas we automate image centering by maximizing T(r) at low r when an eye is present (McNeely et al. 2019). In this work, we use the best track center when an eye is not de- tected. This automated image centering is also used in the computation of DAV(r). Our treatment of the radial profile also differs from earlier work in its use of functional features, defined in section 3b, rather than designed features along the curve. Sanabia et al. FIG. 10. Hourly radial profiles: Example hourly eye-centered (2014) identify several critical points along this curve for use as radial profiles for Hurricane Nicole (2016). Lighter colors indicate features, such as the radial location of the minimum Tb within higher intensities. The inset depicts Nicole’s intensity trajectory. 200 km. In lieu of such designed features on T(r), section 3b The high-intensity phase of the TC’s evolution corresponds to the details a method for the recovery of data-driven functional clearest eye–eyewall structure.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1679

5 5 a 1 a 1 1 a 1 near the center r 0 due to the absence of a clear, warm eye f (x) 1f1(x) 2f2(x) ifi(x) , (4) and generally indicates a TC is below hurricane strength

(,64 kt). The slope of the profile also provides insight into the where fi(x) denotes the ith basis function and ai 5 hf, fii is the extent of the storm’s convection: gentle slopes indicate deep scalar orthogonal projection of f(x) onto the ith basis function, convection sustained over large regions, and sharp slopes in- representing how much of the shape of fi(x) is present in f(x). dicate more localized deep convection most often associated We call the value of fi(x) the loading of the basis function at x. with an eyewall. For continuous functions, the number of elements in a basis is infinite. However, provided that a proper basis is chosen, one 3) BULK MORPHOLOGY can reasonably approximate functions using a finite linear

The characterization of the bulk morphology of Tb utilizes combination of basis functions. In this work, we will use a data- sublevel sets. A sublevel set (hereinafter ‘‘level set’’) on Tb(s) driven approach called principal component analysis to find a identifies the points s 5 (x, y) for which Tb is at or below a low-dimensional basis representation of our ORB functions. certain threshold. Formally, define a level set as PCA returns an ordered set of orthogonal and normalized vectors whose linear combinations can fully reconstruct the L 5 # (c) fsjTb(s) cg. (3) original vectors—that is, the densely sampled ORB functions. Henceforth we will refer to the basis functions f1, f2, ... from As the temperature threshold c is reduced, the level set shrinks PCA as empirical orthogonal functions, to emphasize that to a region of lower temperatures and typically stronger con- these basis functions are empirical (data driven) and orthog-

vection. This definition of L (c) gives rise to a variety of ORB onal. We compute EOFs {fi} separately for each basin and each statistics for bulk morphology. ORB feature. The EOFs are constant in time, whereas the

ORB currently uses four summaries for the structure of L (c); projections ai of the stamps onto respective EOFs fi are scalars see McNeely et al. (2019) for the full mathematical definitions. varying with time. We refer to the latter ai-values as ORB SIZE(c), a function of the level set L (c), gives the area cov- coefficients; these coefficients serve as a time-varying repre- ered by L (c) in kilometers squared (the black region in Fig. 2), sentation of TC convective structure. providing insight into the coverage of various levels of con- The first few EOFs provide the most information, and in our vection in a TC. Similarly, SKEW(c) is a function of the level case, these tend to capture larger-scale structures of an ORB set L (c) and gives the displacement of the center of mass of function. We can project an observed ORB function (a single L (c), normalized by the mean radius of points in L (c), where function computed from a stamp) onto the first K EOFs of that ‘‘radius’’ is the distance from the TC center. This reveals feature to obtain the best K-dimensional approximation. whether the TC cloud pattern is biased in a particular di- [Figure A1, described in more detail in appendix A, demon- rection (Fig. 2). The remaining two are measures of the strates such a reconstruction for radial (RAD); appendix A raggedness (SHAPE) and stretching or eccentricity (ECC) also includes details on PCA.] In our study, we use the first K

of the level set L (c). Note that any functions of level sets coefficients {a1, a2, ..., aK} of each ORB function at each time (such as our SIZE, SKEW, SHAPE, and ECC) can them- stamp as both inputs (predictors) to machine-learning methods selves be regarded as functions of the level set threshold c. as well as a means to visualize the evolution of TC convective Hence, we refer to their continuous approximations as ORB structure with time. functions for bulk morphology. 2) ANALYSIS OF TC STRUCTURE AND EVOLUTION b. Analysis of TC structure via ORB and PCA We compute basin-specific EOFs separately on 14 470 NAL stamps and 9818 ENP stamps (section 2). It only takes K 5 2or

1) DIMENSIONALITY REDUCTION VIA PCA 3 EOFs fi(x) to explain most of the variability in the six current When we densely sample the ORB statistics, we obtain ap- ORB functions (i.e., DAV, RAD, SIZE, SKEW, SHAPE, and proximations of continuous ORB functions. Now, we will ECC). While objective methods (such as cross validation) exist switch to a different representation (coordinate system) in for selecting the number of basis functions, we here use a which we can reduce the dimensionality of the ORB functions heuristic 90% variance threshold. A larger set may prove while retaining most of the information in the original func- useful in forecasting-only settings, but some of the higher- tions. (Here, ‘‘dimension’’ refers to the number of values we order basis functions lack the interpretability of the first two– use to represent each function; thus, the original dimension of three EOFs. the ORB functions would be the number of threshold/sampling Reconstruction using the first three EOFs of radial structure values.) A more compact representation simplifies the analysis explains 90% of the variance of observed radial profiles and enables us to visualize the evolution of TC convective (Fig. 4). EOF 1 (solid purple) roughly measures the overall structure over time in a low-dimensional phase diagram de- temperature anomaly of the profile, with higher loadings in the

fined by the new representation. TC interior; f1(r) is positive, so generally the corresponding Given a suitable orthogonal basis, we can write any contin- ORB coefficient is positive (a1 . 0) when the radial profile uous function, such as the radial temperature profiles in T(r) has warmer-than-average cloud tops. EOF 2 (dotted blue) Eq. (2), as a linear combination of basis functions. More spe- measures the core temperature relative to outer-band tem- cifically, after subtracting the average over the basin, the peratures. Since f2(r) changes sign from positive to negative function f(x) describing deviations from the basin average is around 200 km, generally a2 . 0 when the radial profile

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1680 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59 shows a warmer core; hence, a2 , 0 could indicate a uniform central dense overcast or an obscured eye. EOF 3 (dashed green) indicates the presence of an eye–eyewall structure,

where f3(r) . 0 for r , 50 km and r . 300 km and f3(r) , 0 elsewhere; hence generally a3 . 0 indicates an IR-visible eye. These three EOFs alone can construct close approximations to the wide range of the profiles apparent in Fig. 10. In Fig. 5, we apply the same ORB framework to the SIZE function for Hurricane Katrina (2005). For SIZE, we only need K 5 2 EOFs to capture 90% of the variance in either basin. Because EOF 1 (solid purple) of SIZE measures the rate at which L (c) accumulates area with increasing temperature

thresholds c, a1 for SIZE indicates colder overall Tb. EOF 2 (dashed green) flattens the middle of the SIZE function when

a2 . 0, resulting in more coverage at high temperatures (the exposed sea surface) and low temperatures (deep convection). Although this study’s classification methods use 6-h values to match predictors from the SHIPS developmental database

(section 4), one benefit of Tb observations is high temporal resolution. It is straightforward to track the structural evolu-

tion of TCs with ORB coefficients. The EOFs fi(x) for each ORB feature are constant within a basin, but the ORB coef-

ficients are implicitly functions of time; that is, ai 5 ai(t). One temporal phenomenon that can be observed in TCs is the di- urnal cycle (e.g., Dunion et al. 2014), where deeper convection occurs near the TC core around local sunset and then spreads radially outward through the following afternoon. In Fig 11, the time evolution of the first two ORB coefficients for SIZE exhibits a 24-h periodic clockwise oscillation for Hurricane Katrina (2005). Though SIZE is insensitive to the location of convection and thus cannot measure the outward motion of such pulses, the observed oscillations in this phase space may be a manifestation of the TC diurnal cycle. In the phase dia- gram, we see two primary deviations from this cycle: when the TC turns north and accelerates [near point D, when it becomes a major hurricane (100 kt)] and when the TC begins to interact with land (the rightward translation toward point E). c. Summary of the ORB framework ORB quantifies convective structure in TCs as revealed by

Tb imagery in a way that is both descriptive enough to support statistical analyses and interpretable enough to support scien- tific inquiry into and clear visualization of temporal evolution of rapidly changing TCs. Figure 12 gives an example of the ORB framework as applied to Hurricane Katrina. The framework under which ORB is developed allows extension to FIG. 11. Visualizing SIZE evolution with ORB: (a) The spatial other aspects of convective structure as well as other obser- trajectory shows the path of Hurricane Katrina (2005) through the vational bands. The main steps in the construction of an ORB Gulf of Mexico from 1800 UTC 25 Aug through 0000 UTC 30 Aug, representation are as follows: with the labels marking the locations of displayed ORB functions. 1) Develop a summary of T structure reliant on a single (b) At six points (A–F, labeled) in time, we plot the SIZE function b (black) and the sample mean SIZE function for the NAL basin threshold, an ORB statistic. For example, the average T b (gray). (c) The trajectory of the ORB coefficients of the SIZE can be computed at a single distance from the TC center, function in phase space during the storm’s path. The clockwise such as the mapping T(r) [see subsection 3a and McNeely cyclic path in (c) matches the local night/day cycle as well as the et al. (2019)]. storm’s path in (a); see the text for details. 2) Compute the summary for a dense set of threshold values. View the dense sampling as an approximation of a contin- uous function, the ORB function (see Figs. 1, 2, and 9 and section 3a).

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1681

FIG. 12. Full ORB evolution of Katrina (2005) with selected stamps: (top left) A map of the trajectory of Katrina with select points (A–H) labeled. (top right) The intensity over time, with the same points labeled. (middle) GOES imagery for the eight labeled points. (bottom) The full suite of ORB coefficients as a function of time, with the original time series (blue) smoothed by an exponential weighted 2 moving average (black) with decay chosen to achieve a roughness [average value of jd2f(t)/dt2j] of 0.2s h 2 across the basin. Where no blue curve is visible, the smoothed time series is nearly equal to the original time series.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1682 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

3) Compute the ORB functions for all stamps from the same These predictor sets have been chosen to demonstrate the TC basin (here, NAL or ENP). Apply PCA to the resulting value of adding ORB coefficients to SHIPS environmental set of ORB functions; the derived empirical orthogonal predictors. The fourth predictor set provides a baseline for functions reflect the basin-specific variability of each ORB other predictor sets or a best-case scenario of classification function. We refer to the projection of ORB functions onto performance. Because RI and RW events are defined using the EOFs as ORB coefficients. The ORB coefficients are the information partially contained in these persistence terms, one time-varying inputs to statistical models, as in section 4 (see could argue that including such a term only makes sense in Figs. 4, 5, and 10 and section 3b). prognosis, but not when the goal (as in our case) is diagnosis 4) Interpret the EOFs. This interpretation relies on the design and relating the current RI (or RW) state to meaningful of the original ORB statistics (step 1). For example, PCA of predictors. the average temperature as a function of radius detects the presence of an eye–eyewall structure in its third EOF (see a. Methods: Probabilistic classification via logistic lasso Figs. Figs. 4, 5, and 11 and section 3b). We would like to answer the following question: ‘‘Is the storm currently undergoing or entering a rapid change event, One can use the ORB coefficients to visualize or analyze TC and what convective features are associated with such an convective evolution. In the next section, we use the ORB event?’’ We proceed by constructing a statistical model—the coefficients as inputs to machine-learning methods that can sift logistic lasso—that uses the d components of a predictor set at through a large number of predictors and relate rapid change time I (which we denote by a d-dimensional vector of predic- events to a subset of relevant predictors. The interpretability of tors, x ) to classify the binary state Y of the storm at that time. the EOFs and associated ORB coefficients allows one to di- i i (For example, with the SHIPS-only predictor set in Table 2, rectly tie the results of such an analysis to meaningful aspects of we have a total of d 5 48 predictors for 10 SHIPS variables convective structure. plus latitude and longitude and their lagged changes, whereas d 5 68 for the ORB-only predictor set.) Following the previ- 4. Modeling rapid intensity change ously described labeling scheme, we say that Y 5 1 if the ith The ultimate goals of ORB are to (i) advance scientific un- i observation of the TC occurs during a RI (or RW) event, derstanding of TC intensity change and (ii) support future whereas Y 5 0 corresponds to the observed absence of improvements in TC intensity forecasting by providing a frame- i RI (or RW). work for analysis of IR imagery. To evaluate the effectiveness of We will use TCs prior to 2010 to learn the probability of an the ORB framework, we here consider the problem of diagnos- ongoing rapid change event, p [ P(Y 5 1jx ) and then use the ing the state of rapid change in a TC (where RI and RW events i i i learned statistical model to predict the probability of rapid are treated separately). The problem of obtaining the state Y t change for TCs that occur during and after 2010; that is, our given data X collected up to and including time t is sometimes 1:t training set spans 1998–2009, whereas our test set spans 2010– referred to as filtering (Cressie and Wikle 2015), to be distin- 16.1 These probabilities can be converted to binary (RI/non-RI guished from forecasting or predicting Y based on data X 2 . t 1:t 1 or RW/non-RW) classes. Below, we describe the details of the Following the operational convention of 24 h, we label every logistic lasso regression model and how to properly fit and observation in our sample as either RI or non-RI (or analo- assess such a model. gously as either RW or non-RW). If a given 6-hourly obser- vation lies within any 24-h moving window that was classified as 1) LOGISTIC LASSO RI or RW according to our 25-kt definition, then that 6-hourly Logistic regression is similar to traditional linear regression observation is labeled as RI or RW. All other 6-hourly ob- but bounds the mean response E(Yjx) 5 P(Y5 1jx), or servations are labeled as non-RI/non-RW. equivalently the probabilities p(x) [ P(Y 5 1jx), to values We build a probabilistic binary classifier (see section 4a for between 0 and 1. We also assume a different distribution for details) that predicts the current state of the storm with respect the response, in this case the binary response or ‘‘labels’’ Y , to rapid change events, using four different predictor sets or i than in traditional linear regression. Rather than directly fitting combinations of inputs. Y using linear regression with independent and identically The four predictor sets in our analysis are as follows: i distributed (iid) errors «i from a normal distribution, we in- (i) The SHIPS-only predictor set contains only the 10 SHIPS stead assume that the labels Yi (conditional on xi) are iid ob- predictors we selected for this study, as well as TC latitude servations from a Bernoulli distribution with parameter pi.We and longitude (see section 2 and Table 2). then fit a linear function to logit transformations of pi; this (ii) The ORB-only predictor set contains only ORB coeffi- results in the generalized linear model

cients derived from Tb imagery. (iii) The SHIPS 1 ORB predictor set adds ORB coefficients to the SHIPS-only predictor set but does not add explicit 1 While the statistical models use train-test data splitting, the interactions between the two. EOFs are computed for the full dataset. Such semisupervised 1 (iv) The SHIPS Persistence set includes SHIPS predictors learning approaches, which use all data to learn a basis for di- and intensity persistence (defined by SHIPS as the current mension reduction but then predict on test data only, are common intensity, the 6-h change in intensity, and the 12-to-6-h in statistical machine learning. See, for example, Belkin and Niyogi change in intensity). (2004) for classification on nonlinear manifolds.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1683 p binary classes C , for which we have a ground truth. Here, C 5 logit(p ) [ log i 5 b 1 b x 1 1 b x , (5) i i i 2 0 1 i1 d id 5 1 pi 1 denotes an ongoing RI (or RW) event and Ci 0 denotes the absence of such an event. We perform this conversion by where x represents the jth predictor or the jth component of w ij choosing a probability cutoff p ; for observations xi for which + the input vector xi. As in standard linear regression, we can use ^ . the probability pi p , the classifier predicts onset of or on- maximum likelihood estimation (MLE) to find the best-fitting 5 ^ , + going rapid change (Ci 1); if pi p , the classifier predicts b 5 (b , ... , b ) coefficients. Note that our data come from 1 d that there is no rapid change (Ci 5 0). The cutoff is often fixed w time series of intensities; due to temporal correlations, the to a default value of p 5 0.5, which makes sense in the case of assumption of conditional independence Yijxi may not hold for balanced data (i.e., when the two classes occur with the same events close in time. Nevertheless, our logistic lasso model can proportions) but not when one class represents rare events such w still provide useful diagnosis. as RI and RW. As we change the value of p in the range [0, 1], The advantage of a linear model, or in this case a generalized we can make the resulting classifier more or less sensitive to the w linear model, is that it is easy to directly relate predictors to the event, trading between higher true positive rates at lower p E mean response (Yijxi). Recall that the coefficients in a tradi- (TPR; the fraction of events correctly identified as such) and 5 b 1 b 1 1 b 1 « w tional linear regression model Yi 0 1xi1 dxid i higher true negative rates at higher p (TNR; the fraction of « with normally distributed errors i reflect how important a pre- nonevents correctly identified as such). A so-called receiver dictor is relative to the other variables in the model. In a linear operating characteristic (ROC) curve describes the properties b model, the regression coefficient j tells us that an increase in the of a classifier by showing its TPR and TNR values at different w jth variable xij by one unit, while holding all the other d–1 var- cutoffs p (Fig. 13). An ideal classifier would have a ROC b . iables constant, is on average associated with an increase (if j curve hugging the top-left corner, which represents a 100% b , b 0) or decrease (if j 0) in the response Yi by j jj units. The true positive and true negative rate. (The trivial SHIPS 1 linearity of logistic regression admits a similarly straightforward Persistence model tends to perform best, as expected). A poor interpretation that relates the probabilities pi to the predictors. classifier would be near the gray diagonal, which corresponds From Eq. (5) we have that an increase of xij by one unit, holding to no better than chance. Hence, a common way of quantifying all other variables constant, is associated with a change of the log the performance of a classifier for all possible choices of 2 5 odds [i.e., the logarithm of the odds pi/(1 pi)] of the event Yi thresholds is to compute the area under the ROC curve b b 1by j, which is equivalent to multiplying the odds by exp( j). (AUC). In our study, we include both ROC curves and AUC Unfortunately, when d is large (i.e., when we have many values. In addition, to address the issue of unbalanced data, we b^ w predictors), the estimate is highly variable. The statistical choose the threshold p of the binary classifiers in section 4b model also becomes difficult to interpret. To reduce the vari- to maximize the so-called balanced accuracy (equivalent to ance of the statistical model and to help interpretability, one half of the Peirce score) defined as BA 5 (TPR 1 TNR)/2 typically reduces the size of the model by excluding variables (Brodersen et al. 2010). that contribute little to the fit from the predictor set. The lo- gistic lasso is able to sift through a large number of predictors b. Results: Logistic lasso by maximizing a penalized MLE problem " # 1) CLASSIFICATION BY LOGISTIC LASSO d b^ b^ 5 b D 2 lå b We fit the lasso coefficients in Eq. (5) for all four predictor argmax L( ; ) j jj , (6) b j51 sets in Table 2. Figure 13 and Fig. 6 (top) summarize the final lasso classification results on test data by type of rapid change where L(b; D ) denotes the likelihood of the coefficients b 5 (RI or RW) and by basin (ENP or NAL); with four settings and

(b1, ..., bd) when observing data D 5 {(x1, Y1), ...,{(xN, YN)}, four predictor sets there are 16 different models. In Fig. 6, the l d b l $ the term åj51j jj is the lasso penalty, and 0 is a tuning square markers represent the AUC on TCs between 2010 and parameter. Adding this penalty for l . 0 discourages large 2016 (the test sample) for statistical models fitted on TCs be- (positive or negative) coefficient values and results in all co- tween 1998 and 2009 (the train sample). We assess the uncer- efficients shrinking toward 0 as l increases. To penalize all tainty in the AUC estimates (conditional on the train sample) predictors equally, the predictors x are scaled (to standard by resampling the test sample 250 times with replacement. The deviation 1) and centered (to mean 0). For sufficiently large l, figure shows a 95% pivotal bootstrap confidence interval of small regression coefficients are set to 0. Increasing l leads to the AUC for each model fitted with logistic regression or one of smaller statistical models with fewer predictors. We fit our the nonlinear classifiers discussed in section 4c. logistic lasso model to TCs between 1998 and 2009 via 10-fold According to a permutation test2 of differences in AUC, the cross validation; see appendix B for details. Note that we later statistical models based on the satellite-derived ORB-only in sections 4b and 4c assess the final statistical models (with predictor set achieve a classification accuracy comparable to tuned parameters) on an independent test set not used in the the models based on the SHIPS-only environmental predictor cross validation; in our case, all TCs from 2010 to 2016. set. This result suggests that ORB is able to adequately assess

2) ASSESSING CLASSIFICATION PERFORMANCE To evaluate the performance of the fitted statistical models ^ 2 in an objective way, we convert the predicted probabilities pi to See appendix C for details on the permutation testing.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1684 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59

FIG. 13. Classification by logistic lasso: ROC curve comparison across all four predictor sets, by basin and by type of rapid change. See Fig. 6 and Table 1 for AUC metrics with bootstrap confidence intervals and significance tests of differences in AUC between statistical models. aspects of evolving convective structure relevant to intensity appear robust across widely different classification methods (see change by directly capturing the structure of clouds (rather section 4c and Fig. 6). than the environment in which those structures evolve). Furthermore, ORB predictors used as a complement to tra- 2) REGRESSION COEFFICIENTS IN THE FITTED ditional environmental predictors, as in the SHIPS 1 ORB LOGISTIC LASSO predictor set, may further improve accuracy. The significance As mentioned, the logistic lasso model [Eq. (5)] allows for tests for RI/non-RI in the NAL basin in particular underscore fast variable selection and straightforward interpretation of the the potential for the ORB framework to complement envi- regression coefficients b^. Figure 7 shows the fitted nonzero ronmental predictors. lasso coefficients (20 for RI/non-RI and 37 for RW/non-RW) In summary, our analysis implies that comprehensive study in the NAL basin; the corresponding ‘‘selected’’ variables form of GOES IR can provide a high-resolution view of the evolution a subset of the 116 SHIPS 1 ORB predictors. of TC internal structure, and that information complements tra- The regression coefficient for RAD3 in this model is ditional NWP-derived environmental predictors. These results 20.53, meaning that a 1s increase in RAD3 (the clarity of

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1685 the eye–eyewall structure) is associated with a 0.53 decrease in winds and shear). RW encompasses a wider range of regression the predicted log odds of the rapid change event, or a decrease coefficients, including global measures such as SIZE, vari- of a factor of exp(20.53) 5 0.59 in the predicted odds, given ous DAV predictors, and change in overall Tb (D24RAD1), that all other predictors are held constant. Likewise, the co- alongside the core structure measurements. efficient for U200 in that same fit is 20.75; the same 1s increase The logistic lasso models fitted for the ENP basin (see the in U200 is associated with a decrease in the log odds by 0.75, online supplemental material) show similar results, with dif- and the odds by a factor of exp(20.75) 5 0.47. We center the ferences largely explained by the narrow region of TC activity. predictors (in addition to scaling to s 5 1) so that a predictor Outside of this region, SSTs decrease and become hostile to value of zero corresponds to the sample mean value in TCs. The logistic lasso for RI fitted under the SHIPS 1 ORB that basin. predictor set predicts RI when (i) SST is high (RSST), (ii) shear

When predictors with high collinearity are present, a tradi- is low (SHDC), (iii) the TC is farther south (LAT), (iv) Tb is tional multiple regression will suffer from inflated coefficient low and dropping, particularly in the core (RAD2; RAD1; variance. Lasso suffers from a different version of the problem, D24RAD1), (v) the eye–eyewall structure is intensifying since it will often select one variable out of a group with (D24RAD3), and (vi) the storm has access to ample upper- strongly correlated variables, returning regression coefficients level moisture (RHHI). of 0 for the rest. For this reason, the recovered support (the set of predictors with nonzero regression coefficients in the fit) of c. Results: Nonlinear classifiers the final fitted classifier may exclude predictors expected to To determine the robustness of our results across classifi- appear; for example, OHC does not appear in the logistic lasso cation methods as well as to benchmark the logistic lasso, we for ENP RI, but that does not mean it is not important to the implemented two additional, nonlinear classification methods: intensity-change process. Instead, other variables (such as random forests (RF) and gradient-boosted classification trees RSST) may contain similar information with respect to inten- (GBCT). Both RF and GBCT can capture complex relation- sity change as OHC. This effect is reversed for ENP RW, ships between variables, such as interactions between predic- where RSST is dropped from the model while OHC remains. tors and nonlinear relationships between predictors and rapid A positive regression coefficient means that above-average change. First, the ranger implementation of random forests values of a predictor are associated with increased log odds of (Breiman 2001; Wright and Ziegler 2015) tallies class ‘‘votes’’ rapid intensity change, while negative coefficients mean that from many deeply grown classification trees with variables above-average values of a predictor are associated with re- chosen from a random subset at each branch, sampling the duced log odds. Hence, rapid intensity change is more likely training data with replacement for each tree.3 Second, the when the sign of the predictor matches the sign of the coeffi- Xgboost implementation of gradient-boosted classification cient. In the left panel, the classifier for RI tends to predict trees (Friedman 2002; Chen and Guestrin 2016) iteratively ongoing RI events when (i) current eye–eyewall structure is builds trees of a fixed size, fitting the error remaining after the weak (negative RAD3 regression coefficient), (ii) eye–eyewall addition of the previous tree.4 structure is strengthening (positive D24RAD3 regression co- More complex machine-learning methods (such as RF, efficient), (iii) current 200-hPa zonal winds are low (negative GBCT, and neural networks) have a higher capacity to fit data U200 regression coefficient), (iv) current wind shear is low well but require larger training sample sizes to achieve a better (negative SHDC regression coefficient), and (v) current mid- fit compared to simpler linear models (such as the logistic level humidity is elevated (positive RHMD regression coeffi- lasso). Indeed, in our settings the logistic lasso either matches cient). In the center panel, the RW classifier uses a wider range or outperforms RF and GBCT (Fig. 6; Table 1). The perfor- of weaker effects, predicting RW for cases of (i) higher current mance of the two higher-capacity classifiers here appears to be interior symmetry than exterior symmetry (negative SHAPE2 limited by the small size of the historical sample and the few regression coefficient), (ii) recent decay of interior symmetry occurrences of rapid changes within that sample.

(positive D6SHAPE2 and D12SHAPE2 regression coefficients), With regard to robustness of results across methods: The (iii) decaying eye–eyewall structure (negative D24RAD3 re- relative performance of the 16 statistical models fitted for gression coefficient), and (iv) strong current eye–eyewall different rapid change types, basins, and predictor sets is con- structure (high RAD3 regression coefficient). sistent across all three classification methods (shown as rows in The SHIPS 1 ORB results suggest a more complex rela- Fig. 6). In addition, Figs 7 and 8 in the online supplemental tionship between structure, environment, and intensity in the material demonstrate that the variable selection and ranking of case of RW than in the case of RI (via a large number of small predictor importances to remain stable across these rather to medium-sized regression coefficients as opposed to a few different machine-learning methods; for example, the Gini dominant effects). RI is thought to be strongly driven by in- importance in RF tends to agree with the magnitude of the ternal processes, while RW results from a more complex re- lasso coefficients. Note that RF and GBCT to some extent lationship between the TC and its environment. This is reflected in the logistic lasso fitted under the ORB-only setting (shown in the online supplemental material). RI is dominated 3 The RF models use Gini impurity for node splitting, with d1/2 by radial profile predictors (for core structure), SHAPE2 predictors sampled at each node. Each forest contains 10 000 trees. (which reflects the core symmetry), and SKEW1 (for overall 4 The GBCT models use logistic loss and a learning rate of 0.001 skew of the cloud tops, which is influenced by upper-level on depth-2 trees. The number of trees varies for each model.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1686 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59 account for correlation between predictors with different cri- study the relationship between onset of or ongoing rapid teria for variable selection. Our analysis indicates that our change events and environmental and structural predictors. main results are model agnostic. We conclude that logistic lasso is a good statistical model for rapid change in our setting be- b. Future work cause of its statistical performance for the data at hand, fast The ORB framework is algorithmic, making it easily ex- variable selection, robustness of qualitative results, and added tensible to other aspects of convective structure in IR imagery interpretability. and to other bands such as water vapor (WV) or derived products such as differenced IR-WV imagery (Olander and 5. Conclusions and future work Velden 2009). Such extensions may better probe smaller-scale, transient structures in high-resolution IR imagery and identify a. Conclusions spatial and/or temporal patterns in overshooting tops revealed High spatial and temporal resolution images from GOES by differenced IR-WV imagery. We expect this framework to platforms hold promise to support improvements in under- be able to extract similar features from the higher-resolution standing and forecasting TC intensity change in combination ABI observations, and we will investigate ORB’s performance with statistical tools that can ‘‘unlock’’ the rich information once more data become available. they contain. The ORB framework provides a method to Our logistic lasso approach has demonstrated the value of generate a set of ORB coefficients that target aspects of TC GOES-derived ORB coefficients in a simplified setting: clas- spatial structure in IR imagery that are meaningful to scientists sifying binary RI versus non-RI (or RW vs non-RW) events and forecasters and enable both application and interpretation under the assumption of independence of nearby observations of powerful statistical methods. In this work, we use the logistic in time. Future work will relax these assumptions by (i) treating lasso—a generalized linear regression model—to capture the intensity change as a continuous quantity rather than dis- relationship between IR imagery and rapid change events. The cretizing it into RI/non-RI or RW/non-RW and (ii) accounting logistic lasso can accept a large number of explanatory vari- for temporal dependence between observations by creating a ables (including the ORB coefficients from PCA) as inputs and time series model that accounts for persistence in environ- then automatically select a subset of variables relevant to rapid mental and GOES-derived predictors alike. Such statistical change with fitted regression coefficients that give a measure of models could augment existing intensity guidance schemes by the relative importance of the selected inputs. identifying spatial–temporal markers of upcoming intensity We apply our ORB framework to GridSat-GOES IR imagery change in IR imagery, and provide the means of adding GOES- and base the computations on novel (bulk morphology) as well derived evolutionary history to NWP-derived predictions of a as existing (radial profiles and organization) ORB statistics. Our TC’s future environment. approach recovers a low-dimensional phase space for each Finally, this work utilizes a linear model without interaction structural feature, enabling visualization of the trajectory of terms. Rapid intensity-change events are driven by the inter- evolving TC convective structure over time as well as the ap- play of internal TC processes and the TC environment; future plication of statistical analysis methods such as the logistic lasso. analyses will examine the effects of such interactions, partic- The ORB coefficients show promise in classifying rapid ularly between ORB coefficients and environmental predic- intensity-change events on their own (as in the ORB-only tors, by adding interaction terms to the statistical model. As the predictor set); however, for best results these features should suite of predictors grows larger and more complex, the logistic be used alongside SHIPS predictors. SHIPS and ORB pre- lasso can be modified (such as via the group lasso penalty of dictors have complementary strengths: SHIPS values provide Yuan and Lin 2006) to consider groups of predictors rather current estimates of the TC environment, and ORB features than individual predictors, resulting in more intuitive models. capture the history and current structure of the TC itself, which likely contain important markers of ongoing and future Acknowledgments. Authors Lee and McNeely were sup- changes in TC intensity. In addition, the extended historical ported in part by the National Science Foundation under record available for both sets of predictors will support future DMS-1520786, and author Wood was supported by the studies and guidance tools in analyzing and leveraging the re- Mississippi State University Office of Research and Economic lationship between the environment and TC structure as Development. We also thank Dr. David John Gagne for his revealed by ORB. helpful comments on the paper and the anonymous reviewers Though nonlinear machine-learning techniques are capable whose comments greatly improved the organization and clarity of fitting more complex relationships, they do not seem to of this paper. return better predictions for these data. There is a trade-off between higher-capacity, more flexible methods, and the need APPENDIX A for larger training samples. The number of unique RI and RW events in our study is of the same order as the number of Details on the Calculation of EOFs via PCA predictors (Table 3). Linear methods place more structural We first center our data so that the sample mean of each n 5 assumptions on the statistical model and tend to perform better ORB function is zero: (1/n)åi51zi 0, where a densely sampled in a low-sample-size setting. This property, in combination ORB function evaluated at d threshold values is here repre- with variable selection and straightforward interpretation of sented by the (random) vector z, and n denotes the number regression coefficients, makes the logistic lasso an ideal tool to of observations of z in the basin of interest. The centered

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1687

and D is a d 3 d diagonal matrix with diagonal elements a1 $ a2 $ $ ad $ 0 known as the singular values of Z. The first K EOFs are given by the first K columns of V. Because ZV 5 UD, the PC embedding of the ith data point in K dimensions—that

is, the set of ORB coefficients (a1, ..., aK)—is given by the first K elements of the ith row of UD.

APPENDIX B

Fitting the Logistic Lasso We fit the logistic lasso model on TCs prior to 2010 using the ‘‘glmnet’’ package for R (Friedman et al. 2009); we select l via tenfold cross validation (CV). This procedure fits b on 9 of 10 so-called folds, holding out a rotating 10th so-called validation fold. The tuning parameter l is chosen so as to maximize the fit on the held-out validation folds (see Hastie et al. 2005 for details). CV prevents ‘‘overfitting’’ (fitting too closely to training data and hence performing poorly on future observations) or ‘‘underfitting’’ (inadequately capturing gener- FIG. A1. Reconstructing an ORB function: The radial profile for alizable features in the training data). In our setting, CV results Edouard (Fig. 9) is projected onto the first three EOFs for radial in a different l value for each of the 16 fits. When choosing CV structure (Fig. 4), resulting in K 5 3 ORB coefficients (224.2, 255.7, folds, we randomly sample TCs rather than observations in or- and 102.1). (top) The contribution of each EOF to the reconstruction der to account for temporal dependence within a single TC’s of T(r), i.e., the weighted EOFs aifi(r)fori 5 1, 2, and 3. (bottom) To approximate the original radial profile (‘‘Original’’; thick solid black lifespan. Note that we verify the final statistical models (with line) we add these three contributions to the basin mean (‘‘Mean’’; tuned parameters) on an independent test set not used in the solid gray line). With all K 5 3 contributions (‘‘Mean111213’’; green cross validation; in our case, this test set consists of all TCs from dashed line), we find a close approximation to the original function, 2010 to 2016. capturing the overall structure of the observed ORB function T(r). APPENDIX C observations z1, z2, ..., zn then form the rows of an n 3 d matrix Z. Note that the sample covariance matrix of z is given by the Testing for Differences in AUC for Fitted d 3 d matrix S 5 (1/n)ZTZ. Statistical Models In PCA, one computes the eigenvectors of the covariance matrix Z. These vectors, which we refer to as EOFs, form a a. Hypotheses d basis for the original data. Let v1, ... , vK in R denote the We formally test two of the conjectures stated in the intro- eigenvectors that correspond to the K , d largest eigenvalues. duction to this work. In this work, we refer to y as the ith EOF, f (x). In a principal i i (i) Test 1: Our first conjecture is that ORB-only does at least as component (PC) map, the projections of the data onto these well as SHIPS-only. In words, we ask, ‘‘Is there evidence vectors are used as new coordinates; that is, the PC map of the that ORB-only would do worse than SHIPS-only?’’ That observation z is given by i is, test z 1 (z v , ..., z v ) 5 (a , ..., a ), i i 1 i K 1 K 5 H0: AUCORB-only AUCSHIPS-only versus where the dot represents a scalar product of vectors. In , Ha: AUCORB only AUCSHIPS only . (C1) the ORB framework, the scalar projections (a1, ... , aK) are - - our ORB coefficients. We can reconstruct the approximate ORB function with ORB coefficients and EOFs according to (ii) Test 2: Our second conjecture is that SHIPS 1 ORB may ’ 1 a 1 1 a f (x) f (x) 1f1(x) KfK(x), improve upon SHIPS-only. In words, we ask, ‘‘Is there evidence SHIPS 1 ORB improves upon SHIPS-only?’’ as demonstrated in Fig. A1. The PC map with ORB coefficients That is, test (a1, ... , aK) acts as a low-dimensional ‘‘phase space’’ for an aspect of convective structure; see Fig. 11 for an example. 5 H0: AUCSHIPS1ORB AUCSHIPS-only versus Algorithmically, the PC map is easy to compute using a H : AUC . AUC . (C2) singular value decomposition (SVD) of Z: a SHIPS1ORB SHIPS-only

Z 5 UDVT . b. Test statistics

Here U is an n 3 d orthogonal matrix, V is a d 3 d orthogonal Let X1, ... , XN be the probabilistic predictions from the matrix (where the columns are eigenvectors v1, ... , vd of S), ORB-only model for test data, and let Y1, ... , YN be the

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC 1688 JOURNAL OF APPLIED METEOROLOGY AND CLIMATOLOGY VOLUME 59 predictions from the SHIPS-only model for the same N test eastern North Pacific basins. Wea. Forecasting, 14, 326–337, points. Our test statistic is the difference in AUC values be- https://doi.org/10.1175/1520-0434(1999)014,0326:AUSHIP. tween the two statistical models, 2.0.CO;2. ——, C. R. Sampson, J. A. Knaff, and K. D. Musgrave, 2014: Is ...... intensity guidance improving? Bull. Amer. T1(X1, , XN , Y1, , YN ) Meteor. Soc., 95, 387–398, https://doi.org/10.1175/BAMS-D- 5 ... 2 ... AUC(X1, , XN ) AUC(Y1, , Yn). (C3) 12-00240.1. Dunion, J. P., C. D. Thorncroft, and C. S. Velden, 2014: The ... Likewise, for the second test, let X1, , XN be the probabi- tropical cyclone diurnal cycle of mature hurricanes. Mon. 1 listic predictions from the SHIPS ORB model for the same N Wea. Rev., 142, 3900–3919, https://doi.org/10.1175/MWR- ... test data and Y1, , YN be the predictions for SHIPS-only as D-13-00191.1. before. Then, define the second test statistic as Dvorak, V. F., 1975: Tropical cyclone intensity analysis and fore- casting from satellite imagery. Mon. Wea. Rev., 103, 420–430, ...... T2(X1, , XN , Y1, , YN ) https://doi.org/10.1175/1520-0493(1975)103,0420:TCIAAF. 5 AUC(X , ..., X ) 2 AUC(Y , ...., Y ). (C4) 2.0.CO;2. 1 N 1 N Friedman, J. H., 2002: Stochastic gradient boosting. Comput. Stat. Data Anal., 38, 367–378, https://doi.org/10.1016/S0167- c. Permutation tests 9473(01)00065-2. To assess the significance of an observed difference in AUC, ——, T. Hastie, and R. Tibshirani, 2009: glmnet: Lasso and elastic- net regularized generalized linear models, version 1.1-4. R we estimate the distribution of Ti under the null by forming 5 5 5 package, http://CRAN.R-project.org/package glmnet. B 1000 permutations of the sample points. That is, for b Hastie, T., R. Tibshirani, J. Friedman, and J. Franklin, 2005: The ... ~b ... ~b ... 1, , B, randomly sample X1 , , XN from {X1, , XN, Elements of Statistical Learning: Data Mining, Inference and ... Y1, , YN} without replacement, and assign the remaining N Prediction. 2nd ed. Springer, 533 pp. ~b ... ~b predictions to Y1 , , YN . For each of the B permutations, Kaplan, J., and M. DeMaria, 2003: Large-scale characteristics ~b 5 ~b ... ~b ~b ... ~b compute Ti Ti(X1 , , XN , Y1 , , YN ). Our estimated of rapidly intensifying tropical cyclones in the North p values for the two tests are given by Atlantic basin. Wea. Forecasting, 18, 1093–1108, https:// " # doi.org/10.1175/1520-0434(2003)018,1093:LCORIT.2.0. B CO;2. ^ 5 1 å I ~b , = 1 p1 1 (T1 T1) (B 1) and ——, ——, and J. A. Knaff, 2010: A revised tropical cyclone rapid b51 " # intensification index for the Atlantic and eastern North Pacific B basins. Wea. Forecasting, 25, 220–241, https://doi.org/10.1175/ ^ 5 1 å I ~b . = 1 p2 1 (T2 T2) (B 1), (C5) 2009WAF2222280.1. b51 ——, and Coauthors, 2015: Evaluating environmental impacts on I tropical cyclone rapid intensification predictability utilizing where () is the indicator function, taking value 1 when the statistical models. Wea. Forecasting, 30, 1374–1396, https:// statement is true and value 0 when it is false. doi.org/10.1175/WAF-D-15-0032.1. Klotzbach,P.J.,S.G.Bowen,R.Pielke,andM.Bell,2018: REFERENCES Continental U.S. hurricane landfall frequency and associated Belkin, M., and P. Niyogi, 2004: Semi-supervised learning on damage: Observations and future risks. Bull. Amer. Meteor. Soc., Riemannian manifolds. Mach. Learn., 56, 209–239, https:// 99, 1359–1376, https://doi.org/10.1175/BAMS-D-17-0184.1. doi.org/10.1023/B:MACH.0000033120.25363.1e. Knapp, K. R., and S. L. Wilkins, 2018: Gridded satellite (GridSat) Breiman, L., 2001: Random forests. Mach. Learn., 45, 5–32, https:// GOES and CONUS data. Earth Syst. Sci. Data, 10, 1417–1425, doi.org/10.1023/A:1010933404324. https://doi.org/10.5194/essd-10-1417-2018. Brodersen, K. H., C. S. Ong, K. E. Stephan, and J. M. Buhmann, Landsea, C. W., and J. L. Franklin, 2013: Atlantic hurricane data- 2010: The balanced accuracy and its posterior distribution. base uncertainty and presentation of a new database format. 2010 20th Int. Conf. on Pattern Recognition, Istanbul, Turkey, Mon. Wea. Rev., 141, 3576–3592, https://doi.org/10.1175/ IEEE, 3121–3124, https://doi.org/10.1109/ICPR.2010.764. MWR-D-12-00254.1. Chen, T., and C. Guestrin, 2016: Xgboost: A scalable tree boosting McNeely, T., A. B. Lee, D. Hammerling, and K. Wood, 2019: system. Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Quantifying the spatial structure of tropical cyclone imagery. Discovery and Data Mining, San Francisco, CA, ACM, 785– NCAR Tech. Note NCAR/TN-5571STR, 18 pp., https:// 794, https://doi.org/10.1145/2939672.2939785. doi.org/10.5065/5frb-ws04. Conselice, C. J., 2014: The evolution of galaxy structure over cos- Olander, T. L., and C. S. Velden, 2007: The advanced Dvorak mic time. Annu. Rev. Astron. Astrophys., 52, 291–337, https:// technique: Continued development of an objective scheme to doi.org/10.1146/annurev-astro-081913-040037. estimate tropical cyclone intensity using geostationary infra- Cressie, N., and C. K. Wikle, 2015: Statistics for Spatio-Temporal red satellite imagery. Wea. Forecasting, 22, 287–298, https:// Data. John Wiley and Sons, 624 pp. doi.org/10.1175/WAF975.1. DeMaria, M., 2018: SHIPS developmental database file format and ——, and ——, 2009: Tropical cyclone convection and intensity predictor descriptions. Colorado State University Tech. Rep., 6 pp., analysis using differenced infrared and water vapor imagery. http://rammb.cira.colostate.edu/research/tropical_cyclones/ Wea. Forecasting, 24, 1558–1572, https://doi.org/10.1175/ ships/docs/ships_predictor_file_2018.doc. 2009WAF2222284.1. ——, and J. Kaplan, 1999: An updated Statistical Hurricane ——, and ——, 2019: The advanced Dvorak technique (ADT) Intensity Prediction Scheme (SHIPS) for the Atlantic and for estimating tropical cyclone intensity: Update and new

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC OCTOBER 2020 M C N E E L Y E T A L . 1689

capabilities. Wea. Forecasting, 34, 905–922, https://doi.org/ determined by radial profiles of inner-core infrared brightness 10.1175/WAF-D-19-0007.1. temperature. Mon. Wea. Rev., 142, 4581–4599, https://doi.org/ Pineros, M. F., E. A. Ritchie, and J. S. Tyo, 2008: Objective mea- 10.1175/MWR-D-13-00336.1. sures of tropical cyclone structure and intensity change from Schmit, T. J., P. Griffith, M. M. Gunshor, J. M. Daniels, S. J. remotely sensed infrared image data. IEEE Trans. Geosci. Goodman, and W. J. Lebair, 2017: A closer look at the ABI on Remote Sens., 46, 3574–3580, https://doi.org/10.1109/TGRS. the GOES-R series. Bull. Amer. Meteor. Soc., 98, 681–698, 2008.2000819. https://doi.org/10.1175/BAMS-D-15-00230.1. Politis, D. N., and J. P. Romano, 1994: The stationary bootstrap. Wood, K. M., and E. A. Ritchie, 2015: A definition for rapid J. Amer. Stat. Assoc., 89, 1303–1313, https://doi.org/10.1080/ weakening of North Atlantic and eastern North Pacific trop- 01621459.1994.10476870. ical cyclones. Geophys. Res. Lett., 42, 10 091–10 097, https:// Rhome, J. R., C. A. Sisko, and R. D. Knabb, 2006: On the calcu- doi.org/10.1002/2015GL066697. lation of vertical shear: An operational perspective. 27th Conf. Wright, M. N., and A. Ziegler, 2015: ranger: A fast implementation on Hurricanes and Tropical Meteorology, Monterey, CA, of random forests for high dimensional data in C11 and R. Amer. Meteor. Soc., 14A.4, https://ams.confex.com/ams/ J. Stat. Software, 77, https://doi.org/10.18637/jss.v077.i01. 27Hurricanes/techprogram/paper_108724.htm. Yuan, M., and Y. Lin, 2006: Model selection and estimation in Sanabia, E. R., B. S. Barrett, and C. M. Fine, 2014: Relationships regression with grouped variables. J. Roy. Stat. Soc., 68B, 49– between tropical cyclone intensity and eyewall structure as 67, https://doi.org/10.1111/j.1467-9868.2005.00532.x.

Unauthenticated | Downloaded 10/03/21 04:03 AM UTC