Bulletin of the Seismological Society of America, Vol. XX, No. XX, pp. –, – 2018, doi: 10.1785/0120170223 Ⓔ Turing-Style Tests for UCERF3 Synthetic Catalogs by Morgan T. Page and Nicholas J. van der Elst
Abstract Epidemic-type aftershock sequence (ETAS) catalogs generated from the third Uniform California Earthquake Rupture Forecast (UCERF3) model are unique in that they are the first to combine a complex, fault-based long-term forecast with short-term earthquake clustering statistics. We present Turing-style tests to examine whether these synthetic catalogs can successfully imitate observed earthquake behav- ior in California. We find that UCERF3-ETAS is more spatially diffuse than the ob- served historic catalog in California and that it lacks quiet periods that are present in the real catalog. Although mean aftershock productivity of the observed catalog is matched closely by UCERF3-ETAS, the real catalog has more intersequence produc- tivity variability and small mainshocks have more foreshocks. In sum, we find that UCERF3-ETAS differs from the observed catalog in ways that are foreseeable from its modeling simplifications. The tests we present here can be used on any model that produces suites of synthetic catalogs; as such, in addition to providing avenues for future improvements to the model, they could be incorporated into testing platforms such as Collaboratory for the Study of Earthquake Predictability (CSEP).
Electronic Supplement: Five 1000-yr third Uniform California Earthquake Rup- ture Forecast epidemic-type aftershock sequence (UCERF3-ETAS) synthetic catalogs, and figures showing aftershock decay rates for M 5.5–6.5 mainshocks, mean fore- shock and aftershock productivity as a function of differential magnitude, and results from the clustering analysis.
Introduction Traditionally, earthquake probability models have the model, the long-term time-dependent model (UCERF3- largely been in two separate time domains. Probabilistic haz- TD) (Field et al., 2015), includes elastic rebound effects ard analysis employing fault-based information and elastic related to the time of the last event on major faults. Finally, rebound has operated on time scales of decades to centuries, short-term time dependence is added to this model in the whereas empirical statistical models incorporating clustering form of epidemic-type aftershock sequence (ETAS) modeling due to aftershocks have operated on the scale of days to (Ogata, 1988) to make UCERF3-ETAS (Field et al., 2017). years. Bringing these different models into alignment re- UCERF3-ETAS differs from conventional ETAS due to quires bridging the “medium-term forecasting gap” (Jordan, the explicit modeling of faults. In particular, although 2012) that separates them as well as reconciling inherent stat- conventional ETAS employs the Gutenberg–Richter (G-R) istical and spatial differences. magnitude distribution (Gutenberg and Richter, 1944)at The third Uniform California Earthquake Rupture the smallest spatial scales, the magnitude distributions in Forecast (UCERF3) model is the first unified earthquake fore- UCERF3 are more variable. UCERF3 is constrained to fol- casting model to include elastic rebound effects, fault-based low G-R scaling in large regions (northern California and ruptures, and clustering driven by Omori–Utsu statistics southern California), but in smaller regions it deviates from (Omori, 1895; Utsu, 1972). The base of the model is the time- this scaling. On average, the magnitude distributions on the independent model (UCERF3-TI) (Field et al., 2014), which major modeled faults are characteristic (Wesnousky et al., usesaninversionmethod(Page et al.,2014) to relax segmen- 1983; Schwartz and Coppersmith, 1984), meaning that the tation and solve for the rates of hundreds of thousands of rates of M ⪆ 6:5 earthquakes are higher than a G-R extrapo- potential ruptures on an interconnected network of faults. To lation from small earthquake rates would predict. However, these fault-based ruptures, off-fault seismicity is added from there is considerable variation—in addition to faults with smoothed historical seismicity data. The next progression of varying degrees of characteristic magnitude behavior, some
BSSA Early Edition / 1 2 M. T. Page and N. J. van der Elst faults have magnitude distributions that are approximately Early in the UCERF3-ETAS catalogs, these spontaneous G-R and still others are somewhat anticharacteristic. events account for 30% of all events; after 1000 yrs, they The variable magnitude distributions in the UCERF3 account for 20%. Thus, it is possible for the catalogs to have model have a substantial effect on the triggering probabilities different statistical behavior as they evolve. Also, because all for large earthquakes. For potential foreshocks that occur in synthetic catalogs include aftershocks from the same histori- close proximity to characteristic faults, the probability of cal events, we expect the catalogs to be more similar at the triggering a large earthquake is elevated relative to the fore- start of 1000-yr simulations than at the end. For these rea- shock probabilities in conventional ETAS; similarly, the sons, in many of our tests we compare the 28-yr observed model predicts that earthquakes occurring near anticharacter- catalog to 28-yr snapshots taken from both the beginning istic faults have a lower probability of triggering a large and end of 1000-yr synthetic catalogs. (M ⪆ 6:5) earthquake. This effect is in agreement with a sim- ilar effect seen in the Agnew and Jones (1991) model that is also due to the adoption of characteristic magnitude distribu- Seismological Turing Tests tions (Michael, 2012). A fault with a sufficiently character- Seismicity Rate istic magnitude distribution, could, in theory, cause a runaway sequence of aftershocks for parameters that would The mean seismicity rate in UCERF3 is 8.3 M ≥ 5 be subcritical in conventional ETAS; however, in UCERF3- earthquakes per year. By design, this is a bit higher than the ETAS this effect is mitigated by elastic rebound, which historical (1850–2011) seismicity rate in California of 7.9 effectively removes many large earthquakes from the pool of M ≥ 5 earthquakes per year, and higher still than the recent potential aftershocks following rupture of a fault. (1984–2011) seismicity rate of 5.8 M ≥ 5 earthquakes per By modeling faults, elastic rebound, and short-term year (Felzer, 2013b). Although there are several logic-tree clustering, UCERF3-ETAS marries previously disparate branches with different seismicity rates in UCERF3, more types of earthquake forecasting and bridges the medium- weight was given to high-seismicity rates in the model be- term forecasting gap. Given both the complexity and novelty cause the true long-term seismicity rate in California is likely of the model, testing is required to ensure that results are con- to be higher than the historical rates due to the lack of M ≥ 8 sistent with observed seismicity. To this end, we devise a events in the post-1850 era (Felzer, 2013d). The difference in series of seismological Turing tests. The original Turing test mean seismicity rate is evident in Figure 1, which shows the was proposed by mathematician Alan Turing to sidestep range of small earthquake rates for 10-day windows in both what he saw as the poorly defined question of whether or not the real and synthetic UCERF3-ETAS catalogs. However, machines could truly think. Instead, he devised a game to even when the rates are normalized by the mean rate (Fig. 1c), determine whether a machine could successfully imitate the the distributions are still different. The real California catalog language behavior of a human (Turing, 1950). In our seis- from 1984 to 2012 (Felzer, 2013a) has more quiet periods mological Turing tests, we ask whether synthetic catalogs than synthetic catalogs. This difference is significant given produced by UCERF3-ETAS can successfully imitate the the scatter seen in shorter UCERF3-ETAS catalogs as well statistical behavior seen in the real California earthquake (Fig. 1c). This may be because of the spontaneous back- catalog. Although the observed California catalog is quite ground events in UCERF3-ETAS, which account for limited in duration, we have many long synthetic catalogs. ∼30% of the events early in synthetic catalogs and 20% We can thus construct many synthetic-catalog snapshots that of events 1000 yrs into a synthetic catalog (the fraction of match the duration of the real catalog. These snapshots can events that are spontaneous diminishes as the catalogs are be used to construct error bounds for various metrics; run for longer, because there are more previous events UCERF3-ETAS fails the Turing test if the observed data producing aftershocks). These spontaneous events are Pois- are outside of those bounds. sonian in time, but in reality many of them represent or- We compare the statistics of one hundred 1000-yr phaned aftershocks of parent earthquakes that occur prior synthetic UCERF3-ETAS catalogs to the observed catalog to the start, outside the spatial bounds of the catalog, or be- in California for a 28-yr period from 1984 to 2011 (Felzer, low the minimum magnitude used (van der Elst, 2017). Thus, 2013a). These tests are fully retrospective, because we com- they may be more temporally clustered than is assumed in pare UCERF3-ETAS to an observational catalog that was the modeling. At the high end of seismicity rates, UCERF3- available at the time of its construction. Both observed and ETAS and the observed catalog agree (Fig. 1c, inset). synthetic catalogs contain earthquakes down to M 2.5. The spatial distribution of small (M ≥ 2:5) earthquakes UCERF3-ETAS catalogs are generated with direct Omori for real and synthetic catalogs is shown in Figure 2. parameters from Hardebeck (2013). All synthetic catalogs UCERF3-ETAS catalogs are more spatially diffuse than the start in 2012 and include aftershocks from the UCERF3 his- real catalog. The spontaneous background events from the torical catalog (Felzer, 2013a). Rates of spontaneous back- UCERF-ETAS catalog are shown in Figure 2c, and they are ground events in synthetic catalogs, which in the ETAS also more diffuse than the real catalog, which shows that the model are a proxy for aftershocks of earthquakes outside of source of the discrepancy is not due (or only due) to the dis- the spatial and temporal bounds of the model, vary with time. tance kernel used for aftershock triggering. Small earthquake BSSA Early Edition Turing-Style Tests for UCERF3 Synthetic Catalogs 3
(a) 0.03 (c) 1 UCERF3-ETAS 0.9 0.02 0.8 0.01 0.7
Probability density 0 0.6 0 50 100 150 UCERF3-ETAS Seismicity rate 0.5 Observed catalog 0.4 1 (b) 0.03 Observed catalog 0.3 0.99 0.02 Cumulative fraction 0.2 0.98
0.01 0.1 0.97 0204060 0
Probability density 0 0 50 100 150 0 0.5 1 1.5 2 Seismicity rate Normalized seismicity rate
Figure 1. Seismicity rate (M ≥ 2:5) for 10-day windows for (a) one hundred 1000-yr third Uniform California Earthquake Rupture Forecast epidemic-type aftershock sequence (UCERF3-ETAS) catalogs and (b) the real California catalog. UCERF3-ETAS catalogs lack quiet periods present in the real catalog. Vertical dashed lines show the mean seismicity rate. (c) Two hundred 28-yr snapshots of UCERF3- ETAS catalogs (half of the snapshots used are from the beginning of 1000-yr synthetic catalogs and half are from the end) have fewer quiet periods than the observed catalog, even when the mean rate difference is removed by normalization. The inset shows that for the high end of the seismicity-rate distribution, the observed catalog falls within the spread defined by the UCERF3-ETAS catalogs. The color version of this figure is available only in the electronic edition. rates in UCERF3 are from the smoothed seismicity model Magnitude Distribution (Felzer, 2013c), which is based on the adaptive algorithm The yearly magnitude distribution for UCERF3-ETAS is of Helmstetter et al. (2007). This adaptive smoothing tech- compared with the observed catalog in Figure 3a. In this nique performed the best in a 5-yr prospective test of 17 fore- figure, the higher mean seismicity rate in UCERF3-ETAS is cast models in California (Zechar et al., 2013). Although this visible, as well as more severe magnitude rounding. The ob- smoothing technique produces a mean forecast that does well M ≥ 6:4 in testing, it does not produce synthetic catalogs with the spa- served catalog has a surplus in the rate of events tial behavior of the real catalog. Ideally, individual synthetic relative to UCERF3-ETAS, but this is not statistically signifi- catalogs would have similar spatial statistics as the real cata- cant given the small numbers of these larger earthquakes. b log, whereas the mean of many synthetic catalogs would be Figure 3b shows -value estimates versus minimum diffuse enough to perform well as a probability map. This magnitude for each catalog, which allows us to focus on the could be accomplished by applying a stochastic curdling magnitude probability distribution rather than differences in b algorithm to individual catalogs to make them less diffuse. rate. The -value estimates are computed using maximum- The idea would be to model the observed spatial clustering likelihood (Aki, 1965) and are corrected for magnitude statistics in individual catalogs, whereas appropriately cap- rounding. The estimates assume no cutoff in maximum mag- turing the uncertainty as to where future seismicity clusters nitude. This has the effect of inflating the b-value estimate might be with intercatalog variability. for higher magnitudes, as shown in the b-value estimation for We quantify the difference between the synthetic and a pure G-R distribution with a magnitude truncation of M 8:4 observed spatial distributions by calculating the 2D intere- max (thin line, Fig. 3b). The UCERF3-ETAS catalog vent distance (ignoring depth) between each pair of earth- has a slower fall-off above M 8 than this example. Also quakes in each catalog, respectively, following work by visible are rising b-values relative to b 1 at intermediate Kagan (2007). The number of pairs within a given interevent magnitudes (M 4–5.7); these are consistent with a probability distance R approximately follows a power-law distribution drop relative to G-R scaling at around M 7. The b-value estimates between the synthetic and observed catalogs are N R ∝ Rδ; EQ-TARGET;temp:intralink-;df1;55;171 2 1 consistent at 95% confidence for the M 5.65 bin and above. as shown in Figure 2f. We estimate the power-law exponent δ The observed California catalog has b-values consistent for M ≥ 3 events between interevent distances of 1 and with 1.0 for most minimum magnitudes; however, there are de- 100 km. For the observed catalog, we fit δ 1:16; for viations for several bins at small magnitudes. Although these twenty 28-yr synthetic catalogs, δ ranges between 1.31 and deviations are statistically significant and preclude a single 1.47. The lower slope in the observed California catalog is b-value at 95% confidence, we caution that the catalog is not consistent with the tighter spatial clustering seen in the a perfect representation of earthquakes that occur. M 2.5 is a map-based plots (Fig. 2a–d). reasonable completeness threshold for this time period in most BSSA Early Edition 4 M. T. Page and N. J. van der Elst
(a)42 3 (b) 42 3 UCERF3-ETAS 41 41 UCERF3-ETAS (beginning of 2.5 (end of catalog) 2.5 40 catalog) 40 39 39 2 log 2 log 10
38 38 10 ( ( N N
37 1.5 ≥ 37 1.5 ≥ M M 2.5)
36 36 2.5)
1 1 35 35
34 34 0.5 0.5
33 33
32 0 32 0 –124 –123 –122 –121 –120 –119 –118 –117 –116 –115 –124 –123 –122 –121 –120 –119 –118 –117 –116 –115
3 (c) 42 (d) 42 3 UCERF3-ETAS Observed catalog 41 41 background 2.5 (1984-2012) 2.5 40 40
39 39
2 log 2 log 10 38 38 10 ( ( N N 1.5
37 ≥ 1.5
37 ≥ M M 2.5) 36 36 2.5)
1 1 35 35
34 34 0.5 0.5
33 33
32 0 32 0 –124 –123 –122 –121 –120 –119 –118 –117 –116 –115 –124 –123 –122 –121 –120 –119 –118 –117 –116 –115
42 2 (f) 0 (e) Ratio 10 41 (UCERF3-ETAS / observed) 1.8
1.6 40 10–1
39 1.4
–2 38 1.2 10
37 1 10–3 36 0.8
35 0.6 10–4 UCERF3-ETAS 34 0.4 Observed catalog 33 0.2 –5 Fraction of pairs within distance 10 10–1 100 101 102 103 32 0 –125 –124 –123 –122 –121 –120 –119 –118 –117 –116 –115 Interevent distance (km)
Figure 2. 28-yr snapshots of UCERF3-ETAS at (a) the start and (b) the end of synthetic 1000-yr catalogs. (c) The background (spontaneous) events from the catalog shown in (a). (d) The real catalog (1984–2012) is considerably less diffuse than the synthetic UCERF3-ETAS catalogs, and also less diffuse than the synthetic background events. (e) Event-rate ratio between one 1000-yr UCERF3-ETAS catalog and the observed catalog, for bins with a mean rate of at least 5 M ≥ 2:5 earthquakes in either the synthetic or observed catalog in a 28-yr period (note that the color scale is clipped). (f) The fraction of M ≥ 3 earthquake pairs within a given interevent distance, for 20 UCERF3-ETAS 28-yr snapshots (at the beginning of synthetic catalogs) and the observed catalog. The color version of this figure is available only in the electronic edition. of the state (Felzer, 2013b), but may not be adequate immedi- UCERF3-ETAS also has b-value nonuniformity at small ately following large earthquakes. In addition, there is catalog magnitudes. At M ≥ 2:5, the b-value estimate is 0.98. It nonuniformity in this time period in the definition of local mag- continues to rise for higher minimum magnitudes, reaching b 1 M 3:3 nitude (Tormann et al.,2010; Uhrhammer et al., 2011). at min . UCERF3 was intended to strictly BSSA Early Edition Turing-Style Tests for UCERF3 Synthetic Catalogs 5
104 1.5 UCERF3-ETAS UCERF3-ETAS 1.4 California catalog (1984-2011) California catalog (1984-2011) M
≥ G-R distribution ( b =1, Mmax = 8.4) 1.3 102 1.2
1.1 b=1 100 1 0.9 -value estimate b
L 0.8 10-2 M 0.7
Number of earthquakes per year 0.6
10-4 0.5 345678 3 3.5 4 4.5 5 5.5 6 6.5 7 Magnitude (M) Minimum magnitude
Figure 3. (Left) The cumulative magnitude–frequency distribution for one hundred 1000-yr UCERF3-ETAS and the recent observed California catalog. (Right) Maximum-likelihood b-value estimates (Aki, 1965) for UCERF3-ETAS and the observed catalog, corrected for magnitude rounding, with 95% confidence bounds. b-value estimates for a Gutenberg–Richter distribution with an upper magnitude cutoff at M 8.4 are shown for comparison. The color version of this figure is available only in the electronic edition. match a b-value of 1.0 at low magnitudes; the discrepancy is modeled faults in UCERF-ETAS have a slightly smaller not due to either magnitude rounding or the rolloff at high mean depth. This may occur because modeled faults have a (M ≥ 8) magnitudes. Rather, it represents a bug that will be depth limited by their lower seismogenic depth, which for fixed in future updates of the model. 80% of faults is between 9.0 and 16.1 km. Even small earth- quakes are affected by this modeling assumption because most aftershocks occur near to their parents; thus, after- Depth shocks of large fault-based earthquakes will be affected by this lower seismogenic depth. Off-fault spontaneous seismic- By construction, the UCERF3-ETAS model matches the ity, however, has depths drawn from a continuous probability mean observed depth distribution observed in California. distribution extending to 24 km. However, this depth distribution is (intended to be) uniformly applied throughout the state. Thus, the UCERF3-ETAS model does not have spatial variations in depth that are ob- Foreshock and Aftershock Productivity served in the real catalog (Fig. 4). There is, however, a small In contrast to conventional ETAS, in UCERF3-ETAS, amount of spatial heterogeneity in mean event depth visible after a large on-fault earthquake, probabilities of subsequent in the UCERF3-ETAS catalogs. Earthquakes along the major large earthquakes on the fault that just rup- tured are reduced by elastic rebound. An- other difference in UCERF3-ETAS comes 20 from local magnitude distributions that may Real catalog UCERF3-ETAS depth (km) Mean differ from G-R scaling. In the case of char-
15 acteristic magnitude distributions, this en- hances the probability of triggering large (M ≥ 6:5) earthquakes following a potential 10 foreshock. We measure aftershock rates by fitting 5 sequences to the modified Omori Law, as- λ (only bins with suming the aftershock rate scales as 20+ earthquakes shown) 0 ab M −M −p λ t10 main min t c ; EQ-TARGET;temp:intralink-;df2;385;150 2 Figure 4. Mean depths for bins with more than 20 earthquakes for UCERF3-ETAS (one hundred 1000-yr catalogs) and the real catalog. By construction, the UCERF3- in which t is the time elapsed from the ETAS model does not have depth variation that is present in the real catalog. However, M mean depths are not constant in UCERF3-ETAS as visible differences can be seen in the mainshock, main is the mainshock mag- M mean depth of on-fault earthquakes. The color version of this figure is available only in nitude, min is the minimum magnitude of the electronic edition. aftershocks under consideration, a, c, and BSSA Early Edition 6 M. T. Page and N. J. van der Elst
Figure 5. Interevent rate for subsequent aftershock pairs for M ≥ 6:5 mainshocks, for (a) the observed catalog and (b) five 1000-yr synthetic UCERF3-ETAS catalogs. Aftershock rates are scaled to equivalent rates for a single M 6 mainshock, assuming productivity scales Mmain as 10 , in which Mmain is the mainshock magnitude. A maximum-likelihood fit to the temporal decay of the aftershock rate is shown for the real catalog (dotted line, shown in both panels for comparison with synthetic) and for the synthetic catalogs (solid line, b). The color version of this figure is available only in the electronic edition. p are constants, and the G-R b-value is assumed to be 1 cal. Fits of M ≥ 3 aftershocks of M 5.5–6.5 mainshocks, (Omori, 1895; Utsu, 1961; Reasenberg and Jones, 1989). shown in Ⓔ Figure S1 (available in the electronic supple- The a-, p-, and c-values here are indirect Omori parameters ment to this article) show a similar trend to the M ≥ 6:5 and differ from the direct ETAS parameters that are applied mainshock data. Omori a-values and p-values are slightly to each generation of aftershocks in the modeling (Harde- higher and lower, respectively, in the observed catalog rela- beck, 2013). We stack all sequences relative to time since the tive to synthetic catalogs. Ⓔ Figure S1c shows that the mainshock and solve for a, c, and p by maximizing the log higher observed productivity at 10 days postmainshock is at likelihood, which is given by the high end of 200 synthetic UCERF3-ETAS catalogs the same length as the observed catalog. This high-rate observed Z p 0:03 XN T productivity is borderline statistically significant ( ) L λ t − λ tdt; EQ-TARGET;temp:intralink-;df3;55;352 log log i 3 given the range of fits seen in short synthetic catalogs. i1 0 Although on average aftershock productivity of UCERF3-ETAS is close to what is observed in California, in which N is the number of aftershocks that occur within T more stark differences emerge on a sequence-by-sequence days of their mainshocks and ti is the occurrence time of the basis. The observed California catalog contains more inter- ith aftershock. sequence variability in aftershock productivity—both more On average, UCERF3-ETAS aftershock productivity quiet sequences and more productive sequences—than and decay of M ≥ 4 aftershocks following M ≥ 6:5 main- UCERF3-ETAS catalogs, as shown in Figure 6. This differ- shocks (as defined by the exclusion criteria of Page et al., ence is evident after the first day, and is therefore not due to 2016, which includes aftershocks within T 10 days of incompleteness problems in the observed catalog. UCERF3- mainshocks) are very similar to the observed catalog (Fig. 5). ETAS does not include sequence-specific parameters— The Omori c-value, which is a time constant that regularizes rather, the direct Omori a-, p-, and c-values used are constant the rate function at small times after the mainshock, is from sequence to sequence, although stochasticity of the smaller in UCERF3-ETAS—in fact, it is slightly negative ETAS algorithm leads to some intersequence variability in (which is unphysical) in the maximum-likelihood fit (the di- productivity. For large (M ≥ 6:5) mainshocks, UCERF3- rect Omori c-value used in UCERF3-ETAS is quite small: ETAS results only diverge from the limited observational 1:78 × 10−5 days). The UCERF3-ETAS model currently data for quiet sequences, which, again, are more numerous does not make any attempt to model short-term aftershock in the observed data. For the most productive sequences of incompleteness that is present in the observed catalog fol- M ≥ 6:5 mainshocks, the observed results fall within the lowing moderate and large earthquakes. This detection bias range given by short (28-yr) UCERF3-ETAS catalogs. It in the observed catalog inflates the observed c-value, so this may be there is not enough data in a short catalog to see difference is (at least, partly) observational rather than physi- a difference in productivity for mainshocks as large as BSSA Early Edition Turing-Style Tests for UCERF3 Synthetic Catalogs 7
Figure 6. Number of M ≥ 3 aftershocks per M 5.5–6.5 mainshock, plotted cumulatively, for observed and synthetic catalogs in (a) the first day following the mainshock, and (b) 1–10 days following the mainshock. Number of M ≥ 4 aftershocks of M ≥ 6:5 mainshocks are shown for (c) the first day following the mainshock and (d) 1–10 days following the mainshock. Each line in the plot represents the dis- tribution of the number of aftershocks per sequence for one 28-yr catalog. For synthetic catalogs, we show two hundred 28-yr snapshots, half from the beginning of 1000-yr synthetic catalogs and half from the end. The color version of this figure is available only in the electronic edition.
M 6.5, or it may be that enhanced productivity differences tivity of the observed catalog is higher. Brodsky (2011) first due to large aftershocks drawn from characteristic magnitude pointed out that the observed catalog has a factor of 2–3 more distributions in UCERF3-ETAS bring the results into align- foreshocks than is predicted by scale-free ETAS models. ment for larger mainshocks. This result was verified by Shearer (2012), who observed We also compare foreshock and aftershock productivity that the effect was independent of differential magnitude, as as a function of differential magnitude, following Shearer we reproduce here. We find, however, that this effect disap- (2012). Here, we again use the Page et al. (2016) exclusion pears for higher mainshock magnitudes, for which both the criteria, but apply it to foreshocks as well as aftershocks. In foreshock and aftershock curves at different mainshock the case of foreshocks, we include events 10 days prior to magnitudes are approximately coincident (Ⓔ Fig. S2). mainshocks, and as above, mainshocks are defined as the largest event in the sequence. Differential magnitude is Clustering Analysis defined as the difference between the aftershock or foreshock magnitude and the mainshock magnitude. Observed fore- To compare spatiotemporal patterns in the UCERF3- shock and aftershock productivity agrees with the range pro- ETAS model, we employ a time–space–magnitude clustering duced by UCERF3-ETAS catalogs the same length as the analysis (Baiesi and Paczuski, 2004; Zaliapin et al., 2008). observed catalog (Fig. 7), with the exception of foreshocks In this approach, the pairwise distance between events is of M 4.5–5.5 mainshocks, for which the foreshock produc- defined as BSSA Early Edition 8 M. T. Page and N. J. van der Elst