Natural Hazards and Earth System Sciences PI forecast with or without de-clustering : an experiment for the Sichuan-Yunnan region

Pattern Informatics (PI) algorithm uses earthquake catalogues for estimating the increase of the probability of strong earthquakes. The main measure in the algorithm is the number of earthquakes above a threshold magnitude. Since aftershocks occupy a significant proportion of the total number of earthquakes, whether de-clustering affects the performance of the forecast is one of the concerns in the application of this algorithm. This problem is of special interest after a great earthquake, when aftershocks become predominant in regional seismic activity. To investigate this problem, the PI forecasts are systematically analyzed for the Sichuan-Yunnan region of southwest China. In this region there have occurred some earthquakes larger than MS 7.0, including the 2008 Wenchuan earthquake. In the analysis, the epidemic-type aftershock sequences (ETAS) model was used for de-clustering. The PI algorithm was revised to consider de-clustering, by replacing the number of earthquakes by the sum of the ETAS-assessed probability for an event to be a “background event” or a “clustering event”. Case studies indicate that when an intense aftershock sequence is included in the “sliding time window”, the hotspot picture may vary, and the variation lasts for about one year. PI forecasts seem to be affected by the aftershock sequence included in the “anomaly identifying window”, and the PI forecast using “background events” seems to have a better performance.


Introduction
The Pattern Informatics (PI) algorithm, which has been developed in recent years and has been successfully applied to California (Rundle et al., 2000(Rundle et al., , 2003;;Tiampo et al., 2002), central Japan (Nanjo et al., 2006), Taiwan (Chen et al., 2005), south-west China (Jiang and Wu, 2008) and other regions, Correspondence to: Z. L. Wu (wuzl@cea-igp.ac.cn) uses earthquakes catalogues to identify the increase of probability of strong earthquakes.The main measure of this algorithm is the number of earthquakes above a threshold magnitude.In the perspective of nonlinear dynamics, the PI algorithm tests whether the seismicity before a strong earthquake is different from what it "should be" in the usual time.The null hypothesis for this test is that the seismicity is stationary, or more strongly, being Poissonian.For counts of mainshocks, generally this assumption has few problems; but for an earthquake sequence which mixes mainshocks and aftershocks, this may not be the case.It is well-known that the occurrence of aftershocks does not observe a Poissonian process, rather it observes the Omori-Utsu law.As a matter of fact, in some of the analysis of aftershocks, to identify the "abnormal change" of seismicity above the "normal" background, transformation is used so that the aftershock sequence becomes Poissonian along the "transformed time" axis (Ogata, 2007).In other studies, de-clustering, that is, removing aftershocks from the earthquake catalogue so that a mainshock series can be considered, is necessary for estimation of strong earthquake probability (Keilis-Borok and Rotwain, 1990).
One of the scientific problems related is: whether declustering, or the removal of aftershocks, affects the result of the PI forecast.This problem becomes more important after a great earthquake, such like the 12 May 2008, Wenchuan M S 8.0 earthquake, since after this large earthquake, aftershocks became predominant in regional seismic activity.To investigate this problem, this study considers Sichuan-Yunnan and the surrounding regions of southwest China (hereafter referred to as Sichuan-Yunnan region).Previous works have shown that the PI algorithm performs well in this region (Jiang andWu, 2008, 2010).One of the concerns is whether the PI algorithm was still valid after the 2008 Wenchuan earthquake, or practically, whether de-clustering is needed to avoid the influence of the aftershocks to the performance of the PI algorithm.(b) Frequency-magnitude distribution of the earthquake catalogue.(c) Temporal distribution of earthquakes, with the vertical dash lines to the right representing t 1 the starting time of the "anomaly identifying window", t 2 the ending time of the "anomaly identifying window" and the starting time of the "forecast window", and t 3 the ending time of the "forecast window".For the "sliding window" in this case, the catalogue for PI calculation is selected to start from t 0 .
Tectonic settings of the study region, including the distribution of active faults and historical earthquakes as well as the geodynamic background, mainly based on the reference of Yi et al. (2002) and Xu et al. (2005), have been summarized in previous work (Jiang andWu, 2008, 2010) as background information for using the PI algorithm.In Jiang andWu (2008, 2010), information of the regional seismological observation system and the temporal evolution has also been provided.Directly related to the PI algorithm, completeness magnitude for this region was given using different approaches (Su et al., 2003;Jiang and Wu, 2008).The results indicate that generally in the study region and for the period of study, the magnitude 3.0 can be selected as the completeness magnitude.Related to de-clustering, Jiang and Zhuang (2010) estimated the background seismicity and potential source zones of strong earthquakes in the Sichuan-Yunan region by using the space-time ETAS model.

Data used
Parameter settings have been described in detail in previous works (Jiang andWu, 2008, 2010) and are used in the same form in this study: The region under study is between latitudes 20.8 • ∼ 34.0 • N and longitudes 97.3 • ∼ 107.0 • E, i.e., Sichuan and Yunnan Provinces and their surrounding regions, as shown in Fig. 1.It is observed that in this region, strong earthquakes have a significant component of stochastic clustering, with a complicated recurrence process that does not demonstrate simple periodicity and does not seem to be well-described by the time-predictable model or magnitude-predictable model (Yi et al., 2002;Xu et al., 2005).The calculation uses the Monthly Earthquake Catalogue from 1 January 1970 to 12 May 2008 provided by the China Earthquake Networks Center (CENC), with completeness down to M L 3.0 (Su et al., 2003;Jiang and Wu, 2008).The "target earthquakes" was selected from the catalogue of the China National Seismograph Network (with data available at http://www.csndmc.ac.cn/newweb/data.htm#).Surface wave magnitude was used for the definition of "target earthquakes" to avoid the bias caused by the difference between different local magnitudes.

De-clustering of the earthquake catalogue
De-clustering earthquakes requires discrimination between mainshocks and aftershocks, which is intrinsically difficult in physics (e.g., Helmstetter and Sornette, 2003).As an alternative scheme to "deterministic" identification of aftershocks (e.g., Gardner and Knopoff, 1974), Zhuang et al. (2002) and Zhuang and Ogata (2006) proposed a stochastic de-clustering scheme in which it is no longer determined whether an earthquake is a "background event" or if it is triggered by another.In the algorithm, each event has a probability of being either a "background event" or a direct offspring triggered by others.Calculation of the probabilities is through the "epidemic-type aftershock sequences (ETAS)" model, based on the consideration of a stochastic point process, in which each earthquake has some magnitude-dependent capability to trigger its own Omori-law type aftershocks (Ogata, 1988;Helmstetter and Sornette, 2002).Being an extension of the classical Omori's law (Omori, 1894), the space-time ETAS algorithm has been described by Zhuang et al. (2002) in detail.One set of the outputs of the ETAS de-clustering include: (a) total seismic activity m(x,y), (b) "background seismic activity" μ(x,y), and (c) "clustering seismic activity" Ĉ(x,y), with unit counts/year/degree 2 , being stationary in time but inhomogeneous in space.Software can be found on-line via the SASeis2006 software package (http://bemlar.ism.ac.jp/www2/SASeisUpCollection/SASeis2006/). Figure 2 shows an example of the ETAS de-clustering result.In Figs.2a and b, taking the 1970 Tonghai earthquake sequence as an example, it is shown that a probability µ is assigned to each event.Figure 2c and d shows the result of de-clustering for the whole Sichuan-Yunnan region under study, in which "background events" or "clustering events" are discriminated by the criteria whether µ is larger than a threshold value.The threshold value is calculated from the long-time average of seismic activity.Although apparently arbitrary in this figure, it can be seen that stochastic clustering can assess the importance of clustering for a given earthquake catalogue.From the figure it can also be deduced that for the region under consideration, there were three intense aftershock sequences related to the occurrence of the 1976 Longling (Yunnan) earthquake, the 1976 Songpan (Sichuan) earthquake, and the 1988 Lancang-Gengma (Yunnan) earthquake, respectively.
3 The PI algorithm and its revision considering de-clustering

The PI algorithm
In the PI algorithm, the whole region under study is binned into boxes or "pixels" with size D ×D centered at a point x i .
Each point x i is associated with a time series N i (t), where N i (t) is the time-dependent average rate of earthquakes with magnitude greater than the cutoff magnitude M c in box i and its Moore neighborhood.N i (t) is calculated for box i within a period starting from time t b to time t (t > t b ).The "seismic activity intensity" function of box i is defined as the average rate of occurrence of earthquakes: The probability of a future strong earthquake in box i, P i (t 0 ,t 1 ,t 2 ), is defined as the square of the average intensity fluctuation: in which t 1 is the starting time of the "anomaly identification window", t 2 the ending time of the "anomaly identification window" and the starting time of the "forecast window".The "sliding window" for PI calculation is selected to start from t 0 .Subtracting the mean probability over all boxes and denoting this change as the probability-increase of future earthquakes via the "hotspots" are defined to be the boxes where P i (t 0 ,t 1 ,t 2 ) is positive, or the probability function P i (t 0 ,t 1 ,t 2 ) is larger than the background level.Physically, since the probability function has quadratic form, either "seismic activation" or seismic quiescence can be reflected by the PI "hotspot" map, which is one of the reasons why PI forecasts outperform the "relative intensity" (RI) forecasts.

Parameter settings
The code used for the calculations in this experiment is the modified Matlab version of the PI algorithm provided by the Rundle group.Details of the algorithm are given by Rundle et al. (2002).Taking the works of Nanjo et al. (2006), Holliday et al. (2005Holliday et al. ( , 2006Holliday et al. ( , 2007)), Chen et al. (2005), and Wu et al. (2008) as references and remaining consistent with the parameter settings of Jiang andWu (2008, 2010), parameters for the calculation are set as follow: "target magnitude" M S 5.5, cutoff magnitude M L 3.0, "sliding time window" 15 years, "anomaly identification time window" and "forecast time window" being both 5 years, and spatial grid D = 0.2 • .Following previous works (Keilis-Borok and Rotwain, 1990;Holliday et al., 2007;Jiang and Wu, 2008), only shallow earthquakes are considered, without the specification of depths.Logarithm amplitudes log( P / P max ) are used to represent the relative-probability-increase for strong earthquakes.Following Chen et al. (2005), only the top 30% values are considered as the "hotspots" and shown in the PI forecast map.

The revision of the algorithm considering de-clustering
In this study, considering de-clustering is implemented through the revision of Eq. ( 1).In the space-time ETAS model described by Zhuang et al. (2002), an output of the ETAS de-clustering is that each event can get a probability µ for being a "background event", or a probability ρ = 1 − µ for being a "clustering event".As a straightforward consideration, in this study, for revising Eq. ( 1), N i (t) is simply replaced by the sum of µ for "background events", or by the sum of ρ for "clustering events".That is, different from the "classical" PI algorithm in which each event is assigned an integer 1 in the summation, in the revised PI algorithm considering de-clustering, each event is assigned a decimal µ (for "background events") or ρ (for "clustering events").If all the events are considered, then (µ + ρ) = 1 is assigned to each event, "degenerating" to the "classical" PI algorithm.In this case, the above-mentioned shortcoming, that the discrimination between "background events" and "clustering events" is somehow arbitrary, does not affect the revised PI algorithm.For each event, it is not necessary to identify whether it belongs to the "background" or the "clustering" group -what is needed is only the probability.

Overall comparison
To systematically investigate the performance of the revised PI algorithm, t 2 is slid from 1 January 1988 to 1 January 2003, with each sliding step being 0.5 year.As Rundle et al. (2000Rundle et al. ( , 2003) ) and Tiampo et al. (2002) did in the evaluation of the performance of forecasts, receiver operating characteristic (ROC) test (Swets, 1973;Molchan, 1997) was conducted by systematically changing the "alarm threshold" of the "forecast region" and counting the "hit rate" and "false alarm rate" relative to real earthquake activity, in which "hit  rate" means the number of "forecasted" events divided by the total number of "target" events, and "hit" or "forecasted" is defined as the case that the "target event" occurs within any "alarmed cell" or one of the nearest neighbors.The PI forecast is also compared with random guess.Figure 3a gives the ROC test result for the PI forecasts using "background events", in which the gray zone delimitates the range of all the ROC curves, showing the overall performance of the PI forecast.From the figure it can be seen that despite the variation of the performance with time, the PI forecast is much better than random guessing.Figure 3b compares the forecasts of the PI algorithm using "background events" and the RI algorithm using the whole event set.It can be seen that the PI algorithm using "background events" also outperforms RI.In Fig. 4, the PI forecast is conducted using "clustering events".From the comparison between the PI forecast using "clustering events" and the RI algorithm using whole event sets, it may be seen that these two cases are comparable to each other in terms of forecast performance, being understandable in physics since the RI algorithm basically uses the clustering properties of seismic activity.Figure 5a shows that PI forecasts using "background events" are comparable with the PI forecasts using whole event sets, indicating that in general, de-clustering has little effect on the performance of the PI forecasts.This is also understandable to a greater much extent since Fig. 2c and d shows that clustering events occupy only a small portion of time duration for the whole time period since 1970.Figure 5b shows that the PI forecasts Difference between PI forecasts using "background events" and those using the whole event set.(b) Difference between PI forecasts using "background events" and those using "clustering events".The gray line and black line represent the results of the first and the last sliding, respectively.The color bar indicates the stacking of the areas encompassed by the G curves and the horizontal zero-line.
Fig. 6. "Hotspot overlap ratio", defined as the hotspot pixels overlapped with those of the previous half year (i.e., the last sliding step), represented as the relative value comparing to the whole hotspot pixels, compared with the M − t 0 plot for events larger than M S 6.5.Fig. 7. "Differential ROC area", compared with the M − t 1 plot for events larger than M S 6.5.See text for details.
using "background events" are better than those PI forecasts using "clustering events", being equivalent to the fact that the PI forecasts outperform the RI forecasts in this region.

The effect of intense aftershock sequences: case and case-only analysis
Large earthquakes are the main contributor of aftershocks.
After large earthquakes, aftershock activity becomes predominant in regional seismic activity.Whether such sequences affect the PI forecast is one of the concerns in this study.The practical motive is that after the 2008 Wenchuan earthquake, the question arose of whether the PI algorithm, which has been shown to have a good performance for the Sichuan-Yunnan region (Jiang and Wu, 2008), was still valid for the assessment of regional earthquake hazard.If the aftershock sequence does have some effect on the forecast, how long is needed to overcome such an effect.Although samples of large earthquakes with intense aftershock sequence are too limited to get definite conclusions, it may be seen from Fig. 6 that whether an intense aftershock sequence is included or not in the "sliding time window" did actually affect the picture of the "hotspots".The figure plots the "hotspot overlap ratio", defined as the "hotspot" pixels overlapped with those of the previous half year (i.e., the last sliding step), represented as the relative value comparing to the whole "hotspot" pixels.This measure is to reflect the variation of the forecast -the less this value is, the more variable the hotspot picture is.The "hotspot overlap ratio" for PI forecasts using "background events" and "clustering events", respectively, are shown in the figure together with those using the whole event set, against time t 0 .Crossing of t 0 by an aftershock sequence is equivalent to the change of whether the sequence is included in the "sliding window", or "learning window".The figure also shows the M − t 0 plot of regional seismic activity, in which the intense aftershock sequence is related to the 1976 Longling, Yunnan earthquakes (29 May 1976 M S 7.3 and M S 7.4) and 1976 Songpan, Sichuan, earthquakes (16 August 1976 M S 7.2 and 23 August 1976 M S 7.2).From the figure it can be seen that the hotspot picture changes signif-icantly after an intense aftershock sequence is included into the "learning window": the variation increases significantly for a period of about one year, being consistent with the time scale of an aftershock sequence decay.
The difficulty is how such a change affects the performance of the PI forecast.For studying this, the "differential ROC area" can be defined as the difference between the area under the ROC curve above the zero line and the triangle of the random guess.It is well-known that the larger the area under the ROC curve compared to the area of the triangle for random guess, the better the performance of the forecast.Therefore this measure is to reflect the performance of the forecast changing with the sliding time window.Figure 7 plots the "differential ROC area" for "background events", "clustering events", and the whole event set, against the time ticker t 1 .The M − t 1 plot of regional seismicity is also shown, in which it may be seen that the intense aftershock sequence is related to the 1988 Lancang-Gengma, Yunnan, earthquakes.Crossing of t 1 by an aftershock sequence is equivalent to the change of whether or not the sequence is included in the "anomaly identifying window".It may be seen that after the intense aftershock sequence is included, the forecast capability decreases.Because there were more than one aftershock sequences in the same region and in the same period, it is hard to get a definite estimate of how long such an effect lasts.From the figure it can be deduced that the effect seemed to last for 5 years, the characteristic length of the PI learning window.At the same time, using "background events" seems better due to the shorter time for restoring to the "normal" state.As specific examples, Figs. 8 and  9 show the distribution of PI "hotspots" using "background" events and "clustering" events, respectively.To show the effect of the aftershock sequence, the hotspot distribution for two successive years (with an aftershock sequence crossing the "training window") is provided.From the figures it can be seen that using "background" events and "clustering" events results in significant differences of hotspot distribution.When encountering an aftershock sequence, using "background" events may lead to more evident variation of the hotspot distribution, being the same as in Fig. 6.

Conclusions and discussion
Investigating whether de-clustering may affect the forecasting performance of the PI algorithm, the earthquakes in Sichuan-Yunnan region from 1970 to 2008 were systematically analyzed.The PI calculation used the same parameter settings as in previous works (Jiang andWu, 2008, 2010), but for "background events" and "clustering events", respectively.Using the ETAS model (Zhuang et al., 2002), even if without the definite information of whether an earthquake is a "background event" or a "clustering event", the PI algorithm can be applied to "background events" and "clustering events", respectively.A retrospective test showed that, for the Sichuan-Yunnan region, after a large earthquake occurred or more exactly, after an intense aftershock sequence was included, the PI forecast was affected as shown by the difference in the hotspot pictures.However, the time for such effect lasted for about one year.Therefore, generally, if an aftershock sequence is not intense or the time since the last large earthquake has been longer than one year, declustering has little effect on the PI calculation.Not surprisingly, in general for the Sichuan-Yunnan region, using "background events" in the PI forecast seems to improve peformance slightly.On the other hand, caution has to be taken that the above conclusions may only be valid for the Sichuan-Yunnan region.It is not sure whether such preliminary conclusions are region-dependent, and further work for more places is needed.

Fig. 1 .
Fig. 1.(a) Sichuan-Yunnan region, with earthquakes larger than M S 5.5 since 1970 shown by black dots.Tectonic faults are shown by gray lines.Star and circles indicate the epicenters of the 2008 Wenchuan mainshock and its aftershocks, respectively.Gray dots show earthquakes from 1970 with a magnitude larger than M L 3.0.To the bottom right is the indexing figure showing the region under study.(b)Frequency-magnitude distribution of the earthquake catalogue.(c) Temporal distribution of earthquakes, with the vertical dash lines to the right representing t 1 the starting time of the "anomaly identifying window", t 2 the ending time of the "anomaly identifying window" and the starting time of the "forecast window", and t 3 the ending time of the "forecast window".For the "sliding window" in this case, the catalogue for PI calculation is selected to start from t 0 .

Fig. 2 .
Fig. 2. ETAS de-clustering of earthquake catalogues for the Sichuan-Yunnan region: an example.(a) The M-t plot of the 1970 Tonghai M S 7.8 earthquake and its aftershock sequence.(b) The probability µ for each event to be a "background event", for the same earthquake sequence as shown in panel (a).(c) Background events in the Sichuan-Yunnan region since 1970, shown in the coordinate system of time and latitude.(d) Clustering events.

Fig. 3 .
Fig. 3. ROC test of the PI algorithm using "background events"."Anomaly identifying window" and "forecast window" are taken both as 5 years.The "sliding window" is taken as 15 years."Forecast window" slides from t 2 = 1 January 1988 to t 2 = 1 January 2003, with sliding step being 0.5 year.(a) ROC curve.Gray zone delimits the range of all the ROC curves, with gray line and black line representing the results of the first and the last sliding, respectively.(b) Difference between the hit rate of the PI algorithm and RI algorithm.The gray line and black line represent the results of the first and the last sliding, respectively.The color bar indicates the stacking of the areas encompassed by the G curves and the horizontal zero-line.

Fig. 4 .
Fig. 4. ROC test of the PI algorithm using "clustering events".Parameters for the PI calculation and figure captions are the same as in Fig. 3.

Fig. 5 .
Fig. 5. (a)Difference between PI forecasts using "background events" and those using the whole event set.(b) Difference between PI forecasts using "background events" and those using "clustering events".The gray line and black line represent the results of the first and the last sliding, respectively.The color bar indicates the stacking of the areas encompassed by the G curves and the horizontal zero-line.

Fig. 8 .
Fig. 8. Hotspot maps of the PI algorithm using "background" events.(a) Hotspot map with t 2 = 1 January 1991.Color-coded hotspots highlight the relative probability increase for earthquakes above M S 5.5, with spatial grid size 0.2 • .Blue circles stand for the earthquakes above M S 5.5 occurring within the "forecast window", while gray reverse triangles show the earthquakes above M S 5.5 occurring within the "anomaly training window".(b) Background seismicity rate for events in the sliding window used in Fig. 8a, shown by background seismic activity μ(x,y).Blue circles show the earthquakes above M S 5.5.(c) Hotspot map with t 2 = 1 January 1992.(d) Background seismicity rate for events used in Fig. 8c.

Fig. 9 .
Fig. 9. Hotspot maps of the PI algorithm using "clustering" events.(a) Hotspot map with t 2 = 1 January 1991.(b) Clustering seismicity rate for events used in panel (a), shown by clustering seismic activity Ĉ(x,y).(c) Hotspot map with t 2 = 1 January 1992.(d) Clustering seismicity rate for events in the sliding window used in panel (c).