Hazard estimates for El Chich́on volcano , Chiapas , Ḿexico : a statistical approach for complex eruptive histories

The El Chich́on volcano (Chiapas, Ḿexico) most recent eruption occurred in 1982 causing the worst volcanic disaster in the recorded history of Mexico. Prior to the eruption, El Chich́on volcano was not considered a very hazardous volcano, a perception mostly caused by the low eruption rate of the past eruptions. The correct assessment of volcanic hazard is the first step to prevent a disaster. In this paper, we analyze two periods of the reported eruptive history of El Chich́on volcano during the Holocene, searching for the eruption rates of different VEI magnitude categories and testing their time dependence. One period accounting the eruptions of the last 3707 years before the last eruption (BLE) is assumed to be complete, with no missing relevant events. More scarce information of a period extending to 7772 years BLE is then added. We then apply the Non-Homogeneous Generalized Pareto-Poisson Process (NHGPPP), and the Mixture of Exponentials Distribution (MOED) methods to estimate the volcanic hazard of El Chich́on considering both periods. The results are compared with the probabilities obtained from the homogeneous Poisson and Weibull distributions. In this case the MOED and the Weibull distribution are rather insensitive to the inclusion of the extended period. In contrast, the NHGPPP is strongly influenced by the extended period.


Introduction
Currently, millions of people worldwide are at risk by volcanic eruptions.The average annual death quota remains high due to more people living in close proximity to active volcanoes.The increasing exposure of a larger population is in some cases derived from an inadequate assessment of Correspondence to: A. T. Mendoza-Rosas (ateresa@geofisica.unam.mx) the volcanic risk, allowing the land use of vulnerable areas.A correct assessment thus represents an essential component for the management of risk and disaster prevention.The statistical methods to evaluate the volcanic risk are an emergent and rapidly growing area aimed to provide more objective land use criteria, and to support decision making that may help preventing a disaster.The first step in this assessment is the appropriate estimation of volcanic hazard, i.e., the probability that a specific type of volcanic eruption occurs in a given area within a given interval of time.The quality of this estimate depends mainly on the capacity of the chosen statistical model and the quality of the available data base.
In this paper, we compare different methods to calculate the hazard of El Chichón volcano (Chiapas, México).One, the Non-Homogeneous Generalized Pareto-Poisson process (NHGPPP), is a relatively complex method that permits precise hazard estimates even on non-stationary and incomplete data bases of volcanoes, having a short list of major eruptions separated by long repose periods with little information on smaller activity during those intervals, as is the case of El Chichón volcano.This method is discussed in detail elsewhere (Mendoza-Rosas and De la Cruz-Reyna, 2008).Other method uses the Mixture of Exponentials Distribution (MOED), also called the hyperexponential distribution (Mendoza-Rosas and De la Cruz-Reyna, 2009).This method is much simpler and needs less calculation.The MOED has been applied to other volcanoes of Mexico (Mendoza-Rosas and De la Cruz-Reyna, 2009) and Chile (Dzierma and Wehrmann, 2010), and compared with other distributions such as the Poisson, Weibull, exponential, NHGPPP and log-logistic distributions, showing satisfactory estimates.Although in principle the MOED requires completeness of the data set, the results obtained here show that the MOED also provides acceptable estimates for extended and probably incomplete data set such as the Holocenic El Chichón eruptive series.To increase the comparison perspective, we again weigh these methods against A. T. Mendoza-Rosas and S. De la Cruz-Reyna: El Chichón volcano hazard estimates other well known and commonly used distributions, mostly to analyze the effects of completeness and non-homogeneity, i.e. non-stationarity.The quality of the volcanic hazard estimates obtained from the different methods is evaluated using the Anderson-Darling, Cramer-von-Mises and Kolmogorov-Smirnov goodness of fit tests, as well as the AIC (Akaike Information Criterion), against the distribution of repose times obtained from the reported eruptive history.

El Chichón Volcano
El Chichón volcano (17.36 • N, 92.23 • W) is located in the Chiapas Volcanic Arc (CVA), a structure associated to the subduction of the Cocos plate under the North American plate in a complex way due to the changing subduction angle and the close interaction with the Caribbean plate (Damon and Montesinos, 1978;Mora et al., 2007;Layer et al., 2009).Tephrostratigraphy, 14 C-dating and palynology to estimate the timing and magnitude of past volcanic eruptions of El Chichón volcano have been used by Nooren et al. (2009).El Chichón volcano has an altitude of 1100 m a.s.l., and a 1 km wide, 140 m deep crater in its summit formed during the most recent eruption, beginning on 28 March 1982 (Espíndola et al., 2000;Macías et al., 2007Macías et al., , 2008)).This week-long eruption (VEI 5) produced planetary scale volcanic gas clouds (Kruger, 1983), extensive ash fall, and pyroclastic surges and flows that resulted in the worst volcanic disaster in the recorded history of Mexico devastating a radius of about 10 km around the volcano and covering southeastern Mexico with ash fall (Macías et al., 2008).It caused about 2000 fatalities, displaced thousands, and produced severe economic loss (De la Cruz-Reyna and Martin-Del Pozzo, 2009).
Recent studies on the stratigraphy of the volcano and new radiocarbon ages show that at least other 11 major eruptions prior to the 1982 eruption have occurred at El Chichón in the Holocene, and more precisely, in the past 8000 years, with most of the repose intervals lasting between 100 to 600 years (Tilling et al., 1984;Espíndola et al., 2000;Macías et al., 2007Macías et al., , 2008;;Layer et al., 2009).Table 1 summarizes the historical and Holocenic geological records.Given the limited available geological data of recognizable deposits with approximate geochronological dating of the eruptive history of El Chichón, the completeness, i.e., the certainty that all the eruptions regardless of their magnitudes have been accounted in the eruptive data base cannot be fully sustained, especially for the older events.We thus analyze the eruptive records using a scaling logarithmic relationship between the magnitude VEI (Volcanic Explosivity Index; Newhall and Self, 1982) and the eruption occurrence rate for each magnitude category VEI.This relationship (De la Cruz-Reyna, 1991, 1996;De la Cruz-Reyna and Carrasco-Núñez, 2002) provides a criterion to estimate the most probable magnitude of eruptions to which no VEI has been assigned.Although  (Duffield et al., 1984;Tilling et al., 1984;Espíndola et al., 2000;Macías et al., 1993Macías et al., , 2003Macías et al., , 2007Macías et al., , 2008)).The dates are averages of the reported radiometric age in years before the last (1982) eruption.The dash indicates an unknown VEI value.For the eruption of 1270 years BLE we adopted a VEI 5 based on personal communications by J. L. Macías and J. M. Espíndola who are carrying out most of the field work on El Chichón deposits, and consider that such eruption was at least as large as the 1982 one.For the eruptions in the range 2∼3, we adopted the higher value on the premise that deposits of eruptions older than 1000 years that remain recognizable in a tropical humid climate more probably correspond to the upper end of the range.this criterion is non-unique, different models of the eruption rate-VEI relationship may be constructed and the best model may be chosen by optimizing the fit with the assumed complete catalogue of higher magnitudes (Mendoza-Rosas and De la Cruz-Reyna, 2008).To analyze the eruptive history of El Chichón volcano, we used the available data for the last 7772 years before the last eruption (BLE) and searched for the best fit of the scaling law to estimate the eruption rates of the probably incomplete lower-range magnitudes.

Statistical methods
To make a comparative analysis of the MOED and the NHGPPP methods aimed to the search of precise and simple estimates of the volcanic hazard of El Chichón volcano, we consider the eruption sequence as a time-dependent point processes of independent events developing along the time axis, and study the distributions of the eruptive occurrences and repose times between eruptions for each VEI magnitude category.
The NHGPPP is a procedure based on the use of a nonhomogenous Poisson process on the eruptive time series, with a Generalized Pareto Distribution (GPD) as intensity function (Coles, 2001).The GPD is described by a shape parameter β, a scale parameter θ, and a location parameter u (threshold), and has the following cumulative distribution function: where y=x −u is a realization of an excess (Brabson and Palutikof, 2000;Lin, 2003).Since we are assuming that the completeness (i.e. a portion in which no significant eruption data are missing) of the eruptive time series improves as the magnitude of the eruptions increases, this approach is strongly influenced by the few data that are represented by the right tail of the repose-time distribution.The GPD intensity function of the NHGPPP permits modeling extreme values, such as the very high-magnitude eruptions, allowing for a better fit of the whole distribution.Additionally, it is less sensitive to the incompleteness of the data base and possible time dependence of the large-magnitude eruption sequence, since it only considers the number of exceedances over a threshold (the exceedances are the events with a magnitude higher than a reference threshold magnitude), of a series that may be homogeneous or not.A more detailed description of this statistical method and its applications to other four volcanoes may be found in Mendoza-Rosas and De la Cruz-Reyna (2008).
A second, and simpler, method to assess the volcanic hazard is based on a mixture of exponentials distribution (MOED), also called a hyperexponential distribution (Mendoza-Rosas and De la Cruz-Reyna, 2009).The MOED uses a sum of exponential distributions that may be quickly evaluated and interpreted.This method is particularly useful when the eruptive time series develops as a succession of eruptive regimes, each having a characteristic eruption rate.Usually, these regimes may be readily identified in the cumulative series of events.The MOED permits a good fitting to the eruption data using an a priori calculated distribution parameters, directly obtained from the identified occurrence rates of the regimes.The involved parameters are the rates of the individual exponential distributions, namely the number of occurring events per duration of each regime; and the coefficients or weighting factors, calculated as the normalized complement of the corresponding proportions of the duration of regimes (Eq.6 in Mendoza-Rosas and De la Cruz-Reyna, 2009).The MOED's weighting factors are defined in an interval [0, 1] and their sum should equal 1.The MOED is assuming that the eruptive process is non-stationary, and that the time dependence is expressed as a succession of regimes with high and low eruption rates.Therefore, there is a mean eruption rate representing an average of the eruption rate regimes.This requires that the duration of the high regimes (many eruptions in a shorter time) is shorter than the duration of the slow regimes (few eruptions over a longer time).The shorter periods containing more eruptions (thus representing a higher hazard) should then be weighted by the complements of their durations.MOED is also useful in modeling long-tailed data without some of the mathematical complications of other distributions such as the Pareto and Weibull probability distributions (Johnson and Kotz, 1953;Bebbington and Lai, 1996).
Both methods, NHGPPP and MOED are used here to estimate the likelihood of at least one eruption of El Chichón volcano in a given VEI category at a specified time in the future, i.e., the volcanic hazard, and compared with the results of standard distributions such as Poisson and Weibull.The quality of the estimates is evaluated through the goodness of fit with the available eruptive history data: occurrence rates and distribution of repose times in specific magnitude categories.

Goodness of fit tests
Suppose that x 1 ,...,x n are identical independent distributed observations from an unknown distribution F .We wish to use x 1 ,...,x n to test whether F coincides with a fullyspecified distribution F * .The goodness-of-fit approach to this problem consists of testing under the null hypothesis H 0 : F =F * against H 1 : F =F * , and a number of distribution-free test procedures are available.In order to have a broader criterion to test the quality of the distributions treated here, we use the Cramer-von Mises, the Kolmogorov-Smirnov and the Anderson-Darling tests.
To test the goodness-of-fit for a hypothesized distribution F * , we can use the discrepancy between the empirical F and the hypothesized F * as a statistical test.The Kolmogorov-Smirnov test (Gibbons, 1976), uses a maximum distance or separation criterion based on the statistic In some cases, the Kolmogorov-Smirnov test is perceived as providing a poor estimate of the quality of fit, particularly when the significant separation criterion occurs only at a single point of the data set (we use in this work the central point of each repose period to measure the separation).The null hypothesis may thus be rejected if F and F * significantly differs at one single data point, even if the overall fit is reasonably good.Therefore, some alternative tests have been proposed in the literature (Jesse, 2009), such as the Cramér-von-Mises and the Anderson-Darling tests (Anderson andDarling, 1952, 1954;Anderson, 1962).
A test which does not involve a subjective grouping of the data is the Cramér-von-Mises criterion, which is based on a "quadratic distance" to judge the goodness of fit of a probability distribution F * compared to a given empirical distribution function F .It is defined as where x 1 ,...,x n are the observed values, in increasing order.The statistic is expressed as Anderson and Darling, 1954;Anderson, 1962).If a measure CvM is adopted, the hypothesis is rejected for those samples for which CvM>z.The number z, namely the asymptotic significance point is chosen in such a way that the probability of rejection of a true hypothesis is some specified number (for example, 0.01 or 0.05).The asymptotic distribution of the statistic CvM can be found in Anderson and Darling (1952).
To consider a more convenient measure of the separation or "distance" between two distribution functions, Anderson and Darling (1952) incorporate a weighting function to allow more flexibility in the Kolmogorov-Smirnov and Cramervon-Mises tests.Let F n (x) = i n , if i observations are ≤x for i = 0,1,...,n. and where ψ(u), 0 ≤ u ≤ 1, and u = F * (x), is a weighting function, which is chosen by the statistician so as to weight the deviations according to the importance attached to various portions of the distribution function.The selection of ψ(u) = 1 yields nw 2 , the criterion of von-Mises for W 2 n , and K n for the criterion of Kolmogorov-Smirnov.The criterion W 2 n is an average of the squared discrepancy weighted by ψ (F * (x)) and normalized by n.For a given value of x, F n (x) is a binomial variable; it is distributed in the same way as the proportion of successes in n trials, where the probability of success is H (x). Thus, E [F n (x)] = H (x), and under the null hypothesis and Darling, 1954).This is equivalent to dispersing the sampling error over the entire range of x by weighting the deviation with the reciprocal of the standard deviation under the null hypothesis, i.e., using ψ(u) = 1 u(1−u) as a weighting function, with u = F * (x).This weighting function strongly emphasizes the effect of the distribution tails.Then, letting x 1 ≤ x 2 ≤ ... ≤ x n be the n ordered observations in the sample, the Anderson-Darling statistics is given by A 2 n = −n − s n , where The asymptotic distribution of this statistics, and approximate values of the significance points z s can be found in Anderson and Darling (1954).If the data produce a value of A 2 n larger than the asymptotic significance point z, the hypothesis may be rejected.This test uses the actual observations without grouping, and is more sensitive to discrepancies at the tails of the distribution rather than near the mean.An alternative method is the Akaike Information Criterion (AIC) proposed by Akaike (1973), which is not an absolute hypothesis test; it rather provides a relative comparison between different models, in which the lowest AIC value indicates the best fit.If all the models in the set assume normally distributed errors with a constant variance, the AIC can be computed as where n is the number of data points, k the number of free parameters, and y * i are the points of the model used to fit the data points y i .In the case of a small number of data points n, a correction needs to be applied (Burnham and Anderson, 1998), The Akaike Information Criterion penalises the misfit and the number of parameters used in the tested distributions (Akaike, 1973;Bebbington, 2007;Turner et al., 2008;Dzierma and Wehrmann, 2010).

Stationarity tests
The choice of the method to assess the volcanic hazard also depends on the time dependence of the process.Such dependence or non-stationarity may adopt diverse forms.Of special relevance is the existence of regimes with significantly different eruption rates in the eruptive time-series.This means that the process may keep a constant eruption rate for some time, and abruptly change it.If these changes form a succession of regimes around a mean overall regime characterizing the whole eruptive series, the calculated probabilities of eruptions based on the overall regime may be underestimated if the current process has a rate higher than the mean, and overestimated otherwise.Precise estimates of those probabilities thus require knowing if the fluctuations of the eruption rates around a mean are caused by the actual existence of eruptive regimes, or are just random variations of a natural stationary process.
There are different tests to observe the homogeneity or stationary character of the data, as for example the moving average test (Klein, 1982) or the dispersion test (Cox and Lewis, 1966).However, these tests require a subjective grouping of the data.The results may thus be affected by the choice of the number of groups or the intervals length.Additionally, if completeness of the eruptive record cannot be assured, application of these methods requires great caution.
When the presence of regimes is suspected from the inspection of the cumulative number of eruptions versus time (as is the case in Fig. 1), one may verify their existence using a test similar to the procedure used by Mulargia et al. (1987).We thus apply a simple t student test on the regime's and the global eruption rates to test the hypothesis of whether the regime's rates and the global rate belong to the same population.Call λ i the rate of each regime i, i.e., the number of eruptions in the given magnitude category occurring during that regime, and λ global the global rate (total number of eruptions in the given magnitude category during the whole sampled period).The value of each λ i may be represented by the slope of the best-fitting line to the cumulative number of eruptions for each regime, if the rate remains constant along the regime.Figure 1 shows the regime (solid lines) and global (dashes line) lines of different periods.The dotted lines represent 95% confidence bounds for the global rate lines.We then compare the slopes defined by pairs of successive eruptions with the λ i of the assumed regime to obtain a standard deviation for each regime.Finally, we apply a t student test to the null hypothesis that the λ i and λ global belong to the same population with the following criterion:

Applications to El Chichón Volcano
The eruptive sequence of El Chichón volcano includes 12 distinct events with VEI greater or equal than 3 occurring in the last 7772 years BLE, as shown in Table 1. Figure 1a shows the cumulative eruptive number of the last 3707 years BLE, that include a sample of 11 eruptions assumed to be complete in the specified magnitude range.Figure 1b shows the cumulative eruptive number including the oldest dated Holocenic eruption of El Chichón volcano.Completeness of the above VEI range of this longer period is not assumed.
The eruptive rates can be recognized as the slopes of the lines in Fig. 1.The mean rates λ i and their standard deviations over the 7772 and 3707 years BLE period are plotted in Fig. 2, showing the significant difference between the eruptive rates.The plots in Fig. 1 suggest that the eruptive activity of El Chichón volcano for the magnitude range VEI ≥3 in both periods 3707 and 7772 years BLE had different degrees of non-stationary behavior characterized by well defined regimes with different eruption rates.The rate of  (1987), we include the 95% confidence bounds, and we see that most of the observed points fall out of the bounds in the 7772 years BLE extended period; on the contrary, for the 3707 y BLE period only one single point falls out of the bounds at the change point between 2064-1271 and 3707-2066 years BLE regimes.To further test the stationarity of these periods, we also apply two additional tests.First the Kolmogorov-Smirnov criterion was applied to the global rate for the periods 3707 and 7772 years BLE: the null hypothesis (the eruptive history is stationary) is rejected at the 95% level of confidence for both periods.Secondly, we apply a t student test on the regime's and global rates (λ global =0.002698 and λ global =0.001415 eruptions per year for the time periods 3707-0 and 7772-0 years BLE, respectively).The t-statistics from the 2064-1271 years BLE period compared with the global rates of the periods 3707 and 7772 years BLE are 2.4345 and 2.0086 (with 12 and 13 degrees of freedom), respectively, and the null hypothesis, that all rates belong to the same population, should be rejected for a two-tailed test at the 90% level of confidence.
We thus take for granted the non-stationary character of the eruptive sequence, and mark the transitions between regimes by the vertical lines shown in Fig. 1.

Non-Homogeneous Generalized Pareto-Poisson Process (NHGPPP) method
The available information on the past activity of El Chichón volcano is not sufficient to assign precise VEI values to all of the eruptions.We thus estimate the most likely VEI values of the past eruptions in terms of the eruption occurrence rate Similarly, other two models based on the volcanic data for the last 7772 years BLE are listed in Table 3.These models assign possible VEI values to the eruptions with unknown magnitudes using the implicit condition expressed in Eq. (1) that eruptive rates and magnitudes are inversely related.The order of the assignment of VEI values for each eruption unknown does not affect the eruption rate values.
The VEI of the eruption of 3107 years BLE in which no volume or intensity data were available, was estimated testing the best fit to the VEI values of the other eruptions based on Eq. ( 1), and then selecting the model which best fitted the eruption rates determined by the scaling law.Tables 4  and 5 show the eruption rates λ Mvei for each VEI class, the slope −b from the loglinear relationships (1), and the regression coefficients for each of the models listed in Tables 2  and 3.The regression coefficients indicate that the best fits are for the models "Chichón 2" in Table 4 and "Chichón B" in Table 5.Although the volcanic hazard of El Chichón volcano has been previously estimated (Mendoza-Rosas and De la Cruz-Reyna, 2008), we have recalculated it here using the direct radiometric datings and errors, listed in Table 1, rather than the published "rounded" values (Espíndola et al.,  .The VEI values have also been revised using a different correction on the fact that the VEI magnitude scale is not open, as explained below.Additionally, some magnitudes have been reassigned on the basis of recent field work providing strong evidence that the eruption of 1270 years BLE was at least as large as the 1982 eruption (J.L. Macías, J. M. Espíndola, personal communications, 2010).For that reason, we are using VEI 5 rather that the previously published VEI 4 for that event as indicated in Table 1.Using the "Chichón 2" model (Table 2), Eq. ( 1) may be written as logλ Mvei =−0.239M vei −2.096.We now infer the number of eruptions that have exceeded a threshold (VEI=2), and calculate the excess and exceedance means (the exceedances are the events with magnitude higher than a threshold magnitude u, and an excess is the difference between the magnitude of the exceedance and the threshold u) to obtain the shape and scale parameters of the GPD, which result to be 0.290 and 3.462, respectively.The same procedure applied to the "Chichón B" model yield logλ Mvei =−0.272M vei −2.261, and 0.247 and 3.121 as the respective GPD shape and scale parameters.Since the VEI scale is not defined for values greater than 8, and the exceedances method assumes that the scale measuring the phenomena is open, we subtract the probability of exceeding the VEI 8 magnitude from the probabilities of exceeding VEI's lower than 8 obtained with the GPD intensity function, and not from the occurrence probabilities as in the previous work (Mendoza-Rosas and De la Cruz-Reyna, 2008).

The Mixture of Exponentials (MOED) method
The first question to address in the application of the MOED method refers to the completeness of the eruption data base.Figure 1 shows that three and four regimes may be recognized from the cumulative plot of El Chichón eruptions VEI≥3, considering the eruptive history to 3707 and 7772 years BLE, respectively.The regimes are distinctly differentiated from the overall mean regime; the difference among them is larger in the case of the extended eruptive history shown in Fig. 1b.In fact, the global rate eruption for the 7772 years BLE period is not even similar to the eruption rates of the regimes (Table 6).We speculate that this may be a reflection of the lack of completeness of the extended eruptive history in the specified VEI range, rather than a very different eruption rate of the longer period.Although the MOED method requires completeness, we nevertheless estimate the volcanic hazard using both, the eruptive history to 3707 years BLE, and the extended history to 7772 years BLE (Fig. 3), and compare the results with those obtained form the NHGPPP, Poisson and Weibull distributions for the same periods.The MOED parameters calculated from the eruptive history of El Chichón volcano are shown in Table 6.The resulting probabilities of future eruptions are discussed next.

Discussion
The probabilities of occurrence of at least one eruption exceeding a VEI magnitude in a given time interval by the MOED, NHGPPP, Poisson and Weibull distributions from the "Chichón 2" and "Chichón B" are plotted in Fig. 3.
To obtain an objective measure of the quality of the fit between each distribution and the eruptive history data, we performed three non-parametric goodness-of-fit tests (Cramervon-Mises, Kolmogorov-Smirnov and Anderson-Darling), and the AIC method described in a previous section.
In the 3707 years BLE period (Fig. 3a), the NHGPPP, MOED and Poisson distributions behave similarly.The MOED shows probabilities between the NHGPPP and the Poisson probabilities, because the weighting criterion emphasizes the importance of the short-duration high-rate regimes not considered by the Poisson distribution.The NHGPPP's probabilities yield slightly higher values in the intermediate-high time range i.e. somewhat higher occurrence probabilities for longer intervals.The fact that the NHGPPP values are not very different from the other distributions, mainly for the short periods, is consistent with the assumed completeness of the 3707 years BLE eruptive period.The best fit in the shorter periods is obtained by the Weibull distribution for its ductility resulting from the empirical adjustment of the shape and scale parameters and the AIC method confirms the best fit to the Weibull distribution.Despite those differences, all of the distributions pass the goodness of fit tests with the eruption data, and each of them may be accepted at a 0.05 significance level (Anderson anddarling, 1952, 1954).The tests results are listed in Table 7.The acceptable results of the Poisson distribution are a consequence of a mean rate that provides a good average of the high and low regimes.However, it yields slightly lower probabilities, reflecting the influence of the longer duration of the low regimes (Fig. 3a).This may underestimate the probability that an eruption ended a low regime and the next may correspond to a high regime, an effect that is accounted by the MOED.Although the MOED probabilities also are the averages of a renewal process, they result to be more sensitive to the existence of previous regimes than the Poisson estimates.It must be remarked that this does not mean that the MOED estimates are "regime variables" as those obtained by Bebbington (2007), who uses Hidden Markov (HM) models to identify regime changes, and thus estimate the probability of being in a regime.The renewal processes are series of events in which the times between events are independently and identically distributed (Cox and Lewis, 1966) as is the case of the models used here.In contrast, the HM models may require some degree of periodicity or structure of the eruptive record, or at least of some of its sections.
Figure 3b shows the calculated probabilities and the observed distribution for the period of 7772 years BLE.Now, the Poisson probabilities separate significantly, as may be expected from the inaccuracy of the global rate of eruption as a single parameter describing the whole eruptive process (Fig. 1b).Although the Weibull distribution shows a good fit in the short repose periods, it "saturates" at periods of about 100 decades revealing some inability to deal with the longtail of the distribution.
Unlike the previous case of Fig. 3a, now the NHGPPP yields probabilities significantly lower than the MOED.This is a consequence of decreased eruption annual rates when using an extended period containing only one eruption (Tables 4 and 5).Contrastingly, the weighting criterion of the MOED emphasizes the importance of the short-duration  {0, 635, 905, 1270, 1524, 1657, 1857, 2065, 2590, 3107, 3707} 0.4037 {0, 635, 905, 1270, 1591, 1857, 2065, 2590, 3107, 3707} 0.1657 {0, 635, 905, 1270, 1524, 1757, 2065, 2590, 3107, 3707} 0.2908 {0, 635, 905, 1270, 1524, 1657, 1961, 2590, 3107, 3707} 0.3037 {0, 635, 905, 1270, 1524, 1657, 1857, 2065, 2590, 3107, 3707, 7772} 0.4832 {0, 635, 905, 1270, 1591, 1857, 2065, 2590, 3107, 3707, 7772} 0.4571 {0, 635, 905, 1270, 1524, 1757, 2065, 2590, 3107, 3707, 7772} 0.4727 {0, 635, 905, 1270, 1524, 1657, 1961, 2590, 3107, 3707, 7772} 0.4116 high-rate regimes at the expense of the low rate regimes.Therefore, its probability estimates do not change much between the 3707 and the 7772 periods (Fig. 3).Although one may expect that the MOED probabilities would be affected by the possible incompleteness of the longer period data, the probabilities for both periods are very similar.Despite the differences among the probabilities calculated from the distributions, all of them pass the goodness of fit tests and each of them may be accepted at a 0.05 significance level (Table 7) (Anderson anddarling, 1952, 1954).How sensitive are these methods to possible errors in the sampling of the eruptive history?We may illustrate the answer to this question through an example that involves an inherent difficulty in the construction of eruptive records: the identification of pairs of past eruptions as single events.The point process hypothesis stated in Sect. 3 requires that the events are independent.The independence of successive repose periods may be difficult to determine due to the uncertainty in the recognition and dating of geological deposits.A measure of the independence among events is the serial correlation between successive repose periods, T i vs. T i+1 (Cox and Lewis, 1966).In the present case, the serial correlation coefficients between eruptions over the periods 3707 and 7772 years BLE are 0.40 and 0.48, respectively (Table 8).Independence requires serial correlation coefficients near zero.Although the above values are not statistically significant, they point to a possible weak serial correlation of the repose times.Assuming that some events of the eruptive series of El Chichón volcano may be correlated, we addressed the possibility that two successive, near in time events may be a single eruption, considering the overlapping of the dating error ranges (Table 1).This possibility is reinforced by the absence of paleosoils between the deposits of assumedly different eruptions (Espíndola et al., 2000), as is the case of the 5th to the 8th eruption (1524( , 1657( , 1857( , and 2065 years BLE) years BLE).Table 8 shows the serial correlation coefficients of different possible cases.The eruptive sequence with the lowest serial correlation coefficient (0.1657), is {0,635,905,1270,1591,1857,2065,2590,3107,3707}.We consider that the drop of the serial correlation coefficient by a factor of 2.5 after merging those adjacent eruptions is a relevant indicator for selecting them.The date 1591 years BLE is the average between the reported 1524 and 1657 years BLE events.It is thus possible that the reported eruptions of 1524 and 1657 years BLE may be a single event occurring near 1591.Assuming the above eruption sequence as true, the occurrence rate of regime (2064-1271 years BLE) decrease and the eruptive series would approach a stationary behavior (Fig. 4a).In this case, the best model would be the Chichón 2C in Table 9. Notwithstanding, the probabilities obtained with the NHGPPP, Poisson, MOED and Weibull distributions from this model are not so different from the estimates of "Chichón 2" model (Figs.3a and 4b).The MOED yields almost the same probabilities that Poisson distribution due to the stationary behavior.The AIC proves the best fit to the Weibull distribution and all of the distributions may be accepted at a 0.01 significance level with different hypothesis tests.It is important to emphasize that if the interevent times or repose periods are not independent, the methods applied in these paper are not appropriate because they would not satisfy the renewal process definition.

Conclusions
In the present case, all of the distributions pass the goodness of fit tests with the El Chichón eruption data.There are, however differences among the results that provide important information about the reaches of the distributions.The homogeneous Poisson distribution gives results similar to the other distributions in the 3707 years period.However, in the extended period has a poorer performance, compared with the other distributions.This result confirms the difficulties of the Poisson distribution to describe strongly non-stationary, incomplete series, yet it stills provides acceptable results for weak non-stationarities using the mean rates.The Weibull distribution has an overall good performance, also passing all the goodness of fit tests.However, in the present case the fitting is best only with the probabilities for short waiting times (Fig. 3).At longer waiting times it saturates before any of the other distributions, and no information of the long term behavior may be obtained.In contrast, the NHGPPP distribution shows a better fit in the tail of the distribution, and a significant difference between the probabilities calculated for both periods of study.This is a consequence of the ability of this distribution to comprise the different characteristic of the extended period: a higher degree of non-stationarity and a probable incompleteness of eruptions with VEI below 4.
The MOED proves to be a straightforward and simple method that provides reliable hazard estimates for nonstationary eruptive histories, yielding similar results to the other methods.The MOED parameters can be simply and directly calculated from the inspection of the cumulative distribution of eruption occurrences in specific VEI classes and thus contain information of the process involved in the non-stationarity of the eruption time series, reflected as a succession of eruptive regimes.In the present case, the MOED shows little change between the probabilities calculated for both periods.This is a consequence of the way the MOED weights the contribution of the regimes: inversely to their duration.Therefore, using extended periods containing incomplete eruptive histories may produce results that must be interpreted with caution, because it may underestimate or overestimate the probabilities depending on the length and eruptive rate of the extended regimes.
The NHGPPP probabilities are also affected by the duration of the extended period in a different way.Unlike the MOED, the NHGPPP probabilities are calculated from the number of VEI values exceeding a threshold for the whole period, independently of the regimes.Therefore, in the present case, the rate of excesses is drastically reduced after adding only one eruption (7772 years BLE) exceeding the VEI 3 threshold.The NHGPPP distribution thus provides the best estimates when the low-rate extended period is included.
The above arguments are supported by the goodness of fit tests.MOED and NHGPPP fit almost equally well the data of the 3707 years BLE period.On the contrary, in the extended period NHGPPP fits better than MOED (Table 7).The Akaike Information Criterion produces similar results yielding essentially the same AIC value for MOED and NHGPPP in the 3707 years BLE period, and a much lower AIC value for NHGPPP that for MOED in the extended period.However, an important issue regarding the AIC -and other parametric tests -must be considered here.When parametric tests are used, the relatively large number of parameters required by the MOED modifies the degrees of freedom and hence the significance levels of the tests, penalizing the number of distribution parameters.To account for this, in the present case we have also relied on non-parametric tests.
Except for the Weibull distribution, all of the other distributions tend to produce higher probabilities of at least one eruption of El Chichón occurring in the short-period range (less than 40 decades).This reflects the nature of the distributions.In the present case, Weibull shows a remarkable capacity to adapt its shape to the observed data in the shortperiod range, but it does not fit well the data of El Chichón long tail.Attempts to adjust the distribution parameters to fit the tail data would reduce the quality of fit at the short periods.This was observed by Dzierma and Wehrmann (2010), in their study of different eruptive histories of Chilean volcanoes.They noted that varying the scale parameter of the Weibull distribution improved the fit to longer repose times, but at the expense of the quality of fit to the short repose times.
Increased reliability in the assessment of the volcanic hazard from long, non-stationary and probably incomplete eruptive histories may be gained testing the data with different statistical distributions.For the eruptive series of El Chichón volcano we may conclude that the probabilities of at least one eruption occurring in the relatively short time range estimated from most of the tested distributions (except Weibull) yield greater values than expected from the observed distribution.This is a consequence of the absence of short repose times (less than 50 decades) in the Holocenic eruptive history of El Chichón, a fact that may be related to the apparent absence of low-magnitude eruptions.Although the Weibull distribution fits well the eruptive record in the short-time range, we believe that the lower Weibull probabilities resulting from the absence of short repose periods in the small population of Holocenic eruptions may lead to underrating the volcanic hazard in the short-period range.We thus prefer the seemingly overestimated probabilities of the other distributions, rather than the low Weibull probabilities.
For the long-term probabilities the NHGPPP shows an additional ability to incorporate the effect of scarce data of major past eruptions.If an extended eruptive history is available, as in the present case, using simplicity as a ranking criterion, we would recommend the MOED for estimating the probabilities of future eruptions in the short time range and the NHGPPP for the long time range.

Fig. 1 .
Fig. 1.Cumulative number of eruptions of El Chichón volcano reported for the periods (a) 3707 and (b) 7772 years before of the last 1982 eruption in the magnitude class VEI≥3, and the regimes that can be recognized.The slopes of the solid lines represent the eruption rates of each regime, the slope of the dashed line represents the eruption rate of the whole series and the dotted lines are their 95% confidence bounds (λ global =0.026976 eruptions per decade in 3707 years BLE, and λ global =0.014153 eruptions per decade in 7772 years BLE).The vertical lines mark the points of regime change.

Fig. 2 .
Fig. 2. Rates and standard deviation for each regimen (vertical axis) versus the time of the central point of each regimen (horizontal axis).

Fig. 3 .
Fig. 3. Probabilities calculated by NHGPPP, MOED, Poisson and Weibull distributions of at least one eruption, with VEI≥3 in a given time interval from (a) "El Chichón 2" over 3707 years BLE.The horizontal and vertical lines show the different probabilities of at least one eruption in 20 decades with VEI≥3 (NHGPPP: 0.4724, MOED: 0.4582, the Weibull distribution: 0.2460 and Poisson distribution: 0.4170), and (b) "Chichón B" over 7772 years BLE eruptive models.This is equivalent to the probability of observing a repose time less or equal than T decades.The horizontal and vertical lines show the different probabilities of at least one eruption in 20 decades with VEI≥3 (NHGPPP: 0.2777, MOED: 0.4371, the Weibull distribution: 0.2219 and Poisson distribution: 0.2465).
Fig. 4. (a) Cumulative number of eruptions of El Chichón volcano for the period 3707 years BLE in the magnitude class VEI≥3, assuming that the eruptions 1524 and 1657 years BLE are a single event occurred near 1591.The slopes of the dashed line represent the eruption rate of the whole series.(λ global =0.0243 eruptions per decade).(b) Probabilities calculated by NHGPPP, Poisson and Weibull distributions of at least one eruption, with VEI≥3 in a given time interval from "El Chichón 2C" over 3707 years BLE.The horizontal and vertical lines show the different probabilities of at least one eruption in 20 decades with VEI≥3 (NHGPPP: 0.4499, MOED: 0.3657, the Weibull distribution: 0.1452 and Poisson distribution: 0.3847).

Table 1 .
Volcanic Explosivity Indexes and ages of known eruptions with magnitude VEI≥3 of El Chichón volcano from different sources

Table 2 .
Two possible models of VEI magnitude distributions for the VEI≥3 eruptions of El Chichón volcano in 3707 years BLE.The second column reproduces the published VEI values listed in Table 1.

Table 3 .
Two possible models of the VEI magnitude distributions for eruptions with magnitude VEI≥3 of El Chichón volcano during a period of 7772 years BLE.The second column reproduces the reported VEI values listed in Table 1.
Mvei = a − bM vei .(1)Thisrelationhasbeen used on groups of volcanoes and on individual volcanoes to estimate the eruption rates for different VEI magnitudes (De la Cruz-Reyna, 1991, 1993; De la Cruz-Reyna and Carrasco-Núñez, 2002; De la Cruz-Reyna and Tilling, 2008; Mendoza-Rosas and De la Cruz-Reyna,2008)using the available eruption records to obtain selfconsistent series.We constructed two eruptive history models, shown in Table2, for the eruptive series of El Chichón volcano with the data available for the last 3707 years BLE.

Table 4 .
Eruption rates λ Mvei for each VEI class.The slope-b of the loglinear relationships given by Eq. (1), and the regression coefficients for each model of El Chichón volcano listed in the Table2for the period 3707 years BLE are included in the lower rows.

Table 5 .
Eruption rates λ Mvei for each VEI class, the slope-b of the loglinear relationships from Eq. (1), and the regression coefficients for the two models listed in the Table3for El Chichón volcano in the period 7772 years BLE.

Table 6 .
Eruptive regimes and calculated parameters of the MOED for the eruptive sequence of El Chichón volcano (VEI≥3).The global rates for the 3707 and 7772 years BLE periods are 0.026976 and 0.014153 eruptions per decade respectively.The regime rates are also in eruptions per decade.

Table 7 .
Statistics of goodness of fit tests for the different statistical methods.

Table 8 .
Serial correlation coefficients of different sets of eruptions.The sets are formed with different combinations of the average from the 5th to the 8th eruption (Table1).