Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios

Muhammad, Ario; Goda, Katsuichiro; Werner, Maximilian J.

doi:https://doi.org/10.5194/nhess-2022-59

Preprints

https://doi.org/10.5194/nhess-2022-59

Preprints

24 Feb 2022

| 24 Feb 2022

Status: this preprint was under review for the journal NHESS. A final paper is not foreseen.

Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios

Ario Muhammad, Katsuichiro Goda, and Maximilian J. Werner

Abstract. We develop a novel framework of time-dependent probabilistic tsunami hazard analysis (PTHA) and apply it to Western Sumatra, Indonesia, where future tsunamigenic events are anticipated in the Mentawai region of the Sunda subduction zone. An earthquake rupture model taking into account the spatiotemporal interaction of major megathrust segments is used to simulate future tsunamigenic earthquakes. The earthquake rupture process of the segments is characterized by a multivariate Bernoulli model with interarrival times following a Brownian passage‐time distribution and the dependency between segments specified by a spatial correlation function. We calibrate this model with historical ruptures of the Mentawai thrust in the last 450 years. A total of ≥ 100,000 time-dependent earthquake rupture cases are then coupled with a stochastic tsunami simulation method to evaluate tsunami hazards. We generate a total of 6,300 stochastic tsunami source models from six magnitude scenarios between M 7.75 and M 9.0 and obtain time-dependent PTHA results for seven different periods (1, 5, 10, 20, 30, 50 and 450 years). We further compare the time-dependent PTHA results with a time-independent PTHA approach to investigate the influence of the spatiotemporal earthquake rupture model. The space-time interaction model successfully generates annual seismic moment rates consistent with the observations. Moreover, the model can capture the uncertainty of future time-dependent tsunami hazards. On the other hand, the time-independent approach produces slightly higher hazard estimates than the time-dependent model for long-term hazard assessments (> 450 years).

This preprint has been withdrawn.

Received: 18 Feb 2022 – Discussion started: 24 Feb 2022

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this preprint. The responsibility to include appropriate place names lies with the authors.

Download & links

Preprint (PDF, 3764 KB)

Withdrawal notice
This preprint has been withdrawn.
Preprint (3764 KB)

Download & links

This preprint has been withdrawn.

Ario Muhammad, Katsuichiro Goda, and Maximilian J. Werner

Interactive discussion

Status: closed

RC1:
'Comment on nhess-2022-59', Anonymous Referee #1, 18 Mar 2022

The manuscript “Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios”

by Muhammad et al. presents an important application of time-dependent probabilistic tsunami hazard analysis (PTHA) to the central Sunda subduction zone. The method involves several novel components, such as stochastic tsunami simulation and space-time interactions among earthquakes, developed in previous publications but integrated in this applied study. The time-dependent component may be particularly important for regions that have recently had a large magnitude earthquake (see comment 1, however) and for short design exposure times. Several minor comments are indicated below, primarily related to unstated assumptions and parameter uncertainty. Upon revision, this paper should be a valuable contribution to Natural Hazards and Earth System Sciences.

General Comments

(1) The study is based on the idea that a BPT or other time dependent rupture model more accurately represents earthquake behavior along the Sunda subduction zone. Given numerous papers refuting the seismic gap hypothesis for subduction zones in general (e.g., Rong et al., 2002 who cite Matthews, 2002), it seems that a logical first step for any study region is to falsify a Poisson null hypothesis.

(2) Although the definition of fault segments is based on 450 years of earthquake occurrence, there still might not be sufficient to determine if these segment boundaries are persistent (cf., Jackson et al, 2011).

(3) The earthquake occurrence model is based on a 1D (along strike) representation of the subduction zone. For the Sunda subduction zone, as with other subduction zones with a broad shelf, however, tsunami generation is critically dependent on the dip extent of rupture as was notably observed in comparisons of the 2004 and 2005 earthquakes (e.g., Geist et al., 2006). The limitation of the 1D approach should be mentioned.

(4) It seems that it would be straightforward to estimate uncertainties in mu, alpha, and gamma from the posterior distributions (confidence intervals). These uncertainties could then be used as part of the probabilistic calculations.

(5) My impression is that the maximum magnitude earthquake considered is from the 450-year record and essentially is an event that spans segments 1-6. Even though the tsunami from an Mmax event would have a low probability, such an event may pose a more significant component of the aggregate hazard for longer exposure times than considered in this study. It should be clarified how Mmax is determined and whether a penultimate event could extend beyond the study region.

(6) Tsunami heights seem to “saturate” at nearly 10m (Figure 13). Is this dependent on the largest magnitude earthquake or is this caused by a hydrodynamic effect?

In-line comments

L42: Vere-Jones’ stress release model (cf., Bebbington and Harte, 2001) could also be mentioned—more relevant to this study.

Eqn. 3 is a cumulative distribution function, not a frequency-magnitude distribution.

L141: I couldn’t find in the manuscript where the specific magnitude-area relation used was mentioned. Since this is often a contentious choice, especially for subduction zone earthquakes, the specific relation and its justification should be indicated.

L257: How is distance D determined?

L316: Same variable D used for slip here and distance in L257.

Fig. 3: “occurred scenarios” is awkward. Could just say “scenarios”.

References

Bebbington, M., and D. S. Harte (2001), On the Statistics of the Linked Stress Release Model, , , 176-187.

Geist, E. L., S. L. Bilek, D. Arcas, and V. V. Titov (2006), Differences in tsunami generation between the December 26, 2004 and March 28, 2005 Sumatra earthquakes, , , 185-193.

Jackson, D. D., Y. Y. Kagan, and H. Gupta (2011), Characteristic earthquakes and seismic gaps, , , 1539.

Rong, Y., D. D. Jackson, and Y. Y. Kagan (2003), Seismic gaps and earthquakes, , , ESE 6-1 - 6-14.

Citation: https://doi.org/10.5194/nhess-2022-59-RC1
- AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022
  
  We would like to thank the reviewer in providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 1.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC1
RC2:
'Comment on nhess-2022-59', Anonymous Referee #2, 21 Mar 2022

The paper "Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios" by Muhammad et al develops PTHA for western Sumatra using two alternative magnitude-frequency models; one that is time-dependent (and relatively novel), and another that is time-independent (and more conventional). The paper details the methodologies used to fit each model, and compares their results.

The subject of this paper is of interest for tsunami hazard science and the readership of NHESS. The presentation is mostly clear although could be improved in places (mentioned in details below). But as far as I can tell, there are a number of weakness of this paper that will require major revisions to address, before it is suitable for publication. These involve problems with the statistical methods, including: using very different data to fit each model; not treating issues of time-varying completeness in the long-term historical data; use of questionable methods to set Bayesian priors. The paper also neglects key parameter uncertainties controlling the frequencies of earthquakes, that will probably have a large impact on the results (i.e. greater than the current differences between the time-dependent and time-independent models). The authors also need to adjust the introduction to better reflect controversy regarding the performance of time-dependent models (in the literature)

I hope the authors can consider the comments below and either fix their analyses, or (where I am mistaken) edit the manuscript to make their approaches clearer and more obviously defensible. While this will take significant work, I think it is certainly doable, and will make for a useful contribution to NHESS.

# HIGH-LEVEL COMMENTS

- Generally the paper argues that time-dependent modelling is more accurate, but doesn't provide strong justification for this. To my knowledge there are contrasting views on this in the literature, which should be represented in this paper. The language should be softened, and uncertainties better discussed (see detailed comments).

- The tsunami hazard results do not seem account for uncertainties in the scenario-frequency model parameters (e.g. b, rate of earthquakes, maximum-magnitude, and other parameters). This is true for both the time-dependent and time-independent models, although details of their parameters are different. Variation of the model parameters within the statistical uncertainties will likely have a substantial impact on the results, especially given only 10 events have been used to constrain the time-dependent model (which has many parameters). When these uncertainties are accounted for, I expect they will be larger than the current difference between the time-dependent and time-independent results. Given the 'many synthetic catalogues' approach used in this paper, the uncertainties could be accounted for by randomly drawing different model parameters for each synthetic catalogue.

- The maximum-magnitude is set to Mw9. In reality maximum magnitudes are quite uncertain (justified further below) and yet very impactful for the results. Again I expect they are likely more important than the effect of time-dependence in the current modelling. Many other PTHA studies treat this as an uncertain parameter (details below), and I suggest that issue is also addressed in this paper.

- In so far as I can tell, there are a number of technical weaknesses in the scenario frequency modelling that should be addressed or clarified.

+ The long-term data (10 events over the last 450 years) likely has a time-varying magnitude of completeness; the earliest 8 events all have Mw>=8.3, while only the most recent 2 events (2007+) have Mw<=8. This is not surprising - a-priori we certainly expect that it would be harder to detect smaller earthquakes in the paleo data. But the statistical methods seem to ignore this issue. This could have a large impact on the fit of the time-dependent model.

+ The time-independent model is fit to different data than the time-dependent model, and the time-independent fit is dominated by small earthquakes (mostly having magnitudes well below the Mw 7.65+ that are of interest in this study). Even if there were no differences between the models, the use of such different data would lead to differences in their results. This makes it hard to determine the significance of the time-dependent model structure for the PTHA results. To remedy this, the long-term data should be used to constrain the time-independent model for larger magnitudes. This may require accounting for time-variations in the catalogue completeness (citations below), and placing less weight on low magnitude earthquakes (so they don't dominate the fit at higher magnitudes, which currently doesn't agree especially well with the data). This should help reduce the under-estimation of the earthquake frequencies with Mw>=8.3 (currently about three times less common in the time-independent model, vs the long term data).

+ There appear to be some anomalies in the Bayesian fit of the time-dependent model. The priors for some parameters seem to be set using the same data used for fitting, which should not be done with Bayesian statistics. Also, the figures show differences between the priors and posteriors that suggest a poor specification of the priors (details in comments below).

+ The fit of the time-dependent model seems to ignore the change in completeness magnitude of the long-term data.

# DETAILED COMMENTS

Near L25: 'Over the next decades, major tsunamigenic events are anticipated in .... '. It sounds like "we expect large tsunamis in each of these subduction zones within a few decades". I don't think this is well justified. Do the references really backup the 'major events in the next-few-decades' claim? Historically, time-dependent predictions over these kinds of timescales have not performed well for subduction zones (e.g. Rong et al., 2003).

Near L40: "...assuming a lack of memory between major earthquake occurrences is often viewed as a first approximation .." -- I think there are contrasting views on this in the literature, that should be represented in this part of the paper. For example Rong et al. (2003) are quite critical of assumed quasi-periodic earthquake recurrence (on empirical grounds). OTOH there is empirical evidence that large earthquakes tend to be weakly periodic, but without correlation between successive inter-event times (Griffin et al., 2020).

L53-54: "Recent work has also used high-resolution spatial grids .... to produce more accurate tsunami hazard results (e.g. < 90m, ...)". I don't think we should describe "< 90m" as high-resolution for onshore work, that is quite coarse. I might describe resolutions of 10m or less as high-resolution (e.g. Gibbons et al., 2020).

Near L67: "Since time-dependent hazard estimation leads to more realistic short-term results" -- this really needs justification, or removal. To my knowledge this point has not been demonstrated in general, and it may-or-may-not be true. I suppose for aftershock modelling there would be lots of evidence, but this study is using quasi-periodic modelling for large events, and I believe there is less evidence on this matter. In the Paleo record, some sites look more time-dependent than others, e.g. Griffin et al. 2020.

Near L74: "A uniform-slip was used, which may underestimate the hazard..." -- I believe Horspool et al. (2014) used a log-normal distribution to predict the (uncertain) heights at the coast from the uniform-slip scenarios, as a way of accounting for uncertainties due to the slip model and uncertain geometry. In principle this is supposed to compensate for the lack of slip heterogeneity. In practice it could either underestimate, or overestimate, the variability of natural earthquake-tsunamis. If their sigma were sufficiently large, it may even predict greater hazard than your model (I haven't checked whether it does, just clarifying the principle).

L86: Please add a statement about why you use segments to define the rupture extents (I think it is related to the space-time modelling?).

L115: "magnitude-frequency distribution" -- Should this be "probability density function"? I think the MFD would include the factor lambda_i.

L117: "frequency-magnitude distribution" -- It think this should be "Cumulative Distribution Function"? Furthermore I think you need to say that f_i(M) is the derivative of F(M) (and consider whether you need a subscript _i for F).

L125: In Equation 4, the subscript '_i' might be confused with the same subscript used to denote the source in Equation 2. Also, I think Eq 4 should use 'j' for consistency with notation in the paragraph just before Equation 4?

L127: Here I am concerned that you are not using the long-term paleo data to fit the GR model. Why not? The longer term data suggests a high rate of Mw >= 8.3 (8 events in 450 years, rate around 0.018), quite a bit more frequent than suggested by your time-independent model (visually seems ~ 0.006 in Figure 1C, or one-third the frequency -- noting this fit is dominated by low-magnitude earthquakes, below magnitudes of practical interest for this study). As well as taking the opportunity to improve the model accuracy, this would be good because the long-term data is used for fitting the time-dependent model. The use of very different data to fit the two models allows for a substantial 'arbitrary' difference between their results, which is not related to their structure (temporal/non-temporal). I am concerned that this may dominate the differences in your results. I would suggest you fit the time-independent GR model using both the long-term and catalogue data (there are various approaches to treating the varying completeness magnitude, e.g. Weichert, 1980), while removing the instrumental events from the long-term data. Also, you might want to use fewer low-magnitude earthquakes to constrain the fit (to reduce the influence of low-magnitude earthquakes on the fit, and better represent the data at magnitudes that matter for this PTHA).

L135: Around here, could you please explicitly state that the time-dependent model does not have Mw-frequency curves that follow the GR distribution, over any time-scale. I didn't realise this initially, and it is obviously a very important point for the subsequent analysis. Perhaps a sentence highlighting that instead the Mw-frequency distribution will reflect correlations between rupture on different segments, which is parameterised by the model itself.

L150: It looks like the magnitudes only go up to 9? I think this is neglecting the large uncertainties in Mw-max. Neglect of those uncertainties may have a strong impact on the results. A few relevant points: Berryman et al. (2015) suggested uncertain Mw-max values in this region ranging from 9.0 - 9.6 based on scaling relations and the historical record. Such highly uncertain Mw-max values have been represented in PTHAs (e.g. Davies et al., 2017; Davies and Griffin, 2020). Horspool et al. (2014) allowed Mw-max on Sumatra to vary in 9.3 - 9.7. We know the nearby 2004 event had a magnitude exceeding 9 (around 9.2). From Tohoku we also know that Mw 9.1 can occur in relatively compact regions, smaller than the extents of your study. On this basis I don't think we can exclude the possibility of higher magnitude earthquakes.

L153: "..for each of those 21 rupture scenarios" -- suggest to add "geometrical" before "rupture scenarios", to be consistent with previous sentences. Here there are a few interacting concepts: "geometrical rupture scenarios (seems to be a magnitude plus a set of segments?)", "scenarios", "events" (is this the same as "scenarios"). I suggest you pick one term for each concept, and then use it consistently throughout the paper.

L 164: "(one height for one simulation catalog)" -- does the height vary with space, or are we looking at the 'maximum height anywhere in the model'?

L 172: "The results confirm that N_{sim} = 100,000 catalogues are sufficient to produce a stable result" -- stable in terms of what? The mean over all catalogues? Please make this clear, as I suppose individual catalogues must vary greatly.

L175-179: This section is confusing me. Above I understood that you used N_{sim} = 100,000 to get a stable result. But now it is suggested that many more catalogues were required for 1-50 years. Please edit to make this clearer. [NOTE: Some sentences from the 'Results' section may help in this regard, mentioned below.].

L197-199: "This number is consistent with the GR model". In my judgement they are "not very consistent", with the model under-predicting the frequency of large events (as discussed above, the GR model has a substantially lower frequency of Mw>=8.3). Note the 450 year record contains 10 events (Mw 7.8-8.9), but the first 8 events have Mw>=8.3, and the last two events are from the recent instrumental period. This suggests changes in the magnitude of completeness of the 450 year catalogue over time. A-prior we expect this would happen because Paleo records find it more difficult to detect small events. This issue should be accounted for when comparing the GR model with the long term data (and above I suggest that the long-term data should also be used to fit the time-independent GR model -- doing that will probably lead to significant increases in the modelled frequency of large earthquakes).

L205: "see Figure and Figure 5" -- missing Figure number.

L217: Suggest you use a word other than "scenarios" to denote the 21 "magnitude + set-of-segments" combinations.

L250-ish: Above I argued that the long-term data (10 events, 450 years) is likely subject to a varying completeness magnitude, noting the only two events with Mw<8 events are recent instrumental events, and all others have Mw>=8.3. From what I can see, this 'changing completeness magnitude' is not accounted for in the statistical fit of the time-varying model (Section 2.2.2). I expect this would have a large effect on time-varying model fit - for example, overestimating the conditional probability of multi-segment rupture (which also effects the frequency of high Mw events), and affecting the BPT model parameters.

L250: "The prior median of mu for each segment is different, namely ....... These values represent the median interarrival time of earthquake rupture on each segment over the last 450 years". It sounds like you are using the same data both to specify the priors, and then to fit the model (?). In Bayesian statistics, the priors should be specified in a way that doesn't use the fitting data, or at least doesn't use it in important ways. Another potential problem with the methodology is suggested by Figure 9, where we see the prior and posterior for 'mu' are very different on some segments -- the posterior is more diffuse and often has a very different average (e.g. Panels A, I, K). This suggests the priors have been overly constrained in the analysis. Typically priors would be set either using data different used for fitting, or given weakly informative values.

L310: This source zone has some history of "tsunami-earthquakes", with waves much larger than might be expected from the magnitude (e.g. Mentawai 2010). Can the current model produce similar large waves for scenarios with magnitude below 8, using the rigidity of 40GPA? I would be surprised if it can, although that will also depend on how concentrated the slip is allowed to be. Please add a comment on the capacity for the model to make 'tsunami-earthquake' type scenarios.

L317 "... 300 stochastic models are sufficient to simulate stable and consistent tsunami heights and depths" -- I think this must depend on the model region, and what you are interested in. For instance it would not give an accurate representation of the 99.5th percentile. Also for a model where only a very small part of the source-zone could affect the site of interest, one might need to generate many scenarios to get enough relevant scenarios. In summary, I don't think you can refer to stability tests from another study to provide justification for using 300 models in this study. Instead, can you report on a test that is specific to this case?

L347: "... the final parameter estimates are taken from the maximum a posterior". It would be better to account for the model uncertainty (also in Mw-max, b, etc), which should be substantial given the limited data available to fit the model, and will probably have a substantial impact on the hazard. One way to do this would be to draw a different parameter set for each of the large number of synthetic catalogues that are simulated.

Section 3.1: As discussed earlier, please comment on why the 'mu-priors' for some segments are so different to the posteriors (little overlap for Fig 9 panels A, I, K). This is surprising given especially considering that the priors were apparently constructed using the same data used for fitting. To me it suggests weaknesses in how the priors were constructed, or some other problem.

L356-360: This is a very clear description of how the catalogue duration was defined. I suggest you move this to the earlier methods section (where I expressed confusion about the method).

L360 and Figure 10: Regarding the validation of the annual seismic moment release: Considering that the data was used to fit the model, I don't think the observations/model are particularly consistent on segments 3 and 4. In both cases the observed data exceeds the 90th percentile of the model. Again this seems to suggest some under-estimation in the model, as discussed repeatedly above. Please check that this is all correct following revisions, and if it is, add a comment explaining why this is nonetheless reasonably consistent.

L379: Figure 11C is not a strong basis for making a point about which segments rupture more or less, because it is only 1 catalogue. Can you please make a figure that better justifies the points made in this paragraph?

Line 447 and Figure 12: The conditional probability of Mw9.0 (if an earthquake occurs) is larger in the time independent case. But I doubt that these results will be robust to parameter uncertainties in the time-dependent model, considering that limited data (10 events, that likely has time-varying completeness) was available to fit its many parameters. This further suggests the importance of considering model parameter uncertainties in the PTHA.

Line 451: One factor neglected in this discussion is the effect of using different datasets to fit the 2 models, which could cause differences in the results even if there was no other difference between the two kinds of models. I think the calculations in this paper should be revised so that the time-independent model is informed by the long-term data, and that parameter uncertainties in both the time-independent and time-dependent models are accounted for. In my judgment it is likely that the parameter uncertainties will could lead to differences in the results that are substantially larger than differences between the current time-dependent and time-independent models.

# TECHNICAL CORRECTIONS

None for now.

# REFERENCES

Berryman, K.; Wallace, L.; Hayes, G.; Bird, P.; Wang, K.; Basili, R.; Lay, T.; Pagani, M.; Stein, R.; Sagiya, T.; Rubin, C.; Barreintos, S.; Kreemer, C.; Litchfield, N.; Stirling, M.; Gledhill, K.; Haller, K. & Costa, C. The GEM Faulted Earth Subduction Interface Characterisation Project: Version 2.0 – April 2015 GEM, GEM, 2015

Davies, G.; Griffin, J.; Løvholt, F.; Glimsdal, S.; Harbitz, C.; Thio, H. K.; Lorito, S.; Basili, R.; Selva, J.; Geist, E. & Baptista, M. A. A global probabilistic tsunami hazard assessment from earthquake sources Geological Society, London, Special Publications, Geological Society of London, 2017

Davies, G. & Griffin, J. Sensitivity of Probabilistic Tsunami Hazard Assessment to Far-Field Earthquake Slip Complexity and Rigidity Depth-Dependence: Case Study of Australia Pure and Applied Geophysics, 2020, 177, 1521–1548

Gibbons, S. J.; Lorito, S.; Macías, J.; Løvholt, F.; Selva, J.; Volpe, M.; Sánchez-Linares, C.; Babeyko, A.; Brizuela, B.; Cirella, A.; Castro, M. J.; de la Asunción, M.; Lanucara, P.; Glimsdal, S.; Lorenzino, M. C.; Nazaria, M.; Pizzimenti, L.; Romano, F.; Scala, A.; Tonini, R.; Manuel González Vida, J. & Vöge, M. Probabilistic Tsunami Hazard Analysis: High Performance Computing for Massive Scale Inundation Simulations. Frontiers in Earth Science, 2020, 8, 623

Griffin, J. D.; Stirling, M. W. & Wang, T. Periodicity and Clustering in the Long-Term Earthquake Record Geophysical Research Letters, American Geophysical Union (AGU), 2020, 47

Horspool, N.; Pranantyo, I.; Griffin, J.; Latief, H.; Natawidjaja, D. H.; Kongko, W.; Cipta, A.; Bustamam, B.; Anugrah, S. D. & Thio, H. K. A probabilistic tsunami hazard assessment for Indonesia Natural Hazards and Earth System Sciences, 2014, 14, 3105-3122

Rong, Y.; Jackson, D. D. & Kagan, Y. Y. Seismic gaps and earthquakes Journal of Geophysical Research: Solid Earth, Wiley-Blackwell, 2003, 108

Weichert, D. H. Estimation of the earthquake recurrence parameters for unequal observation periods for different magnitudes Bulletin of the Seismological Society of America, 1980, 70, 1337-1346

Citation: https://doi.org/10.5194/nhess-2022-59-RC2
- AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022
  
  We thank the reviewer for providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 2.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC2
RC3:
'Comment on nhess-2022-59', Anonymous Referee #3, 23 Mar 2022
Review of Muhammad et al

This paper presents a methodology for time-dependent probabilistic tsunami hazard analysis with stochastic earthquake rupture modelling, using the Mentawai region of the Sunda Subduction Zone as a case study. This is a novel and ambitious approach, and it is exciting to see the efforts made by the authors. In my view, the complexity of the model does however pose some challenges, and I think there are a number of points that require further justification and/or consideration of the choices made in the model. I expect this will require some effort to revise the model.

Major comments

Justification of the choice of a time-dependent approach. A number of recent studies of global paleoearthquake records (Williams et al 2019; Griffin et al 2020; Moernaut 2020) have, to varying degrees, provided empirical support for weakly quasiperiodic earthquake recurrence as a general model, which can be used to justify the use of renewal models for hazard assessment. That said, the Mentawai record of Philibosian et al (2017) looks to be more random than quasiperiodic in the analysis presented by Griffin et al (2020), although perhaps a different result might be obtained using the segmentation model presented here. The posterior BPT parameter estimates given for each segment are also relevant – some give values of alpha ~1 (segments 2, 3 and 4), implying random recurrence (i.e. Poisson), while others are ~0.6 (segments 1, 5 and 6), implying moderately quasiperiodic recurrence. So, I think some comment needs to be made here that:

At a global scale there is empirical support for weakly quasiperiodic earthquake recurrence as a general model (see Griffin et al 2020);

Excluding the hypothesis at the individual fault level is difficult, particularly for short records (Williams et al 2019; Griffin et al 2020)

The data from Philibosian et al (2017) is somewhat equivocal about whether earthquake recurrence here is truly time-dependent, and the Poisson hypothesis cannot be confidently excluded using these data. But the global studies mentioned above suggest it is not unreasonable to assume time-dependence as a hypothesis.

The discussion section of the paper could then discuss the implications of this assumption in light of the different values of alpha obtained for each segment.

In estimating parameters for the BPT distribution, the authors use the data to estimate the prior distribution of mu, before then using the same data to calculate the posterior probability distribution of mu. This is incorrect. I would suggest using an uninformative prior (e.g. as used by Fitzenz et al 2010). An alternative approach could be to use an informative prior for mu based on the slip rate (e.g. as determined from geodesy), but this may become complex (e.g. due to having to estimate coupling of the fault). The 450 year long record is short for accurately estimate model parameters. This is, of course, what a Bayesian approach should be helping with, but needs more care about the choice of priors.

I am also concerned that fitting the model parameters to each segment individually is problematic. Later you consider multi-segment ruptures, and it is not clear how all this fits together. Do the recurrence statistics obtained from the sum of all synthetic ruptures across all segments match the recurrence statistics from the sum of all historic/paleo ruptures in your data? Checking this could be a good test for your model.

Also related to parameter estimation, some of the posterior histograms seem a bit spiky; does this improve if the number of samples is increased beyond 10,000?

Spatio-temporal completeness of the paleo record compared with the instrumental record is an issue that I think could lead to biases in the parameter estimates. It is very unlikely that events similar to the Mw 7.8 2010 Mentawai event would be visible in the coral record; this event occurred near the trench and caused <4 cm subsidence on the Mentawai Islands as measured with GPS (Hill et al 2012). Related to the above, the Mmin of 7.6 (L129), while reasonable from a tsunami hazard assessment perspective, would mean that you are modelling events that are unlikely to be present in the paleoearthquake record. I am unsure of how the frequency of these events could be determined in the time-dependent approach. Therefore it seems likely in your current approach that smaller events are missed in the paleoearthquake record, therefore affecting the recurrence model parameters.

The 1D rupture segmentation is a problem for tsunami hazard assessment, as the resulting tsunami size depends so significantly on the depth of rupture. Compare the 2007 Bengkulu earthquakes (Mw 8.4 and 7.9), that were down-dip of the trench and did not generate a significant tsunami, with the 2010 Mentawai earthquakes (Mw7.8), which occurred near the trench and did generate a significant tsunami. It is not clear whether such events are discriminated by the stochastic modelling approach with 1D segmentation – it seems they probably aren’t, but I may not be understanding correctly. A related problem is low-rigidity near the trench and its tsunamigenic potential, as in the 2010 Mentawai tsunami? How might the assumption of constant (and relatively high) rigidity (L309-310) bias your tsunami hazard results?

The maximum magnitude of 9.0 seems too low, which seems related to the segmentation model. If the potential for ruptures connecting with other segments of the Sunda Subduction Zone is considered, then larger Mmax values are justified. Significantly larger Mmax’s were used in Horspool et al (2014). Even if the paleoearthquake record for the past 450 years suggests events haven’t exceeded Mw 9.0, we also don’t expect these magnitude events to occur all that often. So allow for the possibility that they are missing from the record.

Some area of the coast of Padang show zero probability of inundation (Figure 17), while in others the potential inundation extent extends quite a way inland. This raises some significant concerns for me about the quality of the inundation modelling and/or the elevation data used, given how low-lying the coast is in this area. If only SRTM data was used, this could significantly underestimate inundation extent (see Griffin et al 2015, Figure 8). Are buildings included in the elevation model?

Detailed comments

L15: Suggest change ‘A total of >’ to ‘More than’

L18: Forecast periods begin in what year?

L136: Choice of BPT is fine, but hasn’t really been justified here. Why is this chosen over lognormal, Weibull or Gamma? Some of your justification seems to be presented later in Section 2.2.

L174: Several thousand years

L185: Perhaps rephrase as ‘reflects the expectations of elastic rebound theory’, or similar.

L192: Should probably cite others who’ve used Bayesian approaches to fitting time-dependent models to earthquake records, in particular Rhoades et al (1994) and Fitzenz et al (2010).

L197 and Table 1: These should not be referred to as tsunamigenic. For half of them we have no information on whether a tsunami was generated; coseismic deformation on the Mentawai Islands observed in coral paleogeodetic records suggests they probably were, but we don’t actually know.

L324. Please give a link or citation for DEM5 and Bathy5.

L332: Might be a typo here – Griffin et al (2016) used a Manning’s roughness of 0.036 as a conservative minimum for land (grassland; for the Mentawai Islands). For the urban context here, 0.06 may be reasonable, e.g. Griffin et al (2015) suggested a Manning’s roughness of 0.08 for the city of Padang. See also Kaiser et al (2011) for a discussion of choice of Mannings n.

Ling 501-502: The time-independent model has too low an Mmax (9.0) to be considered worst-case. See earlier comments about choice of Mmax.

Table 1: Change Shieh to Sieh.

Figure 3, and also in the text. I do not think the term ‘occurred’ scenarios is the best terminology. These are modelled scenarios that have not actually occurred.

References:

Fitzenz, D. D., Ferry, M. A., & Jalobeanu, A. (2010). Long-term slip history discriminates among occurrence models for seismic hazard assessment. , (20), 1–5. https://doi.org/10.1029/2010GL044071

Griffin, J., Latief, H., Kongko, W., Harig, S., Horspool, N., Hanung, R., Rojali, A., Maher, N., Fuchs, A., Hossen, J., & others. (2015). An evaluation of onshore digital elevation models for modeling tsunami inundation zones. , (32).

Griffin, J. D., Stirling, M. W., & Wang, T. (2020). Periodicity and Clustering in the LongâTerm Earthquake Record. , (22). https://doi.org/10.1029/2020GL089272

Horspool, N., Pranantyo, I., Griffin, J., Latief, H., Natawidjaja, D. H., Kongko, W., Cipta, A., Bustaman, B., Anugrah, S. D., & Thio, H. K. (2014). A probabilistic tsunami hazard assessment for Indonesia. , (11), 3105–3122.

Hill, E. M., Borrero, J. C., Huang, Z., Qiu, Q., Banerjee, P., Natawidjaja, D. H., Elosegui, P., Fritz, H. M., Suwargadi, B. W., Pranantyo, I. R., Li, L. L., Macpherson, K. A., Skanavis, V., Synolakis, C. E., & Sieh, K. (2012). The 2010 Mw 7.8 Mentawai earthquake: Very shallow source of a rare tsunami earthquake determined from tsunami field survey and near-field GPS data. , (6), 1–21. https://doi.org/10.1029/2012JB009159

Kaiser, G., Scheele, L., Kortenhaus, A., Løvholt, F., Römer, H. and Leschka, S., 2011. The influence of land cover roughness on the results of high resolution tsunami inundation modeling. , (9), pp.2521-2540.

Moernaut, J. (2020, November 1). Time-dependent recurrence of strong earthquake shaking near plate boundaries: A lake sediment perspective. . Elsevier B.V. https://doi.org/10.1016/j.earscirev.2020.103344

Rhoades, D. A., & Van Dissen, R. J. (2003). Estimates of the time-varying hazard of rupture of the Alpine Fault, New Zealand, allowing for uncertainties. , (4), 479–488. https://doi.org/10.1080/00288306.2003.9515023

Williams, R. T., Davis, J. R., & Goodwin, L. B. (2019). Do Large Earthquakes Occur at Regular Intervals Through Time? A Perspective From the Geologic Record. , (14), 8074–8081. https://doi.org/10.1029/2019GL083291
Citation: https://doi.org/10.5194/nhess-2022-59-RC3
- AC3: 'Reply on RC3', Ario Muhammad, 19 Jul 2022
  
  We thank the reviewer for providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 3.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC3

Interactive discussion

Status: closed

RC1:
'Comment on nhess-2022-59', Anonymous Referee #1, 18 Mar 2022

The manuscript “Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios”

by Muhammad et al. presents an important application of time-dependent probabilistic tsunami hazard analysis (PTHA) to the central Sunda subduction zone. The method involves several novel components, such as stochastic tsunami simulation and space-time interactions among earthquakes, developed in previous publications but integrated in this applied study. The time-dependent component may be particularly important for regions that have recently had a large magnitude earthquake (see comment 1, however) and for short design exposure times. Several minor comments are indicated below, primarily related to unstated assumptions and parameter uncertainty. Upon revision, this paper should be a valuable contribution to Natural Hazards and Earth System Sciences.

General Comments

(1) The study is based on the idea that a BPT or other time dependent rupture model more accurately represents earthquake behavior along the Sunda subduction zone. Given numerous papers refuting the seismic gap hypothesis for subduction zones in general (e.g., Rong et al., 2002 who cite Matthews, 2002), it seems that a logical first step for any study region is to falsify a Poisson null hypothesis.

(2) Although the definition of fault segments is based on 450 years of earthquake occurrence, there still might not be sufficient to determine if these segment boundaries are persistent (cf., Jackson et al, 2011).

(3) The earthquake occurrence model is based on a 1D (along strike) representation of the subduction zone. For the Sunda subduction zone, as with other subduction zones with a broad shelf, however, tsunami generation is critically dependent on the dip extent of rupture as was notably observed in comparisons of the 2004 and 2005 earthquakes (e.g., Geist et al., 2006). The limitation of the 1D approach should be mentioned.

(4) It seems that it would be straightforward to estimate uncertainties in mu, alpha, and gamma from the posterior distributions (confidence intervals). These uncertainties could then be used as part of the probabilistic calculations.

(5) My impression is that the maximum magnitude earthquake considered is from the 450-year record and essentially is an event that spans segments 1-6. Even though the tsunami from an Mmax event would have a low probability, such an event may pose a more significant component of the aggregate hazard for longer exposure times than considered in this study. It should be clarified how Mmax is determined and whether a penultimate event could extend beyond the study region.

(6) Tsunami heights seem to “saturate” at nearly 10m (Figure 13). Is this dependent on the largest magnitude earthquake or is this caused by a hydrodynamic effect?

In-line comments

L42: Vere-Jones’ stress release model (cf., Bebbington and Harte, 2001) could also be mentioned—more relevant to this study.

Eqn. 3 is a cumulative distribution function, not a frequency-magnitude distribution.

L141: I couldn’t find in the manuscript where the specific magnitude-area relation used was mentioned. Since this is often a contentious choice, especially for subduction zone earthquakes, the specific relation and its justification should be indicated.

L257: How is distance D determined?

L316: Same variable D used for slip here and distance in L257.

Fig. 3: “occurred scenarios” is awkward. Could just say “scenarios”.

References

Bebbington, M., and D. S. Harte (2001), On the Statistics of the Linked Stress Release Model, , , 176-187.

Geist, E. L., S. L. Bilek, D. Arcas, and V. V. Titov (2006), Differences in tsunami generation between the December 26, 2004 and March 28, 2005 Sumatra earthquakes, , , 185-193.

Jackson, D. D., Y. Y. Kagan, and H. Gupta (2011), Characteristic earthquakes and seismic gaps, , , 1539.

Rong, Y., D. D. Jackson, and Y. Y. Kagan (2003), Seismic gaps and earthquakes, , , ESE 6-1 - 6-14.

Citation: https://doi.org/10.5194/nhess-2022-59-RC1
- AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022
  
  We would like to thank the reviewer in providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 1.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC1
RC2:
'Comment on nhess-2022-59', Anonymous Referee #2, 21 Mar 2022

The paper "Time-dependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using Space-Time Earthquake Rupture Modelling and Stochastic Source Scenarios" by Muhammad et al develops PTHA for western Sumatra using two alternative magnitude-frequency models; one that is time-dependent (and relatively novel), and another that is time-independent (and more conventional). The paper details the methodologies used to fit each model, and compares their results.

The subject of this paper is of interest for tsunami hazard science and the readership of NHESS. The presentation is mostly clear although could be improved in places (mentioned in details below). But as far as I can tell, there are a number of weakness of this paper that will require major revisions to address, before it is suitable for publication. These involve problems with the statistical methods, including: using very different data to fit each model; not treating issues of time-varying completeness in the long-term historical data; use of questionable methods to set Bayesian priors. The paper also neglects key parameter uncertainties controlling the frequencies of earthquakes, that will probably have a large impact on the results (i.e. greater than the current differences between the time-dependent and time-independent models). The authors also need to adjust the introduction to better reflect controversy regarding the performance of time-dependent models (in the literature)

I hope the authors can consider the comments below and either fix their analyses, or (where I am mistaken) edit the manuscript to make their approaches clearer and more obviously defensible. While this will take significant work, I think it is certainly doable, and will make for a useful contribution to NHESS.

# HIGH-LEVEL COMMENTS

- Generally the paper argues that time-dependent modelling is more accurate, but doesn't provide strong justification for this. To my knowledge there are contrasting views on this in the literature, which should be represented in this paper. The language should be softened, and uncertainties better discussed (see detailed comments).

- The tsunami hazard results do not seem account for uncertainties in the scenario-frequency model parameters (e.g. b, rate of earthquakes, maximum-magnitude, and other parameters). This is true for both the time-dependent and time-independent models, although details of their parameters are different. Variation of the model parameters within the statistical uncertainties will likely have a substantial impact on the results, especially given only 10 events have been used to constrain the time-dependent model (which has many parameters). When these uncertainties are accounted for, I expect they will be larger than the current difference between the time-dependent and time-independent results. Given the 'many synthetic catalogues' approach used in this paper, the uncertainties could be accounted for by randomly drawing different model parameters for each synthetic catalogue.

- The maximum-magnitude is set to Mw9. In reality maximum magnitudes are quite uncertain (justified further below) and yet very impactful for the results. Again I expect they are likely more important than the effect of time-dependence in the current modelling. Many other PTHA studies treat this as an uncertain parameter (details below), and I suggest that issue is also addressed in this paper.

- In so far as I can tell, there are a number of technical weaknesses in the scenario frequency modelling that should be addressed or clarified.

+ The long-term data (10 events over the last 450 years) likely has a time-varying magnitude of completeness; the earliest 8 events all have Mw>=8.3, while only the most recent 2 events (2007+) have Mw<=8. This is not surprising - a-priori we certainly expect that it would be harder to detect smaller earthquakes in the paleo data. But the statistical methods seem to ignore this issue. This could have a large impact on the fit of the time-dependent model.

+ The time-independent model is fit to different data than the time-dependent model, and the time-independent fit is dominated by small earthquakes (mostly having magnitudes well below the Mw 7.65+ that are of interest in this study). Even if there were no differences between the models, the use of such different data would lead to differences in their results. This makes it hard to determine the significance of the time-dependent model structure for the PTHA results. To remedy this, the long-term data should be used to constrain the time-independent model for larger magnitudes. This may require accounting for time-variations in the catalogue completeness (citations below), and placing less weight on low magnitude earthquakes (so they don't dominate the fit at higher magnitudes, which currently doesn't agree especially well with the data). This should help reduce the under-estimation of the earthquake frequencies with Mw>=8.3 (currently about three times less common in the time-independent model, vs the long term data).

+ There appear to be some anomalies in the Bayesian fit of the time-dependent model. The priors for some parameters seem to be set using the same data used for fitting, which should not be done with Bayesian statistics. Also, the figures show differences between the priors and posteriors that suggest a poor specification of the priors (details in comments below).

+ The fit of the time-dependent model seems to ignore the change in completeness magnitude of the long-term data.

# DETAILED COMMENTS

Near L25: 'Over the next decades, major tsunamigenic events are anticipated in .... '. It sounds like "we expect large tsunamis in each of these subduction zones within a few decades". I don't think this is well justified. Do the references really backup the 'major events in the next-few-decades' claim? Historically, time-dependent predictions over these kinds of timescales have not performed well for subduction zones (e.g. Rong et al., 2003).

Near L40: "...assuming a lack of memory between major earthquake occurrences is often viewed as a first approximation .." -- I think there are contrasting views on this in the literature, that should be represented in this part of the paper. For example Rong et al. (2003) are quite critical of assumed quasi-periodic earthquake recurrence (on empirical grounds). OTOH there is empirical evidence that large earthquakes tend to be weakly periodic, but without correlation between successive inter-event times (Griffin et al., 2020).

L53-54: "Recent work has also used high-resolution spatial grids .... to produce more accurate tsunami hazard results (e.g. < 90m, ...)". I don't think we should describe "< 90m" as high-resolution for onshore work, that is quite coarse. I might describe resolutions of 10m or less as high-resolution (e.g. Gibbons et al., 2020).

Near L67: "Since time-dependent hazard estimation leads to more realistic short-term results" -- this really needs justification, or removal. To my knowledge this point has not been demonstrated in general, and it may-or-may-not be true. I suppose for aftershock modelling there would be lots of evidence, but this study is using quasi-periodic modelling for large events, and I believe there is less evidence on this matter. In the Paleo record, some sites look more time-dependent than others, e.g. Griffin et al. 2020.

Near L74: "A uniform-slip was used, which may underestimate the hazard..." -- I believe Horspool et al. (2014) used a log-normal distribution to predict the (uncertain) heights at the coast from the uniform-slip scenarios, as a way of accounting for uncertainties due to the slip model and uncertain geometry. In principle this is supposed to compensate for the lack of slip heterogeneity. In practice it could either underestimate, or overestimate, the variability of natural earthquake-tsunamis. If their sigma were sufficiently large, it may even predict greater hazard than your model (I haven't checked whether it does, just clarifying the principle).

L86: Please add a statement about why you use segments to define the rupture extents (I think it is related to the space-time modelling?).

L115: "magnitude-frequency distribution" -- Should this be "probability density function"? I think the MFD would include the factor lambda_i.

L117: "frequency-magnitude distribution" -- It think this should be "Cumulative Distribution Function"? Furthermore I think you need to say that f_i(M) is the derivative of F(M) (and consider whether you need a subscript _i for F).

L125: In Equation 4, the subscript '_i' might be confused with the same subscript used to denote the source in Equation 2. Also, I think Eq 4 should use 'j' for consistency with notation in the paragraph just before Equation 4?

L127: Here I am concerned that you are not using the long-term paleo data to fit the GR model. Why not? The longer term data suggests a high rate of Mw >= 8.3 (8 events in 450 years, rate around 0.018), quite a bit more frequent than suggested by your time-independent model (visually seems ~ 0.006 in Figure 1C, or one-third the frequency -- noting this fit is dominated by low-magnitude earthquakes, below magnitudes of practical interest for this study). As well as taking the opportunity to improve the model accuracy, this would be good because the long-term data is used for fitting the time-dependent model. The use of very different data to fit the two models allows for a substantial 'arbitrary' difference between their results, which is not related to their structure (temporal/non-temporal). I am concerned that this may dominate the differences in your results. I would suggest you fit the time-independent GR model using both the long-term and catalogue data (there are various approaches to treating the varying completeness magnitude, e.g. Weichert, 1980), while removing the instrumental events from the long-term data. Also, you might want to use fewer low-magnitude earthquakes to constrain the fit (to reduce the influence of low-magnitude earthquakes on the fit, and better represent the data at magnitudes that matter for this PTHA).

L135: Around here, could you please explicitly state that the time-dependent model does not have Mw-frequency curves that follow the GR distribution, over any time-scale. I didn't realise this initially, and it is obviously a very important point for the subsequent analysis. Perhaps a sentence highlighting that instead the Mw-frequency distribution will reflect correlations between rupture on different segments, which is parameterised by the model itself.

L150: It looks like the magnitudes only go up to 9? I think this is neglecting the large uncertainties in Mw-max. Neglect of those uncertainties may have a strong impact on the results. A few relevant points: Berryman et al. (2015) suggested uncertain Mw-max values in this region ranging from 9.0 - 9.6 based on scaling relations and the historical record. Such highly uncertain Mw-max values have been represented in PTHAs (e.g. Davies et al., 2017; Davies and Griffin, 2020). Horspool et al. (2014) allowed Mw-max on Sumatra to vary in 9.3 - 9.7. We know the nearby 2004 event had a magnitude exceeding 9 (around 9.2). From Tohoku we also know that Mw 9.1 can occur in relatively compact regions, smaller than the extents of your study. On this basis I don't think we can exclude the possibility of higher magnitude earthquakes.

L153: "..for each of those 21 rupture scenarios" -- suggest to add "geometrical" before "rupture scenarios", to be consistent with previous sentences. Here there are a few interacting concepts: "geometrical rupture scenarios (seems to be a magnitude plus a set of segments?)", "scenarios", "events" (is this the same as "scenarios"). I suggest you pick one term for each concept, and then use it consistently throughout the paper.

L 164: "(one height for one simulation catalog)" -- does the height vary with space, or are we looking at the 'maximum height anywhere in the model'?

L 172: "The results confirm that N_{sim} = 100,000 catalogues are sufficient to produce a stable result" -- stable in terms of what? The mean over all catalogues? Please make this clear, as I suppose individual catalogues must vary greatly.

L175-179: This section is confusing me. Above I understood that you used N_{sim} = 100,000 to get a stable result. But now it is suggested that many more catalogues were required for 1-50 years. Please edit to make this clearer. [NOTE: Some sentences from the 'Results' section may help in this regard, mentioned below.].

L197-199: "This number is consistent with the GR model". In my judgement they are "not very consistent", with the model under-predicting the frequency of large events (as discussed above, the GR model has a substantially lower frequency of Mw>=8.3). Note the 450 year record contains 10 events (Mw 7.8-8.9), but the first 8 events have Mw>=8.3, and the last two events are from the recent instrumental period. This suggests changes in the magnitude of completeness of the 450 year catalogue over time. A-prior we expect this would happen because Paleo records find it more difficult to detect small events. This issue should be accounted for when comparing the GR model with the long term data (and above I suggest that the long-term data should also be used to fit the time-independent GR model -- doing that will probably lead to significant increases in the modelled frequency of large earthquakes).

L205: "see Figure and Figure 5" -- missing Figure number.

L217: Suggest you use a word other than "scenarios" to denote the 21 "magnitude + set-of-segments" combinations.

L250-ish: Above I argued that the long-term data (10 events, 450 years) is likely subject to a varying completeness magnitude, noting the only two events with Mw<8 events are recent instrumental events, and all others have Mw>=8.3. From what I can see, this 'changing completeness magnitude' is not accounted for in the statistical fit of the time-varying model (Section 2.2.2). I expect this would have a large effect on time-varying model fit - for example, overestimating the conditional probability of multi-segment rupture (which also effects the frequency of high Mw events), and affecting the BPT model parameters.

L250: "The prior median of mu for each segment is different, namely ....... These values represent the median interarrival time of earthquake rupture on each segment over the last 450 years". It sounds like you are using the same data both to specify the priors, and then to fit the model (?). In Bayesian statistics, the priors should be specified in a way that doesn't use the fitting data, or at least doesn't use it in important ways. Another potential problem with the methodology is suggested by Figure 9, where we see the prior and posterior for 'mu' are very different on some segments -- the posterior is more diffuse and often has a very different average (e.g. Panels A, I, K). This suggests the priors have been overly constrained in the analysis. Typically priors would be set either using data different used for fitting, or given weakly informative values.

L310: This source zone has some history of "tsunami-earthquakes", with waves much larger than might be expected from the magnitude (e.g. Mentawai 2010). Can the current model produce similar large waves for scenarios with magnitude below 8, using the rigidity of 40GPA? I would be surprised if it can, although that will also depend on how concentrated the slip is allowed to be. Please add a comment on the capacity for the model to make 'tsunami-earthquake' type scenarios.

L317 "... 300 stochastic models are sufficient to simulate stable and consistent tsunami heights and depths" -- I think this must depend on the model region, and what you are interested in. For instance it would not give an accurate representation of the 99.5th percentile. Also for a model where only a very small part of the source-zone could affect the site of interest, one might need to generate many scenarios to get enough relevant scenarios. In summary, I don't think you can refer to stability tests from another study to provide justification for using 300 models in this study. Instead, can you report on a test that is specific to this case?

L347: "... the final parameter estimates are taken from the maximum a posterior". It would be better to account for the model uncertainty (also in Mw-max, b, etc), which should be substantial given the limited data available to fit the model, and will probably have a substantial impact on the hazard. One way to do this would be to draw a different parameter set for each of the large number of synthetic catalogues that are simulated.

Section 3.1: As discussed earlier, please comment on why the 'mu-priors' for some segments are so different to the posteriors (little overlap for Fig 9 panels A, I, K). This is surprising given especially considering that the priors were apparently constructed using the same data used for fitting. To me it suggests weaknesses in how the priors were constructed, or some other problem.

L356-360: This is a very clear description of how the catalogue duration was defined. I suggest you move this to the earlier methods section (where I expressed confusion about the method).

L360 and Figure 10: Regarding the validation of the annual seismic moment release: Considering that the data was used to fit the model, I don't think the observations/model are particularly consistent on segments 3 and 4. In both cases the observed data exceeds the 90th percentile of the model. Again this seems to suggest some under-estimation in the model, as discussed repeatedly above. Please check that this is all correct following revisions, and if it is, add a comment explaining why this is nonetheless reasonably consistent.

L379: Figure 11C is not a strong basis for making a point about which segments rupture more or less, because it is only 1 catalogue. Can you please make a figure that better justifies the points made in this paragraph?

Line 447 and Figure 12: The conditional probability of Mw9.0 (if an earthquake occurs) is larger in the time independent case. But I doubt that these results will be robust to parameter uncertainties in the time-dependent model, considering that limited data (10 events, that likely has time-varying completeness) was available to fit its many parameters. This further suggests the importance of considering model parameter uncertainties in the PTHA.

Line 451: One factor neglected in this discussion is the effect of using different datasets to fit the 2 models, which could cause differences in the results even if there was no other difference between the two kinds of models. I think the calculations in this paper should be revised so that the time-independent model is informed by the long-term data, and that parameter uncertainties in both the time-independent and time-dependent models are accounted for. In my judgment it is likely that the parameter uncertainties will could lead to differences in the results that are substantially larger than differences between the current time-dependent and time-independent models.

# TECHNICAL CORRECTIONS

None for now.

# REFERENCES

Berryman, K.; Wallace, L.; Hayes, G.; Bird, P.; Wang, K.; Basili, R.; Lay, T.; Pagani, M.; Stein, R.; Sagiya, T.; Rubin, C.; Barreintos, S.; Kreemer, C.; Litchfield, N.; Stirling, M.; Gledhill, K.; Haller, K. & Costa, C. The GEM Faulted Earth Subduction Interface Characterisation Project: Version 2.0 – April 2015 GEM, GEM, 2015

Davies, G.; Griffin, J.; Løvholt, F.; Glimsdal, S.; Harbitz, C.; Thio, H. K.; Lorito, S.; Basili, R.; Selva, J.; Geist, E. & Baptista, M. A. A global probabilistic tsunami hazard assessment from earthquake sources Geological Society, London, Special Publications, Geological Society of London, 2017

Davies, G. & Griffin, J. Sensitivity of Probabilistic Tsunami Hazard Assessment to Far-Field Earthquake Slip Complexity and Rigidity Depth-Dependence: Case Study of Australia Pure and Applied Geophysics, 2020, 177, 1521–1548

Gibbons, S. J.; Lorito, S.; Macías, J.; Løvholt, F.; Selva, J.; Volpe, M.; Sánchez-Linares, C.; Babeyko, A.; Brizuela, B.; Cirella, A.; Castro, M. J.; de la Asunción, M.; Lanucara, P.; Glimsdal, S.; Lorenzino, M. C.; Nazaria, M.; Pizzimenti, L.; Romano, F.; Scala, A.; Tonini, R.; Manuel González Vida, J. & Vöge, M. Probabilistic Tsunami Hazard Analysis: High Performance Computing for Massive Scale Inundation Simulations. Frontiers in Earth Science, 2020, 8, 623

Griffin, J. D.; Stirling, M. W. & Wang, T. Periodicity and Clustering in the Long-Term Earthquake Record Geophysical Research Letters, American Geophysical Union (AGU), 2020, 47

Horspool, N.; Pranantyo, I.; Griffin, J.; Latief, H.; Natawidjaja, D. H.; Kongko, W.; Cipta, A.; Bustamam, B.; Anugrah, S. D. & Thio, H. K. A probabilistic tsunami hazard assessment for Indonesia Natural Hazards and Earth System Sciences, 2014, 14, 3105-3122

Rong, Y.; Jackson, D. D. & Kagan, Y. Y. Seismic gaps and earthquakes Journal of Geophysical Research: Solid Earth, Wiley-Blackwell, 2003, 108

Weichert, D. H. Estimation of the earthquake recurrence parameters for unequal observation periods for different magnitudes Bulletin of the Seismological Society of America, 1980, 70, 1337-1346

Citation: https://doi.org/10.5194/nhess-2022-59-RC2
- AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022
  
  We thank the reviewer for providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 2.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC2
RC3:
'Comment on nhess-2022-59', Anonymous Referee #3, 23 Mar 2022
Review of Muhammad et al

This paper presents a methodology for time-dependent probabilistic tsunami hazard analysis with stochastic earthquake rupture modelling, using the Mentawai region of the Sunda Subduction Zone as a case study. This is a novel and ambitious approach, and it is exciting to see the efforts made by the authors. In my view, the complexity of the model does however pose some challenges, and I think there are a number of points that require further justification and/or consideration of the choices made in the model. I expect this will require some effort to revise the model.

Major comments

Justification of the choice of a time-dependent approach. A number of recent studies of global paleoearthquake records (Williams et al 2019; Griffin et al 2020; Moernaut 2020) have, to varying degrees, provided empirical support for weakly quasiperiodic earthquake recurrence as a general model, which can be used to justify the use of renewal models for hazard assessment. That said, the Mentawai record of Philibosian et al (2017) looks to be more random than quasiperiodic in the analysis presented by Griffin et al (2020), although perhaps a different result might be obtained using the segmentation model presented here. The posterior BPT parameter estimates given for each segment are also relevant – some give values of alpha ~1 (segments 2, 3 and 4), implying random recurrence (i.e. Poisson), while others are ~0.6 (segments 1, 5 and 6), implying moderately quasiperiodic recurrence. So, I think some comment needs to be made here that:

At a global scale there is empirical support for weakly quasiperiodic earthquake recurrence as a general model (see Griffin et al 2020);

Excluding the hypothesis at the individual fault level is difficult, particularly for short records (Williams et al 2019; Griffin et al 2020)

The data from Philibosian et al (2017) is somewhat equivocal about whether earthquake recurrence here is truly time-dependent, and the Poisson hypothesis cannot be confidently excluded using these data. But the global studies mentioned above suggest it is not unreasonable to assume time-dependence as a hypothesis.

The discussion section of the paper could then discuss the implications of this assumption in light of the different values of alpha obtained for each segment.

In estimating parameters for the BPT distribution, the authors use the data to estimate the prior distribution of mu, before then using the same data to calculate the posterior probability distribution of mu. This is incorrect. I would suggest using an uninformative prior (e.g. as used by Fitzenz et al 2010). An alternative approach could be to use an informative prior for mu based on the slip rate (e.g. as determined from geodesy), but this may become complex (e.g. due to having to estimate coupling of the fault). The 450 year long record is short for accurately estimate model parameters. This is, of course, what a Bayesian approach should be helping with, but needs more care about the choice of priors.

I am also concerned that fitting the model parameters to each segment individually is problematic. Later you consider multi-segment ruptures, and it is not clear how all this fits together. Do the recurrence statistics obtained from the sum of all synthetic ruptures across all segments match the recurrence statistics from the sum of all historic/paleo ruptures in your data? Checking this could be a good test for your model.

Also related to parameter estimation, some of the posterior histograms seem a bit spiky; does this improve if the number of samples is increased beyond 10,000?

Spatio-temporal completeness of the paleo record compared with the instrumental record is an issue that I think could lead to biases in the parameter estimates. It is very unlikely that events similar to the Mw 7.8 2010 Mentawai event would be visible in the coral record; this event occurred near the trench and caused <4 cm subsidence on the Mentawai Islands as measured with GPS (Hill et al 2012). Related to the above, the Mmin of 7.6 (L129), while reasonable from a tsunami hazard assessment perspective, would mean that you are modelling events that are unlikely to be present in the paleoearthquake record. I am unsure of how the frequency of these events could be determined in the time-dependent approach. Therefore it seems likely in your current approach that smaller events are missed in the paleoearthquake record, therefore affecting the recurrence model parameters.

The 1D rupture segmentation is a problem for tsunami hazard assessment, as the resulting tsunami size depends so significantly on the depth of rupture. Compare the 2007 Bengkulu earthquakes (Mw 8.4 and 7.9), that were down-dip of the trench and did not generate a significant tsunami, with the 2010 Mentawai earthquakes (Mw7.8), which occurred near the trench and did generate a significant tsunami. It is not clear whether such events are discriminated by the stochastic modelling approach with 1D segmentation – it seems they probably aren’t, but I may not be understanding correctly. A related problem is low-rigidity near the trench and its tsunamigenic potential, as in the 2010 Mentawai tsunami? How might the assumption of constant (and relatively high) rigidity (L309-310) bias your tsunami hazard results?

The maximum magnitude of 9.0 seems too low, which seems related to the segmentation model. If the potential for ruptures connecting with other segments of the Sunda Subduction Zone is considered, then larger Mmax values are justified. Significantly larger Mmax’s were used in Horspool et al (2014). Even if the paleoearthquake record for the past 450 years suggests events haven’t exceeded Mw 9.0, we also don’t expect these magnitude events to occur all that often. So allow for the possibility that they are missing from the record.

Some area of the coast of Padang show zero probability of inundation (Figure 17), while in others the potential inundation extent extends quite a way inland. This raises some significant concerns for me about the quality of the inundation modelling and/or the elevation data used, given how low-lying the coast is in this area. If only SRTM data was used, this could significantly underestimate inundation extent (see Griffin et al 2015, Figure 8). Are buildings included in the elevation model?

Detailed comments

L15: Suggest change ‘A total of >’ to ‘More than’

L18: Forecast periods begin in what year?

L136: Choice of BPT is fine, but hasn’t really been justified here. Why is this chosen over lognormal, Weibull or Gamma? Some of your justification seems to be presented later in Section 2.2.

L174: Several thousand years

L185: Perhaps rephrase as ‘reflects the expectations of elastic rebound theory’, or similar.

L192: Should probably cite others who’ve used Bayesian approaches to fitting time-dependent models to earthquake records, in particular Rhoades et al (1994) and Fitzenz et al (2010).

L197 and Table 1: These should not be referred to as tsunamigenic. For half of them we have no information on whether a tsunami was generated; coseismic deformation on the Mentawai Islands observed in coral paleogeodetic records suggests they probably were, but we don’t actually know.

L324. Please give a link or citation for DEM5 and Bathy5.

L332: Might be a typo here – Griffin et al (2016) used a Manning’s roughness of 0.036 as a conservative minimum for land (grassland; for the Mentawai Islands). For the urban context here, 0.06 may be reasonable, e.g. Griffin et al (2015) suggested a Manning’s roughness of 0.08 for the city of Padang. See also Kaiser et al (2011) for a discussion of choice of Mannings n.

Ling 501-502: The time-independent model has too low an Mmax (9.0) to be considered worst-case. See earlier comments about choice of Mmax.

Table 1: Change Shieh to Sieh.

Figure 3, and also in the text. I do not think the term ‘occurred’ scenarios is the best terminology. These are modelled scenarios that have not actually occurred.

References:

Fitzenz, D. D., Ferry, M. A., & Jalobeanu, A. (2010). Long-term slip history discriminates among occurrence models for seismic hazard assessment. , (20), 1–5. https://doi.org/10.1029/2010GL044071

Griffin, J., Latief, H., Kongko, W., Harig, S., Horspool, N., Hanung, R., Rojali, A., Maher, N., Fuchs, A., Hossen, J., & others. (2015). An evaluation of onshore digital elevation models for modeling tsunami inundation zones. , (32).

Griffin, J. D., Stirling, M. W., & Wang, T. (2020). Periodicity and Clustering in the LongâTerm Earthquake Record. , (22). https://doi.org/10.1029/2020GL089272

Horspool, N., Pranantyo, I., Griffin, J., Latief, H., Natawidjaja, D. H., Kongko, W., Cipta, A., Bustaman, B., Anugrah, S. D., & Thio, H. K. (2014). A probabilistic tsunami hazard assessment for Indonesia. , (11), 3105–3122.

Hill, E. M., Borrero, J. C., Huang, Z., Qiu, Q., Banerjee, P., Natawidjaja, D. H., Elosegui, P., Fritz, H. M., Suwargadi, B. W., Pranantyo, I. R., Li, L. L., Macpherson, K. A., Skanavis, V., Synolakis, C. E., & Sieh, K. (2012). The 2010 Mw 7.8 Mentawai earthquake: Very shallow source of a rare tsunami earthquake determined from tsunami field survey and near-field GPS data. , (6), 1–21. https://doi.org/10.1029/2012JB009159

Kaiser, G., Scheele, L., Kortenhaus, A., Løvholt, F., Römer, H. and Leschka, S., 2011. The influence of land cover roughness on the results of high resolution tsunami inundation modeling. , (9), pp.2521-2540.

Moernaut, J. (2020, November 1). Time-dependent recurrence of strong earthquake shaking near plate boundaries: A lake sediment perspective. . Elsevier B.V. https://doi.org/10.1016/j.earscirev.2020.103344

Rhoades, D. A., & Van Dissen, R. J. (2003). Estimates of the time-varying hazard of rupture of the Alpine Fault, New Zealand, allowing for uncertainties. , (4), 479–488. https://doi.org/10.1080/00288306.2003.9515023

Williams, R. T., Davis, J. R., & Goodwin, L. B. (2019). Do Large Earthquakes Occur at Regular Intervals Through Time? A Perspective From the Geologic Record. , (14), 8074–8081. https://doi.org/10.1029/2019GL083291
Citation: https://doi.org/10.5194/nhess-2022-59-RC3
- AC3: 'Reply on RC3', Ario Muhammad, 19 Jul 2022
  
  We thank the reviewer for providing significant comments to improve our work.
  However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 3.
  
  Citation: https://doi.org/10.5194/nhess-2022-59-AC3

Ario Muhammad, Katsuichiro Goda, and Maximilian J. Werner

Viewed

Total article views: 1,416 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
972	380	64	1,416	69	73

HTML: 972
PDF: 380
XML: 64
Total: 1,416
BibTeX: 69
EndNote: 73

Views and downloads (calculated since 24 Feb 2022)

Month	HTML	PDF	XML	Total
Feb 2022	131	27	3	161
Mar 2022	155	32	10	197
Apr 2022	35	14	3	52
May 2022	12	7	1	20
Jun 2022	14	5	1	20
Jul 2022	47	15	4	66
Aug 2022	24	9	0	33
Sep 2022	9	3	0	12
Oct 2022	14	8	0	22
Nov 2022	11	4	0	15
Dec 2022	11	6	1	18
Jan 2023	20	4	0	24
Feb 2023	25	20	0	45
Mar 2023	19	5	1	25
Apr 2023	11	10	0	21
May 2023	16	9	0	25
Jun 2023	10	12	1	23
Jul 2023	15	15	1	31
Aug 2023	14	11	1	26
Sep 2023	44	9	0	53
Oct 2023	21	7	2	30
Nov 2023	8	3	0	11
Dec 2023	17	3	2	22
Jan 2024	26	1	1	28
Feb 2024	14	10	2	26
Mar 2024	28	14	0	42
Apr 2024	22	7	7	36
May 2024	16	5	3	24
Jun 2024	30	2	2	34
Jul 2024	9	11	5	25
Aug 2024	19	3	2	24
Sep 2024	12	3	0	15
Oct 2024	19	2	1	22
Nov 2024	4	5	0	9
Dec 2024	9	7	3	19
Jan 2025	16	8	1	25
Feb 2025	6	3	2	11
Mar 2025	14	11	2	27
Apr 2025	21	8	1	30
May 2025	12	6	1	19
Jun 2025	8	18	0	26
Jul 2025	4	18	0	22

Cumulative views and downloads (calculated since 24 Feb 2022)

Month	HTML	PDF	XML	Total
Feb 2022	131	27	3	161
Mar 2022	155	32	10	197
Apr 2022	35	14	3	52
May 2022	12	7	1	20
Jun 2022	14	5	1	20
Jul 2022	47	15	4	66
Aug 2022	24	9	0	33
Sep 2022	9	3	0	12
Oct 2022	14	8	0	22
Nov 2022	11	4	0	15
Dec 2022	11	6	1	18
Jan 2023	20	4	0	24
Feb 2023	25	20	0	45
Mar 2023	19	5	1	25
Apr 2023	11	10	0	21
May 2023	16	9	0	25
Jun 2023	10	12	1	23
Jul 2023	15	15	1	31
Aug 2023	14	11	1	26
Sep 2023	44	9	0	53
Oct 2023	21	7	2	30
Nov 2023	8	3	0	11
Dec 2023	17	3	2	22
Jan 2024	26	1	1	28
Feb 2024	14	10	2	26
Mar 2024	28	14	0	42
Apr 2024	22	7	7	36
May 2024	16	5	3	24
Jun 2024	30	2	2	34
Jul 2024	9	11	5	25
Aug 2024	19	3	2	24
Sep 2024	12	3	0	15
Oct 2024	19	2	1	22
Nov 2024	4	5	0	9
Dec 2024	9	7	3	19
Jan 2025	16	8	1	25
Feb 2025	6	3	2	11
Mar 2025	14	11	2	27
Apr 2025	21	8	1	30
May 2025	12	6	1	19
Jun 2025	8	18	0	26
Jul 2025	4	18	0	22

Viewed (geographical distribution)

Total article views: 1,363 (including HTML, PDF, and XML) Thereof 1,363 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 18 Jul 2025

Download

This preprint has been withdrawn.

Preprint (3764 KB)
Metadata XML

Short summary

This study develops a novel framework of time-dependent (TD) probabilistic tsunami hazard analysis (PTHA) combining a total of ≥ 100,000 spatiotemporal earthquakes (EQ) rupture models and 6,300 probabilistic tsunami simulations to evaluate the tsunami hazards and compare them with the time-independent (TI) PTHA results. The proposed model can capture the uncertainty of future TD tsunami hazards and produces slightly higher hazard estimates than the TI model for short-term periods (< 30 years).


Total:	0
HTML:	0
PDF:	0
XML:	0