the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Timedependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using SpaceTime Earthquake Rupture Modelling and Stochastic Source Scenarios
Abstract. We develop a novel framework of timedependent probabilistic tsunami hazard analysis (PTHA) and apply it to Western Sumatra, Indonesia, where future tsunamigenic events are anticipated in the Mentawai region of the Sunda subduction zone. An earthquake rupture model taking into account the spatiotemporal interaction of major megathrust segments is used to simulate future tsunamigenic earthquakes. The earthquake rupture process of the segments is characterized by a multivariate Bernoulli model with interarrival times following a Brownian passage‐time distribution and the dependency between segments specified by a spatial correlation function. We calibrate this model with historical ruptures of the Mentawai thrust in the last 450 years. A total of ≥ 100,000 timedependent earthquake rupture cases are then coupled with a stochastic tsunami simulation method to evaluate tsunami hazards. We generate a total of 6,300 stochastic tsunami source models from six magnitude scenarios between M 7.75 and M 9.0 and obtain timedependent PTHA results for seven different periods (1, 5, 10, 20, 30, 50 and 450 years). We further compare the timedependent PTHA results with a timeindependent PTHA approach to investigate the influence of the spatiotemporal earthquake rupture model. The spacetime interaction model successfully generates annual seismic moment rates consistent with the observations. Moreover, the model can capture the uncertainty of future timedependent tsunami hazards. On the other hand, the timeindependent approach produces slightly higher hazard estimates than the timedependent model for longterm hazard assessments (> 450 years).
This preprint has been withdrawn.

Withdrawal notice
This preprint has been withdrawn.

Preprint
(3764 KB)
Interactive discussion
Status: closed

RC1: 'Comment on nhess202259', Anonymous Referee #1, 18 Mar 2022
The manuscript “Timedependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using SpaceTime Earthquake Rupture Modelling and Stochastic Source Scenarios”
by Muhammad et al. presents an important application of timedependent probabilistic tsunami hazard analysis (PTHA) to the central Sunda subduction zone. The method involves several novel components, such as stochastic tsunami simulation and spacetime interactions among earthquakes, developed in previous publications but integrated in this applied study. The timedependent component may be particularly important for regions that have recently had a large magnitude earthquake (see comment 1, however) and for short design exposure times. Several minor comments are indicated below, primarily related to unstated assumptions and parameter uncertainty. Upon revision, this paper should be a valuable contribution to Natural Hazards and Earth System Sciences.
General Comments
(1) The study is based on the idea that a BPT or other time dependent rupture model more accurately represents earthquake behavior along the Sunda subduction zone. Given numerous papers refuting the seismic gap hypothesis for subduction zones in general (e.g., Rong et al., 2002 who cite Matthews, 2002), it seems that a logical first step for any study region is to falsify a Poisson null hypothesis.
(2) Although the definition of fault segments is based on 450 years of earthquake occurrence, there still might not be sufficient to determine if these segment boundaries are persistent (cf., Jackson et al, 2011).
(3) The earthquake occurrence model is based on a 1D (along strike) representation of the subduction zone. For the Sunda subduction zone, as with other subduction zones with a broad shelf, however, tsunami generation is critically dependent on the dip extent of rupture as was notably observed in comparisons of the 2004 and 2005 earthquakes (e.g., Geist et al., 2006). The limitation of the 1D approach should be mentioned.
(4) It seems that it would be straightforward to estimate uncertainties in mu, alpha, and gamma from the posterior distributions (confidence intervals). These uncertainties could then be used as part of the probabilistic calculations.
(5) My impression is that the maximum magnitude earthquake considered is from the 450year record and essentially is an event that spans segments 16. Even though the tsunami from an Mmax event would have a low probability, such an event may pose a more significant component of the aggregate hazard for longer exposure times than considered in this study. It should be clarified how Mmax is determined and whether a penultimate event could extend beyond the study region.
(6) Tsunami heights seem to “saturate” at nearly 10m (Figure 13). Is this dependent on the largest magnitude earthquake or is this caused by a hydrodynamic effect?
Inline comments
L42: VereJones’ stress release model (cf., Bebbington and Harte, 2001) could also be mentioned—more relevant to this study.
Eqn. 3 is a cumulative distribution function, not a frequencymagnitude distribution.
L141: I couldn’t find in the manuscript where the specific magnitudearea relation used was mentioned. Since this is often a contentious choice, especially for subduction zone earthquakes, the specific relation and its justification should be indicated.
L257: How is distance D determined?
L316: Same variable D used for slip here and distance in L257.
Fig. 3: “occurred scenarios” is awkward. Could just say “scenarios”.
References
Bebbington, M., and D. S. Harte (2001), On the Statistics of the Linked Stress Release Model, Journal of Applied Probability, 38, 176187.
Geist, E. L., S. L. Bilek, D. Arcas, and V. V. Titov (2006), Differences in tsunami generation between the December 26, 2004 and March 28, 2005 Sumatra earthquakes, Earth Planets Space, 58, 185193.
Jackson, D. D., Y. Y. Kagan, and H. Gupta (2011), Characteristic earthquakes and seismic gaps, Encyclopedia of solid earth geophysics, 5, 1539.
Rong, Y., D. D. Jackson, and Y. Y. Kagan (2003), Seismic gaps and earthquakes, J. Geophys. Res., 108, ESE 61  614.
Citation: https://doi.org/10.5194/nhess202259RC1 
AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022
We would like to thank the reviewer in providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 1.

AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022

RC2: 'Comment on nhess202259', Anonymous Referee #2, 21 Mar 2022
The paper "Timedependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using SpaceTime Earthquake Rupture Modelling and Stochastic Source Scenarios" by Muhammad et al develops PTHA for western Sumatra using two alternative magnitudefrequency models; one that is timedependent (and relatively novel), and another that is timeindependent (and more conventional). The paper details the methodologies used to fit each model, and compares their results.
The subject of this paper is of interest for tsunami hazard science and the readership of NHESS. The presentation is mostly clear although could be improved in places (mentioned in details below). But as far as I can tell, there are a number of weakness of this paper that will require major revisions to address, before it is suitable for publication. These involve problems with the statistical methods, including: using very different data to fit each model; not treating issues of timevarying completeness in the longterm historical data; use of questionable methods to set Bayesian priors. The paper also neglects key parameter uncertainties controlling the frequencies of earthquakes, that will probably have a large impact on the results (i.e. greater than the current differences between the timedependent and timeindependent models). The authors also need to adjust the introduction to better reflect controversy regarding the performance of timedependent models (in the literature)
I hope the authors can consider the comments below and either fix their analyses, or (where I am mistaken) edit the manuscript to make their approaches clearer and more obviously defensible. While this will take significant work, I think it is certainly doable, and will make for a useful contribution to NHESS.
# HIGHLEVEL COMMENTS
 Generally the paper argues that timedependent modelling is more accurate, but doesn't provide strong justification for this. To my knowledge there are contrasting views on this in the literature, which should be represented in this paper. The language should be softened, and uncertainties better discussed (see detailed comments).
 The tsunami hazard results do not seem account for uncertainties in the scenariofrequency model parameters (e.g. b, rate of earthquakes, maximummagnitude, and other parameters). This is true for both the timedependent and timeindependent models, although details of their parameters are different. Variation of the model parameters within the statistical uncertainties will likely have a substantial impact on the results, especially given only 10 events have been used to constrain the timedependent model (which has many parameters). When these uncertainties are accounted for, I expect they will be larger than the current difference between the timedependent and timeindependent results. Given the 'many synthetic catalogues' approach used in this paper, the uncertainties could be accounted for by randomly drawing different model parameters for each synthetic catalogue.
 The maximummagnitude is set to Mw9. In reality maximum magnitudes are quite uncertain (justified further below) and yet very impactful for the results. Again I expect they are likely more important than the effect of timedependence in the current modelling. Many other PTHA studies treat this as an uncertain parameter (details below), and I suggest that issue is also addressed in this paper.
 In so far as I can tell, there are a number of technical weaknesses in the scenario frequency modelling that should be addressed or clarified.
+ The longterm data (10 events over the last 450 years) likely has a timevarying magnitude of completeness; the earliest 8 events all have Mw>=8.3, while only the most recent 2 events (2007+) have Mw<=8. This is not surprising  apriori we certainly expect that it would be harder to detect smaller earthquakes in the paleo data. But the statistical methods seem to ignore this issue. This could have a large impact on the fit of the timedependent model.
+ The timeindependent model is fit to different data than the timedependent model, and the timeindependent fit is dominated by small earthquakes (mostly having magnitudes well below the Mw 7.65+ that are of interest in this study). Even if there were no differences between the models, the use of such different data would lead to differences in their results. This makes it hard to determine the significance of the timedependent model structure for the PTHA results. To remedy this, the longterm data should be used to constrain the timeindependent model for larger magnitudes. This may require accounting for timevariations in the catalogue completeness (citations below), and placing less weight on low magnitude earthquakes (so they don't dominate the fit at higher magnitudes, which currently doesn't agree especially well with the data). This should help reduce the underestimation of the earthquake frequencies with Mw>=8.3 (currently about three times less common in the timeindependent model, vs the long term data).
+ There appear to be some anomalies in the Bayesian fit of the timedependent model. The priors for some parameters seem to be set using the same data used for fitting, which should not be done with Bayesian statistics. Also, the figures show differences between the priors and posteriors that suggest a poor specification of the priors (details in comments below).
+ The fit of the timedependent model seems to ignore the change in completeness magnitude of the longterm data.# DETAILED COMMENTS
Near L25: 'Over the next decades, major tsunamigenic events are anticipated in .... '. It sounds like "we expect large tsunamis in each of these subduction zones within a few decades". I don't think this is well justified. Do the references really backup the 'major events in the nextfewdecades' claim? Historically, timedependent predictions over these kinds of timescales have not performed well for subduction zones (e.g. Rong et al., 2003).
Near L40: "...assuming a lack of memory between major earthquake occurrences is often viewed as a first approximation .."  I think there are contrasting views on this in the literature, that should be represented in this part of the paper. For example Rong et al. (2003) are quite critical of assumed quasiperiodic earthquake recurrence (on empirical grounds). OTOH there is empirical evidence that large earthquakes tend to be weakly periodic, but without correlation between successive interevent times (Griffin et al., 2020).
L5354: "Recent work has also used highresolution spatial grids .... to produce more accurate tsunami hazard results (e.g. < 90m, ...)". I don't think we should describe "< 90m" as highresolution for onshore work, that is quite coarse. I might describe resolutions of 10m or less as highresolution (e.g. Gibbons et al., 2020).
Near L67: "Since timedependent hazard estimation leads to more realistic shortterm results"  this really needs justification, or removal. To my knowledge this point has not been demonstrated in general, and it mayormaynot be true. I suppose for aftershock modelling there would be lots of evidence, but this study is using quasiperiodic modelling for large events, and I believe there is less evidence on this matter. In the Paleo record, some sites look more timedependent than others, e.g. Griffin et al. 2020.
Near L74: "A uniformslip was used, which may underestimate the hazard..."  I believe Horspool et al. (2014) used a lognormal distribution to predict the (uncertain) heights at the coast from the uniformslip scenarios, as a way of accounting for uncertainties due to the slip model and uncertain geometry. In principle this is supposed to compensate for the lack of slip heterogeneity. In practice it could either underestimate, or overestimate, the variability of natural earthquaketsunamis. If their sigma were sufficiently large, it may even predict greater hazard than your model (I haven't checked whether it does, just clarifying the principle).
L86: Please add a statement about why you use segments to define the rupture extents (I think it is related to the spacetime modelling?).
L115: "magnitudefrequency distribution"  Should this be "probability density function"? I think the MFD would include the factor lambda_i.
L117: "frequencymagnitude distribution"  It think this should be "Cumulative Distribution Function"? Furthermore I think you need to say that f_i(M) is the derivative of F(M) (and consider whether you need a subscript _i for F).
L125: In Equation 4, the subscript '_i' might be confused with the same subscript used to denote the source in Equation 2. Also, I think Eq 4 should use 'j' for consistency with notation in the paragraph just before Equation 4?
L127: Here I am concerned that you are not using the longterm paleo data to fit the GR model. Why not? The longer term data suggests a high rate of Mw >= 8.3 (8 events in 450 years, rate around 0.018), quite a bit more frequent than suggested by your timeindependent model (visually seems ~ 0.006 in Figure 1C, or onethird the frequency  noting this fit is dominated by lowmagnitude earthquakes, below magnitudes of practical interest for this study). As well as taking the opportunity to improve the model accuracy, this would be good because the longterm data is used for fitting the timedependent model. The use of very different data to fit the two models allows for a substantial 'arbitrary' difference between their results, which is not related to their structure (temporal/nontemporal). I am concerned that this may dominate the differences in your results. I would suggest you fit the timeindependent GR model using both the longterm and catalogue data (there are various approaches to treating the varying completeness magnitude, e.g. Weichert, 1980), while removing the instrumental events from the longterm data. Also, you might want to use fewer lowmagnitude earthquakes to constrain the fit (to reduce the influence of lowmagnitude earthquakes on the fit, and better represent the data at magnitudes that matter for this PTHA).
L135: Around here, could you please explicitly state that the timedependent model does not have Mwfrequency curves that follow the GR distribution, over any timescale. I didn't realise this initially, and it is obviously a very important point for the subsequent analysis. Perhaps a sentence highlighting that instead the Mwfrequency distribution will reflect correlations between rupture on different segments, which is parameterised by the model itself.
L150: It looks like the magnitudes only go up to 9? I think this is neglecting the large uncertainties in Mwmax. Neglect of those uncertainties may have a strong impact on the results. A few relevant points: Berryman et al. (2015) suggested uncertain Mwmax values in this region ranging from 9.0  9.6 based on scaling relations and the historical record. Such highly uncertain Mwmax values have been represented in PTHAs (e.g. Davies et al., 2017; Davies and Griffin, 2020). Horspool et al. (2014) allowed Mwmax on Sumatra to vary in 9.3  9.7. We know the nearby 2004 event had a magnitude exceeding 9 (around 9.2). From Tohoku we also know that Mw 9.1 can occur in relatively compact regions, smaller than the extents of your study. On this basis I don't think we can exclude the possibility of higher magnitude earthquakes.
L153: "..for each of those 21 rupture scenarios"  suggest to add "geometrical" before "rupture scenarios", to be consistent with previous sentences. Here there are a few interacting concepts: "geometrical rupture scenarios (seems to be a magnitude plus a set of segments?)", "scenarios", "events" (is this the same as "scenarios"). I suggest you pick one term for each concept, and then use it consistently throughout the paper.
L 164: "(one height for one simulation catalog)"  does the height vary with space, or are we looking at the 'maximum height anywhere in the model'?
L 172: "The results confirm that N_{sim} = 100,000 catalogues are sufficient to produce a stable result"  stable in terms of what? The mean over all catalogues? Please make this clear, as I suppose individual catalogues must vary greatly.
L175179: This section is confusing me. Above I understood that you used N_{sim} = 100,000 to get a stable result. But now it is suggested that many more catalogues were required for 150 years. Please edit to make this clearer. [NOTE: Some sentences from the 'Results' section may help in this regard, mentioned below.].
L197199: "This number is consistent with the GR model". In my judgement they are "not very consistent", with the model underpredicting the frequency of large events (as discussed above, the GR model has a substantially lower frequency of Mw>=8.3). Note the 450 year record contains 10 events (Mw 7.88.9), but the first 8 events have Mw>=8.3, and the last two events are from the recent instrumental period. This suggests changes in the magnitude of completeness of the 450 year catalogue over time. Aprior we expect this would happen because Paleo records find it more difficult to detect small events. This issue should be accounted for when comparing the GR model with the long term data (and above I suggest that the longterm data should also be used to fit the timeindependent GR model  doing that will probably lead to significant increases in the modelled frequency of large earthquakes).
L205: "see Figure and Figure 5"  missing Figure number.
L217: Suggest you use a word other than "scenarios" to denote the 21 "magnitude + setofsegments" combinations.
L250ish: Above I argued that the longterm data (10 events, 450 years) is likely subject to a varying completeness magnitude, noting the only two events with Mw<8 events are recent instrumental events, and all others have Mw>=8.3. From what I can see, this 'changing completeness magnitude' is not accounted for in the statistical fit of the timevarying model (Section 2.2.2). I expect this would have a large effect on timevarying model fit  for example, overestimating the conditional probability of multisegment rupture (which also effects the frequency of high Mw events), and affecting the BPT model parameters.
L250: "The prior median of mu for each segment is different, namely ....... These values represent the median interarrival time of earthquake rupture on each segment over the last 450 years". It sounds like you are using the same data both to specify the priors, and then to fit the model (?). In Bayesian statistics, the priors should be specified in a way that doesn't use the fitting data, or at least doesn't use it in important ways. Another potential problem with the methodology is suggested by Figure 9, where we see the prior and posterior for 'mu' are very different on some segments  the posterior is more diffuse and often has a very different average (e.g. Panels A, I, K). This suggests the priors have been overly constrained in the analysis. Typically priors would be set either using data different used for fitting, or given weakly informative values.
L310: This source zone has some history of "tsunamiearthquakes", with waves much larger than might be expected from the magnitude (e.g. Mentawai 2010). Can the current model produce similar large waves for scenarios with magnitude below 8, using the rigidity of 40GPA? I would be surprised if it can, although that will also depend on how concentrated the slip is allowed to be. Please add a comment on the capacity for the model to make 'tsunamiearthquake' type scenarios.
L317 "... 300 stochastic models are sufficient to simulate stable and consistent tsunami heights and depths"  I think this must depend on the model region, and what you are interested in. For instance it would not give an accurate representation of the 99.5th percentile. Also for a model where only a very small part of the sourcezone could affect the site of interest, one might need to generate many scenarios to get enough relevant scenarios. In summary, I don't think you can refer to stability tests from another study to provide justification for using 300 models in this study. Instead, can you report on a test that is specific to this case?
L347: "... the final parameter estimates are taken from the maximum a posterior". It would be better to account for the model uncertainty (also in Mwmax, b, etc), which should be substantial given the limited data available to fit the model, and will probably have a substantial impact on the hazard. One way to do this would be to draw a different parameter set for each of the large number of synthetic catalogues that are simulated.
Section 3.1: As discussed earlier, please comment on why the 'mupriors' for some segments are so different to the posteriors (little overlap for Fig 9 panels A, I, K). This is surprising given especially considering that the priors were apparently constructed using the same data used for fitting. To me it suggests weaknesses in how the priors were constructed, or some other problem.
L356360: This is a very clear description of how the catalogue duration was defined. I suggest you move this to the earlier methods section (where I expressed confusion about the method).
L360 and Figure 10: Regarding the validation of the annual seismic moment release: Considering that the data was used to fit the model, I don't think the observations/model are particularly consistent on segments 3 and 4. In both cases the observed data exceeds the 90th percentile of the model. Again this seems to suggest some underestimation in the model, as discussed repeatedly above. Please check that this is all correct following revisions, and if it is, add a comment explaining why this is nonetheless reasonably consistent.
L379: Figure 11C is not a strong basis for making a point about which segments rupture more or less, because it is only 1 catalogue. Can you please make a figure that better justifies the points made in this paragraph?
Line 447 and Figure 12: The conditional probability of Mw9.0 (if an earthquake occurs) is larger in the time independent case. But I doubt that these results will be robust to parameter uncertainties in the timedependent model, considering that limited data (10 events, that likely has timevarying completeness) was available to fit its many parameters. This further suggests the importance of considering model parameter uncertainties in the PTHA.
Line 451: One factor neglected in this discussion is the effect of using different datasets to fit the 2 models, which could cause differences in the results even if there was no other difference between the two kinds of models. I think the calculations in this paper should be revised so that the timeindependent model is informed by the longterm data, and that parameter uncertainties in both the timeindependent and timedependent models are accounted for. In my judgment it is likely that the parameter uncertainties will could lead to differences in the results that are substantially larger than differences between the current timedependent and timeindependent models.
# TECHNICAL CORRECTIONS
None for now.
# REFERENCESBerryman, K.; Wallace, L.; Hayes, G.; Bird, P.; Wang, K.; Basili, R.; Lay, T.; Pagani, M.; Stein, R.; Sagiya, T.; Rubin, C.; Barreintos, S.; Kreemer, C.; Litchfield, N.; Stirling, M.; Gledhill, K.; Haller, K. & Costa, C. The GEM Faulted Earth Subduction Interface Characterisation Project: Version 2.0 – April 2015 GEM, GEM, 2015
Davies, G.; Griffin, J.; Løvholt, F.; Glimsdal, S.; Harbitz, C.; Thio, H. K.; Lorito, S.; Basili, R.; Selva, J.; Geist, E. & Baptista, M. A. A global probabilistic tsunami hazard assessment from earthquake sources Geological Society, London, Special Publications, Geological Society of London, 2017
Davies, G. & Griffin, J. Sensitivity of Probabilistic Tsunami Hazard Assessment to FarField Earthquake Slip Complexity and Rigidity DepthDependence: Case Study of Australia Pure and Applied Geophysics, 2020, 177, 1521–1548
Gibbons, S. J.; Lorito, S.; Macías, J.; Løvholt, F.; Selva, J.; Volpe, M.; SánchezLinares, C.; Babeyko, A.; Brizuela, B.; Cirella, A.; Castro, M. J.; de la Asunción, M.; Lanucara, P.; Glimsdal, S.; Lorenzino, M. C.; Nazaria, M.; Pizzimenti, L.; Romano, F.; Scala, A.; Tonini, R.; Manuel González Vida, J. & Vöge, M. Probabilistic Tsunami Hazard Analysis: High Performance Computing for Massive Scale Inundation Simulations. Frontiers in Earth Science, 2020, 8, 623
Griffin, J. D.; Stirling, M. W. & Wang, T. Periodicity and Clustering in the LongTerm Earthquake Record Geophysical Research Letters, American Geophysical Union (AGU), 2020, 47
Horspool, N.; Pranantyo, I.; Griffin, J.; Latief, H.; Natawidjaja, D. H.; Kongko, W.; Cipta, A.; Bustamam, B.; Anugrah, S. D. & Thio, H. K. A probabilistic tsunami hazard assessment for Indonesia Natural Hazards and Earth System Sciences, 2014, 14, 31053122
Rong, Y.; Jackson, D. D. & Kagan, Y. Y. Seismic gaps and earthquakes Journal of Geophysical Research: Solid Earth, WileyBlackwell, 2003, 108
Weichert, D. H. Estimation of the earthquake recurrence parameters for unequal observation periods for different magnitudes Bulletin of the Seismological Society of America, 1980, 70, 13371346
Citation: https://doi.org/10.5194/nhess202259RC2 
AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022
We thank the reviewer for providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 2.

AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022

RC3: 'Comment on nhess202259', Anonymous Referee #3, 23 Mar 2022
Review of Muhammad et al
This paper presents a methodology for timedependent probabilistic tsunami hazard analysis with stochastic earthquake rupture modelling, using the Mentawai region of the Sunda Subduction Zone as a case study. This is a novel and ambitious approach, and it is exciting to see the efforts made by the authors. In my view, the complexity of the model does however pose some challenges, and I think there are a number of points that require further justification and/or consideration of the choices made in the model. I expect this will require some effort to revise the model.
Major comments
Justification of the choice of a timedependent approach. A number of recent studies of global paleoearthquake records (Williams et al 2019; Griffin et al 2020; Moernaut 2020) have, to varying degrees, provided empirical support for weakly quasiperiodic earthquake recurrence as a general model, which can be used to justify the use of renewal models for hazard assessment. That said, the Mentawai record of Philibosian et al (2017) looks to be more random than quasiperiodic in the analysis presented by Griffin et al (2020), although perhaps a different result might be obtained using the segmentation model presented here. The posterior BPT parameter estimates given for each segment are also relevant – some give values of alpha ~1 (segments 2, 3 and 4), implying random recurrence (i.e. Poisson), while others are ~0.6 (segments 1, 5 and 6), implying moderately quasiperiodic recurrence. So, I think some comment needs to be made here that:
 At a global scale there is empirical support for weakly quasiperiodic earthquake recurrence as a general model (see Griffin et al 2020);
 Excluding the hypothesis at the individual fault level is difficult, particularly for short records (Williams et al 2019; Griffin et al 2020)
 The data from Philibosian et al (2017) is somewhat equivocal about whether earthquake recurrence here is truly timedependent, and the Poisson hypothesis cannot be confidently excluded using these data. But the global studies mentioned above suggest it is not unreasonable to assume timedependence as a hypothesis.
The discussion section of the paper could then discuss the implications of this assumption in light of the different values of alpha obtained for each segment.
In estimating parameters for the BPT distribution, the authors use the data to estimate the prior distribution of mu, before then using the same data to calculate the posterior probability distribution of mu. This is incorrect. I would suggest using an uninformative prior (e.g. as used by Fitzenz et al 2010). An alternative approach could be to use an informative prior for mu based on the slip rate (e.g. as determined from geodesy), but this may become complex (e.g. due to having to estimate coupling of the fault). The 450 year long record is short for accurately estimate model parameters. This is, of course, what a Bayesian approach should be helping with, but needs more care about the choice of priors.
I am also concerned that fitting the model parameters to each segment individually is problematic. Later you consider multisegment ruptures, and it is not clear how all this fits together. Do the recurrence statistics obtained from the sum of all synthetic ruptures across all segments match the recurrence statistics from the sum of all historic/paleo ruptures in your data? Checking this could be a good test for your model.
Also related to parameter estimation, some of the posterior histograms seem a bit spiky; does this improve if the number of samples is increased beyond 10,000?
Spatiotemporal completeness of the paleo record compared with the instrumental record is an issue that I think could lead to biases in the parameter estimates. It is very unlikely that events similar to the Mw 7.8 2010 Mentawai event would be visible in the coral record; this event occurred near the trench and caused <4 cm subsidence on the Mentawai Islands as measured with GPS (Hill et al 2012). Related to the above, the Mmin of 7.6 (L129), while reasonable from a tsunami hazard assessment perspective, would mean that you are modelling events that are unlikely to be present in the paleoearthquake record. I am unsure of how the frequency of these events could be determined in the timedependent approach. Therefore it seems likely in your current approach that smaller events are missed in the paleoearthquake record, therefore affecting the recurrence model parameters.
The 1D rupture segmentation is a problem for tsunami hazard assessment, as the resulting tsunami size depends so significantly on the depth of rupture. Compare the 2007 Bengkulu earthquakes (Mw 8.4 and 7.9), that were downdip of the trench and did not generate a significant tsunami, with the 2010 Mentawai earthquakes (Mw7.8), which occurred near the trench and did generate a significant tsunami. It is not clear whether such events are discriminated by the stochastic modelling approach with 1D segmentation – it seems they probably aren’t, but I may not be understanding correctly. A related problem is lowrigidity near the trench and its tsunamigenic potential, as in the 2010 Mentawai tsunami? How might the assumption of constant (and relatively high) rigidity (L309310) bias your tsunami hazard results?
The maximum magnitude of 9.0 seems too low, which seems related to the segmentation model. If the potential for ruptures connecting with other segments of the Sunda Subduction Zone is considered, then larger Mmax values are justified. Significantly larger Mmax’s were used in Horspool et al (2014). Even if the paleoearthquake record for the past 450 years suggests events haven’t exceeded Mw 9.0, we also don’t expect these magnitude events to occur all that often. So allow for the possibility that they are missing from the record.
Some area of the coast of Padang show zero probability of inundation (Figure 17), while in others the potential inundation extent extends quite a way inland. This raises some significant concerns for me about the quality of the inundation modelling and/or the elevation data used, given how lowlying the coast is in this area. If only SRTM data was used, this could significantly underestimate inundation extent (see Griffin et al 2015, Figure 8). Are buildings included in the elevation model?
Detailed comments
L15: Suggest change ‘A total of >’ to ‘More than’
L18: Forecast periods begin in what year?
L136: Choice of BPT is fine, but hasn’t really been justified here. Why is this chosen over lognormal, Weibull or Gamma? Some of your justification seems to be presented later in Section 2.2.
L174: Several thousand years
L185: Perhaps rephrase as ‘reflects the expectations of elastic rebound theory’, or similar.
L192: Should probably cite others who’ve used Bayesian approaches to fitting timedependent models to earthquake records, in particular Rhoades et al (1994) and Fitzenz et al (2010).
L197 and Table 1: These should not be referred to as tsunamigenic. For half of them we have no information on whether a tsunami was generated; coseismic deformation on the Mentawai Islands observed in coral paleogeodetic records suggests they probably were, but we don’t actually know.
L324. Please give a link or citation for DEM5 and Bathy5.
L332: Might be a typo here – Griffin et al (2016) used a Manning’s roughness of 0.036 as a conservative minimum for land (grassland; for the Mentawai Islands). For the urban context here, 0.06 may be reasonable, e.g. Griffin et al (2015) suggested a Manning’s roughness of 0.08 for the city of Padang. See also Kaiser et al (2011) for a discussion of choice of Mannings n.
Ling 501502: The timeindependent model has too low an Mmax (9.0) to be considered worstcase. See earlier comments about choice of Mmax.
Table 1: Change Shieh to Sieh.
Figure 3, and also in the text. I do not think the term ‘occurred’ scenarios is the best terminology. These are modelled scenarios that have not actually occurred.
References:
Fitzenz, D. D., Ferry, M. A., & Jalobeanu, A. (2010). Longterm slip history discriminates among occurrence models for seismic hazard assessment. Geophysical Research Letters, 37(20), 1–5. https://doi.org/10.1029/2010GL044071
Griffin, J., Latief, H., Kongko, W., Harig, S., Horspool, N., Hanung, R., Rojali, A., Maher, N., Fuchs, A., Hossen, J., & others. (2015). An evaluation of onshore digital elevation models for modeling tsunami inundation zones. Frontiers in Earth Science, 3(32).
Griffin, J. D., Stirling, M. W., & Wang, T. (2020). Periodicity and Clustering in the LongâTerm Earthquake Record. Geophysical Research Letters, 47(22). https://doi.org/10.1029/2020GL089272
Horspool, N., Pranantyo, I., Griffin, J., Latief, H., Natawidjaja, D. H., Kongko, W., Cipta, A., Bustaman, B., Anugrah, S. D., & Thio, H. K. (2014). A probabilistic tsunami hazard assessment for Indonesia. Natural Hazards and Earth System Science, 14(11), 3105–3122.
Hill, E. M., Borrero, J. C., Huang, Z., Qiu, Q., Banerjee, P., Natawidjaja, D. H., Elosegui, P., Fritz, H. M., Suwargadi, B. W., Pranantyo, I. R., Li, L. L., Macpherson, K. A., Skanavis, V., Synolakis, C. E., & Sieh, K. (2012). The 2010 Mw 7.8 Mentawai earthquake: Very shallow source of a rare tsunami earthquake determined from tsunami field survey and nearfield GPS data. Journal of Geophysical Research: Solid Earth, 117(6), 1–21. https://doi.org/10.1029/2012JB009159
Kaiser, G., Scheele, L., Kortenhaus, A., Løvholt, F., Römer, H. and Leschka, S., 2011. The influence of land cover roughness on the results of high resolution tsunami inundation modeling. Natural Hazards and Earth System Sciences, 11(9), pp.25212540.
Moernaut, J. (2020, November 1). Timedependent recurrence of strong earthquake shaking near plate boundaries: A lake sediment perspective. EarthScience Reviews. Elsevier B.V. https://doi.org/10.1016/j.earscirev.2020.103344
Rhoades, D. A., & Van Dissen, R. J. (2003). Estimates of the timevarying hazard of rupture of the Alpine Fault, New Zealand, allowing for uncertainties. New Zealand Journal of Geology and Geophysics, 46(4), 479–488. https://doi.org/10.1080/00288306.2003.9515023
Williams, R. T., Davis, J. R., & Goodwin, L. B. (2019). Do Large Earthquakes Occur at Regular Intervals Through Time? A Perspective From the Geologic Record. Geophysical Research Letters, 46(14), 8074–8081. https://doi.org/10.1029/2019GL083291
Citation: https://doi.org/10.5194/nhess202259RC3 
AC3: 'Reply on RC3', Ario Muhammad, 19 Jul 2022
We thank the reviewer for providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 3.
Interactive discussion
Status: closed

RC1: 'Comment on nhess202259', Anonymous Referee #1, 18 Mar 2022
The manuscript “Timedependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using SpaceTime Earthquake Rupture Modelling and Stochastic Source Scenarios”
by Muhammad et al. presents an important application of timedependent probabilistic tsunami hazard analysis (PTHA) to the central Sunda subduction zone. The method involves several novel components, such as stochastic tsunami simulation and spacetime interactions among earthquakes, developed in previous publications but integrated in this applied study. The timedependent component may be particularly important for regions that have recently had a large magnitude earthquake (see comment 1, however) and for short design exposure times. Several minor comments are indicated below, primarily related to unstated assumptions and parameter uncertainty. Upon revision, this paper should be a valuable contribution to Natural Hazards and Earth System Sciences.
General Comments
(1) The study is based on the idea that a BPT or other time dependent rupture model more accurately represents earthquake behavior along the Sunda subduction zone. Given numerous papers refuting the seismic gap hypothesis for subduction zones in general (e.g., Rong et al., 2002 who cite Matthews, 2002), it seems that a logical first step for any study region is to falsify a Poisson null hypothesis.
(2) Although the definition of fault segments is based on 450 years of earthquake occurrence, there still might not be sufficient to determine if these segment boundaries are persistent (cf., Jackson et al, 2011).
(3) The earthquake occurrence model is based on a 1D (along strike) representation of the subduction zone. For the Sunda subduction zone, as with other subduction zones with a broad shelf, however, tsunami generation is critically dependent on the dip extent of rupture as was notably observed in comparisons of the 2004 and 2005 earthquakes (e.g., Geist et al., 2006). The limitation of the 1D approach should be mentioned.
(4) It seems that it would be straightforward to estimate uncertainties in mu, alpha, and gamma from the posterior distributions (confidence intervals). These uncertainties could then be used as part of the probabilistic calculations.
(5) My impression is that the maximum magnitude earthquake considered is from the 450year record and essentially is an event that spans segments 16. Even though the tsunami from an Mmax event would have a low probability, such an event may pose a more significant component of the aggregate hazard for longer exposure times than considered in this study. It should be clarified how Mmax is determined and whether a penultimate event could extend beyond the study region.
(6) Tsunami heights seem to “saturate” at nearly 10m (Figure 13). Is this dependent on the largest magnitude earthquake or is this caused by a hydrodynamic effect?
Inline comments
L42: VereJones’ stress release model (cf., Bebbington and Harte, 2001) could also be mentioned—more relevant to this study.
Eqn. 3 is a cumulative distribution function, not a frequencymagnitude distribution.
L141: I couldn’t find in the manuscript where the specific magnitudearea relation used was mentioned. Since this is often a contentious choice, especially for subduction zone earthquakes, the specific relation and its justification should be indicated.
L257: How is distance D determined?
L316: Same variable D used for slip here and distance in L257.
Fig. 3: “occurred scenarios” is awkward. Could just say “scenarios”.
References
Bebbington, M., and D. S. Harte (2001), On the Statistics of the Linked Stress Release Model, Journal of Applied Probability, 38, 176187.
Geist, E. L., S. L. Bilek, D. Arcas, and V. V. Titov (2006), Differences in tsunami generation between the December 26, 2004 and March 28, 2005 Sumatra earthquakes, Earth Planets Space, 58, 185193.
Jackson, D. D., Y. Y. Kagan, and H. Gupta (2011), Characteristic earthquakes and seismic gaps, Encyclopedia of solid earth geophysics, 5, 1539.
Rong, Y., D. D. Jackson, and Y. Y. Kagan (2003), Seismic gaps and earthquakes, J. Geophys. Res., 108, ESE 61  614.
Citation: https://doi.org/10.5194/nhess202259RC1 
AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022
We would like to thank the reviewer in providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 1.

AC1: 'Reply on RC1', Ario Muhammad, 19 Jul 2022

RC2: 'Comment on nhess202259', Anonymous Referee #2, 21 Mar 2022
The paper "Timedependent Probabilistic Tsunami Hazard Analysis for Western Sumatra, Indonesia, Using SpaceTime Earthquake Rupture Modelling and Stochastic Source Scenarios" by Muhammad et al develops PTHA for western Sumatra using two alternative magnitudefrequency models; one that is timedependent (and relatively novel), and another that is timeindependent (and more conventional). The paper details the methodologies used to fit each model, and compares their results.
The subject of this paper is of interest for tsunami hazard science and the readership of NHESS. The presentation is mostly clear although could be improved in places (mentioned in details below). But as far as I can tell, there are a number of weakness of this paper that will require major revisions to address, before it is suitable for publication. These involve problems with the statistical methods, including: using very different data to fit each model; not treating issues of timevarying completeness in the longterm historical data; use of questionable methods to set Bayesian priors. The paper also neglects key parameter uncertainties controlling the frequencies of earthquakes, that will probably have a large impact on the results (i.e. greater than the current differences between the timedependent and timeindependent models). The authors also need to adjust the introduction to better reflect controversy regarding the performance of timedependent models (in the literature)
I hope the authors can consider the comments below and either fix their analyses, or (where I am mistaken) edit the manuscript to make their approaches clearer and more obviously defensible. While this will take significant work, I think it is certainly doable, and will make for a useful contribution to NHESS.
# HIGHLEVEL COMMENTS
 Generally the paper argues that timedependent modelling is more accurate, but doesn't provide strong justification for this. To my knowledge there are contrasting views on this in the literature, which should be represented in this paper. The language should be softened, and uncertainties better discussed (see detailed comments).
 The tsunami hazard results do not seem account for uncertainties in the scenariofrequency model parameters (e.g. b, rate of earthquakes, maximummagnitude, and other parameters). This is true for both the timedependent and timeindependent models, although details of their parameters are different. Variation of the model parameters within the statistical uncertainties will likely have a substantial impact on the results, especially given only 10 events have been used to constrain the timedependent model (which has many parameters). When these uncertainties are accounted for, I expect they will be larger than the current difference between the timedependent and timeindependent results. Given the 'many synthetic catalogues' approach used in this paper, the uncertainties could be accounted for by randomly drawing different model parameters for each synthetic catalogue.
 The maximummagnitude is set to Mw9. In reality maximum magnitudes are quite uncertain (justified further below) and yet very impactful for the results. Again I expect they are likely more important than the effect of timedependence in the current modelling. Many other PTHA studies treat this as an uncertain parameter (details below), and I suggest that issue is also addressed in this paper.
 In so far as I can tell, there are a number of technical weaknesses in the scenario frequency modelling that should be addressed or clarified.
+ The longterm data (10 events over the last 450 years) likely has a timevarying magnitude of completeness; the earliest 8 events all have Mw>=8.3, while only the most recent 2 events (2007+) have Mw<=8. This is not surprising  apriori we certainly expect that it would be harder to detect smaller earthquakes in the paleo data. But the statistical methods seem to ignore this issue. This could have a large impact on the fit of the timedependent model.
+ The timeindependent model is fit to different data than the timedependent model, and the timeindependent fit is dominated by small earthquakes (mostly having magnitudes well below the Mw 7.65+ that are of interest in this study). Even if there were no differences between the models, the use of such different data would lead to differences in their results. This makes it hard to determine the significance of the timedependent model structure for the PTHA results. To remedy this, the longterm data should be used to constrain the timeindependent model for larger magnitudes. This may require accounting for timevariations in the catalogue completeness (citations below), and placing less weight on low magnitude earthquakes (so they don't dominate the fit at higher magnitudes, which currently doesn't agree especially well with the data). This should help reduce the underestimation of the earthquake frequencies with Mw>=8.3 (currently about three times less common in the timeindependent model, vs the long term data).
+ There appear to be some anomalies in the Bayesian fit of the timedependent model. The priors for some parameters seem to be set using the same data used for fitting, which should not be done with Bayesian statistics. Also, the figures show differences between the priors and posteriors that suggest a poor specification of the priors (details in comments below).
+ The fit of the timedependent model seems to ignore the change in completeness magnitude of the longterm data.# DETAILED COMMENTS
Near L25: 'Over the next decades, major tsunamigenic events are anticipated in .... '. It sounds like "we expect large tsunamis in each of these subduction zones within a few decades". I don't think this is well justified. Do the references really backup the 'major events in the nextfewdecades' claim? Historically, timedependent predictions over these kinds of timescales have not performed well for subduction zones (e.g. Rong et al., 2003).
Near L40: "...assuming a lack of memory between major earthquake occurrences is often viewed as a first approximation .."  I think there are contrasting views on this in the literature, that should be represented in this part of the paper. For example Rong et al. (2003) are quite critical of assumed quasiperiodic earthquake recurrence (on empirical grounds). OTOH there is empirical evidence that large earthquakes tend to be weakly periodic, but without correlation between successive interevent times (Griffin et al., 2020).
L5354: "Recent work has also used highresolution spatial grids .... to produce more accurate tsunami hazard results (e.g. < 90m, ...)". I don't think we should describe "< 90m" as highresolution for onshore work, that is quite coarse. I might describe resolutions of 10m or less as highresolution (e.g. Gibbons et al., 2020).
Near L67: "Since timedependent hazard estimation leads to more realistic shortterm results"  this really needs justification, or removal. To my knowledge this point has not been demonstrated in general, and it mayormaynot be true. I suppose for aftershock modelling there would be lots of evidence, but this study is using quasiperiodic modelling for large events, and I believe there is less evidence on this matter. In the Paleo record, some sites look more timedependent than others, e.g. Griffin et al. 2020.
Near L74: "A uniformslip was used, which may underestimate the hazard..."  I believe Horspool et al. (2014) used a lognormal distribution to predict the (uncertain) heights at the coast from the uniformslip scenarios, as a way of accounting for uncertainties due to the slip model and uncertain geometry. In principle this is supposed to compensate for the lack of slip heterogeneity. In practice it could either underestimate, or overestimate, the variability of natural earthquaketsunamis. If their sigma were sufficiently large, it may even predict greater hazard than your model (I haven't checked whether it does, just clarifying the principle).
L86: Please add a statement about why you use segments to define the rupture extents (I think it is related to the spacetime modelling?).
L115: "magnitudefrequency distribution"  Should this be "probability density function"? I think the MFD would include the factor lambda_i.
L117: "frequencymagnitude distribution"  It think this should be "Cumulative Distribution Function"? Furthermore I think you need to say that f_i(M) is the derivative of F(M) (and consider whether you need a subscript _i for F).
L125: In Equation 4, the subscript '_i' might be confused with the same subscript used to denote the source in Equation 2. Also, I think Eq 4 should use 'j' for consistency with notation in the paragraph just before Equation 4?
L127: Here I am concerned that you are not using the longterm paleo data to fit the GR model. Why not? The longer term data suggests a high rate of Mw >= 8.3 (8 events in 450 years, rate around 0.018), quite a bit more frequent than suggested by your timeindependent model (visually seems ~ 0.006 in Figure 1C, or onethird the frequency  noting this fit is dominated by lowmagnitude earthquakes, below magnitudes of practical interest for this study). As well as taking the opportunity to improve the model accuracy, this would be good because the longterm data is used for fitting the timedependent model. The use of very different data to fit the two models allows for a substantial 'arbitrary' difference between their results, which is not related to their structure (temporal/nontemporal). I am concerned that this may dominate the differences in your results. I would suggest you fit the timeindependent GR model using both the longterm and catalogue data (there are various approaches to treating the varying completeness magnitude, e.g. Weichert, 1980), while removing the instrumental events from the longterm data. Also, you might want to use fewer lowmagnitude earthquakes to constrain the fit (to reduce the influence of lowmagnitude earthquakes on the fit, and better represent the data at magnitudes that matter for this PTHA).
L135: Around here, could you please explicitly state that the timedependent model does not have Mwfrequency curves that follow the GR distribution, over any timescale. I didn't realise this initially, and it is obviously a very important point for the subsequent analysis. Perhaps a sentence highlighting that instead the Mwfrequency distribution will reflect correlations between rupture on different segments, which is parameterised by the model itself.
L150: It looks like the magnitudes only go up to 9? I think this is neglecting the large uncertainties in Mwmax. Neglect of those uncertainties may have a strong impact on the results. A few relevant points: Berryman et al. (2015) suggested uncertain Mwmax values in this region ranging from 9.0  9.6 based on scaling relations and the historical record. Such highly uncertain Mwmax values have been represented in PTHAs (e.g. Davies et al., 2017; Davies and Griffin, 2020). Horspool et al. (2014) allowed Mwmax on Sumatra to vary in 9.3  9.7. We know the nearby 2004 event had a magnitude exceeding 9 (around 9.2). From Tohoku we also know that Mw 9.1 can occur in relatively compact regions, smaller than the extents of your study. On this basis I don't think we can exclude the possibility of higher magnitude earthquakes.
L153: "..for each of those 21 rupture scenarios"  suggest to add "geometrical" before "rupture scenarios", to be consistent with previous sentences. Here there are a few interacting concepts: "geometrical rupture scenarios (seems to be a magnitude plus a set of segments?)", "scenarios", "events" (is this the same as "scenarios"). I suggest you pick one term for each concept, and then use it consistently throughout the paper.
L 164: "(one height for one simulation catalog)"  does the height vary with space, or are we looking at the 'maximum height anywhere in the model'?
L 172: "The results confirm that N_{sim} = 100,000 catalogues are sufficient to produce a stable result"  stable in terms of what? The mean over all catalogues? Please make this clear, as I suppose individual catalogues must vary greatly.
L175179: This section is confusing me. Above I understood that you used N_{sim} = 100,000 to get a stable result. But now it is suggested that many more catalogues were required for 150 years. Please edit to make this clearer. [NOTE: Some sentences from the 'Results' section may help in this regard, mentioned below.].
L197199: "This number is consistent with the GR model". In my judgement they are "not very consistent", with the model underpredicting the frequency of large events (as discussed above, the GR model has a substantially lower frequency of Mw>=8.3). Note the 450 year record contains 10 events (Mw 7.88.9), but the first 8 events have Mw>=8.3, and the last two events are from the recent instrumental period. This suggests changes in the magnitude of completeness of the 450 year catalogue over time. Aprior we expect this would happen because Paleo records find it more difficult to detect small events. This issue should be accounted for when comparing the GR model with the long term data (and above I suggest that the longterm data should also be used to fit the timeindependent GR model  doing that will probably lead to significant increases in the modelled frequency of large earthquakes).
L205: "see Figure and Figure 5"  missing Figure number.
L217: Suggest you use a word other than "scenarios" to denote the 21 "magnitude + setofsegments" combinations.
L250ish: Above I argued that the longterm data (10 events, 450 years) is likely subject to a varying completeness magnitude, noting the only two events with Mw<8 events are recent instrumental events, and all others have Mw>=8.3. From what I can see, this 'changing completeness magnitude' is not accounted for in the statistical fit of the timevarying model (Section 2.2.2). I expect this would have a large effect on timevarying model fit  for example, overestimating the conditional probability of multisegment rupture (which also effects the frequency of high Mw events), and affecting the BPT model parameters.
L250: "The prior median of mu for each segment is different, namely ....... These values represent the median interarrival time of earthquake rupture on each segment over the last 450 years". It sounds like you are using the same data both to specify the priors, and then to fit the model (?). In Bayesian statistics, the priors should be specified in a way that doesn't use the fitting data, or at least doesn't use it in important ways. Another potential problem with the methodology is suggested by Figure 9, where we see the prior and posterior for 'mu' are very different on some segments  the posterior is more diffuse and often has a very different average (e.g. Panels A, I, K). This suggests the priors have been overly constrained in the analysis. Typically priors would be set either using data different used for fitting, or given weakly informative values.
L310: This source zone has some history of "tsunamiearthquakes", with waves much larger than might be expected from the magnitude (e.g. Mentawai 2010). Can the current model produce similar large waves for scenarios with magnitude below 8, using the rigidity of 40GPA? I would be surprised if it can, although that will also depend on how concentrated the slip is allowed to be. Please add a comment on the capacity for the model to make 'tsunamiearthquake' type scenarios.
L317 "... 300 stochastic models are sufficient to simulate stable and consistent tsunami heights and depths"  I think this must depend on the model region, and what you are interested in. For instance it would not give an accurate representation of the 99.5th percentile. Also for a model where only a very small part of the sourcezone could affect the site of interest, one might need to generate many scenarios to get enough relevant scenarios. In summary, I don't think you can refer to stability tests from another study to provide justification for using 300 models in this study. Instead, can you report on a test that is specific to this case?
L347: "... the final parameter estimates are taken from the maximum a posterior". It would be better to account for the model uncertainty (also in Mwmax, b, etc), which should be substantial given the limited data available to fit the model, and will probably have a substantial impact on the hazard. One way to do this would be to draw a different parameter set for each of the large number of synthetic catalogues that are simulated.
Section 3.1: As discussed earlier, please comment on why the 'mupriors' for some segments are so different to the posteriors (little overlap for Fig 9 panels A, I, K). This is surprising given especially considering that the priors were apparently constructed using the same data used for fitting. To me it suggests weaknesses in how the priors were constructed, or some other problem.
L356360: This is a very clear description of how the catalogue duration was defined. I suggest you move this to the earlier methods section (where I expressed confusion about the method).
L360 and Figure 10: Regarding the validation of the annual seismic moment release: Considering that the data was used to fit the model, I don't think the observations/model are particularly consistent on segments 3 and 4. In both cases the observed data exceeds the 90th percentile of the model. Again this seems to suggest some underestimation in the model, as discussed repeatedly above. Please check that this is all correct following revisions, and if it is, add a comment explaining why this is nonetheless reasonably consistent.
L379: Figure 11C is not a strong basis for making a point about which segments rupture more or less, because it is only 1 catalogue. Can you please make a figure that better justifies the points made in this paragraph?
Line 447 and Figure 12: The conditional probability of Mw9.0 (if an earthquake occurs) is larger in the time independent case. But I doubt that these results will be robust to parameter uncertainties in the timedependent model, considering that limited data (10 events, that likely has timevarying completeness) was available to fit its many parameters. This further suggests the importance of considering model parameter uncertainties in the PTHA.
Line 451: One factor neglected in this discussion is the effect of using different datasets to fit the 2 models, which could cause differences in the results even if there was no other difference between the two kinds of models. I think the calculations in this paper should be revised so that the timeindependent model is informed by the longterm data, and that parameter uncertainties in both the timeindependent and timedependent models are accounted for. In my judgment it is likely that the parameter uncertainties will could lead to differences in the results that are substantially larger than differences between the current timedependent and timeindependent models.
# TECHNICAL CORRECTIONS
None for now.
# REFERENCESBerryman, K.; Wallace, L.; Hayes, G.; Bird, P.; Wang, K.; Basili, R.; Lay, T.; Pagani, M.; Stein, R.; Sagiya, T.; Rubin, C.; Barreintos, S.; Kreemer, C.; Litchfield, N.; Stirling, M.; Gledhill, K.; Haller, K. & Costa, C. The GEM Faulted Earth Subduction Interface Characterisation Project: Version 2.0 – April 2015 GEM, GEM, 2015
Davies, G.; Griffin, J.; Løvholt, F.; Glimsdal, S.; Harbitz, C.; Thio, H. K.; Lorito, S.; Basili, R.; Selva, J.; Geist, E. & Baptista, M. A. A global probabilistic tsunami hazard assessment from earthquake sources Geological Society, London, Special Publications, Geological Society of London, 2017
Davies, G. & Griffin, J. Sensitivity of Probabilistic Tsunami Hazard Assessment to FarField Earthquake Slip Complexity and Rigidity DepthDependence: Case Study of Australia Pure and Applied Geophysics, 2020, 177, 1521–1548
Gibbons, S. J.; Lorito, S.; Macías, J.; Løvholt, F.; Selva, J.; Volpe, M.; SánchezLinares, C.; Babeyko, A.; Brizuela, B.; Cirella, A.; Castro, M. J.; de la Asunción, M.; Lanucara, P.; Glimsdal, S.; Lorenzino, M. C.; Nazaria, M.; Pizzimenti, L.; Romano, F.; Scala, A.; Tonini, R.; Manuel González Vida, J. & Vöge, M. Probabilistic Tsunami Hazard Analysis: High Performance Computing for Massive Scale Inundation Simulations. Frontiers in Earth Science, 2020, 8, 623
Griffin, J. D.; Stirling, M. W. & Wang, T. Periodicity and Clustering in the LongTerm Earthquake Record Geophysical Research Letters, American Geophysical Union (AGU), 2020, 47
Horspool, N.; Pranantyo, I.; Griffin, J.; Latief, H.; Natawidjaja, D. H.; Kongko, W.; Cipta, A.; Bustamam, B.; Anugrah, S. D. & Thio, H. K. A probabilistic tsunami hazard assessment for Indonesia Natural Hazards and Earth System Sciences, 2014, 14, 31053122
Rong, Y.; Jackson, D. D. & Kagan, Y. Y. Seismic gaps and earthquakes Journal of Geophysical Research: Solid Earth, WileyBlackwell, 2003, 108
Weichert, D. H. Estimation of the earthquake recurrence parameters for unequal observation periods for different magnitudes Bulletin of the Seismological Society of America, 1980, 70, 13371346
Citation: https://doi.org/10.5194/nhess202259RC2 
AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022
We thank the reviewer for providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 2.

AC2: 'Reply on RC2', Ario Muhammad, 19 Jul 2022

RC3: 'Comment on nhess202259', Anonymous Referee #3, 23 Mar 2022
Review of Muhammad et al
This paper presents a methodology for timedependent probabilistic tsunami hazard analysis with stochastic earthquake rupture modelling, using the Mentawai region of the Sunda Subduction Zone as a case study. This is a novel and ambitious approach, and it is exciting to see the efforts made by the authors. In my view, the complexity of the model does however pose some challenges, and I think there are a number of points that require further justification and/or consideration of the choices made in the model. I expect this will require some effort to revise the model.
Major comments
Justification of the choice of a timedependent approach. A number of recent studies of global paleoearthquake records (Williams et al 2019; Griffin et al 2020; Moernaut 2020) have, to varying degrees, provided empirical support for weakly quasiperiodic earthquake recurrence as a general model, which can be used to justify the use of renewal models for hazard assessment. That said, the Mentawai record of Philibosian et al (2017) looks to be more random than quasiperiodic in the analysis presented by Griffin et al (2020), although perhaps a different result might be obtained using the segmentation model presented here. The posterior BPT parameter estimates given for each segment are also relevant – some give values of alpha ~1 (segments 2, 3 and 4), implying random recurrence (i.e. Poisson), while others are ~0.6 (segments 1, 5 and 6), implying moderately quasiperiodic recurrence. So, I think some comment needs to be made here that:
 At a global scale there is empirical support for weakly quasiperiodic earthquake recurrence as a general model (see Griffin et al 2020);
 Excluding the hypothesis at the individual fault level is difficult, particularly for short records (Williams et al 2019; Griffin et al 2020)
 The data from Philibosian et al (2017) is somewhat equivocal about whether earthquake recurrence here is truly timedependent, and the Poisson hypothesis cannot be confidently excluded using these data. But the global studies mentioned above suggest it is not unreasonable to assume timedependence as a hypothesis.
The discussion section of the paper could then discuss the implications of this assumption in light of the different values of alpha obtained for each segment.
In estimating parameters for the BPT distribution, the authors use the data to estimate the prior distribution of mu, before then using the same data to calculate the posterior probability distribution of mu. This is incorrect. I would suggest using an uninformative prior (e.g. as used by Fitzenz et al 2010). An alternative approach could be to use an informative prior for mu based on the slip rate (e.g. as determined from geodesy), but this may become complex (e.g. due to having to estimate coupling of the fault). The 450 year long record is short for accurately estimate model parameters. This is, of course, what a Bayesian approach should be helping with, but needs more care about the choice of priors.
I am also concerned that fitting the model parameters to each segment individually is problematic. Later you consider multisegment ruptures, and it is not clear how all this fits together. Do the recurrence statistics obtained from the sum of all synthetic ruptures across all segments match the recurrence statistics from the sum of all historic/paleo ruptures in your data? Checking this could be a good test for your model.
Also related to parameter estimation, some of the posterior histograms seem a bit spiky; does this improve if the number of samples is increased beyond 10,000?
Spatiotemporal completeness of the paleo record compared with the instrumental record is an issue that I think could lead to biases in the parameter estimates. It is very unlikely that events similar to the Mw 7.8 2010 Mentawai event would be visible in the coral record; this event occurred near the trench and caused <4 cm subsidence on the Mentawai Islands as measured with GPS (Hill et al 2012). Related to the above, the Mmin of 7.6 (L129), while reasonable from a tsunami hazard assessment perspective, would mean that you are modelling events that are unlikely to be present in the paleoearthquake record. I am unsure of how the frequency of these events could be determined in the timedependent approach. Therefore it seems likely in your current approach that smaller events are missed in the paleoearthquake record, therefore affecting the recurrence model parameters.
The 1D rupture segmentation is a problem for tsunami hazard assessment, as the resulting tsunami size depends so significantly on the depth of rupture. Compare the 2007 Bengkulu earthquakes (Mw 8.4 and 7.9), that were downdip of the trench and did not generate a significant tsunami, with the 2010 Mentawai earthquakes (Mw7.8), which occurred near the trench and did generate a significant tsunami. It is not clear whether such events are discriminated by the stochastic modelling approach with 1D segmentation – it seems they probably aren’t, but I may not be understanding correctly. A related problem is lowrigidity near the trench and its tsunamigenic potential, as in the 2010 Mentawai tsunami? How might the assumption of constant (and relatively high) rigidity (L309310) bias your tsunami hazard results?
The maximum magnitude of 9.0 seems too low, which seems related to the segmentation model. If the potential for ruptures connecting with other segments of the Sunda Subduction Zone is considered, then larger Mmax values are justified. Significantly larger Mmax’s were used in Horspool et al (2014). Even if the paleoearthquake record for the past 450 years suggests events haven’t exceeded Mw 9.0, we also don’t expect these magnitude events to occur all that often. So allow for the possibility that they are missing from the record.
Some area of the coast of Padang show zero probability of inundation (Figure 17), while in others the potential inundation extent extends quite a way inland. This raises some significant concerns for me about the quality of the inundation modelling and/or the elevation data used, given how lowlying the coast is in this area. If only SRTM data was used, this could significantly underestimate inundation extent (see Griffin et al 2015, Figure 8). Are buildings included in the elevation model?
Detailed comments
L15: Suggest change ‘A total of >’ to ‘More than’
L18: Forecast periods begin in what year?
L136: Choice of BPT is fine, but hasn’t really been justified here. Why is this chosen over lognormal, Weibull or Gamma? Some of your justification seems to be presented later in Section 2.2.
L174: Several thousand years
L185: Perhaps rephrase as ‘reflects the expectations of elastic rebound theory’, or similar.
L192: Should probably cite others who’ve used Bayesian approaches to fitting timedependent models to earthquake records, in particular Rhoades et al (1994) and Fitzenz et al (2010).
L197 and Table 1: These should not be referred to as tsunamigenic. For half of them we have no information on whether a tsunami was generated; coseismic deformation on the Mentawai Islands observed in coral paleogeodetic records suggests they probably were, but we don’t actually know.
L324. Please give a link or citation for DEM5 and Bathy5.
L332: Might be a typo here – Griffin et al (2016) used a Manning’s roughness of 0.036 as a conservative minimum for land (grassland; for the Mentawai Islands). For the urban context here, 0.06 may be reasonable, e.g. Griffin et al (2015) suggested a Manning’s roughness of 0.08 for the city of Padang. See also Kaiser et al (2011) for a discussion of choice of Mannings n.
Ling 501502: The timeindependent model has too low an Mmax (9.0) to be considered worstcase. See earlier comments about choice of Mmax.
Table 1: Change Shieh to Sieh.
Figure 3, and also in the text. I do not think the term ‘occurred’ scenarios is the best terminology. These are modelled scenarios that have not actually occurred.
References:
Fitzenz, D. D., Ferry, M. A., & Jalobeanu, A. (2010). Longterm slip history discriminates among occurrence models for seismic hazard assessment. Geophysical Research Letters, 37(20), 1–5. https://doi.org/10.1029/2010GL044071
Griffin, J., Latief, H., Kongko, W., Harig, S., Horspool, N., Hanung, R., Rojali, A., Maher, N., Fuchs, A., Hossen, J., & others. (2015). An evaluation of onshore digital elevation models for modeling tsunami inundation zones. Frontiers in Earth Science, 3(32).
Griffin, J. D., Stirling, M. W., & Wang, T. (2020). Periodicity and Clustering in the LongâTerm Earthquake Record. Geophysical Research Letters, 47(22). https://doi.org/10.1029/2020GL089272
Horspool, N., Pranantyo, I., Griffin, J., Latief, H., Natawidjaja, D. H., Kongko, W., Cipta, A., Bustaman, B., Anugrah, S. D., & Thio, H. K. (2014). A probabilistic tsunami hazard assessment for Indonesia. Natural Hazards and Earth System Science, 14(11), 3105–3122.
Hill, E. M., Borrero, J. C., Huang, Z., Qiu, Q., Banerjee, P., Natawidjaja, D. H., Elosegui, P., Fritz, H. M., Suwargadi, B. W., Pranantyo, I. R., Li, L. L., Macpherson, K. A., Skanavis, V., Synolakis, C. E., & Sieh, K. (2012). The 2010 Mw 7.8 Mentawai earthquake: Very shallow source of a rare tsunami earthquake determined from tsunami field survey and nearfield GPS data. Journal of Geophysical Research: Solid Earth, 117(6), 1–21. https://doi.org/10.1029/2012JB009159
Kaiser, G., Scheele, L., Kortenhaus, A., Løvholt, F., Römer, H. and Leschka, S., 2011. The influence of land cover roughness on the results of high resolution tsunami inundation modeling. Natural Hazards and Earth System Sciences, 11(9), pp.25212540.
Moernaut, J. (2020, November 1). Timedependent recurrence of strong earthquake shaking near plate boundaries: A lake sediment perspective. EarthScience Reviews. Elsevier B.V. https://doi.org/10.1016/j.earscirev.2020.103344
Rhoades, D. A., & Van Dissen, R. J. (2003). Estimates of the timevarying hazard of rupture of the Alpine Fault, New Zealand, allowing for uncertainties. New Zealand Journal of Geology and Geophysics, 46(4), 479–488. https://doi.org/10.1080/00288306.2003.9515023
Williams, R. T., Davis, J. R., & Goodwin, L. B. (2019). Do Large Earthquakes Occur at Regular Intervals Through Time? A Perspective From the Geologic Record. Geophysical Research Letters, 46(14), 8074–8081. https://doi.org/10.1029/2019GL083291
Citation: https://doi.org/10.5194/nhess202259RC3 
AC3: 'Reply on RC3', Ario Muhammad, 19 Jul 2022
We thank the reviewer for providing significant comments to improve our work.
However, since the responses to the reviewer's comments are significant, including several figures and long texts, we then attach the PDF SUPPLEMENT file to respond to all of the general and detailed comments from reviewer 3.
Viewed
HTML  XML  Total  BibTeX  EndNote  

832  288  51  1,171  37  37 
 HTML: 832
 PDF: 288
 XML: 51
 Total: 1,171
 BibTeX: 37
 EndNote: 37
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1