The present work proposes a simulation-based Bayesian method for parameter estimation and fragility model selection for mutually exclusive and collectively exhaustive (MECE) damage states. This method uses an adaptive Markov chain Monte Carlo simulation (MCMC) based on likelihood estimation using point-wise intensity values. It identifies the simplest model that fits the data best, among the set of viable fragility models considered. The proposed methodology is demonstrated for empirical fragility assessments for two different tsunami events and different classes of buildings with varying numbers of observed damage and flow depth data pairs. As case studies, observed pairs of data for flow depth and the corresponding damage level from the South Pacific tsunami on 29 September 2009 and the Sulawesi–Palu tsunami on 28 September 2018 are used. Damage data related to a total of five different building classes are analysed. It is shown that the proposed methodology is stable and efficient for data sets with a very low number of damage versus intensity data pairs and cases in which observed data are missing for some of the damage levels.

Fragility models express the probability of exceeding certain damage thresholds for a given level of intensity for a specific class of buildings or infrastructure. Empirical fragility curves are models derived from observed pairs of damage and intensity data for buildings and infrastructure usually collected, acquired and even partially simulated in the aftermath of disastrous events. Some examples of empirical fragility models are seismic fragility (Rota et al., 2009; Rosti et al., 2021), tsunami fragility (Koshimura et al., 2009a; Reese et al., 2011; a comprehensive review can be found in Charvet et al., 2017), flooding fragility (Wing et al., 2020) and debris flow fragility curves (Eidsvig et al., 2014). Empirical fragility modelling is greatly affected by how the damage and intensity parameters are defined. Mutually exclusive and collectively exhaustive (MECE, see next section for the definition) damage states are quite common in the literature as discrete physical damage states. The MECE condition is necessary for damage states in most probabilistic risk formulations, leading to the mean rate of exceeding loss (e.g. Behrens et al., 2021).

Tsunami fragility curves usually employ the tsunami flow depth as the measure of intensity; although different studies also use other measures like current velocity (e.g. De Risi et al., 2017b; Charvet et al., 2015). Charvet et al. (2015) demonstrate that the flow depth may cease to be an appropriate measure of intensity for higher damage states, and other parameters such as the current velocity, debris impact and scour can become increasingly more important. De Risi et al. (2017b) developed bivariate tsunami fragilities, which account for the interaction between the two intensity measures of tsunami flow depth and current velocity.

Early procedures for empirical tsunami fragility curves used data binning to represent the intensity. For example, Koshimura et al. (2009b) binned the observations by the intensity measure, i.e. the flow depth; however, the latest procedures have mostly used point-wise intensity estimates instead.

Fragility curves for MECE damage states are distinguished by their nicely “laminar” shape; in other words, the curves should not intersect. When fitting empirical fragility curves to observed damage data, this condition is not satisfied automatically. For example, fragility curves are usually fitted for individual damage states separately, and they are filtered afterwards to remove the crossing fragility curves (e.g. Miano et al., 2020), or ordered (“parallel”) fragility models are used from the start (Charvet et al., 2014; Lahcene et al., 2021). Charvet et al. (2014) and De Risi (2017a) also used partially ordered models to derive fragility curves for MECE damage states. They used the multinomial probability distribution to model the probability of being in any of MECE damage states based on binned intensity representation. De Risi et al. (2017a) used Bayesian inference to derive the model parameters for an ensemble of fragility curves.

Empirical tsunami fragility curves are usually constructed using generalised
linear models based on probit, logit or the complementary loglog link
functions (Charvet et al., 2014; Lahcene et al., 2021). As far as the
assessment of the goodness of fit, model comparison and selection are
concerned, approaches based on the likelihood ratio and Akaike information
criterion, (e.g. Charvet et al., 2014; Lahcene et al., 2021) and on

The present paper presents a simulation-based Bayesian method for inference and model class selection for the ensemble modelling of the tsunami fragility curves for MECE damage states for a given class of buildings. By fitting the (positive definite) fragility link function to the conditional probability of being in a certain damage state, given that building is not in any of the preceding states, the method ensures that the fragility curves do not cross (i.e. they are “hierarchical” as in De Risi et al., 2017a). The method uses adaptive Markov chain Monte Carlo simulation (MCMC, Beck and Au, 2002), based on likelihood estimation using point-wise intensity values, to infer the ensemble of the fragility model parameters. Alternative link functions are compared based on log evidence, which considers both the average goodness of fit (based on log likelihood) and the model parsimony (based on relative entropy). This way, among the set of viable models considered, it identifies the simplest model that fits the data best. By “simplest model”, we mean the model having maximum relative entropy (measured using the Kullback–Leibler distance; Kullback and Leibler, 1951) with respect to the data. This usually means the model has a small number of parameters.

The intensity measure, IM, (or simply “intensity”, e.g. the tsunami flow
depth) refers to a parameter used to convey information about an event from
the hazard level to the fragility level – it is an intermediate variable.
The damage parameter,

Graphical representation of damage levels

Damage states

The proposed methodology herein is also applicable to fragility assessment
in cases where observed damage data are not available for some of the damage
levels. Let

The generalised regression models (GLMs) are more suitable for empirical fragility assessment with respect to the standard regression models. This is mainly because the dependent variable in the case of the generalised regression models is a Bernoulli binary variable (i.e. only two possible values: 0 or 1). Bernoulli variables are particularly useful in order to detect whether a specific damage level is exceeded or not (only two possibilities). In the following, fragility assessment based on GLMs is briefly described.

The term

For each damage threshold, fragility can be obtained for a desired building
class considering that the damage data provide Bernoulli variables (binary
values) of whether the considered damage level was exceeded or not for given IM levels. For damage threshold

In the following, we have referred to the general methodology of fitting
fragility model to data – one damage state at a time – the “

Equation (2) for

We use the Bayesian model class selection (BMCS) herein to identify the best
link model to use in the generalised linear regression scheme. However, the
procedure is general and can be applied to a more diverse pool of candidate
fragility models. BMCS (or model comparison) is essentially Bayesian
updating at the model class level to make comparisons among candidate model
classes given the observed data (e.g. Beck and Yuen, 2004; Muto and Beck,
2008). Given a set of

The likelihood

For each realisation of the vector of model parameters

The central South Pacific region-wide tsunami was triggered by an
unprecedented earthquake doublet (

The classification of damage thresholds (the damage scale) used for the 2009 South Pacific tsunami (from Reese et al., 2011).

On Friday 28 September 2018, at 18:02 local time, a shallow strike-slip
earthquake of moment magnitude

The classification of damage thresholds (the damage scale) used for the 2018 Sulawesi–Palu tsunami (from Paulik et al., 2019).

Table 3 illustrates the building classes, for which fragility curves are
obtained based on the proposed procedure and based on the databases related
to the two tsunami events described above. The taxonomy used for describing
the building class matches the original description used in the raw
databases. The number of data points available for different building
classes showcases both classes with large number of data available, e.g.
brick masonry 1 storey (South Pacific) and non-engineered brick masonry 1 storey (Sulawesi), and classes with few data points available, e.g. timber
residential (South Pacific) and non-engineered masonry 2 storeys and timber
(Sulawesi). The fifth column in the table shows the proportion of the
number of damage levels for which observed data are available

The building classes.

For each building class considered, we have considered the set of candidate
models consisting of the fragility models resulting from the three
alternative link functions used in the generalised linear regression in
Eq. (5). That is,

The first step towards fragility assessment by employing the MLE method (see
Sect. 2.3) is to define the vector of model parameters

The vectors defining the MLE of the model parameters,

The model parameters

In the first step, the model parameters are estimated for each model class
separately. For each model class

The RF curves derived from the hierarchical fragility curves (see Sect. 2.5) and the corresponding plus/minus 1 standard deviation (

Building class 1 (brick masonry residential, 1 storey) of the South Pacific 2009 tsunami considering fragility model class

Building class 2 (timber residential) of the South Pacific 2009 tsunami considering fragility model class

Building class 1 (non-engineered masonry, unreinforced
with clay brick, 1 storey) of the Sulawesi–Palu 2018 tsunami considering
fragility model class

Building class 2 (non-engineered masonry, unreinforced with clay brick, 2 storey) of the Sulawesi–Palu 2018 tsunami considering fragility model class

Building class 3 (non-engineered light timber) of the Sulawesi–Palu 2018 tsunami considering fragility model class

The equivalent lognormal parameters and the epistemic
uncertainty in the RF assessment for all the building classes, damage
thresholds, and model classes

With reference to Eq. (12), the

Given the samples generated from the joint posterior PDFs

For instance, for Class 1 (masonry residential) for the South Pacific tsunami,
model class

Bayesian model class selection results for empirical fragility models. The best model for each building class is shown in bold.

In the basic method (see Sect. 2.2), the fragility

The Model parameters

Figure 7 compares the fragility assessment obtained based on MLE-based
hierarchical fragility modelling (see also the MLE-based curves in Figs. 2b
to 6b) with the result of the fragility assessment by employing the
MLE-

Comparison between the fragility assessment by
MLE-based hierarchical fragility modelling and MLE-

Table 8 reports the fragility assessment parameters of the MLE and
MLE-

Comparison between fragility assessment based on MLE method (by hierarchical fragility modelling) and the MLE-

The results outlined in this section show a fragility assessment for two different data sets corresponding to observed damaged in the aftermath of South Pacific and Sulawesi–Palu tsunami events. We have demonstrated the versatility of the proposed workflow and tool for hierarchical fragility assessment both for cases in which a large number of data points are available (e.g. Class 1, brick masonry residential, South Pacific tsunami; Class 1, one-storey non-engineered masonry, Sulawesi–Palu tsunami) and cases where very few data points are available (e.g. Class 2, timber residential, South Pacific tsunami; Class 3, non-engineered light timber, Sulawesi–Palu tsunami). Moreover, we demonstrated how the proposed workflow avoids crossing fragility curves (e.g. Class 2, timber residential, South Pacific tsunami; Class 3, non-engineered light timber, Sulawesi–Palu tsunami). The results illustrated for the five building classes demonstrate that the proposed workflow for hierarchical fragility assessment can be applied in cases in which data points are not available for all the damage levels within the damage scale.

An integrated procedure based on Bayesian model class selection (BMCS) for
empirical hierarchical fragility modelling for a class of buildings or
infrastructure is presented. This procedure is applicable to fragility
modelling for any type of hazard as long as the damage scale consists of
mutually exclusive and collectively exhaustive (MECE) damage states, and the
observed damage data points are independent. This simulation-based procedure
can (1) perform hierarchical fragility modelling for MECE damage states, (2) estimate the confidence interval for the resulting fragility curves, and (3) select the simplest model that fits the data best (i.e. maximises log
evidence) amongst a suite of candidate fragility models (herein, alternative
link functions for generalised linear regression are considered). The
proposed procedure is demonstrated for empirical fragility assessment based
on observed damage data to masonry residential (1 storey) and timber
residential buildings due to the 2009 South Pacific tsunami in American
Samoa and the Samoan Islands and non-engineered masonry buildings (1 and 2 storeys) and non-engineered light timber buildings due to the 2018
Sulawesi–Palu tsunami. It is observed that

For each model class, the same set of simulation realisations is used to estimate the fragility parameters, the confidence band and the log evidence. The latter, which consists of two terms depicting the goodness of fit and the information gain between posterior distribution resulting from the observed data and the prior distribution, is used to compare the candidate fragility models to identify the model that maximises the evidence.

Hierarchical fragility assessment can be done also based the maximum likelihood estimation (MLE) and the available statistical toolboxes (e.g. MATLAB's generalised linear model). For each damage level, the reference domain should be the subset of data that exceeds the consecutive lower damage level, instead of taking the entire set of data points as reference domain. Note that the basic fragility estimation (“MLE-

The procedure is also applicable to cases in which observed data are available only for a subset of the damage levels within the damage scale. The number of fragility curves is going to be equal to the total number of damage levels for which data are available minus one. This means, in order to have at least one fragility curve, one needs to have data available for at least two damage levels.

Although the resulting fragility curves are not lognormal (strictly speaking), equivalent statistics work quite well in showing the fragility curves (median and logarithmic dispersion) and the corresponding epistemic uncertainty (logarithmic dispersion).

The proposed BMCS method and the one based on MLE lead to essentially the same set of parameter estimates for hierarchical fragility estimation. However, the latter does not readily lead to the confidence band and log evidence.

Using the basic method for fragility estimation (MLE-

The major improvement offered by this method is in providing a tool that can fit fragility curves to a set of hierarchical levels of damage or loss in an ensemble manner. This method, which starts from prescribed fragility models and explicitly ensures the hierarchical relation between the damage levels, is very robust in cases where few data points are available and/or where data are missing for some of the damage levels. This tool provides confidence bands for the fragility curves and performs model selection among a set of viable link functions for generalised regression. It is important to note that the proposed method is in general applicable to hierarchical vulnerability modelling for human or economic loss levels and to different types of hazards if (1) the defined levels are mutually exclusive and collectively exhaustive, and (2) a suitable intensity measure (IM) can be identified.

The probability of being in damage state

The probability of being in damage state

From an information-based point of view, the logarithm of the evidence
(log-evidence), denoted as

Let us assume that the vector of parameters for the

The MCMC simulation scheme has a Markovian nature where the transition from the current state to a new state is done by using a conditional transition function that is conditioned on the current (last) state. Let us assume that the vector of parameters for the

Simulate a

Calculate the acceptance probability

Generate

If

With reference to Eq. (E1), samples from the posterior can be drawn
based on MH algorithm without any need to define the normalising

The choice of the proposal distribution

The adaptive MH algorithm (Beck and Au, 2002) introduces a sequence of
intermediate candidate evolutionary PDF's that resemble more and more the
target PDF. Let

The adaptive MCMC procedure for drawing samples from the model parameters
from the joint posterior PDF

Figure F1 illustrates the histograms representing the drawn samples from the
joint posterior PDFs corresponding to the sampled model parameters

Distribution of the fragility model parameters

The code implementing the methodology and data used to produce the results in this article are available at the following URL:

FJ designed and coordinated this research. HE performed the simulations and developed the fragility functions. KT cured the availability of codes and software on the European Tsunami Risk Service (ETRiS). BB provided precious insights on the damage data gathered for American Samoa and the Samoan Islands in the aftermath of the 2009 South Pacific tsunami (documented in Reese et al., 2011). All authors have contributed to the drafting of the paper. The first two authors contributed in an equal manner to the drafting of the paper.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the special issue “Tsunamis: from source processes to coastal hazard and warning”. It is not associated with a conference.

The authors would like to acknowledge partial support from Horizon Europe Project Geo-INQUIRE. Geo-INQUIRE is funded by the European Commission under project no. 101058518 within the HORIZON-INFRA-2021-SERV-01 call. The authors are grateful to the anonymous reviewer and Carmine Galasso for their insightful and constructive comments.

The authors gratefully acknowledge partial support by PRIN-2017 MATISSE (

This research has been partially supported by PRIN-2017 MATISSE (

This paper was edited by Animesh Gain and reviewed by Carmine Galasso and one anonymous referee.