Quantification of uncertainty in rapid estimation of earthquake fatalities 1 based on scenario analysis 2

based on scenario analysis 2 Xiaoxue Zhang, Hanping Zhao, Fangping Wang, Zezheng Yan, Sida Cai& Xiaowen Mei 3 1.Key Laboratory of Environmental Change and Natural Disaster, MOE, Faculty of Geographical Science, 4 Beijing Normal University, Beijing, China; 5 2.Academy of Disaster Reduction and Emergency Management, Ministry of Civil Affairs & Ministry of 6 Education, Faculty of Geographical Science, Beijing, China; 7 Correspondence to: Hanping Zhao (zhaohanping@bnu.edu.cn) 8 Abstract: The rapid estimation of earthquake fatalities using earthquake parameters is the core 9 basis for emergency response. However, there are numerous factors affecting earthquake 10 fatalities, and it is impossible to obtain an accurate estimation result. The key to solve this 11 problem is quantifying the uncertainty. In this paper, we proposed a new method to estimate 12 earthquake fatalities and quantify the uncertainty based on basic earthquake emergency scenarios. 13 The accuracy of the model is verified by earthquake that occurred during recent year. The 14 preliminary analysis and comparison results show that the model is more effective and reasonable 15 and can also provide a theoretical basis for post-earthquake emergency response. 16


Introduction
The most important assessment after a destructive earthquake is the estimation of fatalities (Samardjieva. 2002).However, a field investigation cannot be conducted quickly, often because of road damage and communication interruption.(Kongar et al. 2015;Yuan and Wang. 2009).
Nevertheless, one can estimate earthquake fatalities in a few minutes using earthquake parameters (such as magnitude, intensity and initial time) (Frolova, et al. 2011;Wald, et al. 2008).
In addition, it is essential to study the uncertainty of the estimation because there are various uncontrollable factors in the process of estimation.In this sense, a preliminary estimation with uncertainty analysis of earthquake fatalities using available earthquake parameters is a key path in starting the emergency response.
At present, the methods for estimating earthquake fatalities mainly include analytical, semi- However, the calculation of analytical and semi-analytical models are based on building damage data, which are not suitable for rapid estimation (Li, et al. 2015;Weng, et al. 2009).During recent years, the empirical model has been widely used in rapid estimation, which depends on statistical analysis using historical loss data.The empirical model provides an important opportunity to quickly and approximately assess the earthquake loss.Regarding the study of the empirical model, Japanese researchers did so relatively early.Kawasumi (1951) proposed a measure to estimate the danger and expectation of the maximum intensity of destructive earthquakes in Japan.
Similarly, Ohta et al. (1983) developed an empirical relationship for estimating the number of casualties within the number of completely destroyed houses.A more recent attempt was based on an analysis of strong global earthquakes during the twentieth century, which obtained a loglinear relationship for fatalities as a function of magnitude and population density (Samardjieva. 2002).On the basis of Samardjieva's study, Badal et al. (2005) put forward a quantitative earthquake fatality estimation model that considered the mortality rate.Similarly, Nichols and Beavers (2003) studied the earthquake loss catalog of the twentieth century and established a bounding function with the fatality count and magnitude.Chen et al. (2005) analyzed earthquake cases on mainland China and developed an empirical equation based on the standard of population density and the relationship between the seismic fatalities and the magnitude.Jaiswal et al. (2009) established a mortality model based on population distribution according to rebuilt earthquake case scenes and studied regional earthquake cases (Jaiswal et al. 2010).Generally speaking, the current empirical model for fatality estimation is derived from available historical data and relies on parameter regression analysis.Therefore, there are two problems with the empirical model.First, it will ignore extreme events when there is lack of historical data.Second, most models consider fewer factors and do not consider the influence between know factors and possible unknown factors.It is quite essential to establish a new rapid estimation model of earthquake fatalities that can avoid these problems.
The data or processes used in the empirical model contain considerable uncertainty, and the uncertainty in these components is the source of inaccuracy or error in the estimation results (Gardi et al. 2011;Gall et al. 2009;Wirtz et al. 2014) description (Romã o, 2016), and there is a relative lack of quantitative research.Qualitative description is the most widely used method to describe the uncertainty in disaster estimation (Van Asselt 2000).There are many linguistic uncertainties when describing the uncertainty in terms of vagueness and context, which can result in an inaccurate qualitative description.The numerical quantification of uncertainty is possible for emergency decision making when the information is partial or not quantifiable during the process of estimation.It is imperative to construct a suitable model to quantify the uncertainty in the estimation of earthquake fatalities.
In this paper, we present a new approach to estimate earthquake fatality expectations and quantify the uncertainty in the estimation, which is expressed as a function of the mortality rate and victims.The basic scenarios are constructed using the magnitude, the initial time and the relationship between the epicentral intensity and the epicentral fortification intensity, and these scenarios consider combinations of parameters.This study not only breaks the traditional empirical model form but also quantifies the uncertainty in the estimation results.

Earthquake fatalities in mainland China
In general, historical earthquake fatality and exposure data provide a useful basis for future earthquake fatality estimation.We collected destructive earthquake data from earthquakes that occurred on mainland China from 1970 to 2017 as samples.The datasets mainly contain the earthquake parameters (e.g., magnitude, epicentral intensity, epicentral fortification intensity and initial time) and the disaster information (e.g., the number of fatalities and the number of victims); the distribution of the samples is shown in Figure 1.The disaster information was derived from EM-DAT (http://www.emdat.be/),and the earthquake parameters were obtained from PAGER (https://www.pager.com/).(Oike, 1991;Nichols, 2003).Moreover, scholars have considered as many factors as they can when modelling.However, some errors remain in each model; thus, the relational expression between the parameters and the number of fatalities is not suitable, or there are still some temporarily non-measurable factors.Therefore, we hoped to identify the main influencing factors via the analysis of historical data.Basic earthquake emergency scenarios were constructed based on a combination of the main factors.A basic scenario combination can better express the relationship between the parameters and earthquake fatalities.Then, information diffusion theory was used to diffuse the sample data based on the basic scenarios considering the temporarily nonmeasurable factors and the extreme event under each scenario.
We collected data on 219 destructive earthquakes that caused casualties in China from 1970 to 2017.Via qualitative analysis using the collected data, the main factors affecting earthquake fatalities were acquired.There is an approximately linear relationship between the magnitude and the number of fatalities (Figure 2).As the magnitude increases, the number of fatalities increases.The relationship between the epicentral intensity and the number of fatalities is shown in Figure 3; the epicentral intensity is mapped to the number of fatalities.The relationship between the number of fatalities and the initial time is relatively vague, as shown in Figure 4. 102 However, it is evident that the maximum number of fatalities occurred during the period 21:00-103 06:00.The initial time of the earthquake will influence the in-building ratio, the population 104 exposure and the speed of the escape reaction of indoor personnel (Chen 1993;Yang et al. 2007).
105 After analysis, it was found that there was no ideal correspondence between the collapse area 106 and the number of fatalities, as shown in Figure 5.   Based on the aforementioned analysis, the magnitude, epicentral intensity and initial time 108 were selected as the main parameters used to establish the basic earthquake emergency scenarios.

109
Magnitude can be expected to be the most essential factor in determining earthquake fatalities.magnitude)) according to the principle of magnitude division in the earthquake emergency programming of China (The National Earthquake Emergency Plan, 2012).On the basis of the magnitude division, the relationship between the empricial intensity and the fortification intensity was used to indirectly express the building damage information.The relationship between magnitude (M) and epicentral intensity (I0) is as follows : M = 0.58 0 + 1.5 (GB/T17742).As the fomula shows, when the magnitude is greater than 6, the empirical intensity is greater than 7.75.However, there are fewer historical earthquakes with a regional fortification intensity greater than 8 in China.Therefore, the basic earthquake emergency scenarios do not consider the scenario with an epicentral intensity less than the epicentral fortification intensity when the magnitude is greater than 6.In addition, the initial time of the earthquake is an important factor affecting staff reaction.During early morning or night, most of the population is sleeping in residential buildings; thus, they cannot take protective measures.In contrast, during the day, most of the population is at work.Thus, the initial time was devided into two periods: day (06:00-20:59) and night (21:00-05:59).Finally, the basic earthquake emergency scenarios were constructed based on a combination of the magnitude, intensity, and initial time of the earthquake (Figure 6).

Methodology
We needed a functional form describing the fatalities with the victim and moritality rate.
After the earthquake, the China Earthquake Administration will rapidly publish information on the earthquake, including the magnitude, the geographic coordinates of the epicentre, and the source mechanism solution (Wang, et al. 2013).The intensity distribution is acquired by the earhquake parameter information and the seismic intensity elliptical attenuation model (Wang, et al. 2000;Wu, et al. 2010).The number of victims is calculated with the area of each intensity and the population density.To derive an earthquake fatality rapid estimation function, one needs to compile the mortality rate statistical analysis under each scenario using observations from past earthquakes.The outline of the approach is as follows: where D is the number of fatalities; (  ) is the mortality rate expectation of scenario   ; is the affected area of the intensity I;   is the maximum intensity for an earthquake;   is the population density of the intensity I, and parameter   is the ratio of the population affected by the earthquake, as determined from the damage degree table provided by the National Disaster Reduction Center (Fan et al., 2008).
To obtain the mortality rate function beyond the framework of the basic earthquake emergency scenarios, we needed to use the observed data of historical earthquakes to compile a mortality rate expectation under each scenario.However, when dividing the samples into each scenario, the sample size will be small, and it is difficult to obtain the relation equation using traditional mathematical statistics.Therefore, the indirect approach of this study consisted of information diffusion theory to obtain the mortality rate.First, the actual observed values for the mortality rate under one scenario were set as matrix X = {x 1 , x 2 , … , x  }, where   is the actual observed values of an earthquake, and m is the total number of earthquake events.At the same time, the actual recorded mortality rate and historical extreme event (the earthquake event with an extreme mortality rate) under one scenario were considered to build the domain U = { 1 ,  2 ,  3 , … ,   }.Here,   is the arbitrary discrete real value in the interval [ 1 ,   ], and n is the total number of discrete points.Then, the sample value   was diffused to the domain U according to normal information diffusion.The normal information diffusion expression is as shown in Equation (2): The domain U obtains the information from the mortality rate sample matrix X with the normal diffusion.After this, the sample information is normalized via the process of normal information diffusion.We acquired the discretization information of each domain point   .
Therefore, the mortality rate expectation (  ) can be denoted as follows: where   is the point of the domain,   is the order of the basic earthquake emergency scenario, and the number of scenarios is 8.
The discretized domain under each scenario is averagely divided into six levels according to the classification of the type of disaster (emergency situation, crisis situation, minor disaster, moderate disaster, major disaster, catastrophe (Eshghi and Larson, 2008)).Hence, the uncertainty of the mortality rate can be expressed as the possibility of each level of the mortality rate.The probability of each level can be denoted as follows: where P is the probability of the level (the interval with  is less than   and is equal or greater than   ),  is the minimum value of the discrete level point, and  is the maximum value of the discrete level point.

Quantification of uncertainty in mortality rate estimation
The rapid estimation of earthquake fatalities is vital for emergency response during the early hours following the event.We can know both the actual record for the historical earthquakes as well as the empirical model-estimated fatalities for the historical events.There is a small difference among the different empirical models as long as the empirical model can answer critical questions, such as whether a particular earthquake requires a response, and if so, at what level (level 1, level 2, level 3, level 4).With the addition of a rapid estimation model based on scenario analysis, we have also proposed a fatality-based alert scale that provides an estimation of the likelihood of a range of fatalities caused by an earthquake.The overall dispersion is associated with the model's prediction for the past earthquakes in that country or region, and then one uses such a measure for determining the uncertainty associated with the model's future estimates.The estimation for the probability of each mortality rate range is shown in Figure 7.The empirical model has been verified using historical earthquakes.Out of a total of 219 earthquakes for which data was collected in this study, 44 (20% of the samples under each scenario were randomly selected) were estimated using the rapid estimation model, and the results are shown in Table 4. Incidentally, we assessed the accuracy of the model via a comparison between the recorded fatalities and estimated fatalities.Among the outliers, the model predicted fewer fatalities for an earthquake (M 6, 9 July 1979) in China, i.e., Jiangsu Liyang, that killed 41 people.At the same time, there were some overestimated fatalities, such as for the earthquake in Hebei Zhangbei (M 6.2, 10 January 1998) and the earthquake in Sichuan Wenchuan (M 8, 12 May 2008).Among the remaining events, the preliminary estimates were within an order of magnitude of the recorded deaths.The number of fatalities calculated using the model was the same order of magnitude as the actual recorded number for more than 95% of the events.The same order of magnitude will not influence the level of the emergency decision, which is very important for rapid post-earthquake rescue.The main purpose of the verification for the uncertainty was to optimize the estimation result.
Furthermore, the possible fatality interval was necessary to provide the basis for emergency decisions when needing to consider indeterminate factors during the process, particularly when the main factors for assessment were difficult to acquire.To verify the accuracy of the quantified results, we used the random selection of 20% of the samples under each scenario.The results show (Table 5) that under the same scenario, the frequency of events with a small mortality rate was higher, and the frequency of catastrophic events was lower.There is an advantage of the model in that the mortality rate distribution can cover all possible historical scenarios.To a certain extent, this compensates for the lack of extreme events during the fitting of the historical data.
The results were obtained in the form of interval probability statistics, which provide the basis for the subsequent emergency optimization.

Estimation for recent earthquakes
With socio-economic changes, the previous analysis based on historical data may be inconsistent with recent data.Therefore, it is necessary to conduct further verification for the  the estimation result of Yunnan Puer (2014).The reason for this may be that the auxiliary parameter is the average population density in the affected area rather than the unit statistics, which did not consider the population distribution.The second method was proposed by Xiao (1991).The overall evaluation result of this eatimation model was good.However, there was a poor result for Yunnan Ludian (2014).The reason for this was that the sample age chosen by the model was rather old.The accuracy rate is defined as the total number of events divided by the number of events for which the estimation results are the same grade as the actual records.The rapid estimation model based on scenario analysis has a higher accuracy and is more suitable for rapid estimation via the comparision.The estimation results of the Yunnan Ludian earthquake (2014) and the Xinjiang Tashikuergan earthquake (2017) were not the same order of magnitude of the actual records.
These two scenarios should be considered as the extreme events because of their mortality rates.
The fatality interval of Yunnan Ludian (2014) was estimated by the model as [582,680], and the probability was 0.071.For the Xinjiang Tashikuergan earthquake, the fatality interval was [8,10], and the probability was 0.026.The interval estimation of the fatalities in the model can consider the extreme events with larger mortality rates but small probability.

Conclusion and discussion
Based on the study of earthquake data from mainland China (1970China ( -2017)), we proposed a new approach for rapidly estimating earthquake fatalities and quantifying the uncertainty.The main factors of the basic earthquake emergency scenarios were magnitude, intensity (the relationship between the epicentral intensity and the epicentral fortification intensity) and initial time, which were used to express the possible earthquake scenarios.For verification of the model, we not only verified using the recorded number but also presented a comparison to the actual recorded fatalities of historical earthquakes.The fatality estimation results were mostly of the same magnitude as the actual record, and the accuracy of the results were higher than that of the compared empirical model.In addition, the mortality rate interval in the model can effectively cover the high probability of mortality as well as extreme events.Based on the current study, the following aspects were mainly improved: 1.During the actual emergency process, the information on on-site earthquakes will be acquired as time progresses.Therefore, how to update the results with the updated information is in need of further study.
. During recent years, the study of uncertainty in the estimation of earthquake fatalities has mainly regarded the qualitative Nat.Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License.

Figure 1 .
Figure 1.Distribution of historical earthquakes on mainland China from 1970 to 2017 3 Basic earthquake emergency scenarios Scholars have discussed the factors that affect earthquake fatalities, which include magnitude, intensity, initial time, population exposure, housing fragility, and individual factors Nat. Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License. 107

Figure
Figure 2. Relationship between the magnitude and the number of fatalities

Figure
Figure 4. Relationship between the initial time and the number of fatalities

Figure 6 .
Figure 6.Framework of basic earthquake emergency scenariosThe objective of the rapid estimation model of earthquake fatalities based on scenario analysis is to estimate the fatality expections and the uncertainty in the fatality interval.The sample data were classified into each scenario based on the framework of the basic earthquake emergency scenarios.Then, the classified samples were devided into two sets (Table1).One set Nat. Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License.

Figure 7 .
Figure 7. Probability of the mortality rate under each scenario applicability and accuracy of the model using destructive earthquakes that have occurred during recent years.The results of the model calculation were compared to the recorded results.The result and error of the victim estimation is shown in Figure 8.The number of victims calculated via the model is of the same order of magnitude as the recorded number, and the error of the estimation results is less than 30%, which is in line with the requirements of the National Disaster Reduction Committee and the Ministry of Civil Affairs Disaster Reduction Center for the rapid estimation of a disaster.

Figure 8 .
Figure 8. Estimation of the earthquake victims in recent yearsThe number of fatalities during each earthquake was estimated based on the estimation result for the victims.In addition, two models were chosen for comparison, and the selection of the model here considered that the impacts of the empirical models have regionally varied.Thus, we selected two empirical models with Chinese samples, but with different sample numbers and different forms; the comparision results are shown in Table6.The first method was proposed by Liu et al(2012), which set the epicentral intensity as the main parameter, and the magnitude and average population density were auxiliary parameters in the model.There is a large deviation in Nat. Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License.

2.
With the development of remote sensing and unmanned aerial vehicle (UAV) technology, images can be used after the earthquake for damage estimation.The real-time evaluation results Nat.Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License. of regional earthquake damage can be acquired.We can obtain relatively accurate information for local regions.Thus, how to extrapolate the local information to estimate the global demand may need further study.Xiaoxue Zhang analyzed and historical data and also guided focus model design and implementation.Hanping Zhao, Fangping Wang, Zezheng Yan, Sida Cai, Han Wang & Xiaowen Mei guided focus model design and implementation.Competing interests.The authors declare they have no conflicts of interest

Table 2 . Historical earthquakes on mainland China under scenario S1 Time Epicentral location Magnitude Number of fatalities Number of victims Mortality rate Year-month-day Hour-min-second
According to the normal information diffusion (Equation (1)), the information carried by the mortality rate sample matrix X is spread to the domain U. Thereafter, the sample information Nat.Hazards Earth Syst.Sci.Discuss., https://doi.org/10.5194/nhess-2018-187Manuscript under review for journal Nat.Hazards Earth Syst.Sci. Discussion started: 1 August 2018 c Author(s) 2018.CC BY 4.0 License.