Reply on RC1

• The authors used various variables to explain the simulated degree of tsunami casualties. Because some of the explanatory variables seems to have correlations among them, I have had concerns the technical problem of statistical methods such as collinearity/multicollinearity problems. Since inappropriately constructed statistical models can lead to wrong results, I suggest additional validity checks of the statistical method. Additionally, the justifications including detailed explanations of the methods and the choice of the methods from similar regression models can help readers' better understanding.

• Different from the previous statistical models explaining tsunami casualties based on actual data, the statistical models were constructed only based on the simulation results from agent-based simulations. Although the method has an advantage that various data can be generated from simulations, at the same time, the quality of the data and the constructed statistical is totally depends on the quality of the simulation. Therefore, inappropriate modellings or excessive speculation from the results can lead to inappropriate implications for actual evacuation preparedness. Along with the confirmation of the validity of the method and data itself, discussions in this paper should supported by additional validity check using simulations, and the applicability of the results should be carefully discussed.
A: We thank the reviewer for his/her comment. We included a reference to a recently published paper where we validate our agent-based model using real-world data (section 2.2.1, starting in page 8, line 181 of the old manuscript). We agree that issues about the reliability and generalization of the results should be carefully discussed. With this purpose, we enhanced the Discussion section, starting on page 15, line 385 (old manuscript).
• p.1 Line 26 -29: The discussed integrated approach for tsunami disaster risk reduction is summarised well with its historical transitions in Koshimura & Shuto (2015), which would be useful to support the description. The paper can be found at https://doi.org/10.1098/rsta.2014.0373 A: We used this new reference to enhance the manuscript in page 1, line 27 (old manuscript).
• p.2 Line 31 -33: The authors claim "this is hard to achieve...", but the reason why is not well expressed. It is better to make it clear for readers from broader research fields.
A: We included further explanation about this topic in page 2, line 31 (old manuscript).
• p.2 Line 35: What information brings the value "15 min"? In my opinion, the evacuation behaviour is always an effective way to save lives during tsunamis if there is sufficient lead time.
A: We agree with the reviewer. However, as our primary focus is on the Chilean case, we underline that the country's typical short arrival times put strong pressure on evacuation processes. We included further explanation and references about this topic in page 2, line 35 (old manuscript).
• p.2 Line 37 -39: In my understanding, "hazard" simply represents an intensity of external force and is not the term to represent how existing condition is affected. For example, this page (https://www.preventionweb.net/understanding-disasterrisk/component-risk/disaster-risk) explains hazard as "Hazard is defined as the probability of experiencing a certain intensity of hazard (eg. Earthquake, cyclone etc) at a specific location and is usually determined by an historical or user-defined scenario, probabilistic hazard assessment, or other method. Some hazard modules can include secondary perils (such as soil liquefaction or fires caused by earthquakes, or storm surge associated with a cyclone).", with the source GFDRR, 2014. Such terminology should be consistently used, referencing reliable in official documents.
A: As suggested by the reviewer, we included further references from official sources to strengthen our concepts definitions. Page 2, from line 36 (old document).
• p.2 Line 47 -50: In my view, some items are inappropriately categorized. For example, is "elevation" exposure? Again, the abovementioned page defined the exposure as "Exposure represents the stock of property and infrastructure exposed to a hazard, and it can include socioeconomic factors". It is better to categorise them with an exact criterion, referencing corresponding sources.
A: We included the new word "determinants" to underline that in this sentence we are referring to those factors that contribute to either increase or decrease the exposure characteristics. The paragraph was also enhanced with further words. Page 2, line 47 (old manuscript).
• p.2 Line 55: "Fragility function" is often used in this context. A: We agree with the reviewer. We modified the text accordingly. Page 2, line 55 (old document).
• p.3 Line 67 -90: This part lines up the existing literature regarding fragility functions for tsunami casualties. Is there any criterion regarding the order of these literatures? It started from the study in 2018 and goes to 2020, but it then suddenly back to 2009. Since these studies develops their method, usually referencing old ones, it is better to present them as readers can understand the trend of these studies. If there is an intention of authors for this order, the text flow should be modified to make it clear. Additionally, the review seems to lack some literature in the same line. Additional reviews would be useful to be more comprehensive. A: We agree with the reviewer. We entirely modified this section of the manuscript, also including additional references. From page 3, line 67 (old document). A: We modified the text to enhance it, including the suggested additional references. From page 3, line 95 (old document).
• p.4 Line 102 -110: Since the there are tremendous amount of literature regarding tsunami evacuations, the reference here seems insufficient. Recent comprehensive review of tsunami evacuation behaviours would be useful for supporting the discussion here. The review paper, Makinoshima et al., 2020, can be found at https://doi.org/10.1016/j.pdisas.2020.100113 A: We entirely modified this section of the manuscript, also including additional references. From page 4, line 102 (old document).
• p.5 Figure 1: This can be moved to the next Methodology section because detailed explanation was made in the next section, and only the names of cities are described in the first section.
A: We agree with the reviewer. We moved Fig. 1 to the Methodology section. • p.6 Line 146 -147: Here is a suitable place to present the figure 1.
A: We thank the reviewer for his/her suggestion. Indeed, we moved Fig. 1 to this suggested place.
A: We modified the text to include this change. Page 6, line 154 (old manuscript).
• p.7 Table1: It is better to present the items in "Years of recorded destructive tsunamis" with its event name and references for its mechanisms since easily accessible information of the events would be useful for readers. The table captions should be presented at the top.
A: We modified Table 1 to include references to the included earthquakes. The table caption was moved to the top, too. Page 7 (old manuscript).
• p.7 Line 161: I understand that this resolution "4x4 m" is based on the finest resolution of the tsunami simulation; however, this resolution might too fine for counting tsunami casualties in agent-based simulations. The investigation with different resolution is needed to ensure the validity of the result. If consistent important features are found in different resolution, it supports the validity of the analysis method. Reliable coarser values can be generated by integrating finer values.
A: We understand the suggestion made by the reviewer. However, the modelling technique used the STOC software, which couples tsunami and evacuation models, therefore using the same grid with a unique resolution. This feature makes unfeasible to execute the agent-based model independently using a different grid size. Nevertheless, as we pointed out above, the evacuation model was validated with real-world data in a recently published paper. • p.8 Line 178: Which part of the simulation was "enhanced" compared to the original model? It should be clear.
A: We included further explanation about how the source code was enhanced by us.
• p.8 Line 181 -182: This description is true for agent-based modelling that simulates detailed interactions among agents (e.g., social force model), and I think readers expect this study used such model after reading this description; however, the detailed explanation of the model (p.9 Line 203 -215) explains that the model does not simulate such complex interactions (e.g., speed down due to the congestions, which caused by detailed 2D behaviour simulations). The text should be modified to more clearly express the model capability.
A: We modified the text according to the suggestion by the reviewer. Now, we hope to clearly express the model capability, focused on a macroscopic perspective.
• p.9 Line 204 -205: It is unclear whether "a mean time = 8 min" is the mean value of the resulting distribution or a parameter value for the probability distribution. If "8min" refers to the parameter sigma for Rayleigh distribution, the resulting mean value of the distribution does not match this value. Presenting the mathematical expression of the Rayleigh function with the parameter used in this study can avoid any confusion.
A: We included a new reference (Mas et al., 2012) and further text to avoid confusion. Page 9, line 204 (old manuscript).
• p.9 Line 211: In my understanding, the paper cited here is not the paper that first proposed the A* algorithm. Is there any reason to cite this paper here? For example, if the study used the implementation in the citing literature, the authors should explain so to avoid any ambiguity.
A: We agree with the reviewer. We edited the text and included further references focused on evacuation studies that apply the algorithm. Page 9, line 211 (old manuscript).
• p.10 Line 217 -218: The number of required simulation runs should be reported here instead of explaining "at least ten" because the information is useful for readers to know the simulation variance.
A: We ran 10 simulations for each case study. We edited the text to avoid ambiguity. Page 10, line 217 (old manuscript).
• p.7 Line 158 -p.10 269: Sections 2.x.x includes both methods to generate data and metrics, and this mixed description can lower the readability of the manuscript. The structure of these sections can be re-structured for better readability.
A: We appreciate the reviewer's comment. However, we prefer to keep the structure of these sections in its current shape.
• p.11 Line 251 -269: As I pointed out for p.7 Line 161, it is better to conduct the analysis with different resolution to see the effect of resolution and check the validity of the analysis.
A: Please see above our reply to this subject.
• p.11 Line 275 -277: Although the random forest is a popular regression model, it is better to explain what it actually is with its brief theoretical explanation for completeness of this paper and better understanding of readers.
A: We included further references and new text to expand the explanation of the random forest methodology. Page 11, line 277 (old document).
• p.12 Line 287: I think the brief theoretical explanations on SHAP and SHAP values is necessary because most of the readers in this field are not familiar with this. At least, readers have to know the logics to estimate the importance of the explanatory variables in understanding the results presented in the rest of this paper.
A: We included further references and new text to expand the explanation of the SHAP values methodology. Page 12, line 289 (old document).
• p.13 Table 3 and Figure 2: The table and the figure simply display the mean value of the variables and is not accompanied with any detailed explanations and discussions such as regional differences. As a result, current text and materials carry almost no information to readers. The reported value can be improved by reporting variances so that readers know the distribution of the variables. Instead of tables and the current figure, a scatter plot matrix representation of the raw data may be useful for understanding data. Along with the revised visual representation, detailed description of the general tendency of the data is required in the revised manuscript.
A: We modified Table 3 to include, for every examined variable and death ratio threshold, the standard deviation. Also, we included a new Figure 4, showing data scatterplots with the distribution of the death ratios, in comparison to the nine examined independent variables. We also deleted the old Fig. 2. From page 12, line 301 (old document).
• p.14 Line 307 -312: Because some explanatory variables potentially have correlations (e.g., Maximum flood depth and Elevation; Straightness, Route length and Mean travel time), I wonder if the analysis has the problem such as collinearity/multicollinearity. The previously mentioned scatter plots of raw data can help readers to consider such potential problems in data. Although such correlations may not affect the performance of the constructed model, I think it at least affect the value of importance. Confirmation of the data and the justification of the validity of the result are required. Additionally, this section simply present figures and no in-depth explanations are made. Broader implications of the results or relations to the previous literature can be presented in the following Discussion section; however, this section at least should describe the obtained results in detail.
A: As we pointed out above, we included a test to prevent the correlation between the death ratio (our dependent variable) and the other likely predictor (independent) variables, to avoid collinearity problems in the regressive model. Overall, we enhanced the Results section to provide an in-depth explanation of our findings. From page 12, line 299 (old document).
• p.15 Line 332 -341: This part explains counterintuitive results and its potential cause; however, in my view, these explanations need further validations because the data is synthesised using simulations, and the data might be generated from unintended behaviour of the simulation models. For example, combination of very local error in elevation data and the hiking function may cause unrealistically slow evacuees. Because the observed tendency is generated from data in simulations, the authors can validate their explanations by checking the simulation results in detail. Cause of the synthesised data can be clearly explained by simulations, and should be.
A: We agree with the reviewer. We enhanced the Discussion section to address his/her comments. We included a new reference to our recently published paper, which validates our evacuation model with real-world data. We also add new text that discusses the limitations of our results. From page 14, line 317 (old manuscript).
• p.15 Line 338 -339: For example, this description should be supported by showing such simulation results.
A: We agree with the reviewer. However, rather than being on each case study's specific characteristics, the paper's focus is on the 530,091 examined cells.
• p.15 Line 343 -345, Line 345 -346, Line 353 -354: Since the simulations in this study does not include realistic evacuation processes and are based on various assumptions (e.g., a single evacuation departure distribution), it is hard to reach general conclusion using this approach. Such limitations should be clearly expressed, and any extrapolation of the results may lead to proposing inappropriate guidelines.