the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Application of machine learning for integrated flood risk assessment: Case study of Hurricane Harvey in Houston, Texas
Behrang Bidadian
Aaron E. Maxwell
Michael P. Strager
Abstract. Flood risk, encompassing hazard, exposure, and vulnerability is defined concerning potential losses. Machine learning techniques have gained traction among researchers to address the complexities of multi-variable flood risk assessment models and overcome issues associated with non-linear relationships. However, the focus has primarily been on flood hazard prediction rather than comprehensive risk assessment and damage estimations. Therefore, there is a need for experiments that combine risk elements using such methods. To address this need, this study utilized the Random Forest algorithm to analyze the correlations between the physical flood damage caused by Hurricane Harvey in 2017 in Houston, Texas and certain hazard, exposure, and vulnerability-related variables. The study identified poorly drained soils as the primary contributor to the losses, followed by population density and the ratio of developed lands with medium intensity. The study's findings also explored the reasons for the unexpectedly low importance of social vulnerability factors compared to the environmental justice concept. These findings and conclusions can provide insights to planners and stakeholders enhancing their understanding of the underlying causes contributing to flood risk. Future research can expand upon this study's methodology and findings by incorporating additional factors related to climate change.
- Preprint
(12382 KB) - Metadata XML
- BibTeX
- EndNote
Behrang Bidadian et al.
Status: open (until 30 Oct 2023)
-
RC1: 'Comment on nhess-2023-113', Anonymous Referee #1, 19 Sep 2023
reply
This study explores the application of machine learning (ML) techniques in the context of multi-variable flood loss estimation, with the inclusion of environmental and socio-economic factors. The approach is tested for the event of Hurricane Harvey in Houston, Texas.
I would like to begin my review report with a general concern that has been increasingly relevant in studies like this employing ML approaches. Indeed, while ML has great capabilities in analyzing complex problems characterized by non-linearities between the different variables at hand, it is becoming increasingly common to present studies based on these approaches that actually do not significantly contribute to the advancement of knowledge. In this case, the research objective appeared quite interestingly, i.e., to assess the feature importance of environmental and socio-economic characteristics in shaping flood risk in urban areas.
However, the variables selected here and the scale of analysis (census tract) may not be entirely suitable for obtaining meaningful insights. As evidenced by the reported results on variable importance (page 21), the outcomes appear rather obvious, with variables like the percentage of drainage soils, population density and the percentage of medium-density developed areas emerging as the most crucial factors. This somewhat diminishes the discussion’s focus on the significance of socio-economic and environmental aspects. This issue is also evident in the somewhat shallow and unsubstantial discussion section of the manuscript, which fails to provide significant insights into the obtained results, leaving the research questions posed in the introduction largely unanswered.
In my view, while ML is undoubtedly a valuable tool, its application should ideally lead to new findings that significantly contribute to the advancement of a specific field, rather than merely representing an application or exercise that ultimately yields "some results". In the case of the present paper, this concern is further compounded by the strong similarities with prior studies (e.g., Knighton et al. (2020)), both in terms of methodology and objectives, raising additional questions about the novelty and originality of the work.
From the methodological point of view, the corresponding section effectively outlines the division of data into training and validation sets, but it would be beneficial if the Authors could provide more details on whether cross-validation techniques were employed to ensure the robustness of the model. Furthermore, the definition and brief description of the evaluation metrics used to assess model performance should be included in the methodological section, rather than appearing abruptly in the results section.
Finally, there would be also room for improving the overall quality of the presentation and writing, by avoiding the repetition of certain concepts and information multiple times throughout the manuscript (e.g., explanations of what machine learning algorithms are and their capabilities) and by combining the several maps from Figures 4 to 10 into fewer figures.
Citation: https://doi.org/10.5194/nhess-2023-113-RC1 -
AC1: 'Reply on RC1', Behrang Bidadian, 28 Sep 2023
reply
Dear Referee,
We appreciate your thoughtful review of our manuscript and the time you invested in providing valuable feedback. We have carefully considered your comments and would like to respond respectfully, addressing some of your concerns.
Contributions to the Advancement of Knowledge: We firmly believe that our paper significantly contributes to the advancement of knowledge in the field of flood risk assessment. By employing machine learning techniques to analyze the complex interplay of environmental and socio-economic factors in flood loss estimation, we expanded upon previous studies by providing a more comprehensive approach and presenting/analyzing the results.
Scale of Analysis and Variable Selection: We acknowledge your concern about the scale of analysis. The choice of a census tract level for analysis was primarily driven by the availability of detailed socioeconomic data from the Census Bureau. Data limitations compelled us to work at this level, and we have noted this constraint in our discussion section.
Regarding the variables selected, at the time of manuscript submission, our approach represented the most comprehensive and suitable set of variables at least for our study area. Scientific research often builds on existing knowledge, and our study aimed to refine and enhance the understanding of flood risk by considering these variables in a location-specific context.
Outcomes: While some of our results may seem intuitive or common-sensical, it is important to remember that scientific experiments are essential to validate and quantify these ideas, especially in a location- and time-specific context. Additionally, what might appear "obvious" can vary depending on the audience, and it is our duty as researchers to provide empirical evidence to support our findings. We also believe that our study sheds light on the less “obvious” role of development pattern, which may not be immediately apparent without rigorous analysis.
Discussion Section: You commented on the discussion section labeling it as "shallow and unsubstantial" without mentioning any specific reasons for that. We tried to discuss the process limitations and the potential reasons for obtaining results not aligned with the expectations of the environmental justice notion. We believe this approach is neither shallow nor unsubstantial.
Research Question: Our research question was: What role do environmental and socioeconomic characteristics play in shaping flood risk in urban areas? We believe this question was addressed in our study, as we identified the significant impact of soil type, development pattern type, and population density on flood risk in the context of Hurricane Harvey and less significant role of the other factors.
Originality and Similarities with Prior Studies: We respectfully disagree with the assertion that our study lacks originality due to similarities with prior works such as Knighton et al. (2020). As indicated on page 7 of our manuscript, we explicitly state that our research expands upon previous studies like the one mentioned by incorporating a more comprehensive set of variables, including development patterns and additional demographic/socioeconomic factors. Furthermore, our study's application to a specific historical event, Hurricane Harvey, adds a unique dimension to the research landscape.
Presentation Quality: First, we appreciate your concern regarding the inclusion of multiple maps and will consider reducing them during the revision process. Then, it is important to note that while explaining fundamental concepts about machine learning algorithms and processes may appear unnecessary to experts in the field, our target journal does not exclusively focus on machine learning or AI. Therefore, these explanations are necessary to cater to a broader readership within the journal's audience.
In conclusion, we appreciate your feedback and would make necessary revisions to improve the quality of our manuscript. We are confident that our study contributes to the field of flood risk assessment, and we hope that our responses address your concerns adequately. If you have any further comments or suggestions, please do not hesitate to share them with us. Your guidance is invaluable in helping us refine our research.
Citation: https://doi.org/10.5194/nhess-2023-113-AC1
-
AC1: 'Reply on RC1', Behrang Bidadian, 28 Sep 2023
reply
Behrang Bidadian et al.
Behrang Bidadian et al.
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
246 | 76 | 9 | 331 | 3 | 5 |
- HTML: 246
- PDF: 76
- XML: 9
- Total: 331
- BibTeX: 3
- EndNote: 5
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1