the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Elevating Flash Flood Prediction Accuracy: A Synergistic Approach with PSO and GA Optimization
Abstract. Flash floods are frequent and devastating natural disasters in small mountainous river basins worldwide, causing significant harm to people, infrastructure, and property. Flash flood susceptibility mapping is a crucial tool for damage prevention and reduction. This study is focused on the creation of flash flood susceptibility maps in a mountainous region in northern Vietnam. We enhanced the performance of robust machine learning models, including Support Vector Machines (SVM), Random Forests (RF), and XGBoost (XGB), by applying advanced optimization techniques such as Particle Swarm Optimization (PSO) and Genetic Algorithms (GA). These models were developed based on 14 key factors, including elevation, slope, aspect, curvature, topographic wetness index (TWI), stream power index (SPI), flow accumulation, river density, distance to the river, NDVI, land use/land cover (LULC), rainfall, soil type, geology, and 412 flood inventory points. Nine models were tested, including three standalone ML algorithms (SVM, RF, XGB), three ensemble models optimized with PSO (PSO-SVM, PSO-RF, PSO-XGB), and three optimized with GA (GA-SVM, GA-RF, GA-XGB). The results indicated that ensemble models outperformed standalone ones, with the PSO-XGB, GA-XGB, and GA-RF models exhibiting outstanding performance, achieving accuracy rates of 0.939, 0.927, and 0.933, along with remarkable AUC-ROC scores of 0.957, 0.968, and 0.977, respectively. This innovative study introduces a novel set of associative models, contributing significantly to the advancement of flood prediction techniques. The methodology holds applicability for various regions characterized by similar topographical and climatic attributes. Furthermore, enhancing the precision of flood forecasting contributes to the formulation of mitigation strategies by municipal authorities to mitigate prospective flood-related impacts.
- Preprint
(2987 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
RC1: 'Comment on nhess-2024-215', Anonymous Referee #1, 30 May 2025
After a detailed review of the manuscript titled "Elevating Flash Flood Prediction Accuracy: A Synergistic Approach with PSO and GA Optimization," I recommend rejection due to insufficient novelty in the methodology.
The manuscript presents a study on flash flood susceptibility mapping in the Song Ma district, northern Vietnam, employing machine learning models (Support Vector Machines, Random Forests, and Extreme Gradient Boosting) optimized with metaheuristic algorithms (Particle Swarm Optimization and Genetic Algorithms). While the work is well-executed and clearly presented, the approach of integrating PSO and GA with machine learning for flood mapping is not new. This methodology is well-documented in existing literature, and the manuscript does not offer significant innovations to distinguish it from prior studies.
Key Reasons for Rejection:
- Lack of Novelty: The combination of PSO and GA with machine learning models for flood susceptibility mapping is an established technique, not a novel contribution. There are several paper flood mapping using hybrid AI and metaheuristic algorithms such as:
- Bui, D. T., Ngo, P. T. T., Pham, T. D., Jaafari, A., Minh, N. Q., Hoa, P. V., & Samui, P. (2019). A novel hybrid approach based on a swarm intelligence optimized extreme learning machine for flash flood susceptibility mapping. Catena, 179, 184-196.
- Plataridis, K., & Mallios, Z. (2023). Flood susceptibility mapping using hybrid models optimized with Artificial Bee Colony. Journal of Hydrology, 624, 129961.
- Rezaie, F., Panahi, M., Bateni, S. M., Jun, C., Neale, C. M., & Lee, S. (2022). Novel hybrid models by coupling support vector regression (SVR) with meta-heuristic algorithms (WOA and GWO) for flood susceptibility mapping. Natural Hazards, 114(2), 1247-1283.
- Nguyen, H.D. GIS-based hybrid machine learning for flood susceptibility prediction in the Nhat Le–Kien Giang watershed, Vietnam. Earth Sci Inform 15, 2369–2386 (2022). https://doi.org/10.1007/s12145-022-00825-4
- Dodangeh, E., Panahi, M., Rezaie, F., Lee, S., Bui, D. T., Lee, C. W., & Pradhan, B. (2020). Novel hybrid intelligence models for flood-susceptibility prediction: Meta optimization of the GMDH and SVR models with the genetic algorithm and harmony search. Journal of Hydrology, 590, 125423.
- Ngo, P. T. T., Pham, T. D., Hoang, N. D., Tran, D. A., Amiri, M., Le, T. T., ... & Bui, D. T. (2021). A new hybrid equilibrium optimized SysFor based geospatial data mining for tropical storm-induced flash flood susceptible mapping. Journal of Environmental Management, 280, 111858.
- Kaya, C. M., & Derin, L. (2023). Parameters and methods used in flood susceptibility mapping: a review. Journal of Water and Climate Change, 14(6), 1935-1960.
- Nguyen, H. D. (2022). Flood susceptibility assessment using hybrid machine learning and remote sensing in Quang Tri province, Vietnam. Transactions in GIS, 26(7), 2776-2801.
- Established Literature: Several studies have already explored similar methodologies, including:
- "Flood Mapping with PSO-GA": Utilizes PSO and GA with Support Vector Machines for flood mapping.
- "Metaheuristic Flood Assessment": Applies GA and other metaheuristics with ANFIS for flood zoning.
- "Remote Sensing Flood Mapping": Combines PSO, GA, and Harmony Search with machine learning for flood susceptibility.
These examples highlight that the manuscript’s approach aligns with a well-trodden path in flood prediction research. Although the application to a specific region is detailed, it does not advance the methodological framework beyond what is already known.
Additional Notes:
The manuscript is well-written, with a thorough methodology and robust analysis. However, these qualities do not compensate for the lack of originality, which is a critical factor for publication.
In summary, despite its technical competence, the manuscript does not meet the threshold of novelty required for acceptance. Therefore, I recommend rejection.
Â
Citation: https://doi.org/10.5194/nhess-2024-215-RC1 -
CC1: 'Comment on nhess-2024-215', Yen-Yi Wu, 25 Jun 2025
This paper showcased how an advanced machine learning approach such as PSO-XGM could help with mapping flood vulnerability. The hybrid approach has become more popular so it is great to see this work show supporting evidence of its robustness.Â
The authors utilized flood inventory points to identify where there are floods. However, there is a spatial scope in floods. One flash flood might involve a large region, while another flood may only swamp a small area. When they randomly picked up "non-flood" locations, how to make sure that the points did not fall into the spatial extent of a flood point? And extend from this question:Â what are the uncertainties and errors this approach may bring in?
Â
Citation: https://doi.org/10.5194/nhess-2024-215-CC1 -
RC2: 'Comment on nhess-2024-215', Anonymous Referee #2, 14 Jul 2025
The paper presents a systematic comparison between machine learning models (SVM, RF, and XGBoost) and their optimized counterparts using Particle Swarm Optimization (PSO) and Genetic Algorithms (GA). The authors aim to identify the most effective hybrid configurations for producing high-resolution flash flood susceptibility maps in the Song Ma district of northern Vietnam.
The case study is well documented, and the authors utilize 14 environmental and topographic factors. However, the analysis does not consider any temporal or seasonal variability, which could limit the temporal validity and generalizability of the results over time.
A fundamental shortcoming of the article lies in its lack of reproducibility. The authors do not disclose the specific hyperparameters optimized for each machine learning algorithm, nor the search ranges explored during the optimization process. Most notably, there is no indication of which hyperparameters were selected for tuning in the first place—this omission severely limits the ability of other researchers to replicate or validate the results.
Furthermore, the study would have greatly benefited from a comparative analysis between metaheuristic optimization methods (PSO and GA) and a more conventional technique such as Grid Search, which remains a widely adopted and interpretable approach for hyperparameter optimization in machine learning.
An additional weakness is the absence of any investigation into the sensitivity of model performance to different hyperparameter configurations. Understanding how model accuracy and generalization vary with different hyperparameter settings is crucial, especially when dealing with high-dimensional or non-linear feature spaces.
Crucially, the article also fails to address the issue of overfitting, which is a well-known risk in hyperparameter optimization. Without safeguards such as regularization, cross-validation, or independent validation sets, optimized models may become over-specialized to the training data, reducing their real-world applicability.
Lastly, there is no information provided on the data shuffling or sampling strategy during model training. It remains unclear whether randomization techniques (e.g., data shuffling, stratified sampling, or k-fold cross-validation) were employed to ensure representative and unbiased training/testing splits. This further weakens the methodological transparency and robustness of the study’s claims.
Considering all these aspects, my suggestion is to not recommend the paper for publication in its current form.
Citation: https://doi.org/10.5194/nhess-2024-215-RC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
291 | 92 | 19 | 402 | 20 | 29 |
- HTML: 291
- PDF: 92
- XML: 19
- Total: 402
- BibTeX: 20
- EndNote: 29
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1