Understanding rockfalls along the national road G318 in China: from 1 source area identification to hazard probability simulation

: Rockfall hazard is frequent along the national road (G318) in west Hubei, China. To understand the distribution and 12 potential hazard prone to road G318, this study combines the result of a 3-years engineering geological investigation, statistical 13 modeling, and kinemics-based method to identify risky road sections. Rockfall source area cells are preliminarily identified by slope 14 angle threshold analysis and then selected by susceptibility method (Random Forest model and multivariate logistic regression 15 model) with the result of potential spatial probability. Temporal and size probabilities of source areas are separately calculated by 16 Poisson distribution and power-law distribution theory. To get the reaching probabilities and potential influence area of released 17 source areas, rockfall trajectory simulation was taken by Flow-R tools. In this process, an important parameter (reach angle) was 18 determined by back analysis and then validated by field investigation. Rockfall hazard probability is finally calculated by integrating 19 spatial, temporal, size probability, and reaching probabilities of source areas. The results show good fitness with the measurements 20 from field work. In the conditions of 5, 20, and 50 years return period, potential risky road sections are found out under two size 21 scenarios (larger than 1 000 m3, 10 000 m3). This research helps the local government to completely understand the rock falls from 22 source area existence and potential risk to roads. 23 et al., 2009; Žabota et al., 2019; Liu et al., 2020). This is also the reason why some researchers 39 try to apply surveying techniques to identify source areas, such as Light Detection and Ranging (LiDAR) and terrestrial laser 40 scanners (TLS) (Fanos et al., 2020). The existing research results show that the critical SAT values vary from rockfall types and 41

and has the advantage of being less demanding compared to other techniques such as 55 discriminating analysis (Carrara, 1991;Baeza et al., 1996). RFM can achieve higher accuracy with the same data. However, different 56 models result in different source area locations. Thus, it is important to know which model performs better in the area of interest. 57 Besides rockfall source area, we also need to know rock mass trajectory paths with resulting intensity (e.g. velocity or kinetic 58 energy) and the area it can affect. To simulate the trajectories and energy, several 2D or 3D tools or software were developed for 59 regional-scale or site-specific rock slopes, such as CADMA by Azzoni et al. (1995); CONEFALL by Jaboyedoff et al. (2003), Flow-  (Horton et al., 2013). It is now widely applied and has achieved good results in different countries, for example, Michoud et al. and rockfall propagation should be at the level of probability assessment. According to rockfall terminology, rockfall hazard refers 85 to the probability of occurrence of an event (such as rockfall) of a given magnitude (such as volume) over a period of time within a 86 given area (Varnes et al., 1984;Fell et al., 1994;Guzzetti et al., 1999). The main objective of this research is rockfall hazard 87 probability assessment along the G318 national highway in China. Based on field investigation and satellite image interpretation, 88 historical rockfall hazards were inventoried and analyzed for slope threshold determination. Considering possible rock source 89 magnitude and rockfall event return period, hazard probability was simulated.   Lithology in the area is purplish-red mudstone mainly, with sandstone and shell stone as interlayer, which has been affected 101 by physical weathering so that most rockfalls have taken place at these sections. National (G318) and provincial (X553) roads are 102 the main traffic ways, with a shape as an inverse Y across the area. Rockfalls occur frequently in the rainy season causing damage 103 to frastures as well as human casualties.

104
The north of Longju town is located in the Anticline of Fangdoushan and Jianchang syncline. The central part is the anticline 105 of Longju town and the syncline of Matouchang. Jiannan anticline and Jianzhuxi syncline are located in the south, so the tectonic 106 development of the study area is obvious. The rock strata in the core part of jianchang syncline are compressed and lithology is 107 dense. The Matouchang syncline is narrow and steep in the northwest and broad and gentle in the southeast, so it is near the 108 horizontal strata in the study area.

109
Rockfall is the main type of geological hazard in the area, especially in the Jurassic red bed (Middle Jurassic lithology) at the 110 nucleus or near-wings of the Matouchang syncline. Two sets of discontinuities control the rock quality and stability, combining 111 with the stratum layer face. Due to these controlling rock structures, differential weathering in sandstone and silty stone increases 112 the probability of rockfalls.

113
In the recent 10 years, urbanization of the Longjuba area in the Three Gorges dam area has been promoted by the government.

114
Accordingly, various construction works and reconstruction of transportation facilities have increased. In addition, due to the 115 construction of a new highway in the area, which involved cutting and filling the slopes, the Longjuba area is becoming more and 116 more hazardous, especially along the G318 ( Figure.2. a), it can be seen that the highway collapse causes vehicle damage ( Figure.    Besides the rockfall inventory data, other datasets were collected as follows:

145
• A Geological map (1:10 000) was used to extract geological spatial layers such as lithology, faults, and slope structure map.

146
The slope structure map was generated using the standard and stratigraphic altitude advocated by Cruden (1991).

147
• The joint density data was gathered in the field in 2015. Joint sets were measured at 108 rockfall source areas.

148
• The land-use map was generated from the GaoFen-1 remote sensing data by applying the Spectral Angle Mapper 149 Classification method in ENVI software.

150
Specific data used are shown in Table 1 Rockfall sources are preconditions of rockfall hazards and risks. We need to determine the potential rocky slopes which have 155 the possibility to be unstable. In this study, three steps are recommended.

156
Calculate the preliminary rockfall source area 157 Firstly, we need to select the preliminary rockfall areas. In order to make a fine quantitative analysis of the collapse source 158 area, we need to digitize and resample the study area. According to the scope of the study area and the scale of the collapse, the size 159 of the grid is determined comprehensively. The preliminary source identification area of the collapse is constrained by the slope  Secondly, rockfall conditioning factors in preliminary source areas are extracted and processed. The formation of collapse is 164 controlled by topography, physical and chemical weathering, human engineering disturbance, and other factors. Therefore, we 165 selected some factors that have the most serious impact on rock collapse in the study area. In addition to slope degree, the 166 determination of fine compounds source area is also constrained by slope aspect, elevation, lithology, slope structure (spatial position 167 between formation occurrence and slope face), joint density, land-use type, etc. Among these factors, slope, aspect, elevation, joint 168 density, and distance to roads are continuity factors. We use the minimal description length principle (MDLP) to classify these 169 continuity factors to improve the model prediction ability. MDLP is a method of discretizing continuous attributes, which has less 170 manual intervention and better quantitative effect (Varnes et al., 1984) than the methods such as equal frequency, equal width, and 171 artificial definition. Finally, we classify the susceptibility value into five levels (very low, low, moderate, high, and very high) by the Natural

188
Breaks method. In this method, breaks are classified as large as possible between groups and as small as possible within groups.

189
The units with the highest class on the susceptibility map by the model with better performance are further finalized as rock fall 190 source areas. . In this study, the Poisson model is adopted for constructing temporal probability. It is the 194 exceedance probability of rockfall occurrence during a given period as follows: Where t is the return period, e.g., 5, 20, and 50 years; the recurrence interval (RI) is the historical mean recurrence interval for to fit the size probability.
Where Pv is size probability; V is rockfall volume; is parameter primarily controlling power-law decay for medium and The purpose of rockfall hazard assessment in this study is to know the possibility of rockfall fragments reaching the road with 223 a certain magnitude under a certain return period. We multiply four probabilities to assess the hazard level (Eq.4). By overlaying 224 the hazard probability map with the highway map, risky road sections can be identified finally.
Where H is rockfall hazard probability; Ps is the spatial probability of rockfall sources introduced in Section 3.3.1, Pt is the 226 temporal probability of rockfall sources, Pv is size probability of rockfall sources; Pr is reaching probability of rockfall sources to 227 roads.  The preliminary rockfall sources are further classified by considering eight conditioning factors, such as slope, aspect, elevation, 256 roughness, slope structure, lithology, etc. Table 2 lists out these factors with classes by using MDLP (Varnes et al., 1984).   Figure.6 shows susceptibility maps of rockfall source area by MLRM and RFM. In terms of the ranking of importance of the 262 factors, distance from road and slope is the most important as shown in both the models (Figure.7). However, a big difference exists 263 in lithology and land use. RFM ranks lithology as a relatively insignificant predictor but this factor is treated to be the third important 264 in MLRM. As to the land-use factor, it is not effective or the least significant in the ranking in both models.

265
ROC curve analysis shows that the success rate of MLRM is 93%, while RFM is 5% higher (Figure.8). It indicates that RFM 266 has a better model performance than MLRM in the study area. The prediction performance of the two models was further evaluated 267 and compared in the field (Table 3).

268
Four typical slopes along G318 road were selected for validation, including steep slope with sandstone inter-bedding with The integrity of the rock mass is good, but there are blocks piled up on the slope.
It is a high-medium class.
The result from MLRM is accurate.

No.30
The slope surface is gentle, and there is no rockfall accumulation, which is a middle-class-prone area.
The result from RFM is accurate.

No.31
The vegetation coverage rate is high and the slope surface is gentle. It is a lowclass prone area.
The result from RFM is accurate.

No.32
The vegetation coverage rate is very high.
There are no exposed rock blocks. It is a medium-low class prone area.
The result from RFM is accurate.     (Table 7). Because of the lower size 342 probability of 10 000 m 3 , the maximum hazard probabilities are generally half of the values under the size scenario 1 000 m 3 .

343
If the above results are associated with the national road G318, we can find out the risky sections with detailed impact 344 probability for road G318 due to rockfall fragments. Table 8

366
In understanding and analyzing rockfall hazard risk, it is very important to identify the source areas, predict the temporal, size, 367 and reaching probability. The identification of the source area is the first step. The fineness of source area identification has an important impact on the 370 following steps, such as the fragment trajectory and rockfall size analysis. However, the source area of historical rockfall hazard 371 data is often missing or mixed with the rock debris accumulation, so it is difficult to identify the source area. Luckily, the slope 372 angle threshold is found out to be 27°in this study, according to the relationship between the historic data and the slope. The area 373 above this angle is preliminarily selected as source areas. After the preliminary screening of the collapse source area in the study 374 area by using SAT method, we conducted a secondary screening of the initial source results in the study area by using various 375 models. By using and comparing multivariate Logistic regression model and random forest model, the final source areas are 376 determined and had a good accuracy after validation. Importantly, the efficiency of trajectory simulation followed by our approach 377 can be improved by 40 times, without losing data of historical or field survey determined rockfalls.

378
Due to the special topography and geological conditions, there are a large number of multi-stage scarps in the study area (as 379 shown in Figure. 15), and more accurate source area identification is required. In the future, more detailed work will be focused on  This paper adopted energy balance theory, GIS spatial statistical function, and flow theory to simulate the influence area of 403 rock fragments. The parameters in the simulation are calibrated and validated by historical records collected by field investigation.

404
The results indicated that the accuracy of the quantitative analysis is very high. However, the failure motion of collapse is various, 405 which was ignored in the Flow-R simulation. There are multiple failure modes of collapse, such as dumping, falling, and sliding.

406
The simulation procedure simplifies the laws governing rock-mass failures and blocks propagations.

407
Compared with STONE, Rockyfor3D, RAMMS, DDA, Flow-R can simulate the motion of multiple collapsing sources on the 408 regional scale by using less time and costs. But we can not consider the failure modes by Flow-R tools. In the future, we will 409 optimize the simulation considering rock source volume, block shape, failure modes, and mechanical parameters and achieve a 410 three-dimensional dynamic display of the collapse process at the regional scale.

411
The simulation of multistage scarps should consider the energy transfer caused by the collision between the scarps or the

415
A national road G318 in west Hubei China is prone to the high-frequency rockfall hazard. In this paper, rockfall hazard and 416 its probability are quantitatively assessed. Rockfall source areas are firstly identified by the slope angle threshold method and then 417 optimized by using the susceptibility mapping method. Slope degree 27° is determined as the threshold angle of rockfalls in the 418 study area. The multivariate logistic regression model and random forest model are compared in terms of the model performance.

419
Source area cells selected by the random forest model are finally chosen and applied for rockfall reaching probability assessment.

420
Compared to the slope angle threshold method, the source areas determined by our approach are more accurate when geology data 421 is available. Meanwhile, the advantages of trajectory simulation efficiency are obvious and without losing data of historical or field 422 survey determined rockfalls. In addition, the size probability and temporal probability for rockfall sources are calculated considering 423 two size scenarios (1 000 m3 and 10 000 m3) and three return periods (5, 20, and 50 years).

424
The selection of parameters is very important for the rockfall trajectory simulation. The smallest reach angle affects the farthest 425 horizontal distance and then the reaching probability. In this paper, 25 ° is determined as the smallest reach angle. The horizontal 426 distance is then simulated by Flow-R and then validated with the historical rockfalls with field-measured records. In the future, we 427 will optimize the simulation considering rock source volume, block shape, failure modes, and mechanical parameters and achieve 428 a three-dimensional dynamic display of the collapse process at the regional scale.

429
Rockfall hazard probability is finally obtained by integrating the spatial, temporal, size probability of source areas and the 430 reaching probability of rock fragments. In the rainfall return period of 5 and 20 years, there is no high hazardous road section, but