New Global Characterization of Landslide Exposure

Landslides triggered by intense rainfall are hazards that impact people and infrastructure across the world, but comprehensively quantifying exposure to these hazards remains challenging. Unlike earthquakes or flooding which cover large areas, landslides primarily occur in highly susceptible parts of a landscape affected by intense rainfall or seismic shaking, 10 which may not intersect human settlement or infrastructure. Existing global landslide inventories generally include only those reported to have caused impacts, leading to significant biases toward both locations where impacts are common and areas with higher reporting capacity. To address the limits of report-based inventories, we have combined a globally homogenous landslide hazard proxy derived from satellite data with open-source datasets on population, roads and infrastructure to consistently estimate global exposure to landslide hazards. These exposure models compare favourably with existing datasets 15 of rainfall-triggered landslide fatalities, while filling in major gaps in inventory-based estimates in parts of the world with lower reporting capacity. Our findings also, for the first time, distinguish relative levels of landslide hazard mitigation between different countries.

rainfall. The LHASA model does not consider snow avalanches. The effects of this change should be minimal in the tropical and temperate zones previously studied.
Finally, leveraging the new IMERG rainfall product we recompute the thresholds above which landslide activity is anticipated at each pixel based on the 95th percentile of a 7-day ARI weighted rainfall accumulation. The model is then reprocessed from 90 2000-present, and we build a nearly 20-year record of landslide Nowcasts around the world. Averaging the Nowcasts by month, we construct a Nowcast climatology, or average landslide Nowcast rate for each pixel. We also compute annual Nowcast rates. This provides a globally consistent proxy for landslide hazard over the course of the year in each location. We term this as 'Nowcast density', and it represents a proxy for intensity of landslide activity. We can then combine this with data on population and infrastructure to assess the relative exposure to landslides. 95 The result is a raster dataset at approximately 1km resolution for each month of the years in the IMERG record. We compute additional metrics such as the inter-annual variability in Nowcast frequency and standard deviations of Nowcast frequency.
This information is incorporated into the annual exposure estimates to provide a measure of the variability. This uncertainty analysis is discussed in more detail below.

Exposure datasets & integration with hazard 100
We have overlaid the hazard footprints derived from the LHASA-based Nowcast climatology on top of publicly available datasets of population and infrastructure globally to map the exposure of these elements to landslide hazard. We have additionally aggregated these data at a national scale to compare with existing studies. Below, we first describe the datasets used, and then the approach taken to combine them with the hazard outputs.
We use population data from the Gridded Population of the World version 4 dataset (Doxsey-Whitfield et al., 2015), adjusted 105 to the UN WPP Population Density for 2015. Use of this dataset is in line with other studies of population exposure to global hazards (Carrao, Naumann, & Barbosa, 2016;Dilley et al., 2005;Kleinen & Petschel-Held, 2007). The resolution of this dataset is the same as the LHASA Nowcast outputapproximately 1kmand thus can be directly mapped onto the hazard data.
The definition of critical infrastructure can differ depending on the relevant stakeholder or location. The UN Global 110 Assessment Report 2015 incorporates schools, hospitals and residential areas (De Bono & Chatenoux, 2014), and we use this as an initial basis for our estimates. We incorporate roads as defined in the Global Roads Inventory Project (Meijer et al., 2018), and amenities including hospitals, schools, fuel stations and power facilities as defined by OpenStreetMap. Both catalogs have a global extent and are updated regularly. Additionally, they offer a consistent set of data that can be compared https://doi.org/10.5194/nhess-2019-434 Preprint. Discussion started: 22 January 2020 c Author(s) 2020. CC BY 4.0 License. across the world. While there are some caveats to this comparison, which are discussed below, we suggest that these two 115 datasets are likely the best datasets with global coverage, open access, and recent updates.
The GRIP roads dataset harmonises nearly 60 datasets describing road infrastructure into a single, consistent dataset covering 222 countries (Meijer et al. 2018). GRIP incorporates roads derived from OSM as well as other data sources, and is considered to be a harmonised global road catalog. The daily updates for OSM are not incorporated into GRIP, but we consider the globally harmonised nature to be more important than a frequently updated catalog for the purposes of our study. This dataset 120 is a shapefile of linear features, which is not initially directly compatible with the 1km resolution landslide hazard outputs. To connect the linear road dataset with the pixel-based Nowcast density data, we have used the Line Density tool in ArcGIS to calculate the density of roads at 1km resolution with an output of a road density map with units of km/km2. Although the GRIP database classifies roads in one of five classes depending on size and importance (e.g. primary highway, residential road), we have not distinguished between these classes in our analysis. While economic impacts vary based on the type of road, our 125 analysis is meant to highlight the total potential exposed length for all types of roads.
OpenStreetMap (OSM) is a continually updated global map of infrastructure, roads, settlement and land uses (OpenStreetMap contributors 2015). The updates are contributed by members of the public and the data is openly available for access in shapefile and XML format. While differing levels of input from different parts of the world mean that there can be differences in the level of completeness of the map depending on the region (Barrington-Leigh and Millard-Ball 2017), the specificity of 130 the data makes it an excellent source for infrastructure information. There is detailed classification of different features in the map that allow us to isolate specific types of infrastructure, such as medical amenities or power stations. In addition, the opensource nature of OSM means this approach is highly replicable. We have used the OSM Planet data file (a single XML document of approximately 1TB, containing the information for every mapped feature in the OSM map) and parsed the xml data using a Python-based script to obtain the density of critical amenities at a 1km resolution. We define critical amenities as 135 those labelled 'School', 'Hospital' 'Fuel Station', 'Power Station' and other 'Power' nodes (including substations and transformers), based on the OSM feature definitions. The OSM Planet file was downloaded on June 24th 2019. The script used to parse this file is available in the supplementary material.
To combine the roads datasets and OSM-derived critical infrastructure with the hazard outputs, we have multiplied each by the Nowcast density for each full year in the IMERG archive (2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018) and taken the mean value and standard deviation. 140 The resulting datasets on exposure for population, roads, and critical infrastructure are all calculated at approximately 1km resolution. We have also generated month-by-month exposure rasters to estimate the climatology of exposure for the same https://doi.org/10.5194/nhess-2019-434 Preprint. Discussion started: 22 January 2020 c Author(s) 2020. CC BY 4.0 License. exposed elements. Since these outputs are based upon the LHASA Nowcast output, it is important to clarify the units in which our estimates of exposure are expressed. Table 1 provides a summary of the units and the terms used in the study.
In Table 1, the units for each of the exposure outputs is also explained. We use the shorthand Popexp, Roadexp, and Infrexp to 145 denote population, road and infrastructure exposure, respectively.

Error assessment
Kirschbaum and Stanley (2018)  To explore the relative variability in landslide activity, we estimate the standard deviation in annual Nowcast density at each point, based on the near-20 year IMERG rainfall input. We then propagate the error into the estimates for exposure for population, roads and critical infrastructure. The raster data for the standard deviations in error are available in the supplemental data. 160 Estimating errors associated with OpenStreetMap data can be challenging, since the data quality is determined by volunteers who contribute to the map database. Broadly, we suggest it is appropriate to consider two distinct sources of error; the location accuracy of the individual points and infrastructure, and the completeness of the inventory. As discussed by Mooney and coauthors (2010), a lack of ground data across the world makes it challenging to assess the positional accuracy. However, in some locations, data can be compared with existing sources. In the UK, Haklay (2010) suggests that OSM data points offer 165 positional accuracy comparable with the Ordinance Survey Maps (the government standard). For the purposes of our study, where the maximum resolution available for the landslide hazard data is 1km, this positional accuracy is in excess of the requirements. However, completeness of the map is more problematic.
Barrington-Leigh and Millard-Ball (2017) assess the relative completeness of the OSM roads data on a country-by-country basis, finding that OSM data in many developed countries is near-complete, although this declines in some states with lower 170 GDP. The completeness varies within individual countries, with the most complete mapping observed in the highest density cities as well as the most sparsely populated areas (reaching a low in moderately populated areas). We assume that the estimate of completeness presented by Barrington-Leigh and Millard Ball (2017) for roads is applicable to other infrastructure; we are not aware of other global estimates of OSM completeness for specific infrastructure categories, so while this assumption may not fully hold we suggest it is more informative to use this completeness estimate than none at all. Applying this as an error systematically across our analyses is challenging; we can normalize national-level OSM based measurements by the completeness measure of Barrington-Leigh and Millard-Ball (2017), but at a pixel level we present the exposure 'as is', since we have no a priori concept of how to apply completeness estimates at this scale. To effectively normalise the exposure data at a country level, we provide the completeness measure derived from Barrington-Leigh and Millard-Ball (2017) in Supplementary Table 1. In the figures in supplementary material that show Infrexp aggregated at a national level, we normalise 180 the exposed elements by the total number of critical infrastructure elements in each country, which serves to provide a useful intercomparison of the relative hazard, and does not require completeness metrics.
The GRIP roads database (Meijer et al. 2018) draws a significant part of the road inventory from OpenStreetMap, and so is subject to some of the same error constraints. In Europe, the roads are derived primarily from OSM, although completeness in this part of the world is near-perfect (Barrington-Leigh and Millard-Ball 2017). GRIP also uses OSM data in China, where 185 there is a dearth of other freely available datasets. As such, completeness estimates in China are difficult to accurately characterize, and we do not attempt to do so. Elsewhere, GRIP incorporates other road datasets to supplement OSM. These input datasets are limited to those with positional accuracy greater than 500m, which precludes significant positional errors that would affect our km-scale analysis. We are not aware of estimates of the completeness of the GRIP dataset; since it integrates datasets from all over the world, external validation datasets of completeness are unlikely to exist comprehensively. 190 As such, while we note that there may be parts of the world where coverage is incomplete, we do not have strong constraints on this.

Results
Our analyses provide a global set of observations of landslide exposure, in both raster format and tabulated by country. The 195 source data is available in the supplementary material associated with this study. Figure 1 shows the population exposure annually for each 1km pixel and Figure 2 shows the exposure of population, roads, and critical infrastructure at the same scale for a portion of Northern Italy and the Alps, to highlight the nature of the different datasets. As can be observed in Figure 2, population and roads are significantly more widely distributed than critical infrastructure. Infrastructure is instead concentrated primarily in urban centers, although power distribution infrastructure 200 follows similar transportation corridors to road networks. In other parts of the world, there are significant levels of exposure of critical infrastructure to landslide hazard. The co-location of power distribution and road network exposure highlights the potential for complex post-landslide damage and multi-sector impacts.
For each country we have tabulated the aggregated values for Popexp, Roadexp, and Infrexp, average annual Nowcast density.
We also show the total population, total length of roads from GRIP, and total number of OSM critical infrastructure elements; 205 this allows for calculation of the fraction of total that is exposed for each of these aspects. To normalize the number of Nowcasts for each country, we divide by area in square decimal degrees, rather than square kilometers; since the Nowcast data is output on a grid based on decimal degrees. The same aggregation approach could similarly be used at a sub-national level to assess https://doi.org/10.5194/nhess-2019-434 Preprint. Discussion started: 22 January 2020 c Author(s) 2020. CC BY 4.0 License. relative impacts in different administrative areas. These data can be found in Supplementary Table 1, where all data necessary to replicate these results is available. 210 We also list the OSM completeness estimates from Barrington-Leigh and Millard-Ball (2017), the fatalities per country due to non-seismic landslides assessed by Froude and Petley (2018), and the landslide-linked economic impacts assessed by Dilley et al (2005). These datasets are, to our knowledge, the most current datasets that assess landslide impact in terms of economic cost and fatalities globally, and provide valuable points of comparison for our results. Comparison of calculated Popexp with recorded fatalities is shown in Figure 5, and comparison of Roadexp with economic impacts from Dilley et al (2005) in Figure  215 6.

Discussion
The most striking initial result of our study is that significantly larger proportions of the globe are exposed to rainfall-triggered landslide hazards than are often considered. Inventory-based assessments (e.g. Dilley et al. 2005) do not show significant levels 220 of landslide hazard and exposure in sub-Saharan Africa or much of Asia and South America, while we find that many of these countries have significant proportions of the population and infrastructure exposed. It is perhaps not surprising that exposure to landslide hazard is elevated in the major mountain belts of the Andes and the Alpine-Himalayan Orogeny, but there are other key hotspots that may be less well known. These areas include much of Japan, the Rwenzori mountains in Africa, Central America and Mexico, and much of the Caribbean. We find specific hotspots for certain cities within or near mountain belts; 225 this is particularly evident at the edges of large conurbations that abut mountainous areas, such as Taipei, Rio de Janeiro and the edges of Tokyo.
While the zones of densely packed critical infrastructure such as schools and hospitals are also in general associated with these urban areas, the impact of landslides on linear infrastructure is more widespread. Roads and power transmission facilities often follow similar linear corridors, and where those intersect areas of high landslide hazard the relative exposure can still be 230 important. The localised impact of a single landslide impacting a densely populated urban zone may be very high, with several critical infrastructural elements impacted. However, the likelihood of a landslide occurring somewhere along lengthy road or power transmission segments in regional-scale rainfall events is higher, and an interruption to linear infrastructure may impact lifelines that are relevant in disaster response. Thus the localised and distributed impacts should be considered alongside one another, we suggest that highlighting the most vulnerable corridors for power transmission and road traffic is an important 235 subject for future work.
To explore these results against independent datasets of landslide hazard and risk, we have aggregated the data at a country level (Supplementary Table 1). We can then highlight those nations with the highest landslide impact both in absolute terms (total exposed people and infrastructure) and as a proportion of the overall population or infrastructure in that country.
As might be expected, countries with the largest population have the highest overall population exposure, although exposure 240 in China exceeds that of India despite having a smaller population. Exposure of roads is also greatest in China and the United States, which are both highly populated with good OSM coverage. These absolute values are important, but we suggest that https://doi.org/10.5194/nhess-2019-434 Preprint. Discussion started: 22 January 2020 c Author(s) 2020. CC BY 4.0 License. more insight can be gained by assessing the relative exposure of population and infrastructure in each country, as well as by comparing the different relative values between nations.
Intercomparison of different countries can highlight those nations where the impact of landslides is greatest, and can draw 245 attention to smaller, less developed nations where landslide statistics from report-based inventories may be lacking. bulk of reported landslide events occur in larger nations where statistical variability of landsliding is likely damped over larger areas like Nepal, Taiwan, China and Japan. While we find high normalised hazard estimates in many of those states, our analysis also highlights smaller nations where the relative impact of landslides may be more significant on longer timescales.
Alongside the previously mentioned nations, we also find several smaller states with higher proportions of exposed population; Montenegro, Bosnia and Herzegovina, and Macedonia are notable in the Balkan area in particular. 265 To test whether the Nowcast-exposure estimates are a useful predictor of landslide risk, we can compare them to existing datasets. In Figure 5, we plot the total exposure of population in each country (in units of person-Nowcasts per year) against the landslide fatality dataset assembled by Froude and Petley (2018). This dataset, collected from 2004-2016, consists of 4862 separate landslide events that resulted in fatalities, and is the most comprehensive dataset for landslides that have caused 270 fatalities in the world. Figure 5 highlights that there is a relatively strong correlation, with countries in Asia, Central America and Africa generally exhibiting higher numbers of fatalities for a given population exposure than observations in Europe.
In Figure 6, we plot the total road exposure against a derived metric of GDP impact from Dilley et al. (2005) based on the EM-DAT landslide dataset. The EM-DAT based assessment divides the globe into 2.5 degree squares and does not present absolute 275 values of total economic loss, but instead a relative decile (1-10 with increasing risk) ranking of grid cells based upon the https://doi.org/10.5194/nhess-2019-434 Preprint. Discussion started: 22 January 2020 c Author(s) 2020. CC BY 4.0 License. calculated economic loss risks. While this metric is not quantitative of the economic risk, we suggest that it is possible to compare these relative loss rates against our results. As with the comparison between Popexp and fatalities, we see a relatively strong correlation. However, it is clear that the EM-DAT dataset is incomplete; the complete absence of data on costs associated with landslides in African countries limits how effectively we can compare this inventory with our model estimates. The 280 absence of data does further highlight the value of our globally consistent approach.
Although there are countries without data in the EM-DAT derived database, it may be possible to derive these missing values based on the relationship between Roadexp and the countries where EM-DAT data exists (points in Figure 6)i.e., to capture the y-axis values based on a known x-axis value. Extrapolation and validation of this relationship is beyond the scope of this 285 current work, but we suggest is an important topic for future research.
In order to learn which factors control the relationships between exposure and impact in different countries, we can combine the inventory data with our estimates and compare it with other variables. In Figure 7, we plot the number of fatalities recorded in the dataset of Froude and Petley (2018) divided by Popexp. This is subdivided by continent. We suggest that fatalities divided by exposure provides a proxy for the degree of hazard mitigation in a given country; lower values indicate that for a given 290 level of population exposure, fewer fatalities are observed. We find high variability in each continent, although in general there are lower levels of fatalities per unit exposure in Europe when compared to Central America and the Caribbean, as well as South America. Germany and Hong Kong, highly developed countries, have proportionally low fatalities despite high levels of exposure, likely a result of extensive mitigation efforts.

295
At the other end of the spectrum, some less developed countries exhibit higher fatalities for a given exposure; Sierra Leone, Burkina Faso, Haiti, Suriname, Bangladesh, Dominica and the Philippines have a significantly higher level of fatalities per unit of exposure. Some key outliers (Qatar and Bahrain) have high fatality per unit exposure, but these nations have very low overall exposure (see Supplementary Table 1) meaning that even a small number of fatalities increases the y-axis value in Figure 7 to a large degree. This analysis, while not comprehensive, may inform national-level landslide risk management and 300 provide insight into relative vulnerability to a given level of exposure.
To explore whether the variability in fatalities divided by Popexp seen in Figure 7 is related to the level of development in each country, we have compared fatalities / Popexp with 2018 GDP values for each country (World Bank 2019) A priori, we would expect countries with greater GDP to be capable of mitigating hazard more effectively, and thus have fewer fatalities for a given level of exposure. However, while there is a small average decline in fatalities for a given exposure as GDP increases 305 ( Figure 8), with some high GDP countries showing the lowest fatality values (notably Germany and Hong Kong) there is a significant degree of variability in this relationship, suggesting there is a more complex relationship.
We note that comparing the model-based estimates of exposure with the fatality inventory of Froude and Petley (2018) in this manner may lead to erroneous conclusions if not considered carefully. While it is likely that many, if not all of the fatal landslides in developed countries are accurately recorded, this may not be the case in states where disaster management is less advanced. As such the lack of strong relationship between fatalities per unit exposure and GDP per capita observed in Figure   8 may represent gaps in the data in countries with lower GDP per capita, and thus a systematic bias within this analysis. Phrased differently, there may still be a relationship between GDP and fatalities for a given exposure level, but this may be masked by a lower reporting capacity in less-developed nations.

315
While these results provide an independent estimate of landslide hazard and exposure across the globe that does not rely on a specific inventory, there are still assumptions and limitations that should be considered to put these results in appropriate context.
The most important caveat associated with this data is that Nowcasts do not represent a guarantee of a landslide. The LHASA model Nowcasts (Kirschbaum and Stanley 2018) are issued when there is an increased likelihood of a rainfall-triggered 320 landslide, meaning the estimates of exposure represent the relative likelihood of exposure to landslides, rather than the reported impacts. As such, Nowcast number is a proxy for landslide hazard, rather than a quantifiable landslide hazard. However, we suggest that this disadvantage is more than offset by the global homogeneity and comparability of the Nowcast output.
Additionally, since we do not have global data to quantify the vulnerability of settlements and infrastructure to landslide hazard, we cannot quantify the risk and impacts associated with landslide hazard. For example, data on fatalities associated with 325 landsliding (Froude & Petley, 2018;Petley, 2012) quantifies the impacts, and while we can express our outputs in terms of relative proportion of population exposed to hazard, the lack of vulnerability data in our study represents an unconstrained source of variability if we compare those two datasets. Moreover, since the Nowcast output does not capture information about the size of a potential landslide in a given area, there may be differences in the severity of the landslide events that occur depending on local factors (e.g. topography). 330 We note that we do not identify specific hospitals or schools as exposed to landslides. The resolution of our analysis remains coarse for individual points, and identifying specific locations could lead to overconfidence in exposure estimates. We acknowledge the importance of downscaling exposure estimates to those points, and suggest it is another important future direction for landslide exposure estimation.
The resolution of the Nowcast data also presents challenges to the interpretation. While a Nowcast estimate for a 1km x 1km 335 grid cell provides an estimate of the landslide hazard therein, it does not provide information about where exactly a landslide may occur. Since infrastructure and population are unlikely to be evenly distributed within a grid cell (and are likely to be located further from areas of highest landslide susceptibility if risk mitigation measures have been adopted), elements that we describe as 'exposed to landslide hazard' may never actually be so. Given the resolution of our input hazard data, we suggest that it is challenging to provide a more finely resolved estimate. This does highlight the need for effective downscaling methods 340 that can be applied to coarse resolution rainfall data to assess local landslide hazard. We hope to address this in future work.
In addition, while our analysis covers rainfall-triggered landslides, both anthropogenic and seismic triggered slope failure significantly contribute to global landslide impact. We suggest future work should seek to homogenise these diverse triggering factors.
The value of a homogenous global dataset is highlighted when comparing the relative exposure of population to landslide 345 hazard based on our estimates with the GDP cost associated with landslides derived from Dilley et al. (2005). The prior study is based upon the EM-DAT inventory of damaging landslides, but the complete absence of data for countries in sub-Saharan Africa (see Supplementary Table 1) contrasts strongly with our results, which suggest that there is a significant proportion of the population in many sub-Saharan African countries exposed to landslide hazard.

Conclusions
Through combining rainfall, topography and other satellite-derived data, we have developed a long-term estimate of landslide hazard across the globe, which we have utilised to estimate the exposure of population and infrastructure to rainfall induced landslides. These estimates are globally consistent, and compare favourably with existing global datasets. When used in conjunction with datasets of landslide fatalities we can provide a nuanced picture of where and when landslides are most 355 impactful. Our data highlights a potential higher prevalence of landslide hazards than previously documented in in small, mountainous nations and islands; while the absolute numbers of fatalities may be smaller, these represent locations with extremely high hazard and exposure. Further work is necessary to both test these results in a range of settings, consider additional triggering factors such as earthquakes and human impact, as well as to explore how global estimates can be downscaled and compared to more local estimates.