Exploring inferred geomorphological sediment thickness as a new site proxy to predict ground-shaking ampliﬁcation at regional scale: application to Europe and eastern Türkiye

. To test whether a globally inferred sediment thickness value from geomorphological studies can be used as a proxy to predict earthquake site ampliﬁcation, we derive site-ampliﬁcation models from the relation between empirical ampliﬁcation for sites in Europe and Türkiye and the geomorphological sediment thickness. The new site-ampliﬁcation predictions are then compared to predictions from site-ampliﬁcation models derived using the traditional site proxies, V S30 inferred from slope, slope itself, and geological era and slope combined. The ability of each proxy to capture the site ampliﬁcation is evaluated based on the reduction in site-to-site variability caused by each proxy. The results show that the highest reduction is caused by geological era and slope combined, while the geomorphological sediment thickness shows a slightly larger or equal reduction in site-to-site variability as inferred V S30 and slope. We therefore argue that including geology and geomorphology in site-ampliﬁcation modelling on regional scale can give an important added value and that globally or regionally inferred models for soil and sediment thickness from ﬁelds beyond


Introduction
Local geological features can have a strong impact on earthquake ground shaking, especially at sites with mainly loose sediments, which have been observed to amplify the recorded ground motion.Knowing the soil and sediment composition of a site is therefore necessary for computing the possible earthquake site amplification for seismic hazard and risk assessments.For a single site and site-specific analysis, several site parameters for characterizing shallow site conditions (e.g.fundamental frequency f0, shear wave velocity profile, horizontal-to-vertical spectral ratio (HVSR), depth to bedrock) can be obtained from seismic and geotechnical investigations and used to predict local site amplification (e.g.Bergamo et al., 2021;Cultrera et al., 2021;Trifunac, 2016;Derras et al., 2017).For larger areas and regional site-amplification analysis, however, the site conditions must be derived from empirical relations between relevant proxies available through regional or global maps (e.g.Bergamo et al., 2022;Thompson et al., 2010;Weatherill et al., 2023).Currently, the common practice for characterizing site amplification in seismic hazard and risk assessment is using the average shear wave velocity of the upper 30 m of the soil column (V S30 ).For a single site the velocity profile and V S30 can be measured directly, but for larger areas and regions, however, V S30 must be inferred from other parameters.A much-used method to calculate V S30 uses slope from digital elevation models (DEMs), following Wald and Allen (2007).This method is based on the hypothesis that steep (high) slopes generally have less sediment and therefore higher shear-wave velocity (V S ), while flat (low) slopes are more likely to be basins filled with sediment and thus with lower V S .Wald and Allen (2007) used measured V S30 to derive a relation between V S30 and slope for active and stable tectonic regions separately and provided a global map of predicted V S30 values.However, inferring V S30 based on slope has several limitations.As already stated by Wald and Allen (2007), the assumption of correlation between V S30 and slope breaks down for continental glaciated terrains and nominally flat volcanic plateaus.In addition, Lemoine et al. (2012) have shown that other geological conditions, in particular narrow sedimentary basins and small topographic heterogeneity, have a poor correlation with the V S30 model based on slope.
Since the Wald and Allen (2007) model, several V S30 maps based on new methods and other geological proxies in addition to slope have been made, both on local and national level (e.g.Thompson et al., 2014;Vilanova et al., 2018;Foster et al., 2019;Mori et al., 2020;Li et al., 2022).However, as also argued by Weatherill et al. (2020Weatherill et al. ( , 2023)), the main purpose of V S30 is as a proxy to predict site amplification, and when inferring V S30 from other parameters, it thus becomes a proxy of a proxy.In fact, site amplification predicted by V S30 based on slope shows little improvement to the site-amplification models based directly on slope (Weatherill et al., 2020).Furthermore, it is important to keep in mind that inferred V S30 should not be used interchangeably with measured V S30 values without properly accounting for the additional uncertainty related to the V S30 calculations (Lemoine et al., 2012;Thompson and Wald, 2016;Weatherill et al., 2023).The variability of ground motion predictions can have great impact on the resulting probabilistic seismic hazard and risk assessments.Properly accounting for and separating between aleatory uncertainty, coming from natural randomness, and epistemic uncertainty, due to lack of knowledge, is therefore important.It has long been acknowledged that the site variability is a significant contributor to ground motion variability (e.g.Atkinson, 2006;Rodriguez-Marek et al., 2013).In particularly, using inferred site proxies in place of measured site parameters results in an increase in uncertainty.To account for this increase in uncertainty, Weatherill et al. (2023) derived separate site-amplification models for measured and inferred proxies and compared their impact on the final hazard and risk calculation.It was found that, although the median amplification predicted using inferred V S30 was notably lower than the median predicted amplification using measured V S30 , the resulting seismic hazard and risk curves from the different approaches were within the same range.This emphasizes how seismic hazard is controlled not only by the median amplification but also by the uncertainty.Indeed this increase in uncertainty, related to inferred proxies, compensates for the change in predicted median amplification in a probabilistic hazard and risk context.
In this study, we follow the approach of Weatherill et al. (2020Weatherill et al. ( , 2023)), to test the suitability of new site proxies as predictors for site amplification and investigate the effect on epistemic uncertainty when using different regional and globally available site proxies.We skip the step of deriving a site proxy ourselves and look beyond the field of engineering seismology for already available large-scale models of soil and sediment conditions that would allow for inference of soil amplification across a wide region.One such model is the Pelletier et al. (2016a) geomorphological model for sedimentary thickness.As the thickness of soil and sediments down to bedrock is an important factor for modelling amplification of earthquake ground shaking, the thickness of porous weathered material above unweathered bedrock is necessary for land surface modelling of, for example, the water and carbon cycle (Pelletier et al., 2016a).Pelletier et al. (2016a) therefore developed a global dataset of soil, intact regolith and sedimentary deposit thicknesses intended as input for hydrology and ecosystem models.The model is based on a combination of data including slope, lithology and stratigraphy, and water table depth, all of which correlate with geotechnical soil conditions known to yield seismic amplification.Because it is based on more robust geomorphological theories than traditional inferred site proxies, like V S30 based on slope or geology, we acknowledge the potential value of the Pelletier et al. (2016a) model and other similar largescale models of soil thickness derived from other fields than our own, as possible input in site-amplification modelling in large-scale seismic hazard and risk modelling.
To test the whether the geomorphological model can provide extra information and is suitable for ground-shaking prediction, we derive a simple site-amplification prediction model using site-to-site residuals (δS2S s ) from the European Engineering Strong-Motion (ESM) dataset (Lanzano et al., 2019;Luzi et al., 2020).The site-to-site residuals (δS2S s ) are derived from a simple ground motion model (GMM) following the method of Kotha et al. (2018Kotha et al. ( , 2020)).We compare the ability of the geomorphological model to predict site amplification to site-amplification models based on the traditional site proxies, V S30 derived from slope from Wald and Allen (2007), and slope alone, as well as a combination of slope and geological era.To better investigate the differences in the site-amplification prediction maps derived from the different proxies, we focus on eastern Türkiye and Syria, where the recent February 2023 Kahramanmaraş earthquake sequence occurred (Melgar et al., 2023;Petersen et al., 2023).

Site-to-site terms
The site response at sites where ground motion records are available is often derived as the standard spectral ratio (SSR) between a site and a nearby rock reference, or, in the rare cases it is available, a borehole reference at the same site.However, a nearby rock reference or borehole stations are not always available for ground motion recording stations.Instead, when a station has recorded several earthquakes, the repeatable site response can be separated from the source and path effect of the ground motion, using methods like generalized inversion technique (GIT, e.g.Nakano et al., 2015), empirical spectral modelling technique (Edwards et al., 2013) or empirical ground motion modelling (e.g.Kotha et al., 2017Kotha et al., , 2018;;Ktenidou et al., 2018).In this study we use the latter method to remove the source and path effect from the ground motions using a simple GMM.To derive the GMM we use robust mixed-effects regression (rlmm, Koller, 2016), where statistical outliers are down-weighted and hierarchical data are dealt with by distinguishing between fixed effects as explanatory variables and random effects as grouping factors (Bates et al., 2015).A GMM is typically composed of three main explanatory variables describing the source, path and site effects of the ground motion.In its most basic form, magnitude and distance are used to describe the source and path, while V S30 is usually used to characterize the site effects.Here, however, only the source and path effects are used as fixed effects, while the site is included as a random effect: where ln(µ) is the median ground motion prediction and f R,g (R JB ), f R,a (R JB ) and f M (M W ) are the fixed effects capturing the scaling of the ground motion with geometric spreading, anelastic attenuation and magnitude for Joyner-Boore distance R JB , hypocentral depth h D and magnitude M W .The reference values, R ref = 30 km, M h = 5.7 and h D = 4, 8 and 12 km, are frequency-independent and defined by Kotha et al. (2022).Following the notation of Al Atik et al. (2010), the between-event random effect δB e and site-to-site random effect δS2S s represent the systematic deviation of recorded ground motions from the GMM median predictions related to an event e and a site s, respectively, and δWS e,s is the "remaining" record-to-record variability (Kotha et al., 2018;Loviknes et al., 2021).If a site-proxy-dependent site term were included in the fixed effect, δS2S s would represent the systematic deviation of the observed amplification at site s from the median amplification predicted by the model using the site proxy (Al Atik et al., 2010).However, because no site-proxy-dependent site term is included in the GMM derived here, δS2S s captures all the site-specific response and thus can be used as an empirical site-amplification function describing the local amplification, or deamplification, of each station with respect to the median of all sites (Kotha et al., 2018).δS2S s is comparable to site amplification from GIT, as shown by recent studies (Bindi et al., 2017;Wang et al., 2023).The δS2S s is assumed to follow a frequencydependent normal distribution with standard deviation, φ s2s : In this study the GMM and corresponding δS2S s are derived in the Fourier amplitude spectra (FAS) from the ESM dataset (Lanzano et al., 2019;Luzi et al., 2020).While most GMMs are derived for response spectral amplitudes (SAs), representing the damped response of an elastic single-degree-offreedom oscillator, we here derive the GMM and δS2S s in FAS to better capture the physical effects that can be masked in the response spectra, in particularly at high frequencies (Kotha et al., 2022;Bora et al., 2019;Bayless and Abrahamson, 2019).
To derive the GMM we use the same data selection criteria and similar functional form as Kotha et al. (2022).The GMM of Kotha et al. (2022) is a regionally adaptable model in FAS for shallow crustal earthquakes in Europe and Mediterranean regions.The regionalization in these models is represented by including an earthquake locality-to-locality variability term and an attenuation region-to-region variability term to the random effects.In the GMM derived for this study, these terms are not included and only event and site are used as random effects.This is done to minimize the possibility that regional differences in site effects propagate into the region-to-region random effect.
Unlike traditional site-amplification factors, δS2S s is not relative to a reference rock condition but to δS2S s = 0, which is the centre of the distribution, median, of all the stations.The final δS2S s dataset contains site terms in the frequency range f = 0.460-9.903Hz for 1680 stations in Europe and the Middle East, as shown on the map in Fig. 1 at f = 0.529, 1.062 and 9.903 Hz.Although the site amplification shows a high variability and is mainly dominated by very local effects, some regional effects can be observed.For example for Italy, the amplification is mainly high (above the median, red) in the Po Plain and low (below the median, blue) in the Alps.
3 Site proxies: inferred V S30 , slope, geomorphological sediment thickness and geological era 3.1 Inferred V S30 from slope The V S30 dataset of Wald and Allen (2007)   in seismic hazard and risk studies (Silva et al., 2020).Wald and Allen (2007) used measured V S30 from several locations in the United States, Taiwan, Italy and Australia to derive a relation between V S30 and slope, separating between active and stable tectonic regions.The slope was calculated from global 30 arcsec DEMs from the Shuttle Radar Topography Mission (SRTM30).Here we use the inferred V S30 values for Europe directly from the global map published by Wald and Allen (2007).These values range from V S30 = 180 m s −1 to 900 m s −1 as shown in the map of Europe in Fig. 2a and the distribution plot in Fig. 3a.

Geomorphological sediment thickness (GST)
The Pelletier et al. (2016a) model provides a gridded global dataset of soil, intact regolith and sedimentary deposit thicknesses down to 50 m.In their model, Pelletier et al. (2016a) define bedrock as the unweathered bedrock below unconsolidated material, which in lowlands is mainly considered sedimentary deposits, and high porosity material, which in uplands can be divided into regolith and soil where soil is the material that sustains life and regolith is the porous weathered material below soil (Pelletier et al., 2016a;Holbrook et al., 2014).The model is therefore developed by partitioning the Earth's surface into uplands and lowlands, which are then separated into hillslopes and valley bottoms.Uplands and lowlands are defined as areas undergoing net erosion and net deposition, respectively, over geological timescales and are distinguished using geological maps and topographic analysis.Hillslopes and valley bottoms are identified using topographic curvature from DEMs, where hillslopes are areas of unconfined surface water flow, while valley bottoms are areas of confined surface water flow.This distinction is particularly important for uplands, and the regolith, soil and sediment thickness values are derived separately for the three landform types: upland hillslopes, upland valley bottoms and lowlands.The values are calculated using mathematical formulas specific to each landform, based on world climate data, water table depths, soil thickness databases and depth-tobedrock data, among others, as input.The final dataset provided by Pelletier et al. (2016b) includes several 30 arcsec  https://doi.org/10.5194/nhess-24-1223-2024 pixel grids covering 60°S-90°N and 180°W-180°E, separating between maximum upland regolith, average soil thickness for upland hillslope, average soil and sediment thickness for upland valley bottom and lowlands and average soil and sediment thickness across all areas.This study uses the grids for average soil and sediment thickness for all areas, hereafter referred to as geomorphological sedimentary thickness.The Pelletier et al. (2016a) model has been previously tested as a proxy for basin depth in Japan (Weatherill et al., 2020) and was included in the open-source site database of strong-motion stations in Japan by Zhu et al. (2021).However, both Weatherill et al. (2020) and Zhu et al. (2021) used the average soil and sediment thickness for upland valley bottom and lowlands grid, while here we use the average soil and sediment thickness for all areas, which is a combination of the grids for average soil thickness for upland hillslope and average soil and sediment thickness for upland valley bottom and lowlands, in order to access more values over a broader area.The geomorphological sedimentary thickness ranges between 0 and 50 m and is shown in the map of Europe in Fig. 2c and the distribution plot in Fig. 3c.

Geological era and slope for Europe
In the latest European seismic hazard and risk model (ESHM20, ESRM20), geological era and slope are used to derive the site-response model (Crowley et al., 2021;Weatherill et al., 2023).This approach is based on Vilanova et al. (2018), who made a V S30 map for Portugal from geological maps, and Weatherill et al. (2020), who compared several approaches for deriving site amplification from inferred proxies in Japan, including geology and slope.The harmonized surface geology map of Europe is a combination of three geological maps for Europe and Iceland and has a resolution of up to 1 : 1 500 000.Because several geological units, both following lithologic (nature) and stratigraphic (age) classification, contain too few stations, the geological units were grouped into the following seven geological eras: Holocene, Pleistocene, Cenozoic, Cretaceous, Jurassic-

Amplification predictions according to the different proxies
We evaluate the ability of the different proxies to predict site amplification by deriving a site-amplification model for each proxy using linear regression to capture the relation between the empirical amplification δS2S s (f ) and the proxies: where Y s (f, Proxy) is the predicted site amplification for a site s at frequency f using a proxy x Proxy , and a and b are the coefficients derived from the linear regression.
A log-normal distribution is generally assumed for V S30 and depth to bedrock (e.g.Boore et al., 2011;Vilanova et al., 2018), and the relation between site amplification and measured V S30 is often derived as log-linear (e.g.Seyhan and Stewart, 2014;Derras et al., 2017).Indeed, Fig. 4 shows the linear regression between the empirical amplification δS2S s (f ) at frequencies f = 0.529 and 1.062 Hz, with measured V S30 .The coefficient of determination r 2 shows that the correlation between δS2S s (f ) and ln(V S30 ) is higher than between δS2S s (f ) and V S30 on linear scale.We therefore, and following common practice, also assume a log-linear relation between the site amplification and the inferred site proxies.
As shown by the distribution of the inferred proxies (Fig. 3), the log-normal assumption is not fully fulfilled for inferred V S30 and especially geomorphological sediment thickness.This is because the proxies are limited to a certain range during their calculation process.In the case of geomorphological sediment thickness, Pelletier et al. (2016a) set the maximum value to 50 m, effectively meaning the thickness of the sediment layer is 50 m or more.Likewise, the inferred V S30 is limited to a maximum value of 900 m s −1 .
When dealing with such uneven distribution caused by censoring the data, so-called "censored data", tobit regression (Tobin, 1958) is a possibility.The tobit model is developed to estimate linear relationships when the dependent variable is censored and uses the likelihood function to deal with the uneven distribution (Amemiya, 1984).However, as can be seen in Fig. 5 (red line), the tobit regression strongly overestimates the slope of the relation between the site proxies and site amplification, which is also demonstrated by the non-parametric fit (dashed yellow lines, Fig. 5) and the low coefficient of determination r 2 for the tobit regression (red line Fig. 6) compared to the other regression models, as shown in Fig. 6.The slope from the tobit regression is especially overestimated at high frequencies, where a weak relation between the empirical site amplification and site proxies is expected due to the impact of small-scale heterogeneities at the site where even the location or housing of the strongmotion station can affect the amplification (Hollender et al., 2020) and the coarse spatial resolution (30 arcsec) of the siteamplification model.Interestingly, at very low frequencies (below 0.5 Hz), the coefficients of determination r 2 for the  regressions based on geomorphological sediment thickness are slightly reduced.This behaviour is not observed for r 2 based on the regressions with inferred V S30 and might be caused by the geomorphological sediment thickness being limited to 50 m depth.
Another alternative approach to deal with the censored data is to exclude the end values when running the regression.However, this step only excludes a high number of sites while not having a strong impact on the regression line (Fig. 5, green, purple, orange and yellow line).Instead, we only omit sites with geomorphological sediment thickness = 0 m, which is the value that causes the highest unevenness in the distribution, as well as sites with missing values for any of the proxies, leaving us with 1508 sites for the regression.To evaluate the dependency of the regression on the selection of sites, we run a 10-fold cross-validation test, which is a method to separate the data used for the regression (training set) and the data used to validate the regression (validation set) when dealing with small datasets (Bishop and Nasrabadi, 2006).Because our dataset consists of 1508 sites, we choose to separate the data into 10 parts in order to have a sufficiently large validation set (about 150 sites).In the 10-fold cross-validation test, the dataset is thus split into 10 equal parts and the model is derived on 10-1 parts and tested on the remaining 1 part of the dataset.This is done 10 times, and for each run a different subset of the data is used for validation.The distributions of the site proxies for each of the 10 cross-validation iterations are shown in Fig. A1.
Because the object of this study is not to find the best possible relation between each proxy and empirical site amplification, but rather to compare the ability of the proxies to predict site amplifications, we choose linear regression for simplicity.We do, however, acknowledge that site amplifi-cation is a complex phenomenon that the linear assumption cannot fully capture.Furthermore, because the models are based on the proxies' relation with δS2S s , the resulting amplification predictions are, as δS2S s , relative to the median prediction of the associated GMM used to obtain the δS2S s (Eqs.1-4).
We derive site-amplification models from linear regression between δS2S s for the frequency range f = 0.460-9.903Hz and the site proxies, V S30 from slope, slope alone, geomorphological sediment thickness and geological era combined with slope.The selected sites and the regression lines are shown for f = 0.529 Hz and f = 1.062Hz in Fig. 7 for inferred V S30 , slope and geomorphological sediment thickness.Although the δS2S s shows a high scatter with the site proxies, a general trend of higher amplification for low V S30 , low slope and high sediment thickness and low amplification for vice versa can be identified.For the regression over geological era and slope combined, we apply multiple linear regression, meaning with multiple independent variables where the categorical predictor, geological era, is transformed to dummy variables for each era and the regression then derives a constant coefficient for slope with a different intercept for each geological era (Fig. 8).The variation in prediction coefficients due to the alteration of the training set in the cross-validation process, shown as dotted red lines in Figs.7 and 8, is small for all the proxies except inferred V S30 and geological era.This indicates that the site-amplification models based on inferred V S30 and geological era and slope combined are more dependent on the data selection than the other proxies, causing a higher final uncertainty.The coefficients of determination for the linear regressions shown in Figs.7 and  5 Reduction in site-to-site variability After deriving an amplification model based on each proxy, we compare the predicted amplification δS2S s (f, Proxy) with the empirical amplification δS2S s (f ) at a site s and frequency f .We measure the ability of each proxy to capture the site amplification using the reduction in site-to-site variability φ s2s as an indicator of the efficiency of each proxy in predicting the amplification (Stewart et al., 2017;Zhu et al., 2022).We compute the corrected site term for each proxyspecific predicted amplification δS2S s,cor.(f, Proxy): where δS2S s,cor.(f, Proxy) represents the remaining site amplification that is not captured by the proxy-based amplification prediction δS2S s (f, Proxy), and φ s2s cor.(f,Proxy) is the site-to-site variability of δS2S s,cor.(f, Proxy).δS2S s,cor.(f, Proxy) with the four site proxies is shown for f = 0.529, 1.062 and 9.903 Hz in Fig. A3.If the site-amplification model were able to perfectly predict and capture the full range of the site amplification at a specific site, φ s2s cor.(f,Proxy) would be reduced to zero.However, such an ideal case is not realistic and conventional siteamplification models can only aim to reduce φ s2s cor.(f,Proxy) as much as possible.Hence, the greater the reduction in variability, meaning a lower φ s2s cor.(f,Proxy) , the better the ability of the proxy to capture the site amplification.Still, it is important to keep in mind that this measure and the correlation between δS2S s (f ) and the site proxies are purely statistical and do not give any insight into what is causing the amplification and its variability.
As described in Sect.4, the site-amplification models and corresponding reduction in φ s2s were derived and calculated 10 times following the 10-fold cross-validation technique.This means we are dealing with two sources of variability: the site-to-site variability φ s2s and the variability related to running the regression on different subsets of the data in the 10-fold cross-correlation, here called cc .φ s2s is a combination of the natural, random and irreducible (aleatory) variability of site response and the epistemic uncertainty related to the site proxies not being able to fully capture the site properties controlling the site amplification.cc is fully epistemic as it is related to the difference between the site-amplification models derived using different datasets.Figure 9 shows the mean φ s2s for all stations (black lines) and φ s2s cor.(Proxy) for each proxy (coloured lines) from the 10 cross-validation iterations derived on the training sets (Fig. 9a) and the validation sets (Fig. 9b).The φ s2s for each cross-validation iteration is shown in Fig. A4.In Fig. 9, the shaded areas around the means are the variance cc related to the crossvalidation process.cc is, as expected, higher for the validation set (Fig. 9b), which for each iteration corresponds to only 10 % of the dataset, than for the training set (90 % of https://doi.org/10.5194/nhess-24-1223-2024 Nat. Hazards Earth Syst.Sci., 24, 1223-1247, 2024 the data, Fig. 9a).Nonetheless, the general pattern is similar for both sets, where the highest reduction in site-to-site variability is caused by the site-amplification model based on geological era and slope.Inferred V S30 , slope and geomorphological sediment thickness show similar reductions in variability for both the training and validation sets, with around a 1 % difference for the training set.For the validation set the reduction caused by inferred V S30 , slope and geomorphological sediment thickness are within the same standard deviation cc and are not clearly distinguishable (Fig. 9b).For both the training and validation set, none of the site-amplification models are distinguishable for frequencies above 3 Hz, which might be caused by the low resolution (30 arcsec and 1 : 1 500 000) of all the proxies considered in this study.This poor correlation between site amplification and topography-and geology-based proxies at high frequencies (f > 3.0 Hz) has also been observed by other studies (Stewart et al., 2017;Zhu et al., 2022), also when higherresolution indirect proxies were used (Bergamo et al., 2022), indicating that inferred site proxies mainly capture the average and deep properties of the subsurface and not the finer local nuances of a site.The results shown in Fig. 9, indicate that the siteamplification model based on geological era and slope combined is better at capturing site amplification relative to the other proxies used in this study.These results are consistent with the findings of Weatherill et al. (2020), who also derived site-amplification models based on several inferred proxies for Japan.However, when applying the same model to Europe, Weatherill et al. (2023) found that the reduction from geological era and slope was not significantly lower than for inferred V S30 or slope alone.They speculate that this could be an artefact of the mixed-effects regression used to derive the model, where geological era and slope were included as a random effect, meaning the coefficient for slope changes with each geological era, which is different for the multiple linear regression applied in this study where the slope coefficient stays the same for each geological era.In addition, our study uses δS2S s (f ) from FAS, which is likely to affect the site-to-site variability.
Nevertheless, the case that geological era and slope combined give the highest reduction to the site-to-site variability shows the importance of including geology in siteamplification modelling.Furthermore, the new proxy geomorphological sedimentary thickness shows a similar or even slightly higher reduction in site-to-site variability as the traditional proxies inferred V S30 and slope, indicating the potential of geomorphological sedimentary thickness as an alternative site proxy for seismic hazard assessments for large areas or areas without measured site parameters.

Proxy-based site-amplification predictions and maps
To further evaluate the ability of the site proxies to predict site amplification, we compare the predicted site amplification with empirical site amplification for the entire frequency range (f = 0.460-9.903Hz).The object of this study is to test regionally or globally available site proxies as predictors for regional site amplification over a large area.It is therefore not necessarily meaningful to compare our siteamplification predictions to empirical site amplification at a single site.Instead, we compare the predicted amplification to empirical amplification grouped according to the Eurocode 8 classes (EC8 CEN, 2004) provided in the ESM database.Based on the station distribution with V S30 (Fig. 3) and the number of stations per class (Fig. 10a), we only use the stiffer classes, A, B and C, for the comparison.Selecting corresponding ranges of site properties for predicting the site amplification for each EC8 class using other proxies than inferred V S30 requires some attention.The correlation between measured V S30 and slope in Fig. 10b shows that slope generally increases with increasing V S30 .Although the relation has a high variability, especially for high V S30 , we select high slope (0.1-0.3 m m −1 ) for A, 0.05-0.1 m m −1 for B and 0.01-0.03m m −1 for C. Because the relation between geomorphological sediment thickness and measured V S30 has a high variability over the entire range (Fig. 10c), we use the newly proposed Eurocode 8 draft (Paolucci et al., 2021) with very shallow geomorphological sediment thickness at (< 5 m) for A, shallow (5-30 m) and intermediate ( 30 Table 1.The ranges of site properties selected to correspond to the Eurocode 8 classes: A, B and C.  10c, brown and green scatter points).The selected ranges of site properties are given in Table 1.
Figure 11 shows the mean of the empirical site amplification (solid black line) with the standard deviation (shaded black area) of the sites in each of the EC8 classes A, B and C. The variability of the empirical site amplification even within each class is very high, and the predicted site amplifications using any of the proxies are within the standard deviation of the three classes.For A, slope and geomorphological sediment thickness are continuously in the upper range of the empirical site-amplification standard deviation, while the predictions based on inferred V S30 and geological era with slope slightly underpredict relative to the mean empirical site amplification for low frequencies and overpredict for higher frequencies.For class B, inferred V S30 and slope are in the lower range of the empirical site-amplification standard deviation but generally follow the shape of the mean empiri-1235 Figure 11.Empirical mean (solid black line) site amplification with standard deviation (shaded black area) compared to predicted site amplification for the Eurocode classes (a) A, (b) B and (c) C, using inferred V S30 (solid blue lines), slope (solid orange lines), geomorphological sediment thickness (dashed-dotted green and olive lines), and geological era and slope (dotted magenta lines).cal site amplification, while the prediction based on shallow (5-30 m) and especially intermediate (30-50 m) geomorphological sediment thickness overpredicts with respect to the mean empirical amplification, The site-amplification models based on geological era and slope underpredict relative to the mean empirical site amplification, especially for low frequencies, but get closer to the mean empirical amplification at higher frequencies.For the softer soil class C, the different proxies have similar predictions and are close to the mean of the empirical amplification.At low frequencies, in particular the model predictions based on geomorphological sediment thickness and inferred V S30 are close to the mean empirical amplification.However, the differences between the predictions and empirical amplification could also be due to a poor correspondence between the selected ranges of inferred proxy values and the measured V S30 , particularly for geological era.
Finally, we use the proxy-based site-amplification models to predict the site amplification for the whole of Europe, as shown in Fig. 12 for f = 1.062Hz.The amplification maps at f = 0.529 Hz and f = 9.903 Hz are shown in Figs.A5  and A6.The site-amplification maps show the amplification (red) and deamplification (blue) relative to the median site amplification (white) predicted by the associated GMM defined in Eqs. ( 1)-( 4).This is different from conventional amplification maps used in probabilistic seismic hazard analysis, where the site amplification is relative to a rock reference.In this study, we keep the amplification relative to the median to avoid biasing the result with poorly constrained rock properties.Furthermore, the δS2S s used to develop the models is obtained from strong-motion stations mainly located in southern Europe and the Mediterranean region (Fig. 1); some regional bias is therefore likely to be present in the amplifi-cation predictions.The maps all show similar features, for example, high amplification in the soft-sediment basins of the Po Plain in northern Italy, the Danubian Plain of eastern Romania and the Great Hungarian Plain and strong deamplification in the Alps, the Carpathian Mountains and western Norway.However, clear differences are also evident in the different site-amplification maps, for example around the Rhine Valley, in the Baltic and eastern Türkiye.These differences show not only that the proxies are capturing different aspects of the site effects, but also that there is a need for characterizing the epistemic uncertainty related to the siteamplification predictions.

Justifications of the differences in the site-amplification predictions: a focus on eastern Türkiye
To investigate the differences in the predictions further and evaluate how the models perform with new data, we zoom in on eastern Türkiye, where the recent Kahramanmaraş earthquake sequence of February 2023 occurred (Melgar et al., 2023;Petersen et al., 2023).Figures 13 and 14 show the site proxies and predicted site amplification at f = 1.062Hz in eastern Türkiye and Syria.The main difference between the site proxies and their corresponding site amplification is in the area south of the southeastern Taurus Mountains, around the cities of Gaziantep and Aleppo, and by the border between Türkiye and Syria.In the map based on inferred V S30 (Fig. 13a) and slope (Fig. 13b), low values of V S30 and slope are present on both sides of the border and in the corresponding amplification maps (Fig. 14a and b); medium to high (0.2 < δS2S s < 0.6) amplification is predicted around the border with gradual lower amplification tohttps://doi.org/10.5194/nhess-24-1223-2024Nat.Hazards Earth Syst.Sci., 24, 1223-1247, 2024   wards the Taurus Mountains.The map of geomorphological sedimentary thickness (Fig. 13c) differs markedly from that of inferred V S30 and slope north of the Türkiye-Syria border with mainly shallow (< 10 m) sediments, and low amplification (−0.6 < δS2S s < 0) is predicted for that area (Fig. 14c).South of the Türkiye-Syria border, however, the geomorphological sedimentary thickness map shows deep (> 30 m) sediments and predicts high amplification (δS2S s > 0.6).This sharp difference at the Türkiye-Syria border in the geomorphological sedimentary thickness is likely due to the fact that Pelletier et al. (2016a), in the process of differentiating between upland and lowland, used separate geologic maps for Europe and the Arabian Peninsula.The site-amplification map based on geological era and slope combined (Fig. 14d) predicts less high amplification than inferred V S30 and slope and mainly in concentrated areas north of the Türkiye-Syria border (Fig. 14a and b).However, because the geological era maps were created for the European seismic hazard and risk model, the maps based on geological era and slope end at the border of Türkiye and do not include Syria (Fig. 13d).Despite these limitations, we chose to focus on this area because the recent events caused large damages on both sides of the borders, showing the urgent need for seismic hazard and risk models that cross both national and regional borders.We use ground motions from eastern Türkiye recorded in February and March 2023 to evaluate the performance of the models.The new dataset was retrieved from the ESM database (Luzi et al., 2020) and contains events recorded by 290 stations.The distribution and map of the events and stations are shown in Fig. 15.The predicted ground motion at each site is obtained using the four proxy-based siteamplification models (Y s,Proxy ) in combination with the median GMM prediction (µ median ): ln µ s,Proxy = ln exp µ median + Y s,Proxy (9) res = ln(FAS) − ln µ s,Proxy = δB e + δW s .
As the majority of the stations in the new dataset did not record a sufficient number of events (> 3) to derive the siteto-site residual δS2S s , we only use event as a random effect and evaluate the predictability of the four models using the within-event residual δW s .Figure 16 shows the sitecorrected δW s obtained from the GMM predictions of the Kahramanmaraş earthquake sequence and the four different site-amplification models at f = 1.062Hz.The δW s for f = 0.529 Hz and f = 9.903 Hz are shown in Figs.A7 and A8 in the Appendix.The variability of the site-corrected δW s is high for all the proxies, partly because the record-to-record variability δWS e,s is also included in the δW s , whereas the models only predict δS2S s .Nevertheless, the site-corrected δW s for all the proxies is consistently centred around zero with no visible strong trends, as shown by the binned mean (blue error bars).This shows that no significant biases are https://doi.org/10.5194/nhess-24-1223-2024Nat.Hazards Earth Syst.Sci., 24, 1223-1247, 2024  present in the models and that all four proxies are satisfactorily able to predict site amplification, even for new data.However, due to the limitations of the dataset, any further conclusions on the site effects caused by these events are beyond the scope of this study.Furthermore, because there are few stations with calculated δS2S s in this area, and particularly where the largest differences are located, it is difficult to make a solid argument for which proxy is more correct.For this purpose, one would either have to wait for more events to derive a valid δS2S s or perform a more detailed comparison with local values.However, as stated before, the aim of this study is not to re-create the exact site-specific amplification but to evaluate the ability of inferred proxies to predict the site amplification for larger areas and capture, using several models, the epistemic uncertainty of such regional amplification prediction.Instead, we compare the empirical amplification from the few stations in the area with the predicted amplification in a similar way as in Fig. 11.We only use the 14 stations available in the original δS2S s dataset and with measured V S30 values; these stations are marked in Fig. 14 and are described in Table 2.Because the measured V S30 of the considered stations only ranges between 300 and 700 m s −1 , the stations are separated into two ranges for medium-stiff Figure 17.Empirical site amplification (solid black line) compared to predicted site amplification using the stations in eastern Türkiye with measured V S30 values, separated into medium stiff soil sites with (a) V S30 = 300-500 m s −1 and (b) stiff sites V S30 = 500-700 m s −1 , using inferred V S30 (dashed blue lines), slope (solid orange lines), geomorphological sediment thickness (dashed green lines), and geological era and slope (dotted magenta lines) at the stations.
Table 2. Station name, location and site properties of the east Turkish stations with the measured V S30 shown in Fig. 14. soil sites (V S30 = 300-500 m s −1 , Fig. 17a) and stiffer sites (V S30 = 500-700 m s −1 , Fig. 17b).The comparison of empirical amplification and predicted amplification for the eastern Turkish stations in the two V S30 ranges shows that all the proxies significantly underpredict the site amplification for the medium-stiff soil sites and overpredict the stiffer sites.
Figure 17 further shows that predicting site amplification using proxies cannot reproduce the full range of amplifications on a local level and that regionalized models might be necessary.
In addition to assessing whether new site proxies can provide additional valuable information for estimating site effects at regional or global scales, an important motivation for this and previous studies (Weatherill et al., 2020(Weatherill et al., , 2023) ) was to discuss how to deal with epistemic uncertainty when characterizing ground motion at sites for further probabilistic seismic hazard and risk calculations.In order to properly account for the uncertainty, it is necessary to have a clear picture of where the uncertainty comes from.The results presented above have shown that using different site proxies to predict site amplification gives significantly different https://doi.org/10.5194/nhess-24-1223-2024 Nat. Hazards Earth Syst.Sci., 24, 1223-1247, 2024 results, which further emphasizes the importance of capturing the epistemic uncertainty associated with modelling site amplification when using inferred proxies.Furthermore, this epistemic uncertainty needs to be incorporated into the final risk calculation.To fully assess the impact of this epistemic uncertainty, risk and loss calculations should be performed using the different site-amplification models, but this is beyond the scope of this work.

Conclusions
To test whether the geomorphological model for sediment thickness derived by Pelletier et al. (2016a) can be used as an alternative site proxy, we have derived site-amplification models based on the geomorphological sediment thickness, as well as traditional site proxies like inferred V S30 , slope, and geological era and slope combined.Although the predicted site-amplification maps based on the different proxies show similar trends, there are also notable differences, indicating that the proxies capture different aspects of the site effects.Using only one proxy, for example, inferred V S30 , which is often the standard procedure, is therefore not sufficient to fully capture the range of possible amplifications.
The differences in the site-amplification predictions based on the different proxies thus contribute to the characterization of the epistemic uncertainty.In a probabilistic seismic hazard and risk context, this uncertainty needs to be included and properly accounted for.In this study, we calculate the site amplification for Europe and the Middle East but focus particularly on eastern Türkiye and Syria.As a measure of how well the site proxies capture the empirical site amplification, we used the reduction in site-to-site variability.The results show that the site-amplification predictions based on geological era and slope combined cause the highest reduction, while the prediction based on geomorphological sediment thickness causes a similar, but slightly larger, reduction in site-to-site variability than the traditional site proxies, inferred V S30 and slope.This result shows the value of including geology and geomorphology in prediction models for site amplification.Furthermore, the geological map used in this study is only available for Europe, while geomorphological sediment thickness is available globally and easily accessible.However, although the geomorphological sediment thickness has potential, further investigations and tests are needed before establishing it as an alternative to the much-used inferred V S30 model from Wald and Allen (2007), in particular in areas where inferred V S30 and slope are known to have a weak correlation with site amplification.Moreover, the correlation between the empirical site amplification and the site proxies all weakens above 3 Hz, which shows the need for models with higher resolution or including more local and shallow information.Our results therefore show the potential of not only the geomorphological sediment thickness model but also of other models for soil and sediment thickness from geomorphology and similar fields outside seismology and earthquake engineering.The outputs of this study have been made available (see the "Code and data availability" section below); however, it is important to state that the site-amplification models and maps developed in this study can only be interpreted as referenced to the associated GMM median prediction and should be considered as exploratory as they were developed for the purpose of testing the different site proxies and show the epistemic uncertainty related to using different proxies.Additionally, using inferred site proxies should only be done for regional seismic hazard studies of larger areas or when more detailed site parameters are missing.

Figure 1 .
Figure 1.Map of the site-amplification factor δS2S s for (a) f = 0.529 Hz, (b) f = 1.062Hz and (c) f = 9.903 Hz.The colour scale shows the amplification for each station, where red represents amplified ground motions with respect to the median of all the stations, and blue represents deamplified ground motions.

Figure 2 .
Figure 2. Map of the site proxies to be tested in this study.(a) V S30 from slope by Wald and Allen (2007), (b) slope calculated from digital elevation models, (c) geomorphological sedimentary thickness by Pelletier et al. (2016a) and (d) geological era used in the latest European seismic risk model (ESHM20, Crowley et al., 2021; Weatherill et al., 2023).

Figure 3 .
Figure 3.The distribution of the site proxies to be tested in this study.(a) V S30 from slope by Wald and Allen (2007), (b) slope calculated from digital elevation models, (c) geomorphological sedimentary thickness by Pelletier et al. (2016a) and (d) geological era used in the latest European seismic risk model (ESHM20,Crowley et al., 2021;Weatherill et al., 2023).

Figure 4 .
Figure 4.The linear (green line) and log-linear (red dotted line) relation between δS2S s and measured V S30 for the frequencies (a) f = 0.529 Hz and (b) f = 1.062Hz.

Figure 5 .
Figure 5. Inferred V S30 (a-c) and geomorphological sedimentary thickness (d-f) with the δS2S s (f ) of all the 1680 stations (black dots) in the ESM dataset for the frequencies f = 0.529 Hz (a, d), f = 1.062Hz (b, e) and f = 9.903 Hz (c, f).The regression lines are from the tobit regression (solid red line) and the linear regression on all the data (blue line), on the selected dataset without the maximum geomorphological sediment thickness values (50 m, green line), without the extreme values of geomorphological sediment thickness (0 and 50 m, dotted purple line), and the minimum geomorphological sediment thickness (0 m, orange line).The general trend of the data is shown by a non-parametric fit (dashed yellow line).

Figure 7 .
Figure 7. Linear regression (red lines) over the site proxies inferred V S30 (a, b), slope (c, d) and geomorphological sedimentary thickness (e, f), with the station δS2S s (black dots) for the frequencies f = 0.529 Hz (a, c, e) and f = 1.062Hz (b, d, f).
Figure 10.(a) Number of stations per Eurocode 8 class, (b) the correlation between measured V S30 and slope coloured by geological era, and (c) geomorphological sediment thickness (GST).
intermediate for C. When selecting corresponding geological eras, we use Holocene and Pleistocene for C (Fig. 10b, yellow and pink scatter points), the remaining geological eras (leaving out Cenozoic) Cretaceous, Jurassic-Triassic, Precambrian and Paleozoic for B (Fig. 10c, brown, green, blue and purple scatter points), and Cretaceous and Jurassic-Triassic for A (Fig.

Figure 12 .
Figure 12.Predicted site amplification at f = 1.062Hz for Europe using (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.The amplification maps are relative to the median prediction of associated GMM given in Eqs.(1)-(4).

Figure 13 .
Figure 13.Map of eastern Türkiye and Syria coloured by the site proxies (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness and (d) geological era.The epicentres for the two largest earthquakes of the Kahramanmaraş sequence of February 2023 are indicated as red stars on the map.

1237Figure 14 .
Figure 14.Predicted site amplification in eastern Türkiye and Syria at f = 1.062Hz using (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.The triangles represent the strong motion stations coloured by empirical site amplification δS2S s at f = 1.062Hz.The stations with measured V S30 values are indicated on the map.The amplification maps are relative to the median prediction of associated GMM given in Eqs.(1)-(4).

1238K.
Figure 15.(a) Distribution and (b) map of the ground motions from eastern Türkiye recorded in February and March 2023.

Figure 16 .
Figure 16.The within-event residuals δW s from predicted ground motions (in FAS) at f = 1.062Hz of the recent Kahramanmaraş earthquake sequence of February 2023 using the site-amplification models based on (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.

Figure A5 .
Figure A5.Predicted site amplification at f = 0.529 Hz for Europe using (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.The amplification maps are relative to the median prediction of the associated GMM given in Eqs.(1)-(4).

Figure A6 .
Figure A6.Predicted site amplification at f = 9.903 Hz for Europe using (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.The amplification maps are relative to the median prediction of the associated GMM given in Eqs.(1)-(4).

Figure A8 .
Figure A8.The within-event residuals δW s from the predicted ground motions (in FAS) at f = 9.903 Hz of the recent Kahramanmaraş earthquake sequence of February 2023 using the site-amplification models based on (a) inferred V S30 , (b) slope, (c) geomorphological sedimentary thickness, and (d) geological era and slope.
Triassic, Pre-Cambrian and Paleozoic.The map and station distribution of these eras are shown in Figs.2d and 3d.The slope used in this model was calculated from the 2014 General Bathymetric Chart of the Oceans grid (GEBCO 2014, https://www.gebco.net/,last access: 4 April 2022) and is shown in Figs.2b and 3b.In this study, we use the same slope and geological eras as Weatherill