PERL: a dataset of geotechnical, geophysical, and hydrogeological parameters for earthquake-induced hazards assessment in Terre del Reno (Emilia Romagna, Italy)

. In 2012, the Emilia Romagna Region (Italy) was struck by a seismic crisis characterized by two main shocks (ML 5.9 and 5.8) which triggered relevant liquefaction events. Terre del Reno is one of the municipalities that experienced the most extensive liquefaction effects due to its complex geo-stratigraphic and geo-morphological setting. This area is indeed located in a floodplain characterized by lenticular fluvial channel bodies associated to crevasse and levee clay–sand alternations, related to the paleo-Reno River. Therefore, it was chosen as case study for the PERL project, which aims to define a new 20 integrated methodology to assess the liquefaction susceptibility in complex stratigraphic conditions through a multi-level approach. To this aim, about 1800 geotechnical, geophysical and hydrogeological investigations from previous studies and new realization surveys were collected and stored in the PERL dataset. This dataset is here publicly disclosed and some possible applications are reported to highlight its potential.


Introduction
In the last few years, an increasing number of source data is publicly disclosed, allowing for wider access to research activities.Key examples are the huge amount of free satellite imagery (i.e., Sentinel, Landsat) provided by the main space agencies and the cutting-edge tools and procedures integrated in widely known and open-source EO platforms such as Google Engine.A multitude of algorithms and codes are available for all the fields of knowledge concerning natural hazards, while their application is made easier by the increasing number of open-access inventories of natural phenomena (i.e., Martino et al., 2014;Guarino et al., 2018;Tanyaş et al., 2022).However, only a few examples of datasets of in situ investigations and related parameters are publicly disclosed, and this is a gap to be filled.With regard to macro types of investigations (i.e., geological, geophysical, geotechnical, hydrogeological, etc.), some databases are currently available worldwide (i.e., Orgiazzi et al., 2017;Kmoch et al., 2021;Geyin et al., 2021;Minarelli et al., 2022), as well as in the Italian national territory.An example is provided by Vannocci et al. (2022), which includes geotechnical and hydrological soil parameters for shallow landslide modeling.
C. Varone et al.: PERL: a multiparametric dataset for earthquake-induced hazards assessment (Italy) However, there are only a few examples of freely available products which integrate different macro typologies of in situ investigations in a unique database, especially with reference to the Italian territory (Gaudiosi et al., 2021).In light of the above, the aim of the authors is to make a dataset of about 1800 geological, geophysical, geotechnical, and hydrogeological in situ investigations and related parameters collected in the Terre del Reno municipality freely available (Emilia-Romagna region, Italy).The study area is affected by severe seismic hazards and prone to seismically induced effects, as extensively documented by the 2012 seismic sequence which was characterized by more than 2000 earthquakes (Facciorusso et al., 2016).Two main shocks were recorded during the crisis: the first one on 20 May with M L 5.9 and the epicenter in Finale Emilia and the second one on 29 May with M L 5.8 and the epicenter in Medolla, both in Modena Province.
As widely reported in the bibliography, the propagation of seismic waves through the upper portion of the soil can be modified by local site conditions (i.e., Bozzano et al., 2017;Fabozzi et al., 2021;Falcone et al., 2020Falcone et al., , 2021;;Gautam, 2017;Luo et al., 2020;Meza-Fajardo et al., 2019) and can determine the triggering of earthquakeinduced effects at ground surface (i.e., Forte et al., 2021;Martino et al., 2017Martino et al., , 2019;;Giannini et al., 2022;Paolella et al., 2022).In Terre del Reno, these earthquakes triggered several earthquake-induced effects (Chini et al., 2015;Papathanassiou et al., 2015), among which linear and punctual liquefaction effects were the most prominent.These effects may occur when saturated granular deposits are shaken by a seismic action, and their magnitude depends on the combination of earthquake intensity and soil condition.Literature reports plenty of liquefaction events happened in complex geological conditions and were triggered by earthquakes with various magnitudes such as for instance the Gorkha, Nepal (Gautam et al., 2017), Christchurch, New Zealand (Maurer et al., 2019), Urayasu, Japan (Baris et al., 2021), 2008Wenchuan, China (Zhou et al., 2020), and 2019 Dürres, Albania (Mavroulis et al., 2021) earthquakes.Liquefaction effects in Terre del Reno were mainly related to the complex sedimentological and stratigraphic setting of the areas (i.e., Stefani et al., 2018;Tentori et al., 2022), characterized by multiple and alternate sandy and silty-sandy packing, hosting local (shallow) and regional (deep) aquifers (Regione Emilia-Romagna, 1998).Several authors (i.e., Ecemis, 2021;Jain et al., 2022) have highlighted that the presence of a tiny alternation of silt and sand seems to influence the liquefaction occurrence, while other studies focused on the role of silty sands and soil packing condition on liquefaction triggering (i.e., Naeini and Baziar, 2004;Stamatopoulos, 2010;Gobbi et al., 2022a).To overcome the difficulties related to heterogeneously complex soil conditions, integrated approaches are applied to predict the occurrence of liquefaction by combination of numerical and experimental methods (Gobbi et al., 2022b;Rios et al., 2022;Paolella et al., 2022).Further steps toward this direction were made for the Terre del Reno case study by pursuing two main objectives of the PERL project in order to (i) define a new integrated methodology to assess the liquefaction susceptibility in complex stratigraphic settings through a multi-level approach and (ii) perform the seismic microzonation of the municipality for land and civil protection planning purposes.This project allowed for the collection and analysis of the abovementioned in situ investigations and the elaboration of thousands of related parameters that were stored in a harmonized and standardized dataset (named PERL) conceived to guarantee interoperability with existing ICT (information and communication technologies) solutions and data models.The availability of such a dataset of surveys, catalogued and processed according to shared standards, makes Terre del Reno one of the bestcharacterized municipalities in Italy in terms of seismic hazard and earthquake-induced effects.This flexible dataset can be manipulated and combined to tackle different problems and represents a powerful resource for the scientific community, for those who cannot set up and manage a living laboratory or directly perform on-site investigations.
For these reasons, the authors provide complete access to the dataset through the supplementary materials and present two different applications herein used as references to highlight the potential of the PERL dataset.

Structural and stratigraphic setting
The study area is located within the southern portion of the Po alluvial plain, which represents the sedimentary cover of the Po Basin infill (Fig. 1).The geological substrate of the study area, which lies along the northern sectors of the Apennine chain, shows complex fold and thrust structures with arcuate geometry associated with strongly asymmetrical foredeep basins.Although the Po Basin represents both the Alpine retro-foreland basin and the Apennine foredeep, its Cenozoic structural evolution was mainly driven by the northeast migration of the external front of the northern Apennines, which consists of four arcuate fold-and-thrust systems: the Monferrato Arc, the Emilia Arc, the Ferrara Arc, and the Adriatic Arc (Pieri and Groppi, 1981;Royden et al., 1987;Scrocca et al., 2007).These systems that are buried beneath the present Po plain have been active since the late Miocene (Fig. 1) and are still considered seismogenic (Boccaletti et al., 2011;Ghielmi et al., 2013).In particular, the movement of a segment of the Ferrara Arc thrust system (i.e., the Mirandola thrust system) was responsible for the 2012 Emilia seismic events (ISIDe Working Group, 2010), which triggered numerous co-seismic effects associated with liquefaction phenomena in the provinces of Ferrara, Modena, and Bologna.In the study area, the shallowest Quaternary sedimentary fill consists of marine deposits (marine Quater-nary in Fig. 1b) and 100 kyr spaced transgressive-regressive cycles constituted by nearshore sands and alluvial deposits, formed during interglacial and glacial periods, respectively (continental Quaternary in Fig. 1b).The stratigraphic framework of the topmost late Pleistocene to Holocene Po Basin succession (at 0-40 m depth from the ground surface) documents a succession of tabular-shaped fluvial sands (i.e., glacial) overlain up-section by the Holocene's poorly drained and mud-rich floodplain and swamp and marsh succession with subordinate lenticular fluvial sandy channel bodies associated with crevasse and levee clay-sand alternations fed by the paleo-Reno River (i.e., interglacial) (Bruno et al., 2021;Stefani et al., 2018;Tentori et al., 2022).The Reno River's modern drainage basin extends for about 2500 km 2 in the northern Apennines.Owing to the low topographic gradients in the area, the paleo-Reno River experienced fast aggradation and frequent avulsion episodes during recent and historical times (see Tentori et al., 2022 and references therein).

Hydrostratigraphic setting
The hydrostratigraphic architecture reflects the depositional and tectonic evolution of the southern Po sedimentary basin from the Pleistocene to Holocene (Molinari et al., 2007;Emilia-Romagna Region and ENI-AGIP, 1998).The aquifers from the most superficial hydrostratigraphic group (e.g., Group A), consist of six lower-order hydrostratigraphic units belonging to the Quaternary fluvio-deltaic and alluvial depositional systems.In the study area, Group A aquifers consist of the sandy fluvial bodies deposited during glacial periods, separated by the muddy-dominated intervals of transgressive alluvial facies (aquitards) deposited during interglacial periods.The more surficial composite aquifer system named A0 by Molinari et al. (2007) consists of two sandy-dominated aquifer units hosted within the late Pleistocene-Holocene channelized bodies and encased by alluvial floodplain muds.
Based on the piezometric level dating back to the summer of 2012, Calabrese et al. (2012) placed the groundwater level of the shallower semi-confined aquifer at about 3-4 m depth below the levee and about 1-2 m in the floodplain.

Materials and methods
The PERL dataset was obtained by merging three databases provided by different institutions.Additional 17 geotechnical investigations were specifically performed in the framework of the PERL project.
The three existing databases are the following: - The first problem faced when merging these databases was the presence of duplicate information.To avoid duplicates, a methodology to discern and verify the uniqueness of an investigation was elaborated.This methodology is based on the implementation of a series of multiple, progressive true/false (TF) controls applied to various control parameters (CP) relative to all the investigations included in the pertinence area.The latter was defined as a circle with a radius equal to 200 m centered in the correspondence of the considered investigation.The progressively considered CP (Fig. 1) are (CP1) the absence of another investigation within the area of pertinence, (CP2) unmatching of the investigation typology, (CP3) unmatching of the date of the survey, and (CP4) matching of the maximum depth reached by the investigation.Each CPm (m = 1, 2, 3, 4) is checked in a dedicated TF test (TFn with n = 1, 2, 3, 4).Starting from TF1, an investigation that verifies CP1 is moved to TF2 for CP2 verification up to TF4.Each time a CPm in a TFn is not verified, the investigation is defined as "unique".If an investigation verifies all the control parameters, it is defined as "redundant" and removed from the database.The application of this methodology allowed us to identify and remove 32 % of the investigations, obtaining a final dataset composed of 1805 unique investigations (Fig. 2).

Data description
The PERL dataset consists of two shapefiles implemented into a GIS system and an associated geodatabase.The two shapefiles are named ind_pc and ind_ln and correspond to punctual and linear investigations, respectively (EPSG:32633).The associated attribute tables contain the main information of each investigation: -ID.Unique identification number for each investigation.
The complete set of investigations, and the related measured parameters are reported in an Excel file following this structure: -ID.Unique identification number of each investigation.
-Depth_top.Depth (m) of the layer top to which the parameter refers.
-Depth_bottom.Depth (m) of the layer bottom to which the parameter refers.
Penetrometer tests, geognostic boreholes, trenches, and borehole geophysical tests are characterized by a depth of investigation ranging from a few meters to more than 100 m (maximum depth: 265 m) (Fig. 4b).About 90 % of them reach a maximum depth of investigation of 35 m.Thus, the most represented depth classes are 30-35 and 10-15 m with 330 (21 %) and 310 (20 %) investigations, respectively.Penetrometer tests are characterized by depths ranging between 5 and 50 m, with the 30-35 m class being the most represented.On the contrary, boreholes and trenches cover the entire spectrum of the dataset depth.However, it is worth noticing that about 60 boreholes and trenches reach a depth higher than 55 m, which is the most represented classes together with the 10-15 m class.Penetrometer tests, boreholes, trenches, and geophysical tests are characterized by investigation depths ranging from a few to some hundred meters with a maximum of 265 m.About 90 % of them reach a maximum depth of investigation equal to 35 m.Most investigations are carried out up to a depth of 30-35 m and 10-15 m (330 (21 %) and 310 (20 %) investigations for each class, respectively).On the contrary, boreholes and trenches cover the whole spectrum of depth classes.However, it is worth noticing that about 60 boreholes and trenches reach a depth higher than 55 m, which is the most represented class together with the 10-15 m class.

Examples of applications
To address some of the conceptual points discussed in the Introduction and to better highlight the uniqueness and potential of the PERL dataset, we present two different applications.In the first case study, we take advantage of the PERL database to represent the complex geology beneath the San Carlo alluvial plain.The second case history focuses on a statistical inference of the PERL geophysical data to obtain soil dynamics when experimental information is missing.

Stratigraphic reconstruction of liquefiable layer thickness in the San Carlo subsoil
The PERL database includes several sedimentological, geotechnical, geophysical, and hydrogeological data which can be used to reconstruct the stratigraphic architecture of the Terre del Reno subsurface and provide a reliable geological framework for future studies devoted to earthquakeinduced hazard mitigation.The position and the thickness of the liquefiable portion within the subsoil are key information for liquefaction risk assessment and mitigation.The possibility of an automatically built three-dimensional subsoil model with advanced procedures represents a current topic of the applied technological research.Here, a combination of these two approaches is presented to spotlight the potential of the PERL dataset.
As an example, Fig. 5 shows the geostatistical interpolation, performed with the ordinary kriging, of the cumulated thickness of the liquefiable layers (CTL) in the district of San Carlo.In particular, the CTL has been manually extracted from 33 boreholes and automatically obtained on 148 CPTs by applying the procedure proposed by Spacagna et al. (2022).The obtained map has been overlayed on the map of liquefaction evidence that occurred after the Emilia-Romagna 2012 seismic sequence, showing a good match between the liquefaction-induced surficial manifestations and the CTL distribution.

Statistical analysis of shear waves variability with depth
As widely represented by literature data, the amount of available investigations progressively decreases with soil depth.Thus the uncertainty in subsoil characterization increases from the ground surface to the deepest layers of the soil.a correlation analysis between the values of V s (m s −1 ) and depth (m) was carried out to infer information when the depth of investigation does not guarantee a correct soil parameterization.Considering the high geological and stratigraphic complexity of the case study, a lithological-based statistical inference was performed.Specifically, for each lithology L, the scatter plot of the value of V s (m s −1 ) as a function of depth, D (m), and the corresponding linear regression were calculated.
The linear regression model is defined by the following relationship: where a and V s0 correspond to the slope and intercept of the model line, respectively.
The PERL database can rely on 164 V s profiles mainly identified from penetrometer tests (SCPT), down hole (DH), MASW, ESAC_SPAC, SDMT, and cross hole (CH) tests.Each of these V s profiles was discretized with a step size of 1 m in depth and, through an automated procedure, each meter of depth was associated with lithological (L) information extracted from proximal boreholes.
For each lithology L, a statistical analysis was performed, and the linear regression models were calculated.The results obtained for MH and SP lithologies are here presented as case examples (Fig. 6).The corresponding values of coefficients a and b and the coefficient of determination R 2 are reported in Table 1.As expected, the obtained a and b values highlight a positive slope as the depth mean value increases together with the mean of the dependent variable V s .At the same time, these values allow us to quantify how much the V s value changes per meter of depth, showing an SP variation rate greater than that characterizing MH.Moreover, the coefficient of determination R 2 obtained for MH and SP lithologies are characterized by a 0.59 and 0.79 value, respectively, proving the reliability of the fitting.When experimental data are lacking, for depths that fall within the variability range of the available data, the regression models allow us to obtain V s by interpolation, while an extrapolation can be applied for greater depths.
Results may be used in the future for comparison with other Italian estimates (Romagnoli et al., 2022) or combined with ambient vibration measurements to define the thickness of the resonant sedimentary layers (D'Amico et al., 2008;Giannini et al., 2021).

Conclusions
As part of PERL project, a considerable number of investigations were collected in the Terre del Reno municipality, Emilia-Romagna region (Italy).This area experienced the most extensive liquefaction effects during the 2012 Emilia-Romagna seismic crisis and remains exposed to severe seis-  mic hazards and seismically induced effects due to its complex geological setting.
Thanks to this study, complete and free access to the PERL dataset, which includes 1805 punctual and linear in situ investigations consisting of geological, geotechnical, geophysical, and hydrogeological data, is provided.The database is composed of 71 % of penetrometer tests, 16 % of boreholes and trenches, 12 % of geophysical investigations, and 1 % of laboratory and hydrogeological tests.
Two applications of the PERL dataset are presented to highlight its potentiality and to show that a high-quality large dataset could be critical to infer information in areas and/or portions of the soil characterized by poor or sparse data.The first examples pointed out the database potentials in overcoming problems due to the uneven distribution of surveys across the territory, while the second one spotlights its capability to provide information at subsoil depth not reached by investigations.
Other major outcomes of the PERL project in the Terre del Reno municipality included (i) a detailed reconstruction of the subsoil geology to examine the stratigraphic control on earthquake-induced liquefaction (Tentori et al., 2022); (ii) data-driven and automatic subsoil characterization through the analysis of CPT-based soil behavior type (SBT) and soil behavior type indexes (Ic), combining geostatistical and artificial intelligence genetic approaches (Baris et al., 2022); and (iii) a third-level seismic microzonation study of the Terre del Reno municipal area (Varone et al., 2022) to mitigate seismic and seismically induced hazards through sensible urban planning.
Ongoing studies currently focus on the definition of a comprehensive methodology to quantify liquefaction susceptibility in areas dominated by complex geostratigraphic conditions by applying a multi-level approach laying on simplified models and to promote the identification of potentially liquefiable granular bodies, thus mitigating earthquake-induced hazards to allow for a sustainable development of this urban area.
Figure 1.(a) Tectonic sketch map of the Po Plain (northern Italy) showing the main buried faults of the northern Apennines and southern Alps and epicenters of the two mainshock from the 2012 Emilia sequence (red stars) (modified from Tentori et al., 2022 and Bruno et al., 2021).The black-lined rectangle encloses the study area.(b) Simplified stratigraphic cross section and major tectonic structures along the trace A-A in panel (a) (modified afterBoccaletti et al., 2004).The tectonic structures are enucleated into Mesozoic to Paleogene carbonate successions and largely controlled the sedimentary evolution of the terrigenous basins during the Neogene(Ghielmi et al., 2013;Rossi et al., 2015;Ricci Lucchi, 1986).The Pliocene-Pleistocene boundary records the transition from turbiditic sedimentation to marine clay deposition, whereas the Quaternary sedimentary fill consists of marine deposits, nearshore sands, and alluvial deposits (see text for details).

Figure 2 .
Figure 2. Synthetic workflow of the method used to merge the MUDE, RER, and SM databases, as well as the new realization investigations, into the PERL dataset.

Figure 3 .
Figure 3. Spatial distribution of the in situ punctual (a) and linear (b, c) investigations composing the PERL dataset.The digital elevation model (DEM) was retrieved from Regione Emilia-Romagna (2015).

Figure 4 .
Figure 4. PERL dataset characteristics.(a) Classes of in situ investigations and (b) depth reached by penetrometer tests, geophysical investigations (CH and DH), boreholes, and trenches.

Figure 5 .
Figure 5. Geostatistical maps of the cumulated thickness of the liquefiable layer.

Figure 6 .
Figure 6.Scatter plot and regression line model for lithology MH (a) and SP (b).Equations and R 2 are also reported.
Since the digital formats of these investigations were originally not available, geolocalization, key information, and measured parameters were obtained from the digital scans of technical and geological reports.
Modello Unico Digitale per l'Edilizia -Unique Digital Model for Building (MUDE database).The MUDE database consists of 384 records including punctual and linear in situ investigations.Data were extracted from a series of technical reports produced to plan the reconstruction works of buildings collapsed during the 2012 seismic crisis.-SeismicMicrozonation Studies (SM database).The SM database is composed of 1284 records, including punctual, as well as linear in situ investigations.These investigations are geolocalized and organized in a standardized structure according to Commissione tecnica per la microzonazione sismica (2015).The key information (typology, date, coordinates, etc.) of each investigation is stored in a dedicated table, while all the measured parameters are reported in chained tables.This database is available at https://www.webms.it/(last access: 12 September 2022).

Table 1 .
Coefficients (a = slope, V s0 = intercept) of the regression model for two lithologies (L) and related coefficient of determination (R 2 ).