A methodology to detect local incompleteness of
macroseismic intensity data at the local scale is presented. In particular,
the probability that undocumented effects actually occurred at a site is
determined by considering intensity prediction equations (in their
probabilistic form) integrated by observations relative to known events
documented at surrounding sites. The outcomes of this analysis can be used
to investigate how representative and known the seismic histories of
localities are (i.e., the list of documented effects through time). The proposed approach is applied to the Italian area. The analysis shows that, at most of
the considered sites, the effects of intensity

Extending the knowledge of the seismicity of an area as far back as possible in time is essential in seismological research, including seismotectonic investigations and seismic hazard assessment. For these purposes, earthquake catalogues spanning hundreds of years represent an essential complement of instrumental data relative to the last few decades. The compilation of these catalogues relies on the analysis of the effects documented on the human and natural environment during past earthquakes, interpreted and standardized in terms of macroseismic intensity scales (e.g., MCS – Sieberg, 1923; MMI – Wood and Neumann, 1931; MSK – Medvedev et al., 1964; EMS-98 – Grünthal, 1998). Intensity data available at different localities for a given earthquake (intensity data points, hereafter IDPs) can be used to constrain the respective epicentral location and magnitude with a variety of methodological approaches (e.g., in recent years Bakun and Wentworth, 1997; Gasperini et al., 1999, 2010; Provost and Scotti, 2020).

Compared to other regions of the world, the knowledge of European seismicity is particularly detailed and lengthy (Albini et al., 2013; Locati et al., 2014; Rovida et al., 2020a, 2022a), and Italy stands out from other European countries. The bulk of the current Italian Parametric Earthquake Catalogue CPTI15 version 4.0 (Rovida et al., 2020b, 2022b) mostly derives from the harmonization and parametrization of intensity data contained in the Italian Macroseismic Database DBMI15 version 4.0 (Locati et. al., 2022). In fact, the majority of the earthquakes contained in CPTI15, which spans from 1000 to 2020 CE, is supported by IDPs, in particular those in the pre-instrumental period, as a result of more than 45 years of research in the field of historical seismology, represented in the Italian Archive of Historical Earthquake Data (ASMI; Rovida et al., 2017). Despite the increase in the number of macroseismic studies with time, the historical research has remained anchored to the long tradition of national and regional seismological compilations, based mainly on local historiography, summarized in the pioneering work of Baratta (1901), and had later been influenced by projects commissioned for several scopes as, for example, the identification of the sites for nuclear power plants in the 1980s (Stucchi, 1993; Camassi, 2004). As a consequence, several investigations often focused on specific, sometimes limited, geographical areas. This influences the content of the CPTI15 earthquake catalogue, the completeness of which needs to be analyzed from both the time and space points of view to evaluate how representative it is of the actual seismicity (Albarello et al., 2001; Stucchi et al., 2004, 2011). In addition, earthquakes in a given completeness time span and area may show gaps in terms of documented effects at the sites. The assessment of earthquake parameters from intensity data is strictly connected to their reliability, number and spatial distribution. In Italy, as in the rest of Europe, there are many earthquakes attested by very few, or even single IDPs, which, of course, do not constrain the earthquake location and magnitude (Albini and Rovida, 2018; Albini, 2020; Rovida et al., 2020a). Moreover, the size of the earthquake and the number of IDPs are not related, and many IDPs might support low-magnitude earthquakes and vice versa, with high-magnitude earthquakes that might be represented by one or a few IDPs that often correspond to the highest available intensities. This means that several places may not have documented the effects related to a given event, regardless of its size. Analyzing the undocumented earthquake effects and providing an estimate in terms of macroseismic intensity represent the basis for investigating the knowledge of the seismic history of a given site. Despite this, no such in-depth analysis is yet available at both European and regional scales.

The aim of this study is proposing a coherent probabilistic approach to detect sites where seismic effects of past earthquakes could be missing, for several reasons. Moreover, it also provides a deeper insight into the completeness level of data relative to historical earthquakes, and this may be useful to identify possible biases when incomplete macroseismic data are used for several seismological analyses. An application of this approach is here presented, focusing on the Italian territory. In this area, a huge number of macroseismic intensity data is available, which have been extensively used for seismic hazard assessment and other seismological investigations (e.g., D'Amico and Albarello, 2008; Faenza and Michelini, 2010; Gomez-Capera et al., 2020; Oliveti et al., 2022).

The historical research in the field of macroseismology of the last few decades
led to a wealth of studies that present data on Italian earthquakes and
surrounding areas in a variety of different formats. These studies are
inventoried and gathered in the Italian Archive of Historical Earthquake
Data (ASMI;

The latest version of DBMI (DBMI15 v.4.0; Locati et. al., 2022) makes
123 981 IDPs available, mostly expressed in the MCS macroseismic scale,
related to 3229 earthquakes in the time window 1000–2020 and referred to
15 343 Italian localities. These data are the result of 191 different studies, and most IDPs (i.e., 60 %) come from the recent (1980–2005) earthquakes provided by the Macroseismic Bulletin of the Istituto Nazionale di Geofisica e Vulcanologia (INGV) (e.g., Gasparini et al., 2003, 2011) and from the Catalogue of Strong
Earthquakes in Italy CFTI4Med (Guidoboni et al., 2007). The remaining part consists
of intensity data from different studies, dealing with a great number of
earthquakes (e.g., Molin et al., 2008; Camassi et al., 2011; Azzaro and
Castelli, 2015); scientific papers on single earthquakes, areas, or periods;
and macroseismic field surveys of recent earthquakes (e.g., Tertulliani et
al., 2012). The number of available data per earthquake and per locality is
extremely variable. In particular, 5650 out of the 15 343 sites contained in
DBMI15 (36.8 %) have only one intensity value, 2114 have two intensity
data (13.8 %) and 3207 have more than 10 intensity values (20.9 %). Data contained in DBMI15 are used for compiling the seismic history of Italian localities, that is the list of earthquake effects observed in a place through time, and for assessing macroseismic parameters (epicentral location and magnitude) of the events listed in the CPTI15
(

Despite the enormous and unique number of data, their diverse provenance, the possible gaps in the historical documentation and its non-systematic investigation affect their homogeneity in time and space. Assessing the completeness of intensity data requires a deep knowledge of the local history of the investigated locality, the preservation of the related documentation and its thorough analysis. In Italy, such complex and time-consuming investigations were performed for 18 localities in the framework of dedicated projects (Albini et al., 2001, 2003). Stucchi et al. (2004) later extrapolated these results at the national scale for the assessment of the historical completeness of the Italian earthquake catalogue. Although the assessment of the historical completeness for all the localities in DBMI15 is hardly feasible, an overall picture is achievable through the identification and analysis of earthquake effects that potentially occurred but were not documented at a number of sites, as proposed in this work.

To this purpose, we applied the methodology developed by Antonucci et al. (2021) to a significant sample of Italian localities, selected as described in Sect. 4, and considered all the earthquakes in CPTI15, except volcanic events because the attenuation of intensity in volcanic areas is different from that of the Italian territory (e.g., Carletti and Gasperini, 2003; Azzaro et al., 2006), and earthquakes with instrumental depth greater than 40 km because they are generally slightly felt at the surface and thus are likely absent from the historical records.

Geographical distribution of the 228 selected sites.

Intensity data can be calculated where the effects of a given earthquake with known location and magnitude are missing. A common methodology relies on the use of intensity prediction equations (IPEs) for computing an intensity value at a considered locality as a function of the source-to-site distance and the magnitude or epicentral intensity of an earthquake. The most recent IPE for the Italian area was published by Pasolini et al. (2008) and is based on a classical functional form, similar to that used for ground motion prediction equations (GMPEs), with the physical terms of geometric spreading and anelastic attenuation (Mak et al., 2015). This IPE, recently recalibrated by Lolli et al. (2019) using the data collected in the DBMI15 v.1.5 (Locati et al., 2016) and CPTI15 v.1.5 (Rovida et al., 2016, 2020b), was used in this study.

As extensively discussed in Albarello and D'Amico (2004) and Antonucci et
al. (2021), a way to express the uncertainties related to each intensity
estimation is using a probabilistic approach. In particular, the intensity
value calculated at the site by the IPE is estimated through a normal
probability distribution using the average

Furthermore, when the intensity related to a given earthquake is not
documented at the considered site but is available at close sites, the value
estimated with the IPE can be constrained with such intensity values
(Albarello et al., 2007) observed in at least one neighboring (within 20 km)
locality. The distance of 20 km was selected through an analysis on more
than 15 000 Italian sites contained in DBMI15. We investigated the geographic
distribution of these localities calculating the number of localities within
a set of possible distance thresholds for every site (Antonucci et al.,
2021). In particular, with a Bayesian approach, it is possible to estimate
the discrete probability density distribution

To analyze the number and the entity of undocumented macroseismic effects that might have occurred in Italy in the past, a dataset of sample localities was defined. The dataset was selected according to the geographical distribution of the localities and the number of associated macroseismic observations in DBMI15, exclusively based on expert judgment without the use of automatic procedures. These sites had to present a homogeneous and dense distribution over the Italian territory while also finding a good compromise between main cities and small villages. Moreover, the selection considered both the differences in the urbanization in Italy and the distance of 20 km among localities, which is adopted in the Bayesian procedure for estimating the intensities (Antonucci et al., 2021). The selected dataset includes 228 sites (Table S1 in the Supplement) distributed over the whole Italian territory, with the exception of localities in the very low-seismicity area of Sardinia that present too few data, and represents a choice among the localities with (i) the highest number of intensity data collected in DBMI15 and (ii) their geographical distribution, also taking into account the distances from one another. In particular, the seismic histories of the 228 localities have at least two intensity values greater than or equal to 5 MCS (Fig. 1) with a total of 10 323 macroseismic data ranging in intensity from 2 to 10–11 MCS. In addition, the 228 sites have 2201 data expressed with non-conventional descriptive codes (e.g., “HD” for heavy damage; see Locati et al., 2022).

Probability of undocumented effects with intensity

Focusing on the data with observed intensity

The entity of a given effect is computed at each selected site for each earthquake in CPTI15 when the respective IDP is lacking in DBMI15 (see Sect. 3). As a result, the intensities corresponding to undocumented effects are estimated in two ways: (i) from earthquake parameters through the adopted IPE, i.e., effects neither documented at the site nor in the sites nearby but likely to have happened on the basis of the content of the earthquake catalogue, and (ii) by integrating the above information with observations available at other localities within 20 km from the considered site (see Eq. 1). It is assumed that in the last case, the probability that the considered intensity threshold has actually been reached is better constrained than in the former case. Intensity data inferred from the IPE, either “corrected” with macroseismic observations available at nearby localities or not, can be considered “potentially lost” data because, although the locality likely experienced a given level of shaking as a consequence of a known earthquake, this is not documented.

The earthquake effects at the selected sites can be analyzed on a site-by-site basis in order to evaluate (i) the number of undocumented effects at the considered sites and (ii) the probability that each of these effects might have reached a given intensity level. In other words, we estimated the probability of having an undocumented intensity value at each of the considered sites. Figure 2 shows, as an example, the results obtained at four sites in terms of the probability of reaching or exceeding intensity 6 MCS, estimated through the Bayesian approach described above. These sites (see Fig. 1 for location) were selected to represent geographical areas characterized by different levels of seismicity: (i) Susa in the western Alps, (ii) Modena in the Po Plain, (iii) Spoleto in the central Apennines and (iv) Roccadaspide in southern Italy.

Probability of undocumented effects with intensity

As shown in Fig. 2, the undocumented effects at the four sites are quite
different in terms of both the total number and the exceedance
probabilities. Regarding the total number of effects, the highest numbers
were estimated at Spoleto (central Apennines) and Roccadaspide (southern
Italy) with 93 and 45 effects with intensity

Number of undocumented effects at each selected site with
probabilities

The exceedance probabilities for higher intensity levels, i.e., greater than
or equal to 7, 8 and 9 MCS, were also calculated and analyzed. Figure 3
reports the results obtained for intensities greater than or equal to 8 MCS at
the same four sites of Fig. 2. In particular, Fig. 3a and b show that
the earthquakes that might have produced an intensity at least equal to 6 MCS (Fig. 2a and b) were not able to produce an intensity

Probability that at least one effect of intensity

As shown in the previous section, each undocumented effect estimated at a
site has different exceedance probabilities for different intensity levels.
In this respect, the number of effects potentially lost at the 228 selected
sites can be quantified selecting a given probability threshold. For this
purpose, at each site, we counted the number of undocumented effects with
probabilities

Taking into account the different exceedance probabilities computed at each
locality for

This work provides a probabilistic methodology devoted to the quantitative
estimate of the effects of past Italian earthquakes that can be considered
potentially lost at a sample set of sites and analyzes the results both at
the local (site by site) and national scale. The results show some gaps in
the macroseismic data contained in DBMI15, despite their quality and
quantity. Indeed, at least one damage effect with intensity

The joint analysis of macroseismic intensity data observed through time in a place and those calculated through an IPE, constrained with data observed at nearby localities with the Bayesian procedure described in Antonucci et al. (2021), provides a general methodology to investigate the knowledge of seismic histories and to estimate the level of representativeness of each site in function of the seismicity of a given area. The results are given in a probabilistic form that allows considering both the uncertainties related to the assessment of intensity at a given site and the nature of macroseismic data (ordinal, discrete and range-limited). Such a procedure is repeatable and applicable to other regions and contexts. However, the outcomes are strictly dependent on the number of earthquakes and the reliability of the parameters contained in the input seismic catalogue as well as on the adopted IPE, with its specific functional form and parameters, and the associated uncertainties. In this regard, particular care should be given to the interpretation of the results, considering that the content of a catalogue is progressively less representative of the actual seismicity going back in time, especially for small events. This implies that changing both the considered catalogue and IPE could considerably change the results in terms of calculated undocumented effects at the sites.

Regardless of these limitations, the analyses show that the intensity data documented at a given site may not be representative of the actual shaking experienced through time even with an enormous number of macroseismic data, such as in Italy. Consequently, the use of these data for several seismic analyses, such as intensity-based seismic hazard assessment at a local scale and testing of probabilistic seismic hazard models, should include a careful preliminary analysis of the representativeness and completeness of macroseismic data at the sites, regardless of the study area considered. For this purpose, the main future goal will be checking the consistency of these results with those obtained through an in-depth historical investigation, which is the only way of providing robust quantitative estimates of the temporal completeness of the seismic history of a site for different intensity levels.

DBMI15 is available at

The supplement related to this article is available online at:

AA edited most parts of the paper and performed the statistical analyses. AR, VD'A and DA contributed to the manuscript and supervised the research.

The contact author has declared that none of the authors has any competing interests.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The authors wish to thank Carlos Hector Caracciolo, Josep Batlló, Raffaele Azzaro and the editor Animesh Gain for their comments and suggestions. The authors wish to warmly thank Mario Locati for his constant support and Paola Albini for her comments and suggestions on this work and, in general, for her valuable and continuous teachings on historical seismology.

This paper was edited by Animesh Gain and reviewed by Carlos H. Caracciolo, Josep Batlló, and Raffaele Azzaro.