Modelling the seismic potential of active faults and the associated epistemic uncertainty is a fundamental step of probabilistic seismic hazard assessment (PSHA). We use SHERIFS (Seismic Hazard and Earthquake Rate In Fault Systems), an open-source code allowing us to build hazard models including earthquake ruptures involving several faults, to model the seismicity rates on the North Anatolian Fault (NAF) system in the Marmara Region. Through an iterative approach, SHERIFS converts the slip rate on the faults into earthquake rates that follow a magnitude frequency distribution (MFD) defined at the fault system level, allowing us to model complex multi-fault ruptures and off-fault seismicity while exploring the underlying epistemic uncertainties. In a logic tree, we explore uncertainties concerning the locking state of the NAF in the Sea of Marmara, the maximum possible rupture in the system, the shape of the MFD and the ratio of off-fault seismicity. The branches of the logic tree are weighted according to the match between the modelled earthquake rate and the earthquake rates calculated from the local data, earthquake catalogue and palaeoseismicity. In addition, we use the result of the physics-based earthquake simulator RSQSim to inform the logic tree and increase the weight on the hypotheses that are compatible with the result of the simulator. Using both the local data and the simulator to weight the logic tree branches, we are able to reduce the uncertainties affecting the earthquake rates in the Marmara Region. The weighted logic tree of models built in this study will be used in a following article to calculate the probability of collapse of a building in Istanbul.

The North Anatolian Fault system (NAFS) runs through the north of Turkey for a distance of more than 1500 km (Fig.

Regional tectonic setting. Modified from

Fault system and the earthquake catalogue used in this study (

Building on the findings of these studies, several earthquake rate forecast (ERF) models and hazard maps have been developed (

In this study, we use the recently developed the SHERIFS (Seismic Hazard and Earthquake Rate In Fault Systems) code (

The geometry of the network could potentially host larger ruptures than the one observed in historical time. For example, the change in azimuth in the geometry of the fault in front of Princes' Island (Fig.

It has long been observed that the number of earthquakes in a region decreases with the magnitude of the earthquake.

The earthquake rates modelled with SHERIFS using each combination of uncertainties will be compared to the earthquake rates calculated from the earthquake catalogue and the palaeoseismic records in order to give a score to each model. However, for some hypotheses, the comparison with the data is not sufficient for rating one hypothesis against another. We tackle this issue by modelling a synthetic catalogue of the fault system using the earthquake simulator RSQSim (

The final goal of this study is to enhance the understanding of the seismic risk in Istanbul. The earthquake occurrence models developed in this study will be used in a later study to calculate to probability of collapse of a theoretical building in Istanbul. The impact of each uncertainty affecting the earthquake rates on the uncertainty in the estimates of the probability of collapse will be quantified and discussed.

Fault traces and some of the slip-rate estimates along the North Anatolian Fault system have been recently updated in

The parameters of the main faults used in this study are presented in Table

Model parameters of the faults of the Marmara Region, closer to Istanbul. The full parameter table for all the faults in the model is available in the electronic Supplement. See the text for details on slip-rate setting and definition of partial and deep creep.

The scientific community has been debating over the possibility of the NAF creeping in the western Marmara Region, along the Terkidag, central basin, Kumburgaz and Avcilar sections of the fault (Table

For the faults within the Armutlu peninsula (Fig.

In this study, we combined two catalogues: the earthquake catalogue from

Completeness time as a function of magnitudes used in this study.

Based on the analysis of the depth distribution of earthquakes in the instrumental catalogue (

Annual rate of

In this section, we describe two approaches for calculating earthquake rates in fault systems. The first approach is SHERIFS (

SHERIFS uses an iterative budget spending approach of the slip rate of the fault to calculate the annual rate of occurrence of each rupture of a predefined set of ruptures. In an iterative manner, SHERIFS randomly selects user-defined rupture scenarios for which faults involved have a slip-rate budget to spend. The random selection is done in order to ensure that the resulting system level MFD has the shape imposed as input (

SHERIFS takes as input the geometry and slip rate of the faults, the set of multi-fault ruptures that can be expected in the fault network, and the shape of the MFD defined at the fault system level. Before the calculation, the actual value of the MFD and the shape of the MFD of each individual fault are not known. They will be deduced from the fault slip-rate budget and the other hypotheses. Depending on the combination of input hypotheses and fault parameters, SHERIFS can consider part of the slip-rate budget of some faults as non-main-shock (NMS) slip in order to respect the target MFD shape. A NMS of more than 30 % is most likely an indication that the combination of input hypotheses used does not agree with the fault parameters in the SHERIFS framework and that they should be reconsidered.

One uncertainty that SHERIFS allows us to explore is the proportion of seismicity that can occur in the background on faults that are unknown or not considered as active in the model. In most PSHAs, this is taken into account by a background zone with a GR MFD truncated at a given magnitude (

In SHERIFS, it is possible to define a priori the proportion of earthquakes that can be expected on the faults and the proportion in the background for each range of magnitude. In order to assess these proportions, we analyse the spatial distribution of earthquakes compared to the fault traces (Fig.

Most of the observed seismicity of the Marmara occurs close to the known faults; therefore, we chose to consider that a large proportion of the seismicity is on the known faults for a wide range of magnitudes (Table

Ratio of earthquakes assumed to be on the faults over the total number of earthquakes in the system for each background hypothesis.

Distribution of earthquakes between the background seismicity and the faults for each background hypothesis. In black is the earthquake rate for the entire fault system (faults

The annual rates of earthquakes obtained by analysis of the

We explored two alternative hypotheses for the target MFD: one in which the target MFD follows a GR truncated between a minimum magnitude and a maximum magnitude, and in which the target MFD follows a shape tuned to that of the rates deduced from the catalogue. The tuned shape (TS) MFD is described by the following equation and is composed of two parts, both defined by a double truncated GR with a

The historical catalogue does not contain any earthquake with a magnitude larger than 7.5 since 1700. Based on the statistics of the earthquake catalogue,

Example of the six largest ruptures included in the models using Set 1

The resolution of the earthquake records diminishes as we consider older events, and it is possible that larger earthquakes occurred in the NAFS but may not have been observed either in historical times or in the palaeoseismic records. Furthermore, since most bends in the system could be crossed by a rupture without jumping a large distance or by large changes in azimuth, we need to imagine that larger earthquakes might be possible. This hypothesis has been considered in previous studies (

Since there are several hundred ruptures considered in each hypothesis, we chose to only illustrate the larger ones for each set in Fig.

As exposed during the presentation of the NAFS earlier, there is uncertainty concerning the locking condition of the NAF in the western Sea of Marmara. Four hypotheses of locking conditions ranging from fully creeping to fully locked are explored in a logic tree to represent the current state of knowledge. The values of slip rates used for each hypothesis are presented in Table

SHERIFS is run with the hypotheses of each branch of the logic tree (Fig.

Logic tree explored in this study. For each branch, the scaling law parameters and the slip-rate uncertainties are explored through 10 random samples.

In this study, we only modelled part of the NAF (inside the dotted box in Fig.

The rates modelled using SHERIFS are slightly higher than the ones of the catalogue in the central zone (Fig.

Comparison between the rates calculated from the earthquake catalogue (in red with uncertainties in grey) and the model rates (in green) for the central zone of the fault system, close to Istanbul (Fig.

In Fig.

Comparison of the modelled rupture rates with the rates calculated from the palaeo-earthquake record at each palaeo-earthquake site (Fig.

At the Ganos 1 site (Fig.

Comparison of the modelled rupture rates with the rates calculated from the palaeo-earthquake record at the Ganos 1 site (Fig.

The comparison between the modelled earthquake rates and the data allows us to reduce part of the uncertainties explored in the logic tree, but some uncertainties still remain, notably the uncertainties concerning the MFD shape and the maximum rupture size.

In the hope of reducing these uncertainties, we implemented our fault system in the physics-based simulator RSQSim (

RSQSim is a boundary element model that applies the rate and state equation (

RSQSim takes as input the same information as SHERIFS concerning the fault parameters but does not require a target MFD shape or a set of possible FtF ruptures. The MFD of the fault system and the ruptures will be deduced from the synthetic earthquake catalogue which results from the combination of the fault loading rates and the rate and state friction law as implemented in RSQSim.

In this study, we will use the same friction parameters

We are using triangular elements of 1 km size. For this reason, we will only discuss earthquakes larger than magnitude 6 that rupture a number of elements large enough (around 100) to be representative of an earthquake rupture.

Using the fault geometry and slip rate with the fully locked hypothesis (Table

Considering the MFD of the synthetic catalogue, we can reach two conclusions: RSQSim cannot reproduce a GR MFD with the given fault system, and the maximum magnitude that can be generated is closer to magnitude 7.7 than to magnitude 8.0. These two conclusions are not affected by uncertainties in

We also explored the impact of having a region of the fault creeping in the western Sea of Marmara. In a way similar to the deep creep model used in SHERIFS, in RSQSim, elements that are below 5 km depth were attributed a

The results are also compared with the hybrid loading method used in

The flexibility of SHERIFS for modelling the earthquake rates in the NAFS allowed us to explore a wide range of uncertainties in a logic tree framework. The SHERIFS approach does not contain any physical constraints; it can therefore accommodate the different input hypotheses that are being discussed by the scientific community but also can lead to a large range of uncertainties in the earthquake rates. Since SHERIFS only uses the fault data as input, it is possible to compare the modelled rates with rates calculated from the earthquake catalogue and palaeo-earthquakes which can be considered as independent data from the SHERIFS input.

Branch weights in a logic tree are usually based on the scientific value of each hypothesis explored in the logic tree but not on the capacity of the modelled earthquake rates in each branch to reproduce the data. The weight of an individual branch of the logic tree is simply the product of the weights of each hypothesis used for the branch. In this discussion, we propose a novel approach to set the weight of each branch of the logic tree accounting for both the input hypotheses used and the ability to reproduce the independent data.

We set up a quantitative scoring system in order to set the weight of each individual branch of the logic tree. For each branch, four scores are calculated (Fig.

Scoring system allowing the individual weighting of each branch of the logic tree according to both its capacity to spend the fault slip-rate budget in earthquake rates (S1), its capacity to reproduce the earthquake rates observed in the data, and the agreement between the input hypotheses (S2 and S3) and the results of the discussion on the physics-based synthetic catalogue (S4).

We calculate the NMS value for each model and each fault section of the central zone. Since a large NMS value is likely linked to incompatibilities between input hypotheses, given the SHERIFS framework, a low score is attributed to models with high NMS slip value. For a given model, if the mean NMS value for the fault sections of the central zone is greater than 40 % or if the NMS value of one of the faults of the central region is greater than 50 %, the score is 0. The score is 1 if the average NMS of the sections of the central region is less than 20 %. Between the average values of 20 % and 40 %, the score linearly decreases from 1 to 0 as the NMS value increases. As a result, the average score of the fully creeping models is 0.27, the average score of the partly creeping models is 0.6, and the average scores of both the deep creep and fully locked models are 0.96.

The modelling of the creep conditions assumed on the Terkidag section play a predominant role in the modelling of earthquake rates on the adjacent Ganos section and hence the scoring based on NMS. The models considering the Terkidag fault as completely creeping have difficulties reproducing the palaeo-earthquake rates estimated on the Ganos fault (Fig.

Figure

The average scores for models imposing a TS MFD are only slightly higher than those with a GR MFD as a target shape for the MFD of the fault system (0.52 versus 0.49). While both shapes show a good agreement with the rate of large earthquakes calculated from both the catalogue and the palaeoseismicity, the models with a GR MFD overestimate the rate of small to intermediate earthquakes, as seen in Fig.

It can be argued that the deviation of the MFD shape from the GR shape is an artefact of the observation period, which is too short for accurately capturing the full earthquake cycle. Such arguments have been brought forward in California in support of a GR MFD target shape even when the apparent shape of the earthquake catalogue close to the faults might differ from the GR shape (

However, the 10 000-year-long synthetic catalogue generated by RSQSim, complete and representative of the seismic cycle by definition, also diverges from the GR MFD shape (Fig.

According to the scaling laws (

Based on the comparison between the modelled rates with the rates of the earthquake catalogue and the results of RSQSim, and in consideration of the scientific debate around the MFD, we suggest a stronger score (

Based on the comparison between the modelled rates from SHERIFS and the rate calculated from the earthquake catalogue and the palaeoseismicity studies, it is not possible to weight differently the two branches of the logic tree exploring the uncertainty in the rupture scenarios. While the fit between the modelled rates and the catalogue rates using Set 2 of ruptures, allowing larger ruptures, leads to a slightly better fit than the models using Set 1 of ruptures, both fits can be considered as satisfying (Fig.

Weighted logic tree established in this study. For each branch, the scaling law parameters and the slip-rate uncertainties are explored through 10 random samples. The weight of each branch is indicated by the bold number.

In the several 10 000-year-long catalogues simulated by RSQSim, we do not observe earthquakes larger than 7.7 (Fig.

In this study, we used RSQSim as an auxiliary tool to bring additional information for the weighting of the logic tree. As SHERIFS does not model the physics of earthquake but relies on a statistic approach, a physics-based model is complementary. We recognize that further development could be made to the RSQSim model and the exploration of the uncertainties in the input parameters done in this study is not exhaustive. However, we believe that the results exposed above bring sufficient information to affect the balance of the weights of the logic tree, in particular in the cases when the results corroborate the observations made in the data. On another hand, we do not believe these physics-based calculations are sufficiently exhaustive to bring the weight of a logic tree to zero and remove completely a hypothesis. Hypotheses could be removed from the logic tree only if additional modelling with alternative physics-based approaches are performed and lead to similar conclusions.

In the discussion, we established four types of scores for the logic tree branches: the weights established from the analysis of the RSQSim synthetic catalogues and those established from the comparison of the earthquake rates modelled using SHERIFS with the rates calculated from the earthquake catalogue and the palaeo-earthquake record. For each individual branch of the logic tree, these scores are convolved into a final weight unique to the branch (see equation in Fig.

The final weight of each hypothesis is calculated by summing all the branches using a given hypothesis (Fig.

The weights of the MFD hypothesis branches and the set of rupture scenario branches are strongly influenced by the scores imposed after the analysis of the RSQSim synthetic catalogue. While the fit to the catalogue is better with the TS MFD hypothesis, the fit to the palaeo-earthquake rates and the ratio of NMS is similar for both branches. Therefore, the comparison with the data affects the weight only marginally. The same can be said about the set of rupture scenario branches.

The weights of the background hypothesis branches are only affected by the scores depending on the comparison with the data. Overall, the background 1 hypothesis reproduces the earthquake rates better than the other two background hypotheses. In Fig.

The impact of the weights on the annual earthquake rates is presented in Fig.

Density function distribution of the cumulative earthquake rates in the logic tree for the central zone for different magnitudes. In black is the density function distribution when only the scoring according to the NMS is applied. In green is the density function distribution weighted according to the fit with the data (catalogue and palaeoseismicity). In purple is the final density function distribution of earthquake rates weighted according to the match with the data and the discussion based on RSQSim. The vertical bar shows the mean annual earthquake rate of each distribution. The horizontal bar extends between the 16th percentile and the 84th percentile of each distribution.

In this study, we calculated earthquake rates in the Marmara Region relying on the fault slip rate and geometry as primary information. We combined two innovative approaches: the SHERIFS approach that relies on statistical rules and the RSQSim approach that relies on physical rules.

With SHERIFS, we explored an extensive logic tree of uncertainties concerning the locking condition of the NAF in the Marmara Region, the shape of the MFD, the ratio of seismicity between the background and the faults, the largest possible rupture, and uncertainties in the slip rates and the maximum magnitude predicted by the scaling law. Rather than basing the weights of the branches of the logic tree only on expert judgement of each hypothesis, we rather take advantage of model performance by comparing results with data (earthquake catalogue and palaeo-earthquake) and weighting the branch accordingly.

In addition, the analysis of the synthetic catalogue simulated by RSQSim showed an MFD diverging from a GR and no earthquake of magnitude larger than 7.7. We explore different hypotheses on the input parameters of RSQSim and found these conclusions stable. While we do not consider that the findings coming from our models are final and that additional modelling using alternative physics-based models is necessary, we can nonetheless use these results as an additional source of information to weight the logic tree.

This allowed us to attribute a stronger weight to the branches of the logic tree showing similar features, leading to a final weighted distribution of modelled annual rates that properly represents the state of knowledge of the NAFS in the vicinity of Istanbul (Fig.

The logic tree of earthquake source models developed in this study will be later used in a seismic risk assessment study to evaluate the risk of collapse of a theoretical building in Istanbul.

All the data and input parameters used for the earthquake rate calculation with SHERIFS are available in the electronic Supplement. The SHERIFS code is available at

The supplement related to this article is available online at:

TC, OS and HLC were responsible for the collection and organization of the databases, construction of the SHERIFS and the physics-based models, and redaction of the article. KRD, JHD and BES were responsible for the construction of the physics-based models.

The authors declare that they have no conflict of interest.

Publisher’s note: Copernicus Publications remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This project was partly founded by the Axa Research Fund. We thank Aurelien Boiselet from Axa for his involvement in the project. We also thank José A. Alvarez-Gómez and another anonymous reviewer for their valuable reviews and comments that contributed greatly to the improvement of this article.

This research has been supported by the AXA Research Fund (grant no. Joint Research Initiative “Earthquake geology and seismic hazard assessment: tools for risk-informed decision-making.”).

This paper was edited by Filippos Vallianatos and reviewed by José A. Alvarez-Gómez and one anonymous referee.