This paper selects fault source models of typical earthquakes across the globe and uses a volume extending 100 km horizontally from each mainshock rupture plane and 50 km vertically as the primary area of earthquake influence for calculation and analysis. A deep neural network is constructed to model the relationship between elastic stress tensor components and aftershock state at multiple timescales, and the model is evaluated. Finally, based on the aftershock hysteresis model, the aftershock hysteresis effect of the Wenchuan earthquake in 2008 and Tohoku earthquake in 2011 is analyzed, and the aftershock hysteresis effect at different depths is compared and analyzed. The correlation between the aftershock hysteresis effect and the Omori formula is also discussed and analyzed. The constructed aftershock hysteresis model has a good fit to the data and can predict the aftershock pattern at multiple timescales after a large earthquake. Compared with the traditional aftershock spatial analysis method, the model is more effective and fully considers the distribution of actual faults, instead of treating the earthquake as a point source. The expansion rate of the aftershock pattern is negatively correlated with time, and the aftershock patterns at all timescales are roughly similar and anisotropic.

After the occurrence of strong earthquakes, there is often a large number of aftershocks, which constitute the aftershock sequence. The aftershocks can lead to new damage to the area affected by the main earthquake. Therefore, it is necessary to study aftershocks and stimulate further discussion. Stein and Lisowski systematically discussed the influence of the static stress of the main earthquake on the spatial distribution of aftershocks (Stein and Lisowski, 1983). A large number of earthquake examples show that the change in Coulomb stress produced by the main earthquake is greater than 0.01 MPa, readily triggering aftershocks (Harris, 1998; Toda, 2003; Ma et al., 2005). In addition to the Coulomb failure stress change method, the deep learning method is a new emerging method that can address some questions of physical mechanism. The prediction of the aftershock sequence based on the stress state of the crustal medium is also problematic and is a focus of source physics (Jordan and Mitchell, 2015; Lecun et al., 2015). The neural network has the characteristics of a black box, which can avoid the complicated physical mechanisms when predicting the aftershock pattern (Bodri, 2001; Moustra et al., 2011). In 2018, DeVries et al. (2018) proposed a deep neural network to study the spatial distribution of aftershocks following the main earthquake. A neural network classifier based on stress variation was designed by the authors to determine the possibility of a spatial distribution of aftershocks (DeVries et al., 2018). This idea combines traditional physical analysis mechanisms with data-driven machine learning mechanisms, which can improve our understanding of the complex physical mechanism of earthquakes. Kong et al. also analyzed its necessity (Kong et al., 2019).

The distribution of aftershocks is not only related to spatial changes but also to temporal changes (Kapetanidis et al., 2015; Papadimitriou et al., 2018), which may be related to the actual properties of the medium, i.e., the viscoelastic medium and the porous two-phase medium are closer to the actual geological medium than the elastic medium. The hysteresis effect of the viscoelastic medium on stress change, the effect of readjustment of pore fluid on stress change and other time-dependent medium properties are equally important to post-earthquake stress change, which is an issue that is receiving increasing attention in post-earthquake effects research. In the study of the propagation of a seismic wave and its focal mechanism, the earth medium is assumed to be a completely elastic body. Prior to the main earthquake, the crustal medium will be continuously deformed due to the long-term and slow action of tectonic stress. In the process of stress accumulation (Kaviris et al., 2017; Kaviris et al., 2018), the strain energy of the crustal medium will be accumulated continuously and be stored in the crust in the form of elastic strain energy. When the stress intensity is greater than the bearing stress intensity of the crust, the crust will lose its stability. Discontinuous crust will produce displacement at the location of its fracture, forming an earthquake. Sometimes fracture surfaces are produced in some locally continuous areas. Simultaneously, the elastic strain energy stored in the earth's crust will be released in this process. After the occurrence of the main earthquake, the source body and its surrounding medium will return to the steady state. However, because the main earthquake causes a sudden change in the stress state of the medium, the accumulated elastic strain energy in the entire stress field cannot be released completely at once, but it will continue to be accumulated in other areas, and it will ultimately be released in the form of an aftershock sequence. Therefore, there is a hysteresis effect between the aftershock and the main earthquake (Gu et al., 1979). Omori and Utsu proposed the time distribution formulas of aftershocks. However, the formulas are based on statistical significance, which cannot reflect the underlying reason for the change in aftershock distribution over time and cannot spatialize the temporal change in aftershocks (Omori, 1894; Utsu, 1961). Many scholars also analyzed the spatiotemporal distribution characteristics of aftershocks by building a model, for example, the ETAS (epidemic-type aftershock sequence) model proposed by Ogata (Ogata, 1988), the Kagan–Jackson model proposed by Kagan and Jackson (1994) and the model improved by Ogata based on ETAS (Ogata, 1998). In 2009, Wong and Schoenberg (2009) proposed a joint distribution model that parameterized the aftershock location based on the distance and relative angle between aftershocks and mainshocks (Wong and Schoenberg, 2009). All the above spatiotemporal models of aftershocks are all based on point source earthquakes, while the actual earthquake sources are faults. So the distribution of the main fault zone should be considered when predicting the aftershock pattern. Some spatial models also ignore the relative angle or distance between the mainshock and aftershocks. These deficiencies are taken into account when building the new prediction model.

The structure of the DNN (deep neural network). The neural network is composed of an input layer, hidden layers, output layers and the connections between each layer. The function of each hidden layer is to transform the features of the network input.

In this paper, a method based on deep neural networks is proposed to analyze the probability distribution of aftershocks following the main earthquake on multiple timescales, which indirectly reflects the hysteresis effect of aftershocks at different positions under the stress field of the main earthquake. The SRCMOD fault source model database and earthquake events are used as raw data (Mai and Thingbaijam, 2014). First, the analysis area of each main earthquake is gridded, and then the aftershocks of each main earthquake are entered into the grids. The DC3D displacement model is used to calculate the components of stress change tensor for each cell. Based on this grid, the results of the calculation are used as the input to train the neural network, and the aftershock hysteresis model is then obtained. As the application analysis cases for the model, the Wenchuan and the Tohoku earthquakes are not included in the training set or the validation set. Finally, the spatial distribution and expansion characteristics of the aftershock hysteresis model are obtained for both the horizontal and vertical directions. In addition, we focus on two important concepts, namely the “hysteresis effect” and the “aftershock pattern”. The hysteresis effect refers to the change in spatial distribution of aftershocks with the change of timescale. The aftershock pattern refers to the spatial distribution of aftershocks at a certain time.

Two types of data are used in this paper, SRCMOD finite fault data and the ISC (International Seismological Centre) earthquake catalogue (Bondár and Storchak, 2011).

The inversion of finite fault source data facilitates a better understanding of the complexity of the earthquake rupture process. Although the spatial resolution of the model is low, it can provide information on deep seismic slip and fault evolution over time. Therefore, the finite fault model is an important means to further study the mechanics and kinematics of the process of earthquake fracture. The online SRCMOD database provides the inversion results for many typical earthquakes from 1906 to present. These results are uploaded by seismologists globally after the main earthquake through inversion. Because the earth's crust is used as an elastic medium in the calculation of coseismic displacement stress, we do not consider the impact of the background of each earthquake. There are 19 finite fault source models used in this analysis: 15 are used as training data, and 4 are used as validation data.

The aftershocks following each main earthquake are obtained from the International Earthquake Center (ISC). More precisely, all aftershock data are from Reviewed ISC Bulletin, which is a subset of the ISC Bulletin that has been manually reviewed by ISC analysts. This includes all events that have been relocated by the ISC. For the mainshock cases in this paper, the aftershocks within 1, 30, 90, 180 and 365 d and within a volume extending 100 km horizontally from each mainshock rupture plane and 50 km vertically are used for analysis of the aftershock sequences.

After acquiring the limited fault source data and aftershock sequence data,
it is necessary to process them to create the final data for analysis.
First, the volume extending 100 km horizontally from each mainshock rupture
plane and 50 km vertically is divided into a grid of 5 km

ROC (receiver operating characteristic) curve for multiple timescales. Panels

ROC curve of

The inversion analysis of seismogenic faults after earthquakes is a popular topic in seismology, while in the process of inversion, the application of dislocation theory and models is essential. The dislocation model was first used to analyze fault movement in 1958 (Steketee, 1958). Steketee introduced the dislocation theory into the study of seismic deformation fields and described the relationship between discontinuous displacement on the dislocation plane and the displacement field in an isotropic medium. Okada summarized the existing research in 1985 and proposed a formula for the calculation of displacement in an isotropic, uniform elastic half-space. This formula can be used to calculate the coseismic deformation caused by any fault in the elastic half-space (Okada, 1985, 1992). The Okada dislocation theory systematically summarizes the relationship between point source dislocation and surface deformation caused by rectangular dislocation. The crustal movement is typically slow, and the crustal medium generally shows viscosity and plasticity over a long timescale. At present, the Okada dislocation theory is the most widely used dislocation theory and is often used in combination with InSAR (interferometric synthetic aperture radar) technology. InSAR is used to monitor the surface coseismic deformation field, and the Okada theory is then used to conduct fault slip inversion (Shan et al., 2017; Wang et al., 2018; Cheng et al., 2019; Zhao, 2019).

Therefore, the Okada elastic dislocation theory is used to calculate the
coseismic strain stress field of the main earthquake in the paper. The Okada
elastic dislocation model, which ignores the influence of stratification in
the earth's medium, is widely used in the study of coseismic deformation of
the seismic signal source. Okada gives the analytical expression of the
partial derivative

Multi-timescale aftershock depth distribution curves of

Structural background map of the Wenchuan earthquake. The red and green lines represent the fault structures in this area. The red line is the main fault zone of the Wenchuan earthquake, and the green lines represent other fault zones. The focal mechanism of the main aftershocks are also shown. Ngawa Tibetan and Qiang: Ngawa Tibetan and Qiang Autonomous Prefecture. P, T, NF, NS, SS, TF, TS and U represent tension axis, pressure axis, normal fault, strike-slip normal fault, strike-slip fault, reverse fault, strike-slip reverse fault and unknown type fault, respectively.

To analyze the hysteresis effect of aftershocks, it is necessary to establish a model that can predict the damage modes of aftershocks at multiple timescales. We constructed a fully connected deep neural network (DNN) to simulate the relationship between the change value of the elastic stress tensor and aftershock and to explain the hysteresis effect of aftershocks. The neural network is based on the extension of the perceptron, and DNN can be understood as a neural network with many hidden layers. A multilayer neural network and deep neural network actually refer to one thing. DNN is sometimes called multilayer perceptron (MLP). The network established here is a network with six hidden layers. Except for the second hidden layer, which has 100 neurons, the other five hidden layers have 50 neurons. The input layer dimension of the entire network is 12. Its input eigenvalue is the combination of the absolute value of six independent components of the elastic stress at the center of each subunit and the negative number of the absolute value, for a total of 12 inputs.

Then we analyze the correlation between aftershocks and stress change, which
is closely related to the inputs of DNN. At present, the research on
aftershocks is primarily based on statistical methods, and the research
content primarily focuses on the distribution of aftershock strength and
time attenuation. The intensity distribution of aftershocks follows the G–R (Gutenberg–Richter)
relationship

Aftershock damage patterns of the Wenchuan earthquake at
multiple timescales. Panels

The study of time attenuation of aftershocks begins with the statistical
description of frequency attenuation characteristics of the aftershock
sequence using the Omori formula (Omori, 1894). In 1961, Utsu (1961)
proposed that the frequency attenuation rate of the actual aftershock
sequence is faster than that calculated by the Omori formula (Utsu, 1961)
and proposed the modified Omori formula

The stress change caused by the main earthquake can be calculated by the Coulomb fracture stress change, which is also the most widely used analytical method at present. The change in Coulomb stress produced by the main earthquake will trigger the stress of the following aftershocks (Harris, 1998). Some seismologists believe that if the change in Coulomb fracture stress is positive around the main earthquake, it will promote fault movement and trigger aftershocks; if the change in Coulomb fracture stress is negative, it will inhibit fault movement, and the probability of triggering an aftershock is reduced (Lin, 2004; Harris, 1998; Han, 2003). According to the research of DeVries et al., the Coulomb fracture stress change is an inadequate explanation for aftershocks, and the relationship between the positive and negative values of stress change and the triggering of aftershocks requires further exploration. DeVries et al. modeled the relationship between stress change and aftershock triggering by training a neural network (DeVries et al., 2018). The variation in Coulomb fracture stress depends on the geometric properties and coseismic dislocations of the source fault (King et al., 1994; Zhu and Wen, 2009). Therefore, the change value of the stress tensor, which is closely related to the dislocation of the same earthquake, can be used as the aftershock variable to build the model.

In addition, Meade et al. tested many stress-related indicators in 2017 to
explain the influence of the coseismic stress field of the main earthquake
on the location of aftershocks. Their results show that the sum of the
absolute values of the six independent components of the stress tensor, the
von Mises yield criterion and the maximum shear stress produce the best
interpretation. These variables can be obtained by the combination of the
absolute values of the six independent components of the stress tensor and
the negative values of the absolute values. Therefore, these variables are
also used as the network input (Meade et al., 2017; Mignan and Broccardo, 2019). The input components are expressed as

Aftershock hysteresis effect of the Wenchuan earthquake. The aftershock hysteresis effect can be observed by combining the aftershock patterns of the Wenchuan earthquake at different timescales. The blue dots indicate the locations of the actual aftershocks over 1 year.

The ROC (receiver operating characteristic) curve and the AUC (area under curve) are used to evaluate the model. The ROC considers the
results obtained under a variety of different criteria. In this article, the
ROC curve can reflect the prediction results of the model under multiple
thresholds. The AUC is defined as the area enclosed by
the coordinate axis under the ROC curve, and the value of the area cannot be
greater than 1. Because the ROC curve is generally located above the
straight line

Aftershock hysteresis effect of the Wenchuan earthquake
in different depths. Panels

The aftershock hysteresis model under multiple timescales is obtained by
using the neural network to train the constructed training dataset. In this
paper, five submodels are trained, and the final hysteresis model is
composed of five submodels. The prediction result given by the model is the
approximate range of aftershocks, that is, the position of 5 km

In this paper, the evaluation-method-based ROC curve is used, and all possible thresholds are taken into account to evaluate the model and physical model in the text. According to the ROC curves of the two methods, the effect of the hysteresis model in the article may be poor under some thresholds, but its AUC value is much greater than that of the physical model. Based on the trained aftershock hysteresis model, the aftershock patterns are predicted for the Wenchuan earthquake at multiple timescales, and the ROC curves are obtained for the different timescales. The AUC values of the five timescales are all above 0.8, in both the training and validation sets, and some are close to 0.9. The AUC values of the training set are all higher than those of the validation set for the different timescales. The neural network designed by DeVries et al. (2018) is used for aftershock prediction. The AUC value of the training model on the validation set is 0.849 (Fig. 2). In this paper, the AUC value of each submodel on the validation set is similar to the research results of DeVries et al. (2018). Therefore, the model achieves good prediction results at different timescales.

For comparison, we forecast the aftershock location based on the static
Coulomb failure stress change. Considering the influence of shear stress,
normal stress and friction coefficient on the active fault plane, Coulomb
failure stress change (

Aftershocks distribution of the Tohoku earthquake. The blue points in the figure are the projection positions of the aftershocks within a depth of 50 km.

In order to verify the method and model in this article, we selected two typical historical earthquake cases, i.e., the Wenchuan earthquake and the Tohoku earthquake. These two earthquake cases are not included in the data used for model construction. They are characterized by a large magnitude and a large number of aftershocks.

In the Tohoku earthquake case, there were 15 062 aftershocks in the study area within 1 year after the mainshock (Table 1). In the finite fault model used in this article, the focal depth is 20–25 km, and according to the depth distribution of aftershocks at multiple timescales, the number of aftershocks is the largest at the depth of 35–40 km (Fig. 4a). In the Wenchuan earthquake case, there were 1455 aftershocks in the study area within 1 year after the mainshock (Table 1). In the finite fault model used in this paper, the focal depth is 10–15 km. According to the depth distribution of aftershocks at multiple timescales, the number of aftershocks is the largest at 10–15 km depth. Aftershocks are not necessarily distributed the most on the focal-depth surface (Fig. 4b).

According to the tectonic stress figure of the Wenchuan earthquake, the Wenchuan earthquake was located in the Longmenshan area in the border mountains east of the Qinghai–Tibetan Plateau. The geological structure in this area is complex. The main Longmenshan fault zone is composed of a series of roughly parallel thrust faults. It is divided into a front mountain zone and a back mountain zone with the Yingxiu–Beichuan central fault as the boundary. From northwest to southeast, the main fault zone consists of the back mountain fault, the central fault and the front mountain fault. The main fault forming the Wenchuan earthquake is the Yingxiu–Beichuan central fault. According to the beach ball plot of the focal-mechanism solution in Fig. 5, the strong aftershocks following the Wenchuan earthquake are mainly related to reverse or thrust faults under the action of compressive stress.

Aftershock damage patterns of the Tohoku earthquake at
multiple timescales. Panels

The number of aftershocks of typical historical earthquakes on multiple timescales.

Based on the aftershock hysteresis model, the failure patterns of aftershocks are predicted at different timescales, and the section observation is conducted at a depth of 12.5 km (essentially at the same depth as the source). Combined with the focal-mechanism solution analysis of strong aftershocks around the main fault zone, the aftershocks in this area are mainly caused by the NW-trending and SE-trending crustal compressive stress (Fig. 6). The expansion of the aftershock hysteresis pattern is observed, which is generally distributed along the fault strike and extends along the trend line of the main fault. Within 1 d after the main earthquake at Wenchuan, there were aftershocks over a wide area. The location of the aftershocks is distributed along the fault zone, and the location of the aftershocks is basically distributed in the geographical space predicted by the model.

Finally, the spatial results of the hysteresis effect of the Wenchuan earthquake are obtained by synthesizing the damage modes of the aftershocks at multiple timescales (Fig. 7). The location of the aftershocks is basically along the main fault, i.e., the Yingxiu–Beichuan central fault. The model predicts that aftershocks are mainly distributed in the cities of Chengdu, Mianyang, Deyang, Guangyuan and Ngawa, which is consistent with the actual location of the aftershocks. Over time, the area of aftershocks expands outwards, and the rate decreases gradually. Using the main aftershock sequence from the Wenchuan earthquake as an example, the aftershock hysteresis patterns at different timescales are similar, and the direction of outward expansion is basically perpendicular to the distribution direction of the previous timescale. Compared with the attenuation map of earthquake intensity, the spatial distribution map of aftershock attenuation can provide some reference for follow-up disaster prevention and mitigation work after a large earthquake. We can further understand the attenuation law of aftershocks and attempt to extend its time attenuation from a statistical perspective to a spatial perspective.

At different focal depths, the aftershock hysteresis patterns will also change. The focal-depth range of the aftershocks analyzed in this paper is 0–50 km. The aftershock hysteresis effect is analyzed by selecting sections with depths of 2.5, 7.5, 12.5, 17.5, 22.5, 27.5, 32.5, 37.5 and 42.5 km. Many previous studies have shown that the seismogenic layers in central and western China are located in the middle and upper layers of the crust at a depth of no more than 20 km (Zhao and Chen, 1995; Yang et al., 2003). The aftershocks with a focal depth within 20 km are widely distributed (Fig. 8). When the focal depth exceeds 20 km, the area where the aftershocks are generated suddenly decreases with increasing depth until no aftershocks are observed. The focal depth of the largest aftershock distribution range is 12.5 km, which is in the same range as the focal depth of the main earthquake. In the middle and upper layers of the earth's crust, the shapes of the aftershock hysteresis patterns are generally similar at different timescales. Over time, the shape of the aftershock hysteresis pattern generally expands outward in a similar pattern as the previous timescale. However, when the focal depth exceeds a certain value, the hysteresis pattern of the aftershocks substantially changes. In this case, when the focal depth is greater than 20 km, the area predicted for aftershocks significantly decreases, and the evolution of the hysteresis pattern is also changed. Although the overall expansion direction is consistent with the main fault, the pattern is less regular and more random.

Japan is located in the circum-Pacific seismic belt at the intersection of the Eurasian plate and the Pacific plate, which is an area with a frequent occurrence of global earthquakes. Due to the collision between the Pacific plate and the Eurasian plate, the Pacific plate is subducted under the Eurasian plate, thus forming the Japan Trench and the Japanese island arc. “OK” represents the Okhotsk plate, which is part of the Eurasian plate; “PA” refers to the Pacific plate; and “PS” refers to the Philippine Sea plate, which is also part of the Eurasian plate (Bird, 2003) (Fig. 9). The epicenter of the earthquake was located in the subduction zone of the Japan Trench. The Tohoku earthquake occurred due to the subduction of the Pacific plate to the Eurasian plate. The aftershocks of the Tohoku earthquake mainly occurred near the junction of the Eurasian plate and the Pacific plate. They all belong to the earthquake between the plates. The Japanese offshore plate is mainly the Okhotsk plate, which is part of the Eurasian plate. A total of 12 462 (about 82.7 %) aftershocks occurred in the Okhotsk plate, and 2576 (about 17.1 %) aftershocks occurred in the Pacific plate. Based on the aftershock hysteresis model, the aftershock patterns within 1, 30, 90, 180 and 365 d after the main earthquake are predicted, and the section (22.5 km) at the focal depth of the main earthquake is selected for analysis (Fig. 10). Using the Tohoku earthquake in Japan as an example, the greatest expansion of the aftershock distribution area is observed within 30 d. The shape of the aftershock patterns are similar at all timescales. The aftershock and the predicted aftershock patterns are distributed in an approximately north–south direction along the Japan Trench and plate boundary.

The aftershock hysteresis model of the Tohoku earthquake in is obtained by synthesizing the aftershock patterns at different timescales (Fig. 11). Over time, the expansion rate of the aftershock pattern gradually decreases, and the expansion direction is basically perpendicular to the aftershock pattern at the previous scale. Most of the aftershocks of this earthquake occurred in the eastern Sea of Japan, and the area of concentrated terrestrial aftershocks was located in Fukushima.

Aftershock hysteresis effect of the Tohoku earthquake in Japan. The aftershock hysteresis effect can be observed by combining the aftershock patterns of the Tohoku earthquake at different timescales. The blue dots indicate the locations of the actual aftershocks over 1 year.

Aftershock hysteresis effect of the Tohoku earthquake at
different depths. Panels

Similar to the Wenchuan earthquake, the aftershock hysteresis pattern of the
Tohoku earthquake changes with the change in depth. The magnitude of the
earthquake was very large, reaching over

The modified Omori formula is

The modified Omori formula aftershock attenuation curve of the Tohoku earthquake.

The modified Omori formula aftershock attenuation curve of the Wenchuan earthquake.

Compared with the Omori formula, the aftershock hysteresis effect analyzed
in this paper can be reflected by the correlation between the change of timescale and the region of aftershocks. Based on the discussion of focal-depth
sections of the main earthquake, within 1 d after the Wenchuan
earthquake, the number of subunits with aftershocks is 213; within 30 d, it
is 386, representing an increase of 81.6 %; within 90 d, it is 432,
representing an increase of 11.9 %; within 180 d, it is 466, representing
an increase of 7.9 %; and within 365 d, it is 488, representing an
increase of 4.7 %. Within 1 d after the Tohoku earthquake, the number
of subunits with aftershocks was 137; within 30 d, it was 595, representing
an increase of 334 %; within 90 d, it was 724, representing an increase of
21.7 %; within 180 d, it was 799, representing an increase of 10.4 %;
and within 365 d, it was 856, representing an increase of 7.1 %. The
aftershock pattern predicted by the model expands over time, but the
expansion speed of the aftershock pattern also gradually decreases. The rate
of expansion is most rapid 30 d after the earthquake. After 30 d, the speed
decreases significantly from 30 to 90 d. The aftershock pattern of the
Wenchuan earthquake expanded at a speed of 28.7 units d

Modified Omori formula and derived function of typical earthquake cases

The curve of the aftershock hysteresis effect (actual
aftershocks). Panel

The curve of the aftershock hysteresis effect (predicted
aftershock pattern). Panel

The hysteresis model prediction result and the

Finally, a supplementary explanation is given to the phenomenon that the
area predicted by the model is larger than the actual aftershock location.
The prediction results of the hysteresis model are the likely locations of
aftershocks at different timescales after the mainshock. At each location,
the predicted value is a number between 0 and 1, which represents the
probability of aftershocks that may occur at that location. We take the
prediction threshold as 0.5 and think that when the prediction value is
greater than 0.5, an earthquake is more likely to occur in the subcell with
a volume of 5 km

The hysteresis model prediction result and the

The widely used temporal-magnitude earthquake generation model (ETAS) was proposed by Ogata (Ogata, 1988). Later on, he observed that the distribution of aftershock sequences tended to be elliptic rather than circular. He established the anisotropic aftershock attenuation function and took the normal distribution as the spatial distribution model of an aftershock (Ogata, 1998). It is a widely observed fact that aftershocks usually occur on or near the fault of a mainshock. However, the normal distribution model does not include the source mechanism information of the mainshock when predicting the aftershock mode. Kagan et al. introduced the anisotropy function of the spatial smooth core into long-term earthquake prediction and established the spatial-smooth-core model, including the source mechanism information of the mainshock (Kagan and Jackson, 1994). However, the above models ignore the internal relationship of the relative distance or direction between the mainshock and the aftershocks. Based on this, Wong and Schoenberg (2009) proposed a joint distribution model to parameterize the aftershock location according to the distance and relative angle between the mainshock and aftershocks (Wong and Schoenberg, 2009). In the prediction process of the above models, the epicenter of the mainshock is used as a point source for analysis. Actually, the distribution of the fault plane of the mainshock should be fully considered. Based on the finite fault model, the distribution information of the main fault is considered in the paper. At the same time, the relative position and direction between the mainshock and aftershocks have been considered in the process of calculating the variation of stress tensor by using the Okada dislocation theory. Therefore, in the process of model training and learning, the relative-position relation is also identified. Compared with the static Coulomb failure stress change method, the aftershock hysteresis model has a better prediction effect.

In the previous comparison of the two methods in this article, the subcell location where the aftershock was located was used for evaluation, and the subcells with aftershocks were marked. To further prove the validity of the model, the actual location of the aftershock event is further used instead of the subcell location, and the threshold is set to 0.5. The prediction results were verified on the focal depth of the two earthquake cases to compare the effects of the aftershock hysteresis model and the Coulomb failure stress change method. The evaluation results of the aftershock hysteresis model are as follows: 97.6 % of the Wenchuan earthquake aftershocks fall in the area with the predicted value greater than 0.5, and 96 % of the Tohoku earthquake aftershocks fall within the area with the predicted value greater than 0.5. The evaluation results based on the Coulomb failure stress change method are as follows: 87.3 % of the Wenchuan earthquake aftershocks fall in the area with a predicted value greater than 0.5, and 45.3 % of the Tohoku earthquake aftershocks fall within the area with a predicted value greater than 0.5 (Figs. 17 and 18). Therefore, if the evaluation is made from the specific location of the aftershock event, the prediction result of the constructed model is still better than the result based on the Coulomb failure stress change method.

In addition, the model is a six-layer neural network, which is a black box model. Compared with the traditional statistical model or physical model, is the deep learning model more complex? We think this complexity is relative. In fact, the starting point of the traditional model and of the model in this paper are similar. They are all based on data, trying to find a relationship between some basic physical quantities and aftershocks. The complexity of traditional models lies in the process of finding such a connection. The complexity of the deep learning model lies in its seemingly complex structure. The complex structure will lead to the increase of the number of internal variables to be learned, and the rapid computing ability of today's computers can solve this problem, thus reducing man power and time-consuming work. In addition, the deep learning model is a data-driven method. It will be more convenient than the traditional model when the dataset or the amount of data changes greatly or the model needs to be adjusted.

In this paper, based on the criterion of correlation between aftershocks and
stress changes caused by the main earthquakes, a deep neural network is
trained using the SRCMOD finite fault data and the ISC earthquake catalogue
and is used to construct an aftershock hysteresis model. Using the main
aftershock sequences of the Wenchuan and the Tohoku earthquakes as examples,
the characteristics of the aftershock hysteresis effect in plane space and
at different depths are then analyzed. The main contributions are as
follows:

The trained model of aftershock hysteresis is accurate. It can predict the aftershock patterns at multiple timescales after a large earthquake and produce a spatial distribution map of the aftershock hysteresis effect. Compared with static Coulomb failure stress change, this model is more effective.

Compared with the traditional aftershock spatial analysis method, the model fully considers the distribution of actual faults in the prediction of an aftershock pattern, instead of treating the earthquake as a point source. In the analysis of the model, the relative-position information between the mainshock and aftershocks has been included.

The expansion rate of the aftershock patterns changes over time, i.e.,

According to the prediction results of the model, the aftershock patterns at all timescales are roughly similar and anisotropic. The distribution law of aftershock hysteresis effect will change with the increase of the depth.

The basic data used in this paper mainly include SRCMOD finite fault models and ISC earthquake event data. The SRCMOD finite fault model data can be obtained from

HT conceptualized the project, acquired funding and supervised the project. JC performed the investigation, deployed the software and code, and edited the paper. JC and HT developed the methodology. WC and HT revised the paper.

The authors declare that they have no conflict of interest.

The results of the aftershock hysteresis effect are obtained by programming in Python, and some code refers to previous research by DeVries et al. (2018). In this study, the figures and the subsequent processing of the results are all performed using the ArcGIS software.

This research has been supported by the National Natural Science Foundation of China (grant no. 41971280) and the National Key R&D Program of China (grant no. 2017YFB0504104).

This paper was edited by Filippos Vallianatos and reviewed by Athanassios Ganas and one anonymous referee.