Complex Networks Reveal Teleconnections between the Global SST and Rainfall in Southwest China

and Rainfall in Southwest China Panjie Qiao1, Wenqi Liu1, Yongwen Zhang1,2, and Zhiqiang Gong3 1Data Science Research Center, Faculty of Science, Kunming University of Science and Technology, Kunming 650500, China; 2Department of Physics, Bar-Ilan University, Ramat Gan 52900, Israel; 3Laboratory for Climate Studies, National Climate Research Center CMA, Beijing 100081, China Correspondence: Wenqi Liu (liuwenq2215@sina.com)


Introduction
During the recent decades, natural hazards (such as droughts and floods) have occurred frequently in Southwest China (SWC) due to climate change, causing numerous casualties and property losses [1,2]. In the summer of 2006 and 2011, SWC suffered from record-breaking droughts events [3]. On the other hand, the portion of annual precipitation contributed by extremely heavy precipitation has been found as a result of an increasing trend in the period of 1961-2010 [4]. Due to the population growth and high risk of natural hazards, SWC has attracted lots of attention in the meteorological research fields. According to the CMIP5 multi-model projections, severe and extreme droughts in SWC will increase dramatically in the future, meanwhile, extremely wet events will also increase [5].
Droughts and floods can be attributed to precipitation anomalies. A better understanding of precipitation can further improve the underlying mechanisms of droughts and floods in SWC. It was reported that the annual precipitation over SWC does not show a significant decreasing or increasing trend [3,6]. In fact, the trend of precipitation in SWC has been found to strongly depend on spatiotemporal patterns [4,7]. The forecasting of precipitation in SWC is very challenging, since it can be influenced by both the East Asian monsoon and Indian monsoon which could carry moist air from both the Indian Ocean and the Pacific Ocean to SWC [8][9][10][11]. Previous studies have found that some correlations between the SSTA and precipitation in SWC based on the regression analysis and the Empirical Orthogonal Functions (EOFs) [12,13]. Nevertheless, such traditional analyses

Data
The data is obtained from the global ERA-interim reanalysis of the European Centre for Medium-Range Weather Forecasts (https://www.ecmwf.int/) [36]. We use the daily averaged precipitation in SWC with a resolution of 1. area is since that it mainly represents the SWC and its surrounding areas. We totally have N s = 6936 grid points for the global SST. Due to the seasonality of rainfall, we divide the entire time series of precipitation in SWC into four seasons, i.e., Spring (March-April-May, MAM), Summer (June-July-August, JJA), Autumn (September-October-November, SON) and Winter (December-January-February, DJF), respectively.

Methodology
Firstly, we remove the seasonal cycle to obtain the time series of the SSTA as [37,38], where Y y (t) is the time series of the daily SST. y stands the year and t stands date within a year. The "mean" and "std" denote the mean and standard deviation of the SST. We perform the same analysis to obtain the precipitation anomalies (PA). We then construct the directed and weighted multi-variable network. Network nodes i and j can be classified into two subsets by two different variables, one subset denotes the PA nodes over SWC and the other denotes the global SSTA nodes. We divide one year into four seasons, e.g., JJA for summer as mentioned in Data. For the days in summers, we can obtain their dates D t for 39 years and respectively define the daily time series of grid i for the PA and grid j for the SSTA as, X i (t) and Y j (t + τ), where t spans the dates D t and τ is the time delay (the dates of Y j (t + τ) shift τ days in relative to the dates of X i (t). The negative τ means that the SSTA is ahead the precipitation, vice versa. The cross-correlation function is written as [37,39], where −τ max ≤ τ ≤ τ max is the time lag, τ max = 365 days. is averaged for all t. We identify the largest absolute value ofĈ ij (τ) and denote the corresponding time lag as τ * . The correlation between the site i and j is defined as C ij =Ĉ ij (τ * ). We define that the direction of the linkĈ ij (τ * ) is from i to j when τ * > 0 and from j to i when τ * < 0. The direction is undefined when τ * = 0.
To obtain the adjacency matrix of the network, a threshold ∆ is introduced to exclude noise and only the links from the SSTA to PA are considered [39]. Thus the adjacency matrix is defined as: If A ij = 1, there is a link between i and j. If A ij = 0, there is no link between them. Then the proportion of the positive links is calculated by, where λ j is the latitude for the SSTA node j and cos λ j is introduced as a weight for the link regarding the area size i.e., the pole is just a point and its area size is zero and the area size for a gird in the Equator is biggest [14]. Similarly, we obtain the proportion of the negative links S n . The extent of the SSTA nodes to influence the PA nodes in the network is usually quantified by the weighted out-degree [37,39]. We define the weighted out-degree of the node j as: The positive and negative degrees can more information about positive and negative correlations respectively than the totally degree. We have G j = G p j + G n j that is divided into the positive and negative weighted out-degrees G p j and G n j by the weight C ij , respectively.
To find the important nodes in SWC which are strongly influenced by the most SSTA areas, the weighted in-degree of the PA node i is defined as: Also we obtain the positive and negative weighted in-degrees H p i and H n i .

Significance Tests
According to Equation (2), we obtain C ij for any pair of nodes between the global SSTA and PA in SWC for four different seasons respectively. Figure 1 shows the probability distribution function (PDF) of their corresponding correlation C ij . We find two separated peaks related to positive, and negative correlations, respectively in Figure 1. It further demonstrates the differences between positive and negative correlations. In order to verify the significance of the correlation, we compare the PDFs between the real data and shuffled data as shown Figure 1. Here, we randomly shuffle the order of years for each node, keeping the variations within each year to get shuffle data [37]. Then, we calculate the cross-correlation for the shuffled data. In such shuffling process, the autocorrelations and common seasonality have been kept, while the physical dependencies between the SSTA and PA nodes are destroyed. The PDF of the real data in Figure 1 shows a much slower decay than the shuffle data for both the positive and negative parts. Therefore, it proves that some correlations are non-random. If the correlations are significantly higher than the significant threshold ∆, we regard them as real links; otherwise, they are suspected to be spurious links as Equation (3). We obtain the threshold ∆ = 0.1 by using the 99.5% confidence significance test.
arated peaks related to positive, and negative correlations, respectively in Figure 1. It further demonstrates the differences between positive and negative correlations. In order to verify the significance of the correlation, we compare the PDFs between the real data and shuffled data as shown Figure 1. Here, we randomly shuffle the order of years for each node, keeping the variations within each year to get shuffle data [37]. Then, we calculate the cross-correlation for the shuffled data. In such shuffling process, the autocorrelations and common seasonality have been kept, while the physical dependencies between the SSTA and PA nodes are destroyed. The PDF of the real data in Figure 1 shows a much slower decay than the shuffle data for both the positive and negative parts. Therefore, it proves that some correlations are non-random. If the correlations are significantly higher than the significant threshold ∆, we regard them as real links; otherwise, they are suspected to be spurious links as Equation (3). We obtain the threshold ∆ = 0.1 by using the 99.5% confidence significance test.

Links and Degrees
We obtain the proportions of positive and negative links and by Equation (4) and calculate the averages of correlations for positive and negative links as 〈 〉, 〈 〉 in Table 1. In Table 1, we summarize the statistics of the positive and negative links for four seasons. We find the largest proportion for positive links in MAM and the largest for negative links in DJF (see Table 1). Both the strongest correlations 〈 〉 and 〈 〉 are also found in MAM. The fewest positive and negative links are observed in SON. The correlations 〈 〉 and 〈 〉 also are weakest for SON. Therefore, the connections between the SSTA and PA in SWC strongly depend on seasons.

Links and Degrees
We obtain the proportions of positive and negative links S p and S n by Equation (4) and calculate the averages of correlations for positive and negative links as C p , C n in Table 1. In Table 1, we summarize the statistics of the positive and negative links for four seasons. We find the largest proportion S p for positive links in MAM and the largest S n for negative links in DJF (see Table 1). Both the strongest correlations C p and C n are also found in MAM. The fewest positive and negative links are observed in SON. The correlations C p and C n also are weakest for SON. Therefore, the connections between the SSTA and PA in SWC strongly depend on seasons. Table 1. Statistics of positive and negative links from the SSTA to PA in SWC for four seasons. S p and S n represent the proportions of the positive and negative links, respectively. C p and C n respectively denote the average of correlations for positive and negative links. The error bars show the standard deviations of correlations. Then we present the global out-degree pattern according to Equation (5). The outdegree patterns show that some important regions on the oceans are significant correlated with the PA of SWC in different seasons. In DJF, the nodes that have strong positive outdegree are located in the north Indian Ocean and east Equatorial Pacific in Figure 2a. In contrast, the locations of these nodes move to the Equatorial Atlantic Ocean and west Equatorial Pacific in MAM. Note that the nodes in the east Equatorial Pacific disappeared in Figure 2c. For JJA, Figure 2e shows that the nodes with high degrees can be found in high latitudes but not around the Equator. In SON, a few strong nodes are found locating in the east Equatorial Pacific and the south Indian Ocean. Due to the special geographical location of SWC, both the East Asian monsoon and Indian monsoon carry moist air from the Indian Ocean and Pacific Ocean to SWC resulting in extreme precipitation. Therefore, we suggest that these important out-degree areas and SWC are connected through the bridge function of the East Asian monsoon and Indian monsoon [40][41][42][43]. Different features of the monsoons can be used to explain the changes of the degree patterns in different seasons. To further test the contribution of extreme precipitation to the degree patterns, top and bottom 5% extreme PA data are replaced by the random middle magnitude PA. Then we perform the same analysis to the new time series. Figure 2b,d,f,h show the results. Comparing with Figure 2a for real data, much less significant nodes are observed in the Oceans in Figure 2b. Still, few nodes are in the same locations. Similar results can also be found for other seasons. It implies that extreme precipitation plays an important role in forming the teleconnection degree patterns.

DJF
The distributions of the negatively weighted out-degree are shown in Figure 3. We find that they are very different form the positive case. For instance, we find some major negative degree patterns in the Equatorial Atlantic and west Equatorial Pacific, and few nodes in the east Equatorial Pacific for DJF (see Figure 3a), which are the opposite of the results for the positive weighted out-degree in Figure 2a. Similar results are also found for To further test the contribution of extreme precipitation to the degree patterns, top and bottom 5% extreme PA data are replaced by the random middle magnitude PA. Then we perform the same analysis to the new time series. Figure 2b,d,f,h show the results. Comparing with Figure 2a for real data, much less significant nodes are observed in the Oceans in Figure 2b. Still, few nodes are in the same locations. Similar results can also be found for other seasons. It implies that extreme precipitation plays an important role in forming the teleconnection degree patterns.
The distributions of the negatively weighted out-degree are shown in Figure 3. We find that they are very different form the positive case. For instance, we find some major negative degree patterns in the Equatorial Atlantic and west Equatorial Pacific, and few nodes in the east Equatorial Pacific for DJF (see Figure 3a), which are the opposite of the results for the positive weighted out-degree in Figure 2a. Similar results are also found for other seasons (see Figure 3c,e,g). In fact, there are correlations for the SSTA itself between the two regions i.e., the SSTA in the east Equatorial Pacific is negative correlated with the SSTA in the west Equatorial Pacific. Such anti-correlations can lead to the positive and negatively correlated patterns between the PA in SWC and SSTA in the two different regions. The PA in different locations within SWC could be influenced by different weather systems as suggested by Ma et al. [4] and Shi et al. [7]. To test the correlations between the SSTA and PA nodes, we calculate the weighted in-degree for the PA nodes by Equation (6). We find that some nodes within SWC have small in-degree values and several nodes have strong correlations with the SSTA as shown in Figures 4 and 5. The spatial distribution of in-degree is inhomogeneous and varies with seasons. The important nodes with large in-degree values are almost located in the left-and right-bottom corners of SWC which are close to the Indian, and west Pacific Ocean, respectively. Therefore, these nodes of SWC are easier to be influenced by the Indian and the East Asian monsoons than other nodes in SWC. Note that the largest positive and negative weighted in-degree nodes are the same for MAM (see Figures 4b and 5b). It implies that the most sensitive region in SWC respect to the SSTA can keep both two kinds of correlations. The PA in different locations within SWC could be influenced by different weather systems as suggested by Ma et al. [4] and Shi et al. [7]. To test the correlations between the SSTA and PA nodes, we calculate the weighted in-degree for the PA nodes by Equation (6). We find that some nodes within SWC have small in-degree values and several nodes have strong correlations with the SSTA as shown in Figures 4 and 5. The spatial distribution of in-degree is inhomogeneous and varies with seasons. The important nodes with large in-degree values are almost located in the left-and right-bottom corners of SWC which are close to the Indian, and west Pacific Ocean, respectively. Therefore, these nodes of SWC are easier to be influenced by the Indian and the East Asian monsoons than other nodes in SWC. Note that the largest positive and negative weighted in-degree nodes are the same for MAM (see Figures 4b and 5b). It implies that the most sensitive region in SWC respect to the SSTA can keep both two kinds of correlations.

Important Areas
Since the inhomogeneous influence on SWC's PA is observed, we next consider the connection between the important nodes within SWC and the largest clusters of the SSTA. We first select the nodes in SWC with the largest weighted in-degrees (which are more than 4 times standard deviation above the averaged degree) for each season as shown in Figures 4 and 5. The largest cluster C 1 is identified by the largest successive area where all the inside SSTA nodes are connected to that important node in SWC. We can obtain the second largest cluster C 2 by using a similar way. Figure 6a shows the cluster C 1 (blue) and C 2 (green) are connected to the nodes I 1,1 (as shown in Figure 4a) for DJF. Both of them are positively correlated to I 1,1 in DJF. We show the strongest links for C 1 and C 2 in Figure 6a and their time delays in Figure 6b. The correlationĈ changes with the time lag τ and reaches to the maximum value at −189 days for C 1 and −245 days for C 2 in Figure 6b. Similar time delays are also observed for the other links between I 1,1 and different SSTA nodes within the clusters. Therefore, the cluster C 2 in the east Equatorial Pacific performs a longer time delay that could be useful for long-term prediction. The reason could be related to the El Niño-Southern Oscillation that arouses warm SST in the central and east Equatorial Pacific at the beginning, then lead to temperature anomaly in the Indian Ocean [42,44]. The PA in SWC can be significantly influenced by the El Niño through the bridge of the East Asian monsoon and Indian monsoon [44]. We show that the cluster C 1 and C 2 are negatively correlated with the nodes I 1,11 (as shown in Figure 5a) for DJF in Figure 6c. The cluster C 1 locates in the Equatorial Atlantic that can arouse a series of quasi-stationary wave trains. Such waves can propagate eastward leading to an anticyclone (cyclone) in upper air of SWC, and motivating (inhibiting) PA [42]. The time delay corresponding to the maximum correlation is −190 days for the strongest link of C 1 (see Figure 6d). There is a similar time delay −104 days for the link of C 2 in the Middle East Pacific (see Figure 6d).

Important Areas
Since the inhomogeneous influence on SWC's PA is observed, we next consider the connection between the important nodes within SWC and the largest clusters of the SSTA. We first select the nodes in SWC with the largest weighted in-degrees (which are more than 4 times standard deviation above the averaged degree) for each season as shown in Figures 4 and 5. The largest cluster 1 is identified by the largest successive area where all the inside SSTA nodes are connected to that important node in SWC. We can obtain the second largest cluster 2 by using a similar way. Figure 6a shows the cluster 1 (blue) and 2 (green) are connected to the nodes 1,1 (as shown in Figure 4a) for DJF. Both of them are positively correlated to 1,1 in DJF. We show the strongest links for 1 and 2 in Figure 6a and their time delays in Figure 6b. The correlation ̂ changes with the time lag τ and reaches to the maximum value at −189 days for 1 and −245 days for 2 in Figure 6b. Similar time delays are also observed for the other links between 1,1 and different SSTA nodes within the clusters. Therefore, the cluster 2 in the east Equatorial Pacific performs a longer time delay that could be useful for long-term prediction. The reason could be related to the El Niño-Southern Oscillation that arouses warm SST in the central and east Equatorial Pacific at the beginning, then lead to temperature anomaly in the Indian Ocean [42,44]. The PA in SWC can be significantly influenced by the El Niño through the bridge of the East Asian monsoon and Indian monsoon [44]. We show that the cluster 1 and 2 are negatively correlated with the nodes 1,11 (as shown in Figure  5a) for DJF in Figure 6c. The cluster 1 locates in the Equatorial Atlantic that can arouse a series of quasi-stationary wave trains. Such waves can propagate eastward leading to an anticyclone (cyclone) in upper air of SWC, and motivating (inhibiting) PA [42]. The time delay corresponding to the maximum correlation is −190 days for the strongest link of 1 (see Figure 6d). There is a similar time delay −104 days for the link of 2 in the Middle East Pacific (see Figure 6d).  For MAM, the links between the SSTA and PA in SWC are much higher than other three seasons. The largest cluster size is also larger than other seasons. Figure 7a shows that the cluster C 1 in the west Pacific and C 2 in the Equatorial Atlantic are positively correlated with the nodes I 1,1 in MAM. For the negative correlation, however, the clusters C 1 and C 2 mainly distributed in the east Pacific and Indian Ocean (see Figure 7c). Therefore, most of the correlation patterns come from the Pacific, which implies that the SSTA in the tropical Pacific has a crucial impact on PA in SWC. Moreover, the time delays of the strongest links for C 1 are within 100 days shorter than that of DJF, but the strength of links is stronger (Figure 7b,d). Therefore, we suggest that the proceeding winter SSTA probably influences the PA in MAM. The SSTA in the east Pacific often reaches to the peak during winter [40]. It will further influence the circulation systems, i.e., the Walker atmospheric circulation and the Hadley circulation in the tropical Pacific then cause climate anomalies in East Asia including SWC after around three months. For MAM, the links between the SSTA and PA in SWC are much higher than other three seasons. The largest cluster size is also larger than other seasons. Figure 7a shows that the cluster 1 in the west Pacific and 2 in the Equatorial Atlantic are positively correlated with the nodes 1,1 in MAM. For the negative correlation, however, the clusters 1 and 2 mainly distributed in the east Pacific and Indian Ocean (see Figure 7c). Therefore, most of the correlation patterns come from the Pacific, which implies that the SSTA in the tropical Pacific has a crucial impact on PA in SWC. Moreover, the time delays of the strongest links for 1 are within 100 days shorter than that of DJF, but the strength of links is stronger (Figure 7b,d). Therefore, we suggest that the proceeding winter SSTA probably influences the PA in MAM. The SSTA in the east Pacific often reaches to the peak during winter [40]. It will further influence the circulation systems, i.e., the Walker atmospheric circulation and the Hadley circulation in the tropical Pacific then cause climate anomalies in East Asia including SWC after around three months.  Figures 8 and 9 show the same analysis as Figures 6 and 7, but for JJA and SON, respectively. We find that the cluster sizes are smaller and the correlations are weaker than DJF and MAM. This is given that much more moisture is generated by local sources in SWC for JJA. Besides, we also find that the Indian Ocean and Pacific are important areas to influence the SWC in JJA and SON. Thus, considering both the Indian Ocean and Pacific could better improve the prediction of the rainfall in SWC [45,46].  Figures 8 and 9 show the same analysis as Figures 6 and 7, but for JJA and SON, respectively. We find that the cluster sizes are smaller and the correlations are weaker than DJF and MAM. This is given that much more moisture is generated by local sources in SWC for JJA. Besides, we also find that the Indian Ocean and Pacific are important areas to influence the SWC in JJA and SON. Thus, considering both the Indian Ocean and Pacific could better improve the prediction of the rainfall in SWC [45,46].

Conclusions
In this study, we employ a multi-variable complex network method to study the teleconnection between the SSTA and PA in SWC. We reveal that there are the most links between PA in SWC and the global SSTA network in Spring, followed by Winter, Summer and Autumn, respectively. According to the weighted out-degree of the network, we show the positive and negative correlation patterns over the global and find that most of the out-degree patterns emerge in the Equatorial Indian, Pacific and Atlantic Oceans which are mainly contributed by extreme PA in SWC. The mechanisms could be related to atmospheric bridge function of East Asian and Indian monsoons for the Equatorial Indian and Pacific Oceans. For the links between SWC and the Atlantic SSTA, the long-range Atlantic, long-range planetary waves account for the teleconnections [42,47]. The seasonal variation of the East Asian and Indian monsoon system mainly presents south wind in Summer and north wind in Winter, which can result that the out-degree patterns change with seasons.
According to the weighted in-degree in SWC, we find that the teleconnections are dominated by some important nodes. Therefore, we obtain the two largest clusters in the Oceans which are connected to the important nodes in the SWC. The time delay of the strongest link between the tropical east Pacific cluster and the important node of SWC is respectively −245 days for DJF, −75 days for MAM, −62 days for JJA, and −65 days for SON. Such teleconnection patterns over the global and their time delays for four seasons have not been reported in previous studies. These results are based on a complex network analysis and could be useful in improving the prediction of rainfall in SWC. The potential rainfall prediction and its mechanism need further analysis based on climate models and observation data.
Author Contributions: W.L., Z.G. and Y.Z. designed research; P.Q. performed research; P.Q. and Y.Z. analyzed data; P.Q., W.L., Y.Z. and Z.G. wrote the paper. P.Q., W.L., Y.Z. and Z.G. contributed to reviewing the manuscript. All authors have read and agreed to the published version of the manuscript. Data Availability Statement: The data in this paper was downloaded from the European Centre for Medium-Range Weather Forecasts (https://www.ecmwf.int/).