Assessment of probability distributions and analysis of the minimum storage draft rate in the equatorial region

Rapid urbanization in the state of Selangor, Malaysia, has led to a change in the land use, physical properties of basins, vegetation cover and impermeable surface water. These changes have affected the pattern and processes of the hydrological cycle, resulting in the ability of the basin region to store water supply to decline. Reliability on water supply from river basins depends on their low-flow characteristics. The impacts of minimum storage on hydrological drought are yet to be incorporated and assessed. Thus, this study aims to understand the concept of low-flow drought characteristics and the predictive significance of river storage draft rates in managing sustainable water catchment. In this study, the long-term streamflow data of 40 years from seven stations in Selangor were used, and the streamflow trends were analyzed. Low-flow frequency analysis was derived using the Weibull plotting position and four specific frequency distributions. Maximum likelihood was used to parameterize, while Kolmogorov–Smirnov tests were used to evaluate their fit to the dataset. The mass curve was used to quantify the minimum storage draft rate required to maintain the 50 % mean annual flow for the 10-year recurrence interval of low flow. Next, low-flow river discharges were analyzed using the 7 d mean annual minimum, while the drought event was determined using the 90th percentile (Q90) as the threshold level. The inter-event time and moving average was employed to remove the dependent and minor droughts in determining the drought characteristics. The result of the study shows that the lognormal (2P) distribution was found to be the best fit for low-flow frequency analysis to derive the lowflow return period. This analysis reveals September to December to be a critical period in river water storage to sustain the water availability during low flow in a 10-year occurrence interval. These findings indicated that hydrological droughts have generally become more critical in the availability of rivers to sustain water demand during low flows. These results can help in emphasizing the natural flow of water to provide water supply for continuous use during low flow.

Extreme drought can cause significant water cycle imbalances that alter the processes of precipitation and evaporation; the circulation of atmospheric water vapor; and the availability of soil moisture, which results in a low volume of water in streams, rivers and reservoirs. The equilibrium between both the water that is taken out for supply and that is substituted by surface runoff must be maintained. A critical issue arises when there is a dry season and there is no estimated water excess. Under such conditions, water shortages can happen even though the dry season is not too extreme. Human activities and poor management of water resources can lead to water scarcity, which could be exacerbated by drought. In certain regions, water consumption increases the severity of water scarcity and triggers water shortage events in regions that are relatively well endorsed with water resources (Wada et al., 2013).
Hydrological drought is a natural event with streamflow deficits in duration and volume (Kubiak-Wójcicka and Bąk, 2018). In a hydrological drought, not every low-flow occurrence can be called a drought, and several low flows can form one hydrological drought (Teegavarapu et al., 2019). It is not advisable to equate hydrological drought with low flow or other related hazards. Low flow is a term that is often used, referring to low-flow discharge. Low flow is often defined by a minimum annual series which does not reflect hydrological drought in all years. Fleig et al. (2006) distinguished between hydrological-drought and low-flow characteristics. For some specific purposes, the main feature of drought is said to be the water deficit. Low flows are usually observed during a drought, but they only feature one aspect of the drought, namely the magnitude of drought. Low-flow analysis is described as analyses that attempt to understand the short-term physical development of flows at a point along a river. The minimal annual n d average discharge is the most widely used low-flow index.
Water availability in many areas is becoming less predictable due to climate change. More significant periods of drought and higher temperature are projected to affect the rainfall distribution and river flow used for water availability causing deleterious effects on water supply. The watershed also plays a significant role in the propagation of drought and affects procedures such as pooling, lagging and lengthening (Fleig et al., 2006;Sarailidis et al., 2019). Some research further explored the specific functions of climate control and watershed influence in regulating features of hydrological drought, and the findings are largely based on spatial scales (Austin and Nelms, 2017;Barker et al., 2016;Liu et al., 2012;Zarafshani et al., 2016;Zhu et al., 2018). Generally, the duration of hydrological drought and the quantity of the deficit are more climate-related than watershed-related. However, watershed features such as geology, region, slope and groundwater regime perform a significant part in regulating the duration of hydrological drought and the quantity deficit for the regional scale, where the climate is presumed to be relatively constant (Gianfagna et al., 2015;Blöschl, 2006, 2007;Liu et al., 2016). The influences on hydrological drought are not restricted to the external variables such as climatic and watershed variables and should not be disregarded for anthropogenic activities in the form of land use modification, reservoir control, irrigation, and water extraction or withdrawal (Hatzigiannakis et al., 2016;Richter and Thomas, 2007;Sun et al., 2018;Toriman et al., 2013).
In the event that the low flow of the river is sufficient to meet the water demand, the storage may be utilized to increase the guaranteed water supply. The hydrological aspects which must be considered are the amount of storage necessary to sustain a given draft rate and the associated risk of insufficient storage to meet this draft rate. The relationship between inflow, storage and draw-off is complex. Significant sources of error are associated with frequency analysis. Error in frequency analysis is due to fitting the type of extremevalue distribution to low-flow series and uncertainties associated with assigning recurrence intervals for cumulative probabilities to the events in series. Drainage basin stores are surfaces of significant quantities of water that may regulate the rate at which input feeds through to the output. Channel storage is the volume of water contained within banks of the river that will operate as a water store between its initial input and ultimate output (Griffiths and Clausen, 1997).
This study was conducted in the state of Selangor on the western coast of Peninsular Malaysia to evaluate and investigate the hydrological-drought characteristics using historical streamflow data. High demand for water that can accommodate the daily water consumption of the population due to rapid growth, as well as the lack of rain, has caused disruptions of water supply in Selangor (Khalid, 2018;Kwan et al., 2013;Ngang et al., 2017). Water shortages associated with the El Niño-Southern Oscillation (ENSO) incident impacted parts of Malaysia, including Selangor (Sanusi et al., 2015;Zainal et al., 2017). Drought disasters have hit several regions in Malaysia, especially in the Klang Valley, Selangor; Penang; and several other places such as Kedah, Kelantan, Sarawak and Sabah (Chan, 2012). The problems of water shortage and drought in Malaysia were recorded as early as 1951, when it occurred for 29 months in the Langat River basin (Chan, 2012). After that episode, the drought disaster continued to hit Malaysia with the Klang Valley water crisis in February-May 1998; the water shortage continued in Hulu Langat, Selangor, in 2002(Ithnin, 2014. This drought has caused the water level in some water dams in Peninsular Malaysia to reach critical levels, like what happened in the 1997-1998 drought episode (Lee et al., 2018). Consequently, the characteristics of hydrological drought must be identified, and the effects of hydrological drought must be quantitatively evaluated. Studies conducted by Iqbal et al. (2016), Azadi et al. (2018, and Tigkas et al. (2012) have highlighted the issue of hydrological drought and its impact on agricultural, socio-economic and streamflow in the watershed (Azadi et al., 2018;Iqbal et al., 2016;Tigkas et al., 2012). The hydrological drought was referred to as the most critical aspect of drought, with significantly reduced streamflow and lower water storage in the river system (Hasan et al., 2019). Because of this, the storage rate for each river should be established to ensure the minimum storage for water supply requirement during low flow and drought in the coming years sufficient to accommodate consumers' water demand. Some relevant research questions in the investigation of hydrological drought are: The scope of this study covers the entire streamflow station in the state of Selangor. Selangor covers an area of 8104 km 2 and is located on Peninsular Malaysia's western coast. Selangor's water supply system not only covers the state of Selangor but also supplies water to the Kuala Lumpur and Putrajaya areas (Sakke et al., 2016a). The basins of the Langat, Klang and Selangor rivers are the main river basins in Selangor. There are also three other river basins in Selangor, which are the basins of the Buloh, Bernam and Tengi rivers. Table 1 shows the locations and characteristics of all streamflow gauging stations involved in this study. Langat and Semenyih dams, located at the upper reaches of the Langat River (Elfithri et al., 2018), serve to regulate the raw water supplied to treatment plants downstream. The main tributaries of Selangor's rivers are the Sembah, Kanching, Kerling, Rawang and Tinggi rivers. There are two dams, namely the Selangor and Tinggi dams, in the Selangor River basin.
The state of Selangor is characterized by its geographical position, which lies near the equatorial climate that is warm and humid all year (Lassen et al., 2004). The average annual temperature varies between 27 and 30 • C, and the average annual relative humidity is between 70 % and 90 % (Lee et al., 2013). The equatorial climatic regions are influenced by two monsoons: the southwest Indian monsoon and the northeast Asian monsoon, which result in two rainy seasons with a significant number of storms, resulting in a mean annual rainfall of about 2500 mm (Mamun et al., 2010). Even though Selangor is located in the humid region, it occasionally encounters drought periods. Dry spells, low rainfall and high soil impermeability due to population growth are the leading causes of low-flow events. A stream's regime can display one or more low-flow events depending on the climate. Two rainy and two dry seasons represent the equatorial climate, and the two streamflow regimes have two corresponding periods of high flow and low flow. Figure 1 shows the seven streamflow gauging stations involved in this study with four streamflow gauging stations located at the Langat River basin at Dengkil, Kajang, Semenyih and Lui. There is also streamflow gauging station at Rantau Panjang for the Selangor River basin, Tanjung Malim and Jambatan Sekolah Kebangsaan Cina for the Bernam River basin, respectively (Department of Irrigation and Drainage Malaysia, 2011). The headwater of the Langat River basin starts from the northeast of the basin, flows to the southwest and joins the Semenyih River. The Langat and Semenyih dams, the Selangor and Tinggi dams, are located at the upper reaches of the Langat River and Selangor River basins, respectively, (Elfithri et al., 2018) to regulate the quantities of streamflow to the treatment plants.

Methodology
Daily streamflow data were obtained from the Department of Irrigation and Drainage of Malaysia, which covers approximately 40 years (1978 to 2017) of records for all streamflow gauging stations. Precautions were taken to ensure reasonable low-flow data were captured. The methodological framework was developed for assessing the hydrologicaldrought characteristics in the state of Selangor, Malaysia, using low-flow and threshold indicators. The first analysis in this study is to determine the daily streamflow trend for 40 years using the Mann-Kendall test; the slope of trend was calculated using the Sen's slope estimator; and the change points are identified using the CUSUM and Pettitt test. Next, the potential of a probability distribution that optimally fits the 7 d mean annual minimum (MAM) in low-flow frequency analysis was evaluated for determining different return periods. The 10-year return period was computed using the estimation of the minimum storage draft rate in the river using a mass curve. Next, the threshold level was ob-tained from the flow duration curve (FDC), and 90th percentiles were selected for drought analysis. Finally, the characteristics of hydrological drought were analyzed, including drought events, durations and drought deficits in seven watershed catchments. The summary of the whole methodology analysis is depicted in Fig. 2. The following sections elucidate the specific components incorporated into the methodological framework.

Streamflow trend analysis
The mean annual streamflow was analyzed for significant trends, and distribution changes are discussed. The trend slope is measured using the Sen's slope estimator, which produces the magnitude of change in trends. Finally, using the CUSUM test, the change points were defined in the longterm streamflow results, and the changes in streamflow before and after the change points were examined using the Pettitt test. All analyses were conducted at seven stations to recognize the spatial variability based on historical streamflow pattern change. The Mann-Kendall test and Sen's slope test are the most commonly used non-parametric trend analysis methods (Hisdal et al., 2001). The Mann-Kendall test was chosen due to its capability of identifying the trend in a time series, if there is any. In the streamflow time series data, the trend was analyzed using the Mann-Kendall test to evaluate the significance of monotonic trends. For the test that consists of a series of streamflow data over a time period, the null hypothesis (H 0 ) was tested, and the data originated from a series of variables that are identically distributed and independent. The data of H 1 , the alternative hypothesis, follow a monotonic pattern over time. Under H 0 , the test statistics for Mann-Kendall are given by Eq. (1): where x j and x i are the data values in years j and i, respectively, and n is the total number of years. The probability associated with S and the sample size n was determined to measure the trend significance statistically. The normalized test statistics Z are expressed as follows using Eq. (2): . (2) The null hypothesis of no trend is rejected if Z>2.575 at 99 % significance. In the test statistic, S calculates the sum of the difference between data points and the associations between samples to show the presence or absence of a trend. When the value of Z is positive, it gives a positive trend, and it gives a negative trend when Z is a negative value. In this study, the level of significance of 0.05 or 95 % (p value = 0.05) was used. If their p value was equal to or less than 0.05 (p value ≤ 0.05), the trend test is considered significant, as shown by Eq. (3) (Coch and Mediero, 2016): Then, a linear trend analysis was also conducted, and the trend magnitude was determined using the Sen's slope method. Sen's slope is a non-parametric method for determining any trend's slope. It utilizes data from a time series that is similarly distributed. The difference in slope was calculated per changed time for each data point. If a trend is identified in a time series, the slope can be determined using the slope estimator (β) in the Sen's slope test. For the entire dataset, the estimator β is the median of all slopes between data points. A positive β indicates an increasing trend, and a negative β indicates a decreasing trend as given by Eq. (4): where n is the number of data and i and j are indices with i = 1, 2, . . . (n − 1) and j = 2, 3, . . . , n. The changes in the average annual streamflow were determined after the trend slope had been verified, using the equation employed by Petrow and Merz (2009) to calculate the amount of change in the data series by Eq. (5): where X R is the amount of change observed in the data series, X end is the last piece of the trend slope data, X first is the first piece of the trend slope data and X mean is the mean of all pieces of the slope. The distribution-free CUSUM test is a cumulative total of time series deviations of target value and is capable of detecting abnormal trends, is simple and produces a better graphical representation of results (Sonali and Nagesh Kumar, 2013). Let us consider x samples, each of n size with mean µ 0 and standard deviation σ . Then, the cumulative sum of deviation (S i ) from the target value (mean) was calculated using Eq. (6): where x j is the mean of the j th sample. Finally, by considering a sequence of random variables x 1 , x 2 , . . . , x T which may have a change point at N if x t for t = 1, 2, . . . , N has a common distribution function F 1 (x), the Pettitt test index (U ) is defined using Eq. (7) (Ahn and Palmer, 2016): where T is the change point, x is the target variable and sgn(x j − x i ) is defined as Eq. (8): The non-parametric statistic (Eq. 9) was applied in the evaluation of the change point at which time U has the highest absolute value.
where K is the final Pettitt statistic and T is the data point at which the change occurs. The probability of significance was approximated by p ≈ 2 exp[−6K 2 (i 3 + i 2 )]. When p is smaller than the specified significance level (0.05), the null hypothesis is rejected.

Low-flow frequency analysis
There are many types of frequency distribution functions that have been applied successfully to hydrological data. Frequency analysis is based on fitting the observed data with a theoretical probability distribution function and providing low-flow estimates for any given return period. The choice of probability distribution is defined as the distribution of probability with the shape parameter. This selection is necessary to evaluate the shape parameter as the parameter for skewness. The frequency analysis starts with the calculation of the annual 7 d minimum streamflow series for each gauge station in order to determine the suitable probability distribution that best fits the minimum 7 d low flow in Selangor. Then, four probability distributions, including the gamma distribution, Gumbel, lognormal 2P and Pearson type 3 distribution (PE3) were evaluated to determine which distribution most appropriately fits the low-flow data. The Kolmogorov-Smirnov (KS) test and ranking method were used to determine the best-fitting distributions. After choosing the optimum probability distribution, it is important to estimate the values of the variables for certain return periods. The return period of low-flow occurrence is crucial for determining the magnitude and frequency of low flow, and such information is useful in minimizing and mitigating the risk of drought in the future. Four scores ranging from 1 to 4 representing the ranking of distributions in fitting the data were assigned to each station, where a score of 1 indicated the best, while a score of 4 indicated the worst. The summation of scores shows the suitability of distribution such that the best distribution got the lowest sum of scores. The selected regional probability distribution function was then used to calculate the annual 7 d minimum discharge series with a 1-, 2.3-, 5-, 10-, 25-, 50and 100-year return period. The 7 d minimum with a 10-year return period (7Q10) was used to derive the minimum storage draft rate required for all stations (Sect. 3.3).
The probabilistic behavior was analyzed using four probability distribution functions (PDFs), widely used in extremevalue analysis (Joshi and St.-Hilaire, 2013;. Then, probability distribution functions were fitted with their parameters estimated using the method of maximum-likelihood estimation (Assefa and Moges, 2018). Goodness of fit was determined by the Kolmogorov-Smirnov test. Here, a 95 % confidence level was accepted to reject or accept a non-rejected hypothesis, based on the D value. The graphical illustration of probability plot is described as the ith-order statistic of the sample y(i) as a function of a plotting position, which is simply a measure of the non-exceedance probability related to the ith-order statistic from the assumed standardized distribution (Sharma and Panu, 2015). The rth-order statistic was acquired by the way of rating the observed sample from the smallest (i = 1) to the greatest (i = n) value; then y(i) equals the ith largest value. The plotting position of low flow P can be obtained using the Weibull formula (Koteia et al., 2016). The probability selection was made following the shape parameter. This is because it is possible to represent the shape parameter as the parameter for skewness. For each distribution, Table 2 provides the functions of probability density. For this study, the method of maximum likelihood was used for parameter estimation. Once the parameters were estimated, the selected distributions will be tested for the assumption that the observed data is actually from the fitted distribution of probability. The Kolmogorov-Smirnov (KS) test has been used to determine the largest discrepancy between the theoretical (F n (x i )) and empirical (F 0 (x i )) cumulative distribution functions. The KS test obtains a D statistic; if D was higher than the critical value (α = 0.05), the distribution was rejected. After the probability calculations P and subsequent return periods of the low flow T , the low-flow rate variation will be plotted against the return period, with T on the semilog graph. With this graph, the specific magnitude of a specified period can be determined (Erfen et al., 2015;Gottschalk et al., 2013).

Method for minimum storage draft rate
The water supply or inflow is dependent on low-flow characteristics in the stream. If the inflow rate is lower than the outflow (demand) rate, the cumulative difference between supply and demand volume is the maximum amount of water drawn from storage during the dry season. In channel storage, the function of both outflow and inflow discharge can be considered under two categories as prism and wedge storage. The water surface flow in the channel is not only unparallel to channel bottom but also varies with time. The storage, which is the maximum cumulative deficiency in any dry season, is obtained from the maximum difference in the ordinate between the mass curve of water supply and demand. Thus, the storage required can be expressed as per Eq. (10): where V D is the demand volume and V S is the supply volume. The minimum storage draft rate was determined by using the mass curve of low flow at a monthly interval (Bharali, 2015). Although specific evaluation of storage requirements is essential for design, reconnaissance planning can frequently be facilitated by using draft storage curves based on low-flow frequency analysis. Alrayess et al. (2017) de- Table 2. Probability density function for gamma, Gumbel, lognormal 2P and Pearson type 3 distributions.

No. Distribution
Probability density function References where α is the location parameter and β is the scale parameter where the α and β parameters are parameters of scale and location Zou et al. (2018) 3 Win and Win (2014)   4 Pearson type 3 (PE3) Bhatti et al. (2019) termined the capacity of river storage by the mass curve method. The mass curve has many useful applications in the design of storage capacities, such as to determine the storage capacity and flood routing (Gao et al., 2017). The mass curve method can be used to define the storage required for a given draft rate for a monthly record. This approach is limited to draft rates that can be sustained by the streamflow available in any single month, that is, by within a year of storage. The usefulness of this analysis depends on the monthly variability of streamflow. In some regions, the maximum draft that can be provided is less than a tenth of the mean flow. In others, notably in Selangor, drafts of half of the mean flow can be provided within a year of storage. The estimation of the storage draft rate in this study will determine the minimum storage of a river to sustain the water supply during low flows and droughts. The mass curve of the monthly low-flow rate is used in this analysis to obtain the minimum storage rate of the river. The procedure for the mass curve method has the following steps; first, the mass curve analysis of low flow for the duration of January to December was plotted against the duration for the recurrence interval of 10 years from the 10-year return period in Table 7. Second, the cumulative draw-off that corresponds to a constant draft rate of 50 % of the mean annual flow and was connected by a straight line. Third, the cumulative draft line was superimposed on the mass curve; fourth, the largest intercept between the cumulative draft line and the mass curve was measured. The maximum positive difference between cumulative draw-off and low flow is the minimum storage necessary to maintain a draft rate of 50 % of the mean annual streamflow. The example of minimum storage required in the river for station S05 using mass curve analysis was shown in Fig. 3.

Threshold analysis
An approach based on deficit characteristics under a given threshold method was adopted to identify extreme low-flow occurrences (Fleig et al., 2006). The low-flow period, which depends on the catchment's hydrological regime, is defined by a fixed threshold level. The selection of the threshold level is influenced by the study, region and available data. The threshold level method can easily obtain the start and the end times of a drought or streamflow deficit period and has been used to define streamflow droughts or deficits. The fixed threshold level in this study is the 90th percentile value (Q 90 ) of FDC, which was compiled using all the available daily streamflow and identified as perennial rivers with river flow having continuous flow. The flow duration curve (FDC) describes the ratio of a specified percentage of time with discharge being equal to or surpassed over a historical period for a particular river basin (Croker et al., 2003;Mohamoud, 2008;Vogel and Fennessey, 1994), which reflects the relationship between streamflow magnitude and the length of time that relates to the average percentage of time of a specific flow that had exceeded (Sung and Chung, 2014). The FDC was developed by arranging streamflow values in decreasing magnitude order and assigning rank numbers to each streamflow value. The most substantial flow was ranked as one, and the smallest flow was ranked as n, where n is the complete record quantity. The percentage of time for a given flow was equalled or exceeded (probability of excess) when calculated using the relationship in Eq. (11) (Awass, 2009;Koteia et al., 2016;Yahiaoui, 2019): where P is the percentage of time a given flow is equalled or exceeded, n is the total number of records and r is the rank of the flow magnitude. Kannan et al. (2018) indicated the flow duration curve that could be divided into five zones, representing high flows (0 %-10 %), humid conditions (10 %-40 %), medium-range flows (40 %-60 %), dry conditions (60 %-90 %) and low flows (90 %-100 %). The selection of percentile will strongly condition the classification and evaluation of extreme low-flow events. The magnitude of drought characteristics was determined by the threshold value and the difference in value between the time series. When compared to the use of standardized drought indices, a major benefit of this approach is that it allows the deficit volume to be quantified, which is a critical aspect in the management of water supplies. When the flow falls below the threshold level, a drought event begins; it terminates when the flow exceeds the threshold level. The duration; total deficit, which is the sum of the deficits; and magnitude of each drought event can be readily obtained. As the daily data series was used, the existence of minor drought events and mutually dependable drought events can be detected (Van Loon and Van Lanen, 2013). In order to deal with this problem, pooling procedures such as moving average, inter-event time criterion and inter-event time, and volume criterion were frequently used (Sung and Chung, 2014). According to the study by Sakke et al. (2016a), to eliminate the minor drought events, the events that have occurred for less than 15 d will be excluded, while the mutually dependent events were also eliminated using the pooling procedure (Sakke et al., 2016b). In this paper, the 15 d of inter-event time and 7 d moving average was applied as a pooling procedure to obtain smooth data. Through these methods, the mutually dependent drought events will combined into individual and independent drought events (Fleig et al., 2006). The minor drought events will be eliminated or combined with individual drought events automatically (Yahiaoui et al., 2009).

Results and discussion
The streamflow data from the seven streamflow gauging stations will be analyzed in three aspects, which are mean annual low flow and the probability of occurrence, drought characteristics using the threshold level, and the estimation of the storage draft rate of the river. Statistical characteristics were calculated from the observed 40-year daily streamflow time series: the mean, minimum and maximum; standard deviation; skewness; and kurtosis for each station (Table 3).

Streamflow trend analysis
Annual streamflow series trend analysis presents the overall view of the shift in systems of streamflow (Assefa and Moges, 2018). The Mann-Kendall test, Sen's slope, relative change within 40 years, maximum cumulative sum (CUSUM) with the year of change point and their value of p using the Pettitt test are displayed in Table 4. In the trend significance test, the significance level of α = 0.05 was set as the standard, making Z α/2 = 1.96. The analysis indicated that five selected stations (S01, S02, S04, S05 and S07) have increasing trends of streamflow. Two of the stations, S03 and S06, showed a decreasing trend with the negative change of streamflow. The estimation of the trend slope was carried out using the Sen's slope estimator, where an upward (downward) streamflow trend is indicated by a trend slope greater (less) than zero. In order to compute the trends of annual streamflow, the trend slope values were also used to construct a trend line. Using Eq. (5), the amount of change in annual streamflow was determined. The analysis results indicate that the amount of change in the basin of station S04 was higher than that at other stations (Table 4). The two gauging stations, which are S03 and S06, had significantly greater changes that showed a downward decreasing trend of −20 % and −55 %, respectively. Streamflow trends indicate variability from one station to another, in terms of magnitude and trend direction.
In the S03 and S06 stations, there could be several factors for decreasing streamflow. Some of this involves modifications in the catchment of physical characteristics such as changes in land cover in river basins (Hisdal et al., 2001). Another five stations indicated an increase in trends of streamflow due to climate change for the increasing temperature and soil water evaporation (Siwar et al., 2013;Taye et al., 2011). The accuracy of the results of data analysis is of crucial importance in the trend analysis studies, especially on the discharges of any stream. The majority of station trends on the main and secondary branches of the basin reflected good consistency in this analysis. Two main rivers, however, demonstrate a paradox, although one station shows a declining trend and the other station shows an increasing trend. Due to the location of the stations, dam construction, linking of another stream to the channel, irrigation and other disruptions in the discharge regime of the river, this condition is foreseeable. Stations S01, S02, S03 and S04 are located on the same stream, but the trends at station S04 are not in the same direction. Stations S01, S02 and S03 have a significantly increasing trend, while station S04 shows no significant downward streamflow trend, caused by the disruption in the river regime, such as the construction of the Langat dam, which may cause this contrast (Memarian et al., 2012).
The results of the change point in annual streamflow are tabulated in Table 4 using the Pettitt test. For each time sequence, the result gave the most likely change point event.
For the annual streamflow, the results showed that 1997 was the most probable year of change with a p value of 0.0004. Some stations show signs of a change point at a significance level of 5 %, while the others do not. The prediction of process changes and trend generation is well indicated using CUSUM charts. This analysis shows a change point that can be seen in the year of 1996, with a confidence interval setting of 95 % and the p value of 0.1215 for station S01. The change point occurred in 2005 twice for stations S05 and S07 in the state of Selangor. The major changes in the annual streamflow observed revealed that the presence of rapidly increasing industrial activities in the basin due to a shift in the land use is caused by the result of the streamflow trend in the basin. The latest change points occurred in 2009 at Bernam River (S06) with a new implementation of several projects by the state government such as the construction of a feeder canal for agricultural and repair of the collapsed stretch of the riverbank that caused the widening of the river channel.
For the mean annual streamflow at the gauging stations, five stations indicated an upward trend, and two stations indicated a downward trend in the 40 years of data. The interpretations of trend analysis for relatively partial streamflow records may only reflect a short-term condition and may not be a representative of an actual long-term change in the streamflow data. This issue is valid for relatively short-term records that begin or end in a historically low-flow condition. From the average annual streamflow results, the change point is seen to be present at a 100 % confidence interval in 1996-1997 and 2005-2007 and implies that there is an impact of rapidly increasing industrial activities in the basin as well as a change in the pattern of land use induced by the effect of streamflow patterns in the basin which is supported by research according to Abdullah and Nakagoshi (2006). This study is very useful in interpreting climate change scenarios and is focused on the revealed characteristics of regionallevel hydrological variables.
The anthropogenic effect is shown by transformations of water surface such as the construction of reservoirs, a transbasin diversion project, crop irrigation, urban water supply or drainage, and urbanization. There are three strategic dams in the study area. Those are the Langat dam in S02, the Semenyih dam in S03 and the Selangor River dam in S05. All the dams are functional for domestic and industrial freshwater supply. Whereas, the Langat dam is only used as a power supply generator for Langat Valley consumption. A study by Shaaban and Low (2003) showed that drought events reduced water discharge at the Langat and Semenyih basin, particularly in the period of 1993-1998 (Shaaban and Low, 2003). This event justified the change point from this analysis. These drought events have decreased the trend of water discharge in the Semenyih basin. Due to the increasing size of natural or artificial dams, the reduction of streamflow trend was regulated at the Langat River basin as compared to the Semenyih basin.
Streamflow variability due to potential human intervention or climate change is important for regional water supply planning and management. Knowledge of streamflow variability and its trend is crucial for the socio-economic sector because any changing in streamflow is a limiting factor for the use of water resources. The streamflow decreasing trend could result in important economic losses and affect health and human welfare, as well as the aquatic ecosystems. One of the influential aims of the time series trend is to define the nature characteristic represented by the sequence of observations and predicted future values of the time series variable. The analysis of the observed data for changes and trends of streamflow data can be used to assess the impact of climate change. The streamflow trend can estimate future water availability to maintain and sustain ecosystem functions. Moreover, streamflow trend analysis can also be used to predict any change in river flows for making water withdrawal decisions, which indirectly could improve drought management response.

Low-flow frequency analysis
Frequency analysis has focused on fitting a theoretical probability distribution function to the observed data and provides low-flow estimates for any given return period. For each station, annual minimum streamflow was plotted using all the distributions. The goodness of fit was performed using the Kolmogorov-Smirnov test. All the PDFs were ranked for streamflow at each station. Ranks, according to these three goodness-of-fit metrics, showed a significant variation. In the case of annual minimum streamflow, various distributions were found to be the best fit for different stations, namely, gamma, Gumbel, lognormal 2P and Pearson type 3. Figure 4 shows the example probability of mean annual minimum flow for station S01. The estimated parameters were determined and shown in Table 5. The information on the return period of extreme events can be used in determining the risk management by extreme events such as hydrological drought, while the geographical station location and the surrounding environmental factors determine the variation of streamflow. Table 6 shows the best-fit results of the KS test and p-value results with their ranking. The purpose of the probability distribution fitting is to represent the low-flow probability most accurately. Among all stations, it was found that among all distributions, the lognormal 2P yielded the most cases of best-fit distributions, while the Gumbel and gamma results yielded the second and third most cases of best fits, respectively. Comparatively, it is proposed that lognormal 2P distributions predict low-flow discharges for all the rivers under analysis, which can be used in water quality and quantity management at gauged and ungauged areas. From this comparison, although a threeparameter metric in the probability distribution functions are more advantageous, the 7 d low-flow sequences fit better. However, in the Selangor region, a two-parameter metric is more suitable, which optimally fits a 7 d mean annual minimum flow verified in the studies of Granemann et al. (2018) and da Silva Lelis et al. (2020). When the best-fit probability distribution of the low-flow series of the 7 d has been determined, the low-flow discharge of the 7 d can be estimated according to any given return period. It should be noted that the research is station dependent in this analysis. Table 7  Table 5. Estimated parameters for the gamma, Gumbel, lognormal 2P and Pearson type 3 distributions.   shows the return period of low flow at all streamflow stations. The 7 d mean annual minimum for the recurrence interval of 10 years (Table 7) was used in the determination of minimum storage draft rate for each station.

Distribution
A catchment with a slow or quick response to rainfall intensity that usually has prolonged or rapid recession actions depends entirely on the catchment's physical characteristics. Low flow in catchments that respond quickly is lower than in those that respond slowly. Low flow in catchments that respond slowly is more persistent than in catchments that respond quickly. These differences demonstrate the significant effect of hydrological processes and storages to the lowflow events. Figure 5 displays the low-flow relationship with the watershed area represented by the boxplot graph. The largest range for low flow per area is in S06, while the smallest range is in S01. The boxplot graph provides information about the shape of a dataset. S01, S02 and S04 are skewed right; S03, S05 and S06 are symmetrically shaped data; and S07 is skewed left. From the discussions above, it is clear that the natural elements that affect a variety of factors of the river's low-flow regime consist of distribution and hydraulic components, climate and topography.

Estimation of the minimum storage draft rate
This study focused on the minimum surface water storage required based on the records from the hydrological stations in the state of Selangor for the 1978 to 2017 period. Hydrological drought is a recurring phenomenon of water shortage that incorporates the storage of surface and subsurface water under the effects of climate change and human activity (Schwalm et al., 2017). The water storage required for all stations is based on their respective monthly streamflow discharge. A graph of the cumulative streamflow draft rate versus a specific historical timeline is plotted to find out the storage required for each station. Figure 6 shows the mass curve analysis for the determination of the minimum storage draft rate of each station that needs to be maintained at a draft rate of 50 % of the mean annual flow during low flows to sustain the water supply. The minimum storage required for maintaining a draft rate required for S01 is 21.51 m 3 s −1 in October; S02 is 13.37 m 3 s −1 in December; and S03 is 4.79 m 3 s −1 in December. The minimum storage required for S04 is 2.32 m 3 s −1 in October for a 40-year duration period, and S05 is 15.00 m 3 s −1 in September. The minimum storage required to maintain the draft rate for S06 is 10.90 m 3 s −1 in October, and lastly, for S07 it is 6.17 m 3 s −1 in September. The result shows that the water storage for all stations did not meet the corresponding water required, while stations S05 and S07 correspond to the required expectation for August to October. This result reveals that the period of September to December is a critical duration in river water storage to sustain the water availability during low flow in a 10-year occurrence interval. This finding is justified by the state of Selangor located on the western coast of Peninsular Malaysia which is affected by two main monsoon seasons and two inter-monsoon seasons with October and January being relatively dry months (Hazir et al., 2020). However, there is not enough water storage starting September for stations S05 and S07.
Low-flow and surface water storage assessment is a critical issue for understanding the global water cycle, which is recognized to be of significant importance on a regional and global scale for the monitoring of water resources. Cor- Note that the 10-year low-flow return period will be used in the determination of the minimum storage draft rate.
respondingly, this analysis provides important scientific data on the minimum storage required for river systems. Sufficient water storage during critical dry periods is largely dependent on the adequacy and efficiency of water supplies from surface water resources. This surface water storage faces many challenges which could lead to a decrease in their optimum yields and eventually result in inadequate supply of water over the next 10 years. This could be due to reasons such as increasing water demand due to increasing population and industry needs and emerging demands for recreation and the conservation of the quality of stream water, biodiversity and aquatic ecosystems.

Hydrological-drought characteristic analysis
The threshold level value per the Q percentile obtained from the flow duration curve is shown in Table 8. In this study, only Q 90 was used as a threshold level in the determination of drought events. The percentage is shown where the streamflow rate was below the average level, and the respective days were recorded to show the severity of droughts events at each station. The growing perception of hydrological-drought improvement on a global scale has some necessary implications for water management. It is recognized, for example, that the duration and the volume of the deficit of the drought are associated (Fleig et al., 2006). Figures 7 to 10 show the drought characteristics below the threshold level (Q 90 ), with the minor drought for each station in the Selangor region removed. Station S01 has 39 episodes of drought events in 40 years. This station also recorded 1593 d of drought, with a total deficit of 10 299.97 m 3 s −1 . The lowest deficit was recorded in 1994 at 41.53 m 3 s −1 , while the highest deficit was recorded in 1986 at 666.58 m 3 s −1 . The average amount of water deficit was 264.10 m 3 s −1 . This river has been affected by water rationing that happened in Selangor in early 2014 for 3 to 4 months. The most prolonged period of an individual drought was recorded in 2014 at 112 d from 5 March to 24 June. The shortest period of a single drought was 15 d, which was marked three times in 2004 and 2005. Station S02 was a part of the Langat River basin and has had 29 episodes of  drought events in 40 years. The total duration of the drought events was recorded to be 1261 d from the 14 610 d of total observation, which was only 8.63 % of the entire record period and was below the threshold level Q 90 = 2.99 m 3 s −1 . The overall deficit for this station was 2340 m 3 s −1 , with an average of 80.70 m 3 s −1 . The lowest deficit was in 1993 at 34.44 m 3 s −1 , while the highest deficit was recorded in 1986 with 179.73 m 3 s −1 . The overall total deficit was 1.57 % of the total water flow. The threshold level of S03 was 1.47 m 3 s −1 at an average level with 12 episodes of drought events. The total number of the occurrence of drought was 1577 d, which was 10.79 % of the overall record of observation. S03 has the lowest recorded value of the total number and series of drought events among all stations. However, S03 also recorded a long period of drought for individual events. The longest single drought took place in 1998, with 241 d commencing on 24 February and ending on 22 October. S03 also recorded the lowest deficit amount amongst all stations with 1660 m 3 s −1 during the period of drought. This total was 2.2 % of the total water flow through this station, which was 75 562 m 3 s −1 . The highest deficit was recorded in 1998 with a total of 226 m 3 s −1 over 241 d. The lowest deficit was recorded in the dry season in 1997, with only 21.57 m 3 s −1 within 20 d. Station S04 has 28 episodes of drought occurring in 40 years of records. The most prolonged period of an individual and annual drought was recorded in 2004 as 306 d. The shortest period was 15 d in 1999. The number of drought events exceeding the number of years of drought was due to repeated events occurring 18 times with a maximum of four replications in 1 year. The total number of days of the occurrence of this drought was 1460 d, which is 9.99 % of the total daily flow data. The overall deficit of 28 drought events was 673.54 m 3 s −1 . The lowest total deficit was recorded in 1983 at as much as 7 m 3 s −1 , while the highest deficit was recorded in 2004 with 131.27 m 3 s −1 . The average amount of total deficit was 24.06 m 3 s −1 .
Station S05 has been categorized as the most critical station with the highest number of days of droughts events. The longest annual drought event was recorded in 1998 with 217 d, and for individual drought events, this occurred in 1999 with a period of 111 d. Using the threshold level at Q 90 = 21.52 m 3 s −1 , 1236 d (10 % of the total) are below the threshold level categorized as drought. Repeated drought events were recorded in 1978, 1979, 1986, 1987, 1990, 1998, 2000 and 2002. The drought episode was seen most repetitive in 1998 with four repetitions a year. The total magnitude deficit of the entire river water stream during the occurrence is 18 695.45 m 3 s −1 . The value of the minimum storage rate at 67.36 m 3 s −1 exceeds the amount of the low-flow rate at 35.61 m 3 s −1 that will occur at a return period of 50 years. Station S06 shows the drought episodes that were seen in succession from  1978,1983,1985,1987,1990,1991,1992,1998,1999,2001,2002,2005  The total drought days at this station was 1614 d, which was 11.05 % of the total days. S07 recorded a deficit of 21 740 m 3 s −1 during the drought episode, and this percentage is the highest percentage recorded as compared to other streamflow stations. This stream recorded a high deficit amount with fewer drought days. The highest deficit reached was 1445 m 3 s −1 , which was recorded in the drought events in 1990, while the lowest deficit was in 1983 with a total of 161.32 m 3 s −1 . From the results, S01 exhibits the highest number of drought events, at 39 episodes, with the mean deficit being 264.10 m 3 s −1 . This station is located downstream of the Langat basin. It indicates the downstream watershed catchment has more drought episodes compared to the upstream catchment. Magnitudes differ significantly between catchments, since there were also varied specific hydrological characteristics, such as station spatial distribution, precipitation and temperature magnitudes, and frequency of extreme events like drought.
Several indices could be used to provide a more accurate representation of hydrological drought. Which indices one chooses to use is going to affect the result directly. It is important to note that the Q 90 threshold merely identifies that low flows accounted for catchments' regular flow, especially in this study area. Therefore, the Q 90 threshold does not necessarily imply a situation where functions in nature are affected. The threshold level can reflect a specific requirement, such as for water supply or minimum environmental flow, or a normal low-flow condition of the river can be represented. For a bigger picture and understanding of the broad spectrum of hydrological drought, more indices need to be put together in an index. Different methods will allow different characteristics of hydrological droughts. The threshold level method should be used for more detailed deficits and in-depth study. Complex indices would be most useful to verify results in regional studies. While streamflow changes are mainly influenced by rainfall variability, the occurrence of low-flow conditions is also likely to be a function of catchment response, influenced by catchment storage. There can be a significant variance in the frequency, severity and duration of streamflow depletion between surrounding catchments as a drought develops and subsequently decays. In catchments with low storage, streamflow levels typically drop more rapidly than in catchments that receive a consistent flow from stored sources. However, catchments dependent on stored water are becoming increasingly vulnerable in a prolonged or multiyear drought as depletion in groundwater storage begins to affect baseflow levels. Thus, even after rainfall has returned to normal levels, flows in permeable catchments may still be affected.
Selangor's river flow trend reflects the rainfall pattern, and there is a prompt response to rainfall in general, although the response rate varies from one catchment to another. Some catchments, with little or insignificant storage, have a very rapid response to rainfall and are known as flashy catchments. The rate of increment in runoff resulting from rainfall in other catchments may not be as extreme as water goes into storage and then contributes to the flow of rivers from storage. The state of Selangor enjoys a tropical rainforest climate with two major monsoon seasons and two inter-monsoon seasons. Due to this, heavy rainfall typically occurs in the form of convective rains, and the state is generally wetter than other parts of Peninsular Malaysia. Drought in Selangor is therefore not a very frequent event. However, it is important not to forget that droughts events occurred in the past: 1986, 1994, 1997, 1998, 2003 and 2004 for all stations. This pattern justified the El Niño events that largely influence the climate variability over Malaysia, especially the state of Selangor (Tangang et al., 2012). This situation can be seen with the drought period being very closely related to the amount of deficit that occurs. Drought is seen as very severe when it occurs over a long period, and the amount of water deficit experienced is a high.

Conclusions
This study determined the streamflow trend analysis on seven stations in the state of Selangor, Malaysia, to quantify the trends over 40 years of record data. The result shows that two stations experienced significant decreasing trends, with 55.56 % of relative change within the 40 years. From the mean annual streamflow data, it is seen that the change point is present in 1996-1997 and 2005-2007 at a 100 % confidence interval. This implies that there is an influence of fastgrowing industrial activities in the basin, and there is also a change in the land use pattern, which is caused by the effect of streamflow trends in the basin. This finding has important implications for water resource management, which will affect future developments in Selangor. The impact of the serial and spatial correlation on the trends needs to be investigated. Further study in streamflow trends needs to be carried out, such as the prediction or modeling in the forecasting of streamflow trends.
Low-flow analysis is an essential and widely studied design and management strategy for hydrology and water resources. Varying and complex natural processes may produce low flows in a river on a catchment scale. The second aim of this work was to determine the characteristics of low flow by using frequency analysis. In order to determine the suitable probability distribution that optimally fits the minimum 7 d low-flow values, first, the 7 d mean annual minimum streamflow series for each gauge was computed. Then, four probability distributions, including the gamma distribution, Gum-bel, lognormal 2P and Pearson type 3 distribution (PE3) were evaluated to determine the distribution that most appropriately fits the low-flow data. The Kolmogorov-Smirnov (KS) test and ranking method were used to determine the bestfitting distributions. Based on the result, a lognormal 2P distribution provided a good fit to annual minimum flow data at each station. After the suitable probability distribution was selected, the return values for certain return periods were estimated. The return period of low-flow occurrence is crucial for determining the magnitude and frequency of low flow, and such information is valuable in accessing and mitigating the drought hazard in the future. Their parameters define distributions of probability; hence, to better understand the theoretical probability distribution method, it is necessary to fully understand the principles underlying parameter estimation for established theoretical frequency distributions. From the result, the range indicated that the low flow of rivers in Selangor was between 0.75 and 19.47 m 3 s −1 . The 7 d mean annual minimum for the recurrence interval of 10 years was used in the determination of the minimum storage draft rate for each station.
The draft rate of low flow at the recurrence interval of 10 years from low-flow frequency analysis using lognormal 2P was used to ensure the minimum storage draft rate required to sustain the water demand during low-flow periods. The restructuring of the minimum storage draft rate must be carried out by hydrologist at a particular return period to ensure the streamflow gauging station has enough water to be supplied to the user during the low-flow and drought periods. Based on the analysis of the study, the estimated minimum storage draft rates for each station cannot meet the water demand during low flow at specific return periods, which is a 10-year recurrence interval for this research. This result reveals that September to December is a critical period in river water storage to sustain the water availability during low flow in a 10-year occurrence interval. The storage of river water faces several problems that may lead to a decrease in its sustainable yields and even to an inadequate supply of freshwater over the next 10 years.
Hydrological drought is a phenomenon of water shortage when the water supply is below the average level. This study developed a sound principle using threshold level methods to describe the characteristics of streamflow droughts. However, the threshold selection should be further analyzed because it is not clear if Q 90 should be used as a representative threshold for rivers in a tropical climate. From this study, we can make the following conclusions: 1. The threshold level using the Q percentile based on the flow duration curve was used as an average level to separate the occurrence of drought events or otherwise. The number of days and duration of droughts for a station can show the severity of the drought that occurs.
2. The drought characteristics were analyzed from time series below a threshold level (Q 90 ) by removing the minor drought. The magnitude and duration of drought characteristics were determined by the value difference between the time series and the threshold level value.
3. The highest drought events are 39 episodes with a mean volume of the deficit being 557.46 m 3 s −1 , while the lowest events of drought were 10 episodes with the mean volume of the deficit being 127.71 m 3 s −1 .
4. Drought in Selangor is therefore not a very frequent event. However, several notable droughts occurred in Selangor in the years of 1986, 1994, 1997, 1998, 2003 and 2004 for all stations.
This research is essential to water resource management. Low-flow analysis and water availability enable water resource management to make more realistic decisions on water restrictions and provisions for cities and populations. Understanding the concept of low flow and the predictive significance of the river minimum storage draft rate required can also help in managing a sustainable water catchment. This study also helps in emphasizing the natural flow of water to provide water supply for continuous use during low flow. Additionally, through this research, the concept of lowflow analysis, hydrological drought using a threshold level and the predictive significance of the minimum storage draft rate can be developed to produce more efficient water resource management systems during the dry season in Selangor, Malaysia.
Data availability. Data can be made available by the authors upon request. The raw streamflow data is the property of the Department of Irrigation and Drainage (DID) of Malaysia and cannot be shared publicly without prior permission.
Author contributions. The study and methodology were conceived by all authors. HHH carried out the analyses, produced the results, and wrote the paper under the supervision of SFMR and NSM. HHH prepared the original draft with contributions from all authors. SFMR was responsible for funding acquisition and supervised the project. HHH, SFMR, NSM and FMH all contributed by generating ideas, discussing results and editing the paper.
Competing interests. The authors declare that they have no conflict of interest.
Special issue statement. This article is part of the special issue "Recent advances in drought and water scarcity monitoring, modelling, and forecasting (EGU2019, session HS4.1.1/NH1.31)". It is a result of the European Geosciences Union General Assembly 2019, Vienna, Austria, 7-12 April 2019.