Forecasting Vegetation Condition with a Bayesian Autoregressive Distributed Lags (BARDL) Model
 ^{1}The Data Intensive Science Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH, UK
 ^{2}Astronomy Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH, UK
 ^{3}Department of Geography, School of Global Studies, University of Sussex, Brighton BN1 9QJ, UK
 ^{4}Sackler Centre for Consciousness Science, Department of Informatics, University of Sussex, Brighton BN1 9QJ, UK
 ^{1}The Data Intensive Science Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH, UK
 ^{2}Astronomy Centre, Department of Physics and Astronomy, University of Sussex, Brighton BN1 9QH, UK
 ^{3}Department of Geography, School of Global Studies, University of Sussex, Brighton BN1 9QJ, UK
 ^{4}Sackler Centre for Consciousness Science, Department of Informatics, University of Sussex, Brighton BN1 9QJ, UK
Abstract. Droughts form a large part of climate/weatherrelated disasters reported globally. In Africa, pastoralists living in the Arid and SemiArid Lands (ASALs) are the worse affected. Prolonged dry spells that cause vegetation stress in these regions have resulted in the loss of income and livelihoods. To curb this, global initiatives like the Paris Agreement and the United Nations recognised the need to establish Early Warning Systems (EWS) to save lives and livelihoods. Existing EWS use a combination of Satellite Earth Observation (EO) based biophysical indicators like the Vegetation Condition Index (VCI) and socioeconomic factors to measure and monitor droughts. Most of these EWS rely on expert knowledge in estimating upcoming drought conditions without using forecast models. Recent research has shown that the use of robust algorithms like AutoRegression, Gaussian Processes and Artificial Neural Networks can provide very skilled models for forecasting vegetation condition at short to medium range lead times. However, to enable preparedness for early action, forecasts with a longer lead time are needed. The objective of this research work is to develop models that forecast vegetation conditions at longer lead times on the premise that vegetation condition is controlled by factors like precipitation and soil moisture. To achieve this, we used a Bayesian AutoRegressive Distributed Lag (BARDL) modelling approach which enabled us to factor in lagged information from Precipitation and Soil moisture levels into our VCI forecast model. The results showed a ∼2week gain in the forecast range compared to the univariate AR model used as a baseline. The R^{2} scores for the Bayesian ARDL model were 0.94, 0.85 and 0.74, compared to the AR model's R^{2} of 0.88, 0.77 and 0.65 for 6, 8 and 10 weeks lead time respectively.
Edward E. Salakpi et al.
Status: final response (author comments only)

RC1: 'Comment on nhess2021223', Anonymous Referee #1, 01 Nov 2021
General comments
The paper presents a study aiming at forecasting vegetation conditions in arid and semiarid environments, with up to 10 weeks leadtime in order to improve the management of droughts and anticipate their socioeconomic impacts. The study uses different satellite product (NDVI, rainfall and soil moisture) to build statistical models for the forecasting of the vegetation index. The study builds on a previous paper by the same team (Barrett et al., Remote Sensing of Environment, 2020, https://doi.org/10.1016/j.rse.2020.111886 ), that used vegetation indices only and extends it using rainfall and soil moisture remote sensing data. Furthermore, the study uses a Bayesian framework for parameter estimation, that allows the determination of uncertainty on the forecast.
The paper is well written and well structured and the results are analyzed comprehensively. The conclusions appear well founded and show that the proposed methodology provides significant improvement as compared to Barrett et al. (2020), in particular in terms of lead time. Some elements in the paper could however be improved: results are compared to a benchmark model that should be better described. Some hypotheses of the work could lead to uncertainties in the results and should be discussed (use of one image in 2016 to identify grassland and shrublands, gap filling of data). Some details about methods for nonspecialists of machinelearning techniques and on how figures were built are sometimes missing, precluding the good understanding of their meaning. Provided the authors address these minor comments, the paper will be suitable for publication in Natural Hazard and Earth System Sciences.
Â
Â
Specific comments
1/ P.2 lines 6263: revise the sentence that is not correct.
2/ p. 5, line 91: you use sentinel 2 data from year 2016 to identify grasslands and shrublands pixels, but are you sure that this image is representative of the whole 20012018 study period? It is likely that land use changes over a 18 year period, so what would be the impact of errors on the grassland and shrubland pixels on the forecasting results?
3/ p.5 line 100101: could you elaborate more about the gap filling method: how does it work and how the gap filling could impact the results of the forecasting model? What is the percentage of gapfilled data?
4/ p.7 lines 122123: could you elaborated a little more on the method used to assess the forecast probability distribution?
5/ p. 8 line 154: incomplete sentence?
6/ p.9 line 171172. In order to assess the validity and robustness of a forecasting model, it is recommended to use different data for model calibration and evaluation. Could you explain more in details how you proceed with the model evaluation and if the data used for the evaluation are independent from the ones used to calibrate it.
7/ p.9 lines 183: a figure explaining the computation of MPIW and PICP could be useful.
8/ p.9 lines 192194: Could the authors provide more details about the AR benchmark model. I understood that it was built using only the vegetation index, but was a Bayesian framework also used for parameter estimation? If not, could the performance of the AR model be improved is a Bayesian framework was used for parameter estimation?
9/ p.10 lines 203205: for nonspecialists (possibly in appendix or supplementary materials) explain how the Reliability diagram and Sharpness are built.
10/ p.11 Fig. 3: explain in the figure caption how the figure is built: what are the ellipses on the figures?
11/ p.15 Fig 6: what is AUC in the caption and on the figures? The lines for the two models have the same colors and types. It is not easy to understand which curve is related to what? Could you also explain how this curve was built?
12/ p.16 Fig. 7: same remark as for Fig. 6: explain how the figures are built and what is the information they carry out.
12/ p.16 Lines 253256: I do not understand these explanations.
13/ p. 17 lines 274276: the result mentioned here was not shown before. It can be seen in the appendix but this should be mentioned.
14/ p.18 line 280: figure ?? â€“ please modify.
15/ p.1718: a discussion of the limitations and impact of the gap filling and choice of the 2016 Sentinel 2 image to identify grassland and shrubland should be added.
16/ Fig B1: Explain what is shown is these figures. They are not understandable with the current caption.
 AC1: 'Reply on RC1', Edward Salakpi, 02 Nov 2021

AC2: 'Reply on RC1', Edward Salakpi, 23 Feb 2022
General comments
The paper presents a study aiming at forecasting vegetation conditions in arid and semiarid environments, with up to 10 weeks leadtime in order to improve the management of droughts and anticipate their socioeconomic impacts. The study uses different satellite product (NDVI, rainfall and soil moisture) to build statistical models for the forecasting of the vegetation index. The study builds on a previous paper by the same team (Barrett et al., Remote Sensing of Environment, 2020, https://doi.org/10.1016/j.rse.2020.111886 ), that used vegetation indices only and extends it using rainfall and soil moisture remote sensing data. Furthermore, the study uses a Bayesian framework for parameter estimation, that allows the determination of uncertainty on the forecast.
The paper is well written and well structured and the results are analyzed comprehensively. The conclusions appear well founded and show that the proposed methodology provides significant improvement as compared to Barrett et al. (2020), in particular in terms of lead time. Some elements in the paper could however be improved: results are compared to a benchmark model that should be better described. Some hypotheses of the work could lead to uncertainties in the results and should be discussed (use of one image in 2016 to identify grassland and shrublands, gap filling of data). Some details about methods for nonspecialists of machinelearning techniques and on how figures were built are sometimes missing, precluding the good understanding of their meaning. Provided the authors address these minor comments, the paper will be suitable for publication in Natural Hazard and Earth System Sciences.
Specific comments
1/ P.2 lines 6263: revise the sentence that is not correct.
Response: Comment accepted, has been fixed
2/ p. 5, line 91: you use sentinel 2 data from year 2016 to identify grasslands and shrublands pixels, but are you sure that this image is representative of the whole 20012018 study period? It is likely that land use changes over a 18 year period, so what would be the impact of errors on the grassland and shrubland pixels on the forecasting results?
Response: Comments accepted and has been added as a limitation in the discussion section. An example of an impact on the forecast in the case of a significant change in landcover over the period will be added as well.Â
3/ p.5 line 100101: could you elaborate more about the gap filling method: how does it work and how the gap filling could impact the results of the forecasting model? What is the percentage of gapfilled data?
Response: Comment accepted, more details will be added on the gapfilling method. But as seen in the Barrett et al. 2019 paper the gapfilling method did not significantly impact forecasts.
4/ p.7 lines 122123: could you elaborated a little more on the method used to assess the forecast probability distribution?
Response: Comment accepted and addressed.Â
5/ p. 8 line 154: incomplete sentence?
Response: Sentence fixed
6/ p.9 line 171172. In order to assess the validity and robustness of a forecasting model, it is recommended to use different data for model calibration and evaluation. Could you explain more in details how you proceed with the model evaluation and if the data used for the evaluation are independent from the ones used to calibrate it.
Response: It was mentioned in line 172 that a rolling window approach was used. But to make this clearer, some more details or a walkthrough on how the split for training and test (held out) datasets was used for model training and evaluation will be added.
7/ p.9 lines 183: a figure explaining the computation of MPIW and PICP could be useful.
Response: A welldetailed figure of the MPIW and PICP can be found in the cited paper. However, a simple version will be added to this paper as suggested.
8/ p.9 lines 192194: Could the authors provide more details about the AR benchmark model. I understood that it was built using only the vegetation index, but was a Bayesian framework also used for parameter estimation? If not, could the performance of the AR model be improved is a Bayesian framework was used for parameter estimation?
Response: A description of the AR method will be added to the paper. AR method used in the paper was not parameterized with the Bayesian approach, however, if a Bayesian method was used it might not significantly improve parameter estimation but will add probabilistic interpretation to the AR results. The improvement in model performance was mainly attributed to additional variables. Using the Bayesian approach was to give a straightforward way to assess forecast uncertainty.Â
9/ p.10 lines 203205: for nonspecialists (possibly in appendix or supplementary materials) explain how the Reliability diagram and Sharpness are built.
Response: Comment accepted and details have been added to the paper
10/ p.11 Fig. 3: explain in the figure caption how the figure is built: what are the ellipses on the figures?
Response: Comment accepted, more details on the ellipses, which are the joint distribution bins of the scatter plot between the forecasted and observed VCI3M values, will be added to the paper.Â
11/ p.15 Fig 6: what is AUC in the caption and on the figures? The lines for the two models have the same colors and types. It is not easy to understand which curve is related to what? Could you also explain how this curve was built?
Response: Curves in the figure are not the same types, AR curves are dotted and the BARDL curves are solid (explained in the caption). The line colours indicate the lead times (i.e. 6, 8, 10 & 12 weeks ahead) which are the same for both models, thus the two methods being compared in the plot cannot have different colours. More details on the AUC will be added.
12/ p.16 Fig. 7: same remark as for Fig. 6: explain how the figures are built and what is the information they carry out.
Response: Comment accepted, details will be added
12/ p.16 Lines 253256: I do not understand these explanations.
Response: Comment accepted and fixed
13/ p. 17 lines 274276: the result mentioned here was not shown before. It can be seen in the appendix but this should be mentioned.
Response: Comment accepted and well noted, it will be addressed
14/ p.18 line 280: figure ?? â€“ please modify.
Response: Comment accepted and fixed
15/ p.1718: a discussion of the limitations and impact of the gap filling and choice of the 2016 Sentinel 2 image to identify grassland and shrubland should be added.
Response: Comment accepted and limitation as has been acknowledged and added.
16/ Fig B1: Explain what is shown is these figures. They are not understandable with the current caption.
Response: Comment well noted. The caption has been updated

RC2: 'Comment on nhess2021223', Anonymous Referee #2, 13 Jan 2022
This paper proposes a new model for forecasting the vegetation condition index (VCI) based on a Bayesian autoregressive distributed lag (BARDL) model. The new model can provide the probability distribution of VCI instead of a deterministic value. In a forecasting framework, it is clear that the BARDL model can improve the current methods, as supplying a probability distribution is crucial for decision making. The BARDL model is applied to a set of counties in Kenya with arid and semiarid conditions. VCI is forecasted from the available information about precipitation and soil moisture content, considering the current information about drought conditions. The new BARDL model is compared with the results obtained by using a deterministic AR model. The comparison is based on a set of measures that quantify both accuracy and precision. The paper offers a new method that can overcome some limitations of the current models to forecast droughts. However, the paper needs to address the comments included below before accepting it for publication.
Â
General comments
 The paper uses the vegetation condition index (VCI) to forecast droughts in Kenya. However, other indices are available like SPI, SPEI, PDSI, multivariate standardised dry index (MSDI), the temperature condition index (TCI), the vegetation temperature condition index (VTCI), and the temperature vegetation dryness index (TVDI), among others. A discussion could be included in the paper to support the selection of VCI in the paper.
 The Introduction Section focuses on three existing techniques to forecast VCI: AutoRegression, Gaussian Processes and Artificial Neural Networks. A longer revision of the techniques used in last years to develop EWS for droughts could be included in this section, as well as other papers that develop similar tools. For example, stochastic algorithms based on different types of Markov Chains, autoregressive movingaverage (ARMA), autoregressive integrated moving average (ARIMA) techniques, support vector machines, Kalman filters, multiple regression tree techniques, among others, have been used in last years to forecast droughts.
 While the BARDL algorithm supplies a probability distribution, the AR model supplies a deterministic value. Therefore, the comparison between the two models is not straightforward. In the paper, a confidence interval for the AR model is estimated from RMSE and zscore. However, this is a simplified way to estimate the prediction uncertainty, supplying a constant confidence interval regardless the magnitude of both VCI and the explanatory variables. This step is very important to compare BARDL results with AR results in a proper way. In addition, the methodology to compare both models should be clarified in the paper, as it is not clear how most of measures used to quantify accuracy and precision have been applied to the probabilistic forecast supplied by BARDL.
 The Discussion Section should be rewritten, as in its current form it is mostly a mixture of conclusions with some additional results considering seasonality.
 The Conclusions Section could be extended to summarise the main findings of the study.
Â
Specific comments:
 Abstract: Some sentences could be included in the abstract about the case study used in the paper.
 14: The acronym AR has not been introduced in the paper at this point yet.
 30: The acronym USAID is not introduced in the paper and could be explained at this point.
 46: The ARDA model has been applied to assess droughts previously, such as Zhu et al. (2018). References to previous studies in which the ARDA technique is applied to droughts should be included in the paper.
 51: The paper proposes the use of a Bayesian framework in the ARDA model to incorporate the prior knowledge about model parameters in the analysis, obtaining a probability distribution for VCI results.Â Bayesian networks have been also applied to develop a longterm drought forecast (Shin et al., 2019), supplying probabilistic results that can assess forecast uncertainties. A discussion could be included in the paper, stating the benefits of a BARDL model compared to Bayesian networks.
 Section 2.1: Some information about the number of counties considered in the study could be included in this section, as well as the number of counties that are arid and semiarid. In addition, some information about the area in km2 that is considered in the study could be useful for the reader.
 70: â€˜estimatesâ€™ should be changed to â€˜estimate.
 9899: The description of NVIi and NDVIi variables should be included in this paragraph too.
 103104: â€˜long termâ€™ should be changed to â€˜longtermâ€™.
 111: The acronym AR has been introduced in the paper above.
 118: A discussion could be included about the selection of the OLS method for estimating parameters of ARDL. Some other methods are also available.
 131 â€“ Eq. 3: The variable subscripts should be revised in Eq. 3. Dtq seems to be the drought indicator in a constant time step tq, which seems to be constant in the first summation regardless the value of i. Similarly, Ptp and Stp seem to be constant values in the summations. In addition, the regression coefficients are also constant values in the summation, though they could change in terms of i. A discussion should be included about the use of constant values in summations.
 137 â€“ Eq. 4: How does Xti represent several variables? How can i vary from 0 to i?
 143146: The variable theta should be explained to readers in this paragraph.
 145146: The term P(Xt) is ignored because it is difficult to compute. This is not a proper statement for ignoring a variable in a research paper.
 152154: An analysis should be done to fix the distribution function that best characterises the regression parameters. Why mu is set to 0 and sigma to 0.5?
 153154: Something is missing in this sentence.
 164: This is not the standard form of AIC.
 161163: Some figures could be included in the paper to show how a time lag of 6 weeks obtains the best AIC and R2 results.
 168: What is i? What is y hat?
 176: The R2 measure of Eq. 9 is not a good measure to quantify accuracy of forecasts.
 188189: What is m?
 196: â€˜inputsâ€™ should be changed to â€˜inputâ€™.
 213: How r, R2 and RMSE are calculated for the BARDL model? The BARDL model supplies a probability distribution, but observations are deterministic.
 214: R2 is not a good measure of forecast accuracy. RMSE is more adequate than R2. Therefore, the gain in performance metrics could be assessed with RMSE. However, the BARDL model supplies a probability distribution of VCI. How do you obtain a RMSE value from the comparison between probability distributions and deterministic values of observations?
 Figure 3: What do the coloured lines mean?
 222224: The R2 values do not correspond with the values shown in Table 2.
 Table 2: This table could be summarised in a figure.
 229231: The table in Appendix A could be summarised in a figure and included in the main text of the paper, in order to analyse the comparison between the two models. The results included in Table 1 show that PICP values are smaller for AR than for BARDL, meaning that a greater number of observations are out of the confidence intervals for BARDL. This result should be discussed in the paper. In addition, most of PCIP values for the BARDL model are smaller than 9496 %, in contrast to the statement of line 229.
 Figure 5: Please use the same yaxis scale in each row to compare the AR and BARDL results. The dashed line of the left column differs from the dashed line of the right column, though observations do not change. The green line represents the forecast. What is such a forecast for the BARDL model given that it supplies a probability distribution?
 235: A drought is forecasted when VCI3M values are smaller than 35. This is straightforward for the AR model, as it is deterministic. However, how do you apply this criterion to the BARDA outputs considering probability distributions?
 251253: The BARDL lines lie above the main diagonal of the reliability diagram. This means that the probabilities supplied by the BARDL model tend to underestimate droughts. A comment about this point should be included in the paper.
 253256: The sharpness diagrams are mostly flat for 10 and 12 weeks. The low values close to 1 means that the BARDL model is not able to forecast droughts. Therefore, the BARDL model is useful to forecast droughts with 6 weeks ahead but it is not for 10 and 12 weeks. A comment about this point should be included in the paper.
 Figure 7: The sharpness diagram should plot percentages in the y axis.
 266276: These two paragraphs could be moved to the Conclusions Section.
 273274: The Authors state that the BARDL model gains 2 weeks based on the results of R2. However, more measures should be taken into account to conclude such a statement.
 275276: This statement is not clear from the results included in Section 4.
 280: The number of the figure is missing.
 284295: This paragraph with figures of Appendixes C to F could be extended to form a new section 4.6 devoted to the seasonality analysis.
Â
References
Zhu N., Xu J., Li W., Li K., Zhou C. (2018) A Comprehensive Approach to Assess the Hydrological Drought of Inland River Basin in Northwest China. Atmosphere, 9(10): 370. https://doi.org/10.3390/atmos9100370
Â
Shin, JY, Kwon, HH, Lee, JH, Kim, TW. Probabilistic longterm hydrological drought forecast using Bayesian networks and drought propagation. Meteorol Appl. 2020; 27:e1827. https://doi.org/10.1002/met.1827
Â

AC3: 'Reply on RC2', Edward Salakpi, 23 Feb 2022
Reviewer 2
This paper proposes a new model for forecasting the vegetation condition index (VCI) based on a Bayesian autoregressive distributed lag (BARDL) model. The new model can provide the probability distribution of VCI instead of a deterministic value. In a forecasting framework, it is clear that the BARDL model can improve the current methods, as supplying a probability distribution is crucial for decision making. The BARDL model is applied to a set of counties in Kenya with arid and semiarid conditions. VCI is forecasted from the available information about precipitation and soil moisture content, considering the current information about drought conditions. The new BARDL model is compared with the results obtained by using a deterministic AR model. The comparison is based on a set of measures that quantify both accuracy and precision. The paper offers a new method that can overcome some limitations of the current models to forecast droughts. However, the paper needs to address the comments included below before accepting it for publication.
Â
General comments
The paper uses the vegetation condition index (VCI) to forecast droughts in Kenya. However, other indices are available like SPI, SPEI, PDSI, multivariate standardised dry index (MSDI), the temperature condition index (TCI), the vegetation temperature condition index (VTCI), and the temperature vegetation dryness index (TVDI), among others. A discussion could be included in the paper to support the selection of VCI in the paper.
Response: The work done in this paper was in partnership with the national drought monitoring authority (NDMA)Â in Kenya who are currently using VCI for monitoring drought. They have used the indicator extensively for their monthly drought reports and bulletins. In our attempt to introduce a forecast model as an additional information for bulletins, we did not want to propose a new index to them.Â
Â
The Introduction Section focuses on three existing techniques to forecast VCI: AutoRegression, Gaussian Processes and Artificial Neural Networks. A longer revision of the techniques used in last years to develop EWS for droughts could be included in this section, as well as other papers that develop similar tools. For example, stochastic algorithms based on different types of Markov Chains, autoregressive movingaverage (ARMA), autoregressive integrated moving average (ARIMA) techniques, support vector machines, Kalman filters, multiple regression tree techniques, among others, have been used in last years to forecast droughts.
Response: Comment well noted, the section on existing works that use similar tools will be updated with cited papers.Â
While the BARDL algorithm supplies a probability distribution, the AR model supplies a deterministic value. Therefore, the comparison between the two models is not straightforward. In the paper, a confidence interval for the AR model is estimated from RMSE and zscore. However, this is a simplified way to estimate the prediction uncertainty, supplying a constant confidence interval regardless the magnitude of both VCI and the explanatory variables. This step is very important to compare BARDL results with AR results in a proper way. In addition, the methodology to compare both models should be clarified in the paper, as it is not clear how most of measures used to quantify accuracy and precision have been applied to the probabilistic forecast supplied by BARDL.
Response: Thanks for this comment, the results from the AR model was compared to the means (average) of the forecast distribution obtained from the Bayesian model. We realise this explanation is missing in the paper and will be addressed accordingly together with additional comments.
The Discussion Section should be rewritten, as in its current form it is mostly a mixture of conclusions with some additional results considering seasonality.
The Conclusions Section could be extended to summarise the main findings of the study.
Response: Comment accepted and well noted, the discussion will be restructured so it does not come across as being incoherent. The conclusion will also be rewritten.
Â
Specific comments:
Abstract: Some sentences could be included in the abstract about the case study used in the paper.
Response: Comment accepted and will be fixed
14: The acronym AR has not been introduced in the paper at this point yet.
Response: Comment noted and will be fixed
30: The acronym USAID is not introduced in the paper and could be explained at this point.
Response: Comments noted and will be fixed
46: The ARDA model has been applied to assess droughts previously, such as Zhu et al. (2018). References to previous studies in which the ARDA technique is applied to droughts should be included in the paper.
Response: Well noted will be considered, however, the paper focused on Hydrological Droughts in river basins and not vegetation conditions.
51: The paper proposes the use of a Bayesian framework in the ARDA model to incorporate the prior knowledge about model parameters in the analysis, obtaining a probability distribution for VCI results.Â Bayesian networks have been also applied to develop a longterm drought forecast (Shin et al., 2019), supplying probabilistic results that can assess forecast uncertainties. A discussion could be included in the paper, stating the benefits of a BARDL model compared to Bayesian networks.Â
Response: Comment noted and will be considered however, the results in this paper are also for Hydrological drought and not comparable to agricultural drought indicators.Â
Â
Section 2.1: Some information about the number of counties considered in the study could be included in this section, as well as the number of counties that are arid and semiarid. In addition, some information about the area in km2 that is considered in the study could be useful for the reader.
Response: Comment accepted and well noted, more details on this will be included
70: â€˜estimatesâ€™ should be changed to â€˜estimate.
Response: Comment well noted and will be fixed
9899: The description of NVIi and NDVIi variables should be included in this paragraph too.
Response: Comment well noted and will be fixed
103104: â€˜long termâ€™ should be changed to â€˜longtermâ€™.
Response: Comment well noted and will be fixed
111: The acronym AR has been introduced in the paper above.
Response: Comment well noted and will be fixed
118: A discussion could be included about the selection of the OLS method for estimating parameters of ARDL. Some other methods are also available.
Response: Comment not too clear because OLS was not used for the ARDL in this paper.Â
131 â€“ Eq. 3: The variable subscripts should be revised in Eq. 3. Dtq seems to be the drought indicator in a constant time step tq, which seems to be constant in the first summation regardless the value of i. Similarly, Ptp and Stp seem to be constant values in the summations. In addition, the regression coefficients are also constant values in the summation, though they could change in terms of i. A discussion should be included about the use of constant values in summations.
Response: Comment well noted and will be fixed
137 â€“ Eq. 4: How does Xti represent several variables? How can i vary from 0 to i?
Response: Comment noted, these subscripts represent lagged order of the input variables, but it will be amended to make it clearer in the paper
Â
143146: The variable theta should be explained to readers in this paragraph.
Response: Comment well noted and will be fixed
145146: The term P(Xt) is ignored because it is difficult to compute. This is not a proper statement for ignoring a variable in a research paper.
Response: Comment noted this statement will be better explained
152154: An analysis should be done to fix the distribution function that best characterises the regression parameters. Why mu is set to 0 and sigma to 0.5?
Response: This parameter was selected after the model optimisation (grid search) process was done. Thes 'mu' and 'sigma' were selected because these parameters gave the best forecast results and with minimum error. This will be better explained in the paper
153154: Something is missing in this sentence.
Response: Comment noted and will be fixed
164: This is not the standard form of AIC.
Response: Comment noted and has been fixed
Â
161163: Some figures could be included in the paper to show how a time lag of 6 weeks obtains the best AIC and R2 results.
Response: Comment noted, the figures will be addedÂ
Â
168: What is i? What is y hat?
Response: Comment noted these will be explainedÂ
Â
176: The R2 measure of Eq. 9 is not a good measure to quantify accuracy of forecasts.
Response: We considered the R2 score for the work because in addition to the knowing deviation of the forested values from observed values (RMSE) we also needed to test the goodness of fit or the variation in dependent variables captured or explained by the model.Â
Â
188189: What is m?
Response: Comment noted and will be fixed
Â
196: â€˜inputsâ€™ should be changed to â€˜inputâ€™.
Response: Comment noted and will be fixed
Â
213: How r, R2 and RMSE are calculated for the BARDL model? The BARDL model supplies a probability distribution, but observations are deterministic.
Response: To determine these metrics the mean of the forecast probability distributions were used. This will be made clear in the paper.Â
Â
214: R2 is not a good measure of forecast accuracy. RMSE is more adequate than R2. Therefore, the gain in performance metrics could be assessed with RMSE. However, the BARDL model supplies a probability distribution of VCI. How do you obtain a RMSE value from the comparison between probability distributions and deterministic values of observations?
Response: We used the R2 because we needed to test the goodness of fit and the variation in dependent variables captured or explained by our model. The R2 and RMSE values were determined with the means (Average) of the forecast probability distribution.
Â
Figure 3: What do the coloured lines mean?
Response: The contour lines represent the density and bins of the joint distribution plot. A detailed explanation of this will be added to the paper to make it clear to the readers.Â
Â
222224: The R2 values do not correspond with the values shown in Table 2.
Response: The R2 values do not correspond because table 2 is showing R2 values for the separate Arid and SemiArid zones and not the overall as seen in the figure.
Table 2: This table could be summarised in a figure.
Response: Comment noted, and will be considered.
229231: The table in Appendix A could be summarised in a figure and included in the main text of the paper, in order to analyse the comparison between the two models. The results included in Table 1 show that PICP values are smaller for AR than for BARDL, meaning that a greater number of observations are out of the confidence intervals for BARDL. This result should be discussed in the paper. In addition, most of PCIP values for the BARDL model are smaller than 9496 %, in contrast to the statement of line 229.
Response: Comment noted, the details will be discussed
Figure 5: Please use the same yaxis scale in each row to compare the AR and BARDL results. The dashed line of the left column differs from the dashed line of the right column, though observations do not change. The green line represents the forecast. What is such a forecast for the BARDL model given that it supplies a probability distribution?
Response: Comment well noted, the yaxis will be set to the same scale. The difference in the dashed line is due to the shift in timeseries data when creating observed led time datasets.
Â
235: A drought is forecasted when VCI3M values are smaller than 35. This is straightforward for the AR model, as it is deterministic. However, how do you apply this criterion to the BARDA outputs considering probability distributions?
Response: Comment noted, the criterion was applied to the mean of the forecast distribution from the bayesian model.Â
We have noted that details on the use of the mean forecast distribution model evaluation are missing the narrative and will be addressed.Â
Â
251253: The BARDL lines lie above the main diagonal of the reliability diagram. This means that the probabilities supplied by the BARDL model tend to underestimate droughts. A comment about this point should be included in the paper.
Response: Comment noted, this was an omission in the paper and the details on this have now been outlined in the paper.
Â
253256: The sharpness diagrams are mostly flat for 10 and 12 weeks. The low values close to 1 means that the BARDL model is not able to forecast droughts. Therefore, the BARDL model is useful to forecast droughts with 6 weeks ahead but it is not for 10 and 12 weeks. A comment about this point should be included in the paper.
Response: Comment well noted, the details on this point will be outlined.Â
Figure 7: The sharpness diagram should plot percentages in the y axis.
Response: Comment well noted, will be fixed
266276: These two paragraphs could be moved to the Conclusions Section.
Response: Comment well noted, will be fixed
273274: The Authors state that the BARDL model gains 2 weeks based on the results of R2. However, more measures should be taken into account to conclude such a statement.
Response: Comment well noted, additional factors informing this gain will be further discussed in the paper.
275276: This statement is not clear from the results included in Section 4.
Response: Comment will note andÂ will be addressed
280: The number of the figure is missing.
Response: Comment noted and fixed
Â
284295: This paragraph with figures of Appendixes C to F could be extended to form a new section 4.6 devoted to the seasonality analysis.
Response: Thanks to this comment itâ€™s been noted and will be done as suggested.Â Â
Edward E. Salakpi et al.
Edward E. Salakpi et al.
Viewed
HTML  XML  Total  BibTeX  EndNote  

273  105  19  397  12  9 
 HTML: 273
 PDF: 105
 XML: 19
 Total: 397
 BibTeX: 12
 EndNote: 9
Viewed (geographical distribution)
Country  #  Views  % 

Total:  0 
HTML:  0 
PDF:  0 
XML:  0 
 1