The manuscript has been significantly improved after revision. I think the contribution is interesting and should be considered for publication. However, the manuscript could still be improved in terms of clarity. Below are details of 3 major and several minor suggestions for improvement.
[1] Clarify the connection between emulator development and sensitivity testing of the slow/fast simulators. One may expect the sensitivity testing to be carried out in two stages: first identifying the emulator, then using the emulator as a substitute of the simulator within a computationally-expensive sensitivity analysis. However after reading the entire manuscript I understand that there is no such a second step, and that the sensitivity testing is a "byproduct" of the emulator identification process itself: the estimated emulator's coefficients (beta) are the measures of the simulator sensitivity to its parameters (x). This could be clarified in the Abstract and Introduction. For example on P. 4 L. 18-19: "This enables the quantification of the impact of each simulator parameter on the prediction of the dispersion of volcanic ash": the sentence could be revised to clarify "how" the quantification is enabled.
On the same topic, the term "active" should be defined in Sec. 6.1 (how is it operationally concluded that a parameter is "active"?) and the link to sensitivity testing established. I understand that the fact that a parameter is active in the emulator implies that the output of the original simulator is sensitive to that parameter, so "finding active parameters" is the sensitivity testing, but this is not stated explicitly. Please clarify.
[2] The description of the emulation methodology (Sec. 5 and associated Appendices) is still confusing. I think the main text still puts relatively too much emphasis on the statistical rationale underlying the emulator and too little on the practical steps for the their construction and use. This may make the manuscript not very accessible to non-statisticians and limit the uptake of the proposed multi-level emulation approach.
Specifically:
- P. 18 L. 9: "Such an emulator then provides predictions for f (x) at a new x. Since it is a statistical model, this prediction also comes with an associated uncertainty". How are the predictions and associated uncertainty obtained in practice? I guess the prediction is the adjusted expected value of Eq. (9) and the associated uncertainty is expressed by an uncertainty interval based on the adjusted variance of Eq. (10), is this correct? Also, what is the link between Eq. (9)-(10) and Eq. (8)? Is the expected value E(B) in Eq. (9) given by the first term in Eq. (8) (the linear combination of "g" functions)? What about the variance? Please clarify.
- P. 19 L. 1 says that the Bayes linear approach is used for "the analysis of the link between the fast and slow emulators". Does this mean that Eq. (9)-(10) are also used to establish a link between the two emulators? If so, does this mean that the prediction of one emulator is adjusted based on the prediction of the other? This seems in contrast with Sec. 5.3, from which I understand that the link between the two emulators is established by linking their respective coefficients beta. Again this should be clarified.
- P.18 L. 4: "Conceptually, the expectation, variance, and correlation are a priori uncertainty judgements". What does this exactly mean? Which of the 3 parameters (expected value, variance and correlation length) are actually estimated from the residuals of the emulator predictions and which are assumed by a priori judgement? How is the "correlation length" estimated? The description in A2.2. is unclear: what does "tune" mean on line 25 P. 34? Manual tuning? How is it checked that the method "has been successful" L. 1 P.35?
[3] From Figure 3 it seems that the difference between the fast and slow simulators is really small (at least for the chosen output variable). In the best case (top panel) the two simulators produce almost identical output, in the worst case (panel (b)) the outputs are still well related to each other (only few points in the bottom left part of Fig. 3.b would not align to a simple interpolating line). I guess this is the reason why the two emulators are found to be very close (P. 24 L. 4-6: "the link between mean functions of the two emulators is strong and consistent") and their difference "mostly a rescaling". I think the similarity of the fast/slow simulators should be emphasized more when commenting the emulator results, also to acknowledge that this case study application may not be the most challenging one to test the multi-level emulation approach (although the idea remains valid and very interesting in principle).
MINOR
P. 4: maybe remove the line break between line 2 and 3 - the sentence "Finally, ..." should be part of the previous paragraph.
P. 4 L. 7-8: "The emulation method that is presented in this paper gives assessments of uncertainty that can be combined easily with other sources." I agree this is possible in principle but not actually demonstrated in this paper. Maybe good clarify the point.
P. 4 L. 10: "is expensive in both time and money." A bit vague. If I understand correctly, the point here is that sensitivity tests of a wide range of parameters require a lot of model runs and for a complex simulator such as VATD this would take a lot of computing time.
P. 5 L. 15: "the emulators used..." remove "used"?
P. 5 L. 21-22: "Section 5 gives an overview of the statistical methods used in the analysis." Maybe specify this is about building and evaluating the emulators
P. 10 L. 2: "all the alternative PSDs could be reconstructed to a reasonable approximation": unclear. What is "reasonable approximation"? Does it mean that the alternative PSDs are compatible with the range of observations by Dacre et al. (2013)?
P. 17 L. 14: "hosen" should be "chosen"?
P. 17 L. 15: "For the rest of this section, attention is restricted to scalar-valued for simplicity of notation." Ok but please clarify whether in your application a vector-valued emulator was used.
P. 18 L. 20: "It" should not be capital letter |

The reviewers have some significant concerns regarding your current manuscript, which you acknowledge in your suggested responses. Please revise the manuscript accordingly and focus especially on the readability of the manuscript as discussed by both reviewers. The revised manuscript will be re-assessed by the referees.

Best regards,

Thorsten Wagener