Statistical theory of probabilistic hazard maps: a probability distribution for the hazard boundary location

The study of volcanic mass flow hazards in a probabilistic framework centers around systematic experimental numerical modelling of the hazardous phenomenon and the subsequent generation and interpretation of a probabilistic hazard map (PHM). For a given volcanic flow (e.g., lava flow, lahar, pyroclastic flow, etc.), the PHM is typically interpreted as the point-wise probability of flow material inundation. In the current work, we present new methods for calculating spatial representations of the mean, standard deviation, median, 5 and modal locations of the hazard’s boundary as ensembles of many deterministic runs of a physical model. By formalizing its generation and properties, we show that a PHM may be used to construct these statistical measures of the hazard boundary which have been unrecognized in previous probabilistic hazard analyses. Our formalism shows that a typical PHM for a volcanic mass flow not only gives the point-wise inundation probability, but also represents a set of cumulative distribution functions for the location of the inundation boundary with a corresponding set of probability density functions. These distri10 butions run over curves of steepest ascent on the PHM. Consequently, 2D space curves can be constructed on the map which represent the mean, median and modal locations of the likely inundation boundary. These curves give well-defined answers to the question of the likely boundary location of the area impacted by the hazard. Additionally, methods of calculation for higher moments including the standard deviation are presented which take the form of map regions surrounding the mean boundary location. These measures of central tendency and variance add significant value to spatial probabilistic hazard analyses, giving 15 a new statistical description of the probability distributions underlying PHMs. The theory presented here may be used to construct improved hazard maps, which could prove useful for planning and emergency management purposes. This formalism also allows for application to simplified processes describable by analytic solutions. In that context, the connection between the PHM, its moments, and the underlying parameter variation is explicit, allowing for better source parameter estimation from natural data, yielding insights about natural controls on those parameters. 20


Introduction
The probabilistic study of volcanic hazard is part of an emerging paradigm in physical volcanology, one that openly admits and seeks to calculate and characterize the uncertainty inherent to physical modelling of hazardous volcanic processes (Bursik et al, 1 Nat. Hazards Earth Syst. Sci. Discuss., https://doi.org/10.5194/nhess-2018-344 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Discussion started: 16 November 2018 c Author(s) 2018. CC BY 4.0 License. 2012;Connor et al., 2015;Bevilacqua et al., 2016;Sandri et al., 2016). Critically, this approach is founded on the notion that physical parameters related to these processes can be represented by a probability distribution. The distribution is propagated by a physical model to yield a space of hazard distributions which put probabilistic constraints on quantities that would be fixed in a deterministic, scenario-based modelling framework. In practice this has typically been accomplished by accumulating the statistics of the processes through ensemble deterministic modelling (Neri et al., 2015;Biass et al., 2016;Tierz et al., 2016;5 Bevilacqua et al., 2017;Gallant et al., 2018;Patra et al., 2018b). Polynomial Chaos approximations and Gaussian model surrogates can provide fast algorithms to obtains these statistics Madankan et al., 2012;Stefanescu et al., 2012;Spiller et al., 2014;Bayarri et al., 2015;Tierz et al., 2018). Owing to the nature of the studied processes, the results are typically displayed in map form and contain a large amount of spatial information, the interpretation of which has become a source of extensive study Thompson et al., 2015). Because the information in these maps is potentially of great interest to a wide community of stakeholders including governments, planners, emergency managers, local residents, and others spanning a wide range of familiarity with probabilistic reasoning, finding the best approach to condensing these datarich maps is non-trivial. The raw probabilities are considered less useful to many end users despite their significant information content (Thompson et al., 2015). Often the information in these maps is presented qualitatively as "high", "medium", and "low" hazard or simply "high" and "low" hazard Wagner et al., 2015). The reduction of a complex probability 15 distribution to a relative categorization as in a high -low hazard scheme is based on a probability threshold, in many cases a certain percentile of the distribution, which corresponds to a probability contour on the map. While this is the most obvious method it is just one of many options for declaring this categorical boundary.
At present, a probabilistic hazard map (PHM) for a given hazard is easily constructed from numerical modelling; however, its most commonly used statistical product in hazard communication is just the set of probability contours. Consequently if a 20 binary volcanic hazard map were to be constructed from a PHM showing the region of no likely impact versus the region of likely impact, the map creator would have to decide where to draw the line: possibly at the 5% probability contour (a safe, conservative choice), the median contour (50%) (good guess for the most likely impact -no impact boundary), or some other arbitrarily chosen percentile. Often the choice of the contour level is based on agreements or official protocols made by civil protection authorities. Such potentially high-consequence decisions require more statistical analysis to generate other tools 25 for this task. For example, it may be useful to know the average geographical limits of a particular hazard in addition to the point-wise probability of hazard impact. For any sample data set or theoretical probability distribution, the mean and variance as well as one or more modal values are calculable in addition to the median and other percentiles. An important deficiency in the analysis of the PHM is that previously, estimates of the likely hazard boundary (a single curve on the map) have not been computed by consistent methods.

30
For any given volcanic hazard analysis which maps probabilities of a given event, the probability values may represent a wide array of concepts in probability theory including cumulative probabilities, probability densities, conditional probabilities, and others. In the following theory and analysis, we only consider spatial probability distributions which map the probability of a given hazard impacting a given location where there is at least one point in space with a modelled probability of unity. Here, the point-wise probability is interpreted as the relative frequency of hazard impact as predicted by repeated simulation using 35 2 Nat. Hazards Earth Syst. Sci. Discuss., https://doi.org/10.5194/nhess-2018-344 Manuscript under review for journal Nat. Hazards Earth Syst. Sci. Discussion started: 16 November 2018 c Author(s) 2018. CC BY 4.0 License. a deterministic physical model from an ensemble spanning the realistic range of underlying uncertain parameters. Although this excludes the study of many possible PHMs, it covers a sufficiently wide range of models which are regular or regularized and span a realistically limited subspace of the general parameter space, that is, they depend continuously on the inputs or in this case, maintain some level of spatial correlation across a realistically narrow, continuous range of inputs. Throughout this work we focus on models of geophysical mass flows (GMF) relating to the hazards posed by inundation. This is done primarily 5 because the flow boundary is easily definable in these models, giving a consistent intuition to the complicated mathematical quantities discussed here. However, these concepts extend to many other types of hazard models where a particular threshold is of interest in defining the region impacted by a hazard in volcanology and other fields including concentration thresholds in volcanic clouds , thickness thresholds in tephra fallout , ballistic ejecta impact count thresholds (Biass et al., 2016), pyroclastic surge invasion (Bevilacqua et al., 2017), lava flow inundation (Gallant et al., 2018), 10 tsunami runup thresholds (Geist and Parsons, 2006;Grezio et al., 2017), ground shaking thresholds in seismic hazards (Kvaerna et al., 1999), hurricane wind speed thresholds (Splitt et al., 2014), and many others. A full evaluation of suitable and unsuitable types of probabilistic hazard assessments is beyond the scope of this work.
The work of Stefanescu et al. (2012) provided a rigorous framework for calculating the PHM from a Gaussian emulator of GMF simulations. Within that framework, the PHM is the point-wise mean probability of hazard impact and the uncertainty 15 bounds for that probability are given a rigorous definition. Although this method is rigorous, it has not been applied in many hazard studies. Furthermore, as with most calculations of PHMs, the interval 0 < P rob(Hazard Impact) < 1 occupies a very large region of space. As in the above hypothetical binary hazard mapping scenario, the question remains -From the perspective of a probabilistic volcanic hazard analysis, where is the likely boundary of the impacted area?
The goal of the present work is to extend the probability theory of PHMs to construct statistically meaningful answers to 20 this question. To this end, we give an explicit mathematical definition of the PHM and define exactly the connection between the cumulative exceedence probabilities and probability densities in a PHM and estimates of the likely hazard boundary.
More specifically, we calculate spatial representations of the PHM's moments including the mean and variance of the hazard boundary location as well as its modality. This effort combines mathematical tools and concepts from probability theory, the study of dynamical systems, and differential geometry.

Hazard Map Distribution Theory
Solutions to GMF models which involve depth-integrated partial differential equations (e.g., Saint-Venant shallow water equations) typically include solving for the distribution of flow thicknesses over position and the parameter space: where x is the position vector and β ∈ B is an n-vector in parameter space B. From the solutions h(x; β), an indicator function can be constructed to denote the region in space, Ω(β), where the solution space has been inundated: where h 0 ∈ R + is the inundation thickness threshold of interest. For many GMFs including lahars, lava flows, and pyroclastic surges, the threshold of interest is h 0 = 0 since any amount of inundation is a significant hazard; however, in the typical practice 5 of PHM construction from numerical modelling, a non-zero thickness tolerance is typically set due to inaccuracies in the solver near free boundaries (h(x) = 0) or other considerations (e.g., Patra et al., 2018b). Although doing so is inaccurate for the GMFs considered in this work, a sufficiently small threshold captures the essential features of the flow boundary. In assessments of other types of hazards with non-zero thresholds of interest (e.g. tephra fallout required for collapse of a particular roofing material), the true threshold may be used without introducing problems due to these solver inaccuracies.

Probabilistic Hazard Map
The construction of any PHM requires the definition of a probability density function (PDF) f : B → R + which measures the parameter space B. Consequently, calculating the PHM amounts to propagating f (β) from parameter space (B) through a physical model into Euclidean space. The PHM is then calculated by integration: In the integral, the indicator function acts as a filter, removing parameter space points from the integral which do not yield solutions exceeding the threshold at a given space point. In spatial subsets where all values of β in the support of f give solutions which exceed the threshold, the indicator is identically unity and φ = 1 by the scaling condition for a valid PDF f (β). Consequently, 0 ≤ φ ≤ 1 globally and measures the probability of finding material in the simulated flow at position x.
If the range of realistic parameter values were confined to a single point β 0 in parameter space, then the parameter PDF f is 20 the Dirac delta function δ(β − β 0 ), resulting in a PHM which is merely the indicator function: φ(x) = 1 Ω(β 0 ) (x; β 0 ). This is also the result for a PHM constructed for a single deterministic run. This corresponds to the intuition that complete certainty of input parameters corresponds with complete certainty of the spatial extent of the hazard (the region where φ = 1).
If the parameter space is measured by a uniform distribution (maximal entropy distribution), the probabilistic hazard map (PHM) is constructed by integral averaging the indicator function through the parameter space: The mean value integration is a good assumption in hazard analysis if there is no a priori knowledge of the uncertain parameters except for reasonable bounds.
In the context of a typical probabilistic hazard modelling effort, the solution space and parameter space are discrete with a uniform distribution in parameter space and these steps are accomplished merely by cell-wise averaging of the indicator functions derived from each model run to generate a discretized 2D hazard map φ = φ ij . In that context, φ ij is typically contoured and presented in a hazard assessment as probabilistic zones defined by the specific contours. However, because the underlying physics of these flows is not well constrained, the parameter space must be sampled widely resulting in large probability dis-5 persion in some cases. In the present work we globally analyze the PHM as a type of probability distribution and generate its associated measures of central tendency which form curves in 2D space. To analyze the PHM, we must first note several of its features and state some assumptions critical to the analysis. For a PHM generated from a finite, simply-connected parameter space and physically realistic governing equations: We will further assume that φ(x) is continuous and at least twice-differentiable for ∀x ∈ R 2 \ {∂Ω 0 ∪ ∂Ω 1 }. We focus the 15 analysis to follow in regions away from local maxima with φ < 1. This restriction can be mitigated by segmentation of the spatial domain as detailed below.
Formally, φ(x) is a cumulative distribution function (CDF) for the location of the edge of the simulated flow. This is represented schematically in Fig. 1. To see this, consider parametric curves along level set normals of φ: where P(s F ≤ s) is the cumulative probability that position x n (s) is farther from x n (0) than the true inundation front x n (s F ) is from x n (0) as measured along the integral curve. Because the integral curves follow the gradient, φ(s) is non-decreasing, and along with 0 ≤ φ(s) ≤ 1 satisfies the description of a CDF.

25
If local maxima less than unity are present, then the integral curves that ascend these maxima will not be valid CDFs (φ(s)) because they never reach the value φ(s) = 1. Despite this, as long as the PHM contains a finite number of local maxima less than unity (φ(x m ) < 1 where the index m runs over the M local maxima) then for each local maximum, we may define a region R m which surrounds x m and is bounded by two integral curves as well as a segment of ∂Ω 0 and ∂Ω 1 . Because local maxima are sinks for gradient-ascending integral curves, all of the integral curves that begin on R m ∩∂Ω 0 will terminate at x m .

30
The two bounding integral curves are therefore defined extrinsically as those integral curves producing valid CDFs originating closest to the segment of ∂Ω 0 that originates the integral curves terminating at x m . If there is a saddle point between the local maxima and ∂Ω 1 , then the two integral curves will meet at the saddle point and no part of ∂Ω 1 will make up the boundary of  R m . In a sense, R m can be thought of as the "shadow" of trajectories that are influenced most by the local maximum φ(x m ) ( Fig. 1). We remove these regions by constructing a new set on which the remainder of the analysis is valid: 6 Nat.

Probabilistic Hazard Density Map
To find the PDF associated with this distribution, consider that by the chain rule: Although this description satisfies the requirements of a PDF along each individual integral curve, it cannot be applied directly to measure the probability density in two dimensions. To measure this probability density, we generate a normalized PDF over 2D space which we refer to as the probabilistic hazard density map (PHDM): which is finite and positive. With this normalization the integral over Ω 0 is unity, giving ψ(x) the properties of a PDF for the inundation front location. Consequently, the value ψ(x) dA represents the probability of finding the inundation edge in an increment of area dA. As noted, this distribution is only valid in Ω 0 : the region of space where local maxima in the PHM do not influence the integral curves. A proof which extends Eqn. 9 to this normalization of the PHDM is given in Appendix A. 15 In the notional example given above for a PHM generated by a parameter space Dirac delta, given a PHM that is identically the indicator function, the PHDM is defined only in the sense of distributions because of the apparently infinite gradient at Ω(β 0 ). However, because derivatives can be defined in the sense of distributions, the PHDM corresponding to certain input

Measures of Central Tendency and Variance
Using these definitions, three measures of central tendency (mode, mean, median) can be generated for each integral curve, generating a locus of points (space curve) for each measure parameterized by a new parameter (l) which increases along the PHM contours according to: where R = 0 −1 1 0 is a rotation matrix and x t (l) are the PHM contours (Schematized in Fig. 1). By defining a density function for the inundation edge probability distribution as in Eqn. 10, a notional "modal" set can be constructed by connecting the maximal value of ψ on each integral curve x n . The modal set (parameterized by l) is For the inundation hazard map, the curve x M (l) represents the maximum likelihood curve along which to find the inundation front. Although each distribution may not be unimodal and individual distributions may have unimodal and multimodal regions 5 owing to the complexity of the underlying physics of the problem, this estimate gives a starting point for understanding the central tendency of the distribution.
Furthermore, we may find the mean location of the inundation by considering the first moment of each univariate PDF dφ ds (s; l). For each integral curve x n (s; l) connecting a point on ∂Ω 0 to a point on ∂Ω 1 , the the first moment (mean) of dφ/ds may be written: We compute this integral by parts, but make a critical substitution before doing so. Note that for a new function 1 − φ, differ- . We may substitute this into Eqn. 14 and then integrate by parts: Because φ(s) → 1 as s → ∞, in the same limit, 1 − φ(s) → 0. As long as 1 − φ(s) → 0 faster than 1/s → 0 as s → ∞, the 15 endpoint-evaluated terms in the limit will vanish and the integral will be finite. This allows for a simpler representation for the first moment: This formula can be converted from a path integral parameterized by s to a path integral parameterized by Euclidean length by noting that an increment of Euclidean length dζ may be calculated as: yielding an integral over a finite interval: that is, integrating the profile of φ and its derivatives over each curve C in 2D space with distance given by Euclidean distance.
Although this integral is analytically intractable, in a real-world context involving quadrature of gridded data, the advantages of this formula are the proper integration bounds, eliminating the need to know the exact behavior of φ as s → ∞. Despite this advantage, this formula will be only accurate in situations where the numerically-calculated gradient is very smooth.
Once µ(l) is calculated for every integral curve x n (s; l) regardless of parameterization, we may connect these points to form a curve defined by: 5 Finally, we may define a curve representing the median value of the probability distribution for inundation front location.
This is simply the 50% probability contour: which is a level set of the hazard map. The three curves x M , x µ , and x med represent different measures of the central tendency of the hazard map probability distribution.
Additionally, a measure of the variance may also be constructed for each PDF dφ ds (s; l), although this requires integration of the density function over s: The standard deviation (σ) defines a region around the mean value in that for each integral curve, the points s = µ(l)±σ(l) map to points in space along the curves x µ±σ (l), forming a region about x µ (l) between x µ+σ (l) and x µ−σ (l). Higher moments of 15 the distribution many be calculated similarly along each curve and subsequently connected.

Results
To highlight the features of the new theory, we present three examples of its application in order of increasing complexity and realism.
3.1 Example 1: Analytic model of time-dependent solidification of a viscous gravity current 20 A simple application of this theory is to a flow which can be calculated analytically. Here, we show the application of the above theory to the problem of time-dependent solidification of a viscous gravity current. This model used by Quick et al. (2017) is a modification of the famous model due to Huppert (1982) of the axisymmetric (radial) spread of viscous gravity currents, allowing for time dependent viscosity where the height of the current is governed by: where g is gravitational acceleration. This model incorporates a simple model of time-dependent kinematic viscosity increase representing the solidification of the current, where ν 0 is the kinematic viscosity (ν) at time t = 0 and Γ is the time it takes for the viscosity to increase by a factor of e. The solution for the release of a constant volume V 0 with initial radius r 0 may be stated as: Notably, Θ → Γ as t → ∞, implying that eventually the current will solidify and grow to a maximum size. Consequently, when the flow dynamics have ceased, the radius of the edge of the current (r N ), where the current thickness vanishes, will have reached a maximum at 10 r N = r 0 (1 + Γ/τ ) 1/8 .
To use this simple model to illustrate the probability distribution properties above, consider the case where all parameters are known precisely except for the viscosity freezing time coefficient Γ which can be constrained within a range: Γ min < Γ < Γ max . Because Eqns. 24 and 27 could be nondimensionalized so as to be rewritten in terms of one fundamental parameter η := Γ/τ , which is proportional to Γ, this single random parameter choice is justified. Under the assumption of no prior 15 knowledge of the likelihood of any particular value of Γ, a uniform distribution may be used to construct an inundation indicator function for this model: To generate a PHM from this indicator function as above, the indicator function must be integrated through the parameter space. The only nonconstant segment of this integral is on the interval with Γ N (r) = τ [(r/r 0 ) 8 − 1] and ∆Γ = Γ max − Γ min . This results in the PHM (Fig. 2a,c): From the PHM, the PHDM (Fig. 2b,c) can be calculated as and the integral curves as (32a)  These relationships lead naturally to the estimates of the center of the distribution (Fig. 2) as defined above: and the standard deviation: where Eqn. 32a was substituted into Eqn. 30 and Eqns. 14 and 21 were used to calculate Eqns. 35a and 35b. Higher moments of the distribution are theoretically calculable by repeated integration by parts, from which additional facts may be learned about the distribution.
These calculations can be made analytically because of the ability to write the equation of the boundary and because that 10 equation is invertible for the unknown parameter, that is, it can be written as r N (Γ) (a boundary in Euclidean space) and as Γ(r N ) (a boundary in parameter space). This example also serves to highlight the asymmetry of the PHM despite the underlying uniform distribution of physical parameters. This asymmetry is highlighted by the significant spread in the mean, median, and modal boundaries (Fig. 2).
This example was greatly simplified by the assumption of radial symmetry, allowing the distribution to be cast onto one 15 random variable, the radius of the flow boundary, giving integral curves of constant azimuthal angle which are linear in map view. However, in most realistic cases, this is a very poor assumption and the distribution will vary between integral curves.

Example 2: Mohr-Coulomb Flow on an Inclined Plane
To see the full realization of these concepts, a true two-dimensional problem is required. A simple example flow which had been well-studied is a lab-scale granular flow of sand down an inclined plane with a Mohr-Couloumb (MC) friction relation 20 analogous to a natural-scale debris avalanche or landslide (Patra et al., 2005;Dalbey et al., 2008;Yu et al., 2009;Webb and Bursik, 2016;Aghakhani et al., 2016;Patra et al., 2018a, b  this setup (Fig 3a), a 5 cm-radius, 5 cm-high cylindrical pile is placed 0.7 m up a 1 m long, 38.5 • inclined plane and is allowed to flow out onto a flat runout 0.5 m-long by 0.7 m-wide. In accordance with MC theory, the flows were stopped locally when the sum of the drag due to internal and bed friction exceeded the gravitational forces and the simulations were stopped globally after 1.5 s, when the flow dynamics had typically ceased (Patra et al., 2018b). In this case, the solution variable of interest h is the maximum flow height over time, that is, the maximum flow height that each point experiences through the course of the 5 flow. The uncertain parameters in the problem for a given granular medium are the internal and basal friction angles of the medium, ϕ int and ϕ bed respectively. To construct the PHM for this flow, 512 TITAN2D simulations were performed over a Latin-Hypercube-sampled input parameter space of To construct the indicator functions for this experiment, a thickness threshold of h 0 = 1.0 mm was adopted. The PHM was 5 then computed by averaging the indicator functions: a discrete analogy to Eqn. 4. Owing to the discrete nature of the data, all continuous operations on the PHM were done by numerical finite differences and quadrature.
The PHM for this problem has many important characteristics, notably that the greatest dispersion in the probability contours occurs in the runout zone, owing principally to the uncertainty in ϕ bed . Additionally, the spreading angle of the flow as a whole experiences a significant increase as the flow passes the break in slope (Fig. 3a). In order to calculate the mean flow boundary 10 set and higher moments of this distribution, the gradient ascent curves of Eqn. 5 were integrated numerically from a finite set of closely spaced points around an approximation of ∂Ω 0 defined as where J = 2049 is the number of such points generated by numerical contouring. Eqn. 5 was integrated by Euler's method with a step forward in s (∆s k = s k+1 − s k ) conditioned such that steps in true distance were fixed: ||x n j (s k+1 ) − x n j (s k )|| = 15 5.0 mm. This yielded discrete integral curves x n j (s k ) of varying lengths around the perimeter of the PHM (Fig. 3a). With each x n j (s k ) calculated, φ(s k ; l j ) was constructed and the parameter µ j calculated on each curve, yielding an estimate of x µ (l) by interpolation of the discrete points x µ j (Fig. 3b). The standard deviations σ j and the standard deviation space curves x µ±σ (l) were calculated by similar means (Fig. 3b). An important result of these calculations is that for almost every part of the perimeter of the PHM, the mean flow boundary set x µ is dislocated from the median flow boundary set x med (φ = 1 2 ) 20 indicating qualitatively the general asymmetry of this distribution. Additionally the region bounded by the curves x µ±σ is remarkably different from the region which bounds the central 68% of the data -the same percent bounded by ±σ in the normal distribution (Fig. 3). All of this indicates that for PHM's of even simple realistic flows, the underlying statistics are not only non-normal, their distributions can be significantly non-symmetric. We analyze an ensemble of TITAN2D simulations of this flow that were generated by Bevilacqua et al. (2018) using a physically realistic rheology for a debris flow, the Voellmy-Salm (VS) rheology. This rheology is parameterized by a bed friction coefficient µ V S and a velocity-dependent friction parameter ξ which approximates turbulence-induced dissipation and is uncertain over several orders of magnitude. Consequently, ξ is sampled logarithmically (Patra et al., 2018a, b). The flow was initiated from 5 source paraboloidal piles of material with unitary aspect ratios and volumes represented as constant fractions 5 of the total flow volume which were proportional to the drainage size (Fig. 4a). The input parameter space B for this simulation was taken as a convex subset of {(arctan(µ V S ), log 10 (ξ), V )} with the bounds 1 • ≤ arctan(µ V S ) ≤ 3 • , 3 ≤ log 10 (ξ) ≤ 4 and 3.5 × 10 6 ≤ V ≤ 5.0 × 10 6 . This set was determined using information about the known flow runout, flow arrival time in the town, and exclusion of ravine overflow in conjunction with a round of preliminary simulations according to a new procedure of probabilistic modelling set-up .

10
As in the previous example, the solution variable of interest is the maximum flow height over time; however, here the height threshold h 0 = 5.0 cm is used. The 350 VS simulation indicator functions were averaged as in the previous example to obtain the PHM for this flow (Fig. 4a). For such a complex flow margin, the gradient-ascending integral curves could not be computed reliably at the resolution of the model; however, the PHDM was calculated using a discrete gradient operation, giving a sense of the maximum likelihood for the inundation boundary (Fig. 4b).
Owing to the complexity of this flow and unless some smoothing or interpolation is applied before gradient calculation, the PHM will decay step-wise with many grid spaces between steps. When the numerical gradient is calculated over a set cell size, these steps will become peaks in the PDFs 30 and PHDM. This problem is generally absent where the PHM step size is more comparable to the scale of the grid spacing used by the gradient operator.
This complex example highlights the need for consideration of additional probability measures than simply the probability contours. Where the flow is not narrowly confined or where spill-over has occurred, the shape of the distribution becomes relevant and may yield significantly different results than estimating the likely flow boundary as one of the probability contours.

35
As shown by the above three examples, our theory can be applied to give consistent results between continuous, analytic models and discrete, numerical models of flows. The following discussion highlights several consequences of the theory as applied to analytic models in research and more complex numerical models in hazard assessments.

5
Although an analytic model (Quick et al., 2017) was invoked in the first example to make an exact calculation of the mean, median, mode, and standard deviation of the flow boundary, the real value of these models is that they are generally invertible and can be used to extract information about the input parameters from the properties of a given flow. Specifically, the PDF measuring the space of input parameters f can be inverted from data collected from the natural phenomena that the analytic model represents. Using the example of the freezing viscous gravity current highlighted above (Quick et al., 2017), we show 10 that an ensemble of natural examples of such flows can be used to invert the PDF of input parameters directly from the definition of the PHM. Suppose that remote sensing or geological field work yields data on the maximum radii (r N ) for an ensemble of these natural freezing gravity currents and that a natural PHM model is constructed calledφ(r) which represent the discrete CDF of radii that were measured. As before, we consider a simplified case wherein the most uncertain parameter from among the collection of natural flows is the freezing rate parameter Γ. If the researcher were sufficiently confident that 15 these assumptions and the governing equation (Eqn. 22) accurately describe these flows, then the only differences between φ(r) and Eqn. 30 would arise as a consequence of the PDF f (Γ) of the natural input space being non-uniform. Consequently, because the formula for the flow boundary (r N (Γ) = r 0 (1 + Γ/τ ) 1/8 ) is known and is invertible, the formula for the PHM can also be inverted directly in this simple case as follows: 20 Applying the Fundamental Theorem of Calculus yields In practice, the collected data would be in the form of a normalized histogramp(r N ) for the maximum flow radii. As a discrete approximation,φ(r N ) is the associated cumulative distribution function forp subtracted from unity. Consequently, the PDF over parameter space could be written approximately as This illustrates the general fact that the distribution of flow boundaries observed reflects the underlying parameter space PDF weighted by the derivative of the analytic mapping from parameter space to the flow boundary location in Euclidean space.
Generally, unless the analytic model predicts that the boundary of a flow (x(β)) is proportional to a given parameter, the input space distribution f (β) undergoes a nonlinear transformation in the construction of a PHM and PHDM. More precisely, we remark that the calculated distribution is the conditional probability density f (Γ|β i ), where β i represents all of the other parameters that were assumed to be known precisely. Inverting the PHM for the joint parameter space PDF is beyond the scope of this work.

5
If the parameter space of a given inundation hazard scenario and a realistic physical model of the process are known, a simple numerical experiment can be run using classical Monte Carlo methods or other Monte Carlo variants as was done for the granular flow on an inclined plane (Patra et al., 2018b) or the 1955 Atenquique lahar , as well as even more sophisticated sampling methods (e.g., Bursik et al, 2012;Bayarri et al., 2015). As in the examples in this study, the hazard map and hazard density map and the associated statistics may be generated from such experiments. This may then be used to 10 give additional information to a probabilistic assessment of the hazard rather than merely the inundation probability contours.
In this context, the method outlined here is capable of generating the typical measures that would appear in routine statistical analyses of distributions. The most obvious among these is the mean flow boundary and standard deviation region. These are fundamentally newly calculable measures and should be included in probabilistic hazard assessments.
As illustrated in the examples, the mean boundary may differ significantly from the median boundary owing to the asym-15 metric statistics that are inherent to typical nonlinear flows. Furthermore, this asymmetry suggests that the modal or maximum likelihood boundary may be of interest to users of these assessments as well. For probabilistic hazard assessments to be used in sophisticated applications including risk assessments by governments or actuarial assessment for insurance purposes, the full statistics of the PHM must be considered.

20
With new tools to calculate the moments of any given hazard edge distribution from a PHM, hazard map makers and analysts become able to estimate the hazard edge location and the uncertainty in that estimate. As numerical modelling capabilities continue to grow, the increased ease and speed of this type of analysis will allow the map maker to quickly construct the PHM, PHDM, the mean hazard boundary curve, standard deviation region, as well as higher moments of the hazard boundary distribution, giving improved estimates of the real hazard zone. Although this theory is useful for many applications where 25 probabilities are defined spatially, further work is needed to constrain which types of hazard assessments and models are suitable for this analysis. Similarly, additional work is required to study the probabilistic inverse problem further than the simple example given and to evaluate the use of this theory in a time dependent context. Overall, the analysis put forth here significantly enhances the study of probabilistic hazard maps, yielding new estimates of the likely hazard boundary and the uncertainty in those estimates.

10
(v) If φ has local maxima φ(x m ) < 1 with m ∈ 1...M then each x m can be surrounded by a region of influence R m bounded by integral curves as defined in the text.

Define
: Then: |∇φ(x n (s))| 2 ds = 1 =⇒ Proof. Consider the tangent curves to the level set normals 20 where R = 0 −1 1 0 is a rotation matrix. The curves x t (l) are the level sets of φ. By invoking the tangent curves and parameterizing each normal curve x n (s; l), each PDF dφ ds (s; l) may be extended away from a single integral curve: |∇φ(x n (s; l))| 2 ds = dl.
Integration through l would then yield an integral of the square gradient over Ω 0 with the area element parameterized in local orthonormal curvilinear coordinates by (s, l) which are derived from the PHM instead of ordinary space. Because of this, the 25 value of the integration will depend on the φ particular to the problem under consideration. Note that in the coordinates (s, l), each of the domains of influence related to local maxima less than unity (R m ) is a rectilinear patch of (s, l) space (segment of l ∈ [0, ∞)) which is removed, leaving the integration domain of l which contains gaps: The right hand side of Eqn. A3 can then be evaluated by invoking Green's First Identity. Introducing an outward unit normal n on ∂ Ω 0 and noting that: Green's First Identity gives