Using maps of city analogues to display and interpret climate change scenarios and their uncertainty

,


Introduction
According to the United Nations, Department of Economic and Social affairs (2004), the majority of humankind is now living in urban areas. Especially the European population is predominantly city-dwelling, with 65.6% residing in urban areas in 1975, a number that rose to 71.7 % at the turn of the century and is prospected to reach 78.3% by 2030. In cities, weather patterns interact with the socio-economic structures directly and indirectly in many uncounted and mostly unaccountable ways. Elevated temperatures, particularly during extremes like the 2003 and 2006 summers, have shown the heavy strain on and need for adaptation of sanitary systems, production strategies (above all in construction), power supply systems, living conditions and 2 Characterizing climate with indicators 2.1 Indicators Climate can be defined as the weather conditions in a certain geographical area averaged over a long period of time. A more quantitative definition is needed for a computer-based method. A well accepted approach to characterize climates is to select a few aggregate indicators quantifying the most relevant attributes such as measures of seasonal and annual warmth or cold, accumulated wetness/dryness, solar radiation, atmospheric humidity, etc. Many climate indicators exist, as climates can be defined in different ways for different purposes. For example, in agriculture the total annual evapotranspiration is an important indicator for plant growth, whereas in tourism the total number of rainy days might be of primary interest instead. The literature suggests that general-purpose characterizations of climates, such as the Köppen classification, tend to include at least one indicator related to temperature (or energy) and one indicator related to moisture (or water). The popularity of the Holdridge (1947) Life Zone system shows that three indicators are sufficient to define a useful classification of climates (in this classification, the indicators are temperature, precipitation and evaporation, but the zones are represented on a two dimensional triangle because the third indicator is a combination of the first two).
In order to characterize climate from the point of view of its impact on cities and urban life, we considered the combination of the following three climate indicators: annual Aridity Index, annual Heating Degree Days and annual Cooling Degree Days. The annual Aridity Index represents a key factor defining water deficit. This index is widely used in the categorization of climate types, and water stress is expected to be a key social impact of climate change. The Heating and Cooling Degree Days measure accumulated temperature and are known to correlate well with the energy demand for heating and air conditioning, respectively. They are used in financial markets to settle the price of weather derivatives and futures (e.g. van Asseldonk, 2003), or to estimate a building's or a city's energy needs. Similar measures of accumulated temperature are also used in agriculture (Monteith, 1981), for example the Effective Temperature Sum (ETS) used by Fronzek and Carter (2007) as an indicator of thermal suitability for crop development.
Although capturing additional aspects of climate or investigating selected features or particular subsystems of urban areas might require additional indicators, there is a certain trade-off between the exhaustive description of a local climate and the practical ability to identify analogues of that climate. We believe that the combination of these three indicators provides a sufficient description of a city's climate to assess the impact of climatic change on urban areas, so we define climate for this study as the 30-year joint three-way distribution of the Aridity Index, Heating Degree Days and Cooling Degree Days. In order to statistically compare climatic (dis-) similarities between different times and places, we assume stationarity, as if the 30 years were drawn from the same unchanging distribution. No assumption is made on the shape of this distribution. The three indicators are defined in principle from daily data, but monthly mean temperature and precipitation data are more readily available. As we show next, the indicators can be computed to a good approximation from monthly data. Monthly precipitation and potential evapotranspiration Figure 1: Mean monthly precipitation (blue bars) and potential evapotranspiration (green bars) of the simulated climate of Paris in 2071 (HadRM3H model, single grid box). Absolute aridity is the area above the precipitation and below evapotranspiration bars of the deficient months (hollow rectangles). It is divided by total evapotranspiration of the deficient months to get the aridity index. 4

Aridity
Aridity describes the availability of water that plants can use. It is a fundamental indicator for a climate's vegetation, likely to change significantly in a changing climate. There are several variants of an aridity index available in the literature: absolute or relative, aridity or humidity. For the purpose of describing climates statistically they are largely equivalent, so we settled on the classical Aridity Index AI as defined by Thornthwaite (1948) (see Figure 1) and restated below: In any given month, the water deficit is the difference between the monthly potential evapotranspiration e and the precipitation p which sums up for all water deficient months of a year to the annual water deficit. The annual Aridity Index is defined relative to the total potential evapotranspiration of the deficient months: Thornthwaite (1948) also provides an empirically derived method for closely estimating the monthly potential evapotranspiration e of a standard month of 30 days in cm from the mean monthly temperature t i in°C: a = 0.000000675I 3 + 0.0000771I 2 + (4) 0.01792I + 0.49239 As the days in a month vary and the number of hours of sunshine per day depend on the seasons and the latitude, Thornthwaite (1948) also introduced an adjustment factor for the above calculated unadjusted potential evapotranspiration. This method is also known to systematically underestimate the potential evapotranspiration in more arid regions and seasons as it was developed and parameterized for conditions in the USA (e.g. Deichmann and Eklundh, 1991). In the present work, we neglected these adjustments but follow-up studies should investigate this aspect.

Cooling and heating degree days
Heating and cooling degree days (HDD and CDD, see Figure 2) can be seen as measures of heating and air conditioning needs, respectively. They are based on the simple idea that heaters (or air conditioners) are turned on when the daily mean temperature  climate change issue is twofold. On one hand, changes in their distributions are an expected important impact of climate change. And on the other hand, they also matter for mitigation, since degree days empirically characterize households' energy consumption very well. Mathematically, annual heating and cooling degree days are defined as follows: Although based, by definition, on a daily difference to the base, by observing that the daily temperature distribution has a known shape, it is possible to estimate monthly degree days statistically from monthly temperature means, neglecting Schär et al. (2004)'s suggestion that climate change may alter this known shape. Thom (1954Thom ( , 1966 proposed a method to calculate the monthly degree days above (CDD m ) or below (HDD m ) any base as follows: where N is the month length in days, t m the monthly mean temperature, σ m the standard deviation of monthly average temperature (which was calculated in our case from the available average monthly temperatures over several years as available), b=18°C the base, and x 0 and l * HDD/CDD the so-called truncation point and respective truncation coefficient, which are related empirically and calculated with an exponential approximation (Thom, 1966) as follows: This method only approximates the monthly HDD and CDD, but for European climatic conditions it provides a reasonable replacement estimate for calculations based on daily temperature.

The 1-Dimensional two sample Kolmogorov-Smirnov test
The Kolmogorov-Smirnov test is a commonly used and relatively simple non-parametric statistical test. It can be used to examine if a sample comes from a known distribution, or to examine if two samples come from the same unknown distribution. Our use is the latter: to compare climates from two different places and periods, using samples of 30 years. In its basic univariate case, the Kolmogorov-Smirnov statistic D is defined  Barletta (1961Barletta ( -1990, data from HadRM3H model. The Kolmogorov-Smirnov statistic D is the maximum vertical distance between the two curves. as the maximum vertical distance between the cumulative distribution functions of the two samples. This is illustrated in Figure 3 for the cumulative distribution of the 30 annual aridity indices of the climate of Paris from 2071 to 2100 and the climate of the southern Italian city Barletta from 1961 to 1990 respectively (aridity indices computed using the results of the HadRM3H model simulation.) The basic idea of the test is the following. When one draws two samples of numbers according to a given probability distribution f , the cumulative distribution curves of the two samples will both tend to fall around the same PDF curve of f . Thus, if one cannot expect D to be exactly 0, one can expect it to be small. But when one draws two samples according to very distinct probability distributions, respectively f and g, the cumulative distribution curves of the two samples will tend to fall around the PDF curves of respectively f and g. If these curves are well apart, one can expect the statistic D to be close to 1. To illustrate with an extreme case, if numbers drawn according to f are known to lie within [1, 2] and numbers drawn according to g are in [3,4], then certainly the distance will be 1.
The frequency distribution p of the K-S statistic D for two samples of 30 drawn from the same distribution f can be computed empirically to an adequate level of precision for application by using Monte-Carlo simulation. The key to the K-S test is that p does not depend on the shape of f itself. Thus, the K-S is non parametric, as very few assumptions need to be made on the unknown distribution: Fasano and Franceschini (1987) show that for the 2-and 3-dimensional case, only the correlation between the variables matter. Figure  If two samples come from the same distribution, the probability that their K-S distance is 0.3 or more is 10.9% Figure 4: The p value as a function of the K-S statistic D after Press et al. (1992). It measures on an absolute scale (between 0 and 1) how likely it is for two samples to be drawn from the same distribution i.e. in our case, how well the two climates, as described by the indicator, correspond.
The Kolmogorov-Smirnov distance D offers an absolute measure of similarity between two samples based on statistical theory. In technical language, the probability that two samples drawn from the same distribution have a K-S statistic at least as a great as D is called the p-value. For example, the p-value of two identical samples (D = 0) is p = 1. When the p-value is small, there is reason to reject the hypothesis that the two samples come from the same distribution. On the contrary, the larger p, the more reason there is to believe (or accept the hypothesis) that the two samples where indeed drawn from the same distribution.

The 2-D and 3-D Kolmogorov-Smirnov test
The classical K-S test presented above deals with real-valued variables (i.e. is 1-dimensional). However, we characterize climates with three indicators, so we have to test the threeway joint probability distribution AI, HCC and CDD. Generalization is not trivial because in higher dimensions there is no obvious total ordering relation, so the notion of cumulative distribution is not immediately applicable. We used generalizations of the K-S test for two (Peacock, 1983) and three (Fasano and Franceschini, 1987) dimensions.
In the case of two-dimensional samples, each data point is a pair of numbers, such as (AI, CDD) for example. The approach of Peacock (1983) is best understood graphically, as illustrated in Figure 5 for the combination of annual Aridity Index and annual Cooling Degree Days over 30 years of simulated climate for Paris, from 2071 to 2100, and Barletta, Italy, from 1961 to 1990. It replaces the cumulative probability distribution with a description of the integrated probability in each of the 4 quadrants around a given reference point (x,y) of the sample. Practically, each data point of the sample is successively used as the reference point. For each such reference point, the relative 9 500 600 700 800 900 1000 1100 1200 1300 1400 1500 50% 55% 60% 65% 70% 75% 80% 85% 90% 95%

Annual Cooling Degree Days
Annual aridity Index Barletta 1961Barletta -1990, Paris 2071-2100 The K-S distance between these two samples is 9 30 = 0.3 © Figure 5: Spatial distribution of the combination of the two climate indicators annual Aridity Index and annual Cooling Degree Days from 30 years for Paris (2071-2100) and the southern Italian city Barletta (1961Barletta ( -1990, data from HadRM3H model. The Kolmogorov-Smirnov statistic D is the maximum difference of the integrated probabilities of the two distributions in the 4 quadrants around each data point. The figure displays this calculation with one data point as a reference, in which case the maximum difference is 9/30 = 0.3 found in the forth quadrant. To find the maximum, the same calculation is performed for all data points, which yields the displayed difference of 9/30 = 0.3 as absolute maximum.
frequencies for the two samples are calculated in each quadrant, as the ratio of the number of data points in the quadrant to the total number of data points. Finally, The K-S statistic D between two samples is the maximum difference of the relative frequencies in the 4 quadrants, when considering successively all data points as the reference point.
Generalizing the 2-dimensional version of the K-S statistic D to the 3-dimensional case is straightforward. Each data point is a triple, for example (AI, CDD, HDD). These data points can be seen as a cloud in 3-dimensional space. There are 8 octants in the space around each data point instead of 4 quadrants in the plane. The K-S statistic D between two sample distributions is taken as the maximum difference of the relative frequencies when considering the 8 octants around all data points.
For statistical testing, the translation of the K-S statistic D into the p value in the multi-dimensional cases is based on the same Monte-Carlo methods as in the 1-dimensional case. Technically, given a distance D measured between two tested samples, the p value is the probability that the K-S distance between two samples randomly drawn from the same distribution is greater than D. It describes how well the two samples are similar, or could come from the same probability distribution. In other terms, the p value is the likelihood that the two samples are two realizations from the same probability distribution. In our case, it also means how well two climates, as described by multiple indicators, coincide. Technical details on the probability distributions used in the 2and 3-dimensional cases are shown Appendix A.

Analogue Filtering, Selection and Visualization
The selection of the best current analogue to a city's future climate amounts to searching for a local minimum in D. We used two additional filters.
First, only grid points in the model with a p value greater than 0.5 were considered acceptable for further evaluation. Locations that reject the "same climate" hypothesis at a 50% confidence level were not acceptable. According to the usual practice of statistical testing at 95%, this is a quite low confidence level. But the purpose is not to test for all analogs, only to simplify further computations by filtering out a large fraction of grid cells. When no grid cell is acceptable, the search fails.
Second, we penalized narrow optima by applying a lowpass spatial filter before minimizing the D field. The filter combined the score of a cell with a 0.5 weight, with the score of its four cardinal neighbors located at plus or minus 0.5 • latitude/longitude, using a 0.125 weight. Neighbors were obtained by interpolation when the datagrid made it necessary. The justification for this smoothing is heuristic. The analogue is meant to represent a climate to readers who have a fuzzy mental representation of European climates. This goal is better accomplished when the optimum is within a large region of good analogues.
The optimum was found using exhaustive search, as this is nonconvex optimization with a finite, computationally tractable number of points (one per grid cell). Compared to , no further heuristic arbitration between candidate optima was needed. The smallest smoothed K-S statistic at an acceptable location was considered the best analogue. It was then possible to name the analogue according to the closest meteorological station or city.
Based on this method, two kinds of maps were drawn. The first kind is the "climate analogues quality" map, shown in Figure 7. It shows where one can currently find the future climate of a given city, by mapping the K-S statistic D on a regular grid of Europe, at the resolution of the original dataset. We used interpolation when the original dataset was not on a rectangular grid. Appendix C discusses why we choose to display D instead of the p-value. This kind of map allows the reader to check visually the quality of the "best" analogue, which is necessary since it involves nonconvex local minimization.
A second kind of map is the "climate relocation" map. It is obtained by selecting a set of cities, and displaying where their best analogue lies on a common map of Europe (see Figure 8). These maps communicate the directions and order of magnitude of climate changes expected over the course of the century. In order to convey the uncertainty related to climate simulations, it is important to always show several such maps obtained by different models or emissions scenarios.

Data and implementation
The method was implemented in Fortran, using g77 with the GrADS, NetCDF and CDO libraries. For city coordinates, we took the list of stations from the Global Historical Climatology Network 2 dataset. The code uses resources from Press et al. (1986), is released under the GPL and available from the CIRED web site (http: //www.centre-cired.fr). It can be parameterized to examine most large cities in Europe. For this paper, we examined analogues for 12 large European cities: Athens, Figure 6: Cities examined in this study, displayed on a mean temperature background from the HadRM3H control run ) for a basic impression of relative temperatures.
The key inputs needed are regional 2D fields of mean monthly surface temperatures and precipitations. Data should be at relatively high spatial resolution, about 50 km grid. It should cover two 30 years spans, in order to compare the present and the future climates. Finally, it should cover a reasonably wide latitudinal zone, since warmer climates are to be found southward.
We used two climate simulation datasets from models of the PRUDENCE project (from ensemble 1 simulations described in Christensen and Christensen (2007)). One dataset is the DE6 run of the ARPEGE-Climate model from CNRM/Météo-France. This is a global circulation model with a variable horizontal resolution of up to 50km in Europe. This atmospheric model was forced by sea surface temperature of the HadCM3 A2 model. The other dataset is the ackda run of the HadRM3H model from the Hadley Centre, a regional model with a 50-km resolution, forced by the global circulation model HadAM3H A2. Both models simulate a warming over Europe with an increase in precipitation in the North and a strong drying over the Mediterranean. In these datasets, the HadRM3H model simulates a stronger global warming response than the ARPEGE model, but both are within the range of the typical literature values according to the PRUDENCE intermodel comparisons. They both provided monthly mean temperatures and precipitations over 30 years in the present climate (control run, 1961-1990) and the projected future climate (2071-2100).

Comparing two versus three indicators
We compared empirically the results of the method as described above, based on a 3dimensional K-S test using three indicators (Aridity Index and both Degree Days), with a simplified version using only two indicators (and a 2-dimensional K-S test). There are three possible ways to pick two indicators out of three, but theoretically it is hardly defensible to throw away the Aridity Index and keep only the two temperature-based indicators. This is why we tested only (AI, HDD) and (AI, CDD). Figure 7, based on the HadRM3H model simulation, compares the climate analogues maps computed with three and two indicators for Paris and Saint-Petersburg. Logically, it can be seen that the first map in each row is like the fuzzy intersection of the second and third map.
The analogue location selected by the 3D test is also relatively good when tested with the 2-dimensional criteria, whereas the converse is not necessarily true. For example, for Paris the testing method with all 3 indicators found the best climate analogue close to the small Spanish city of Badajoz at the Spanish-Portuguese border with a p value of 90%. This location also evaluates to a p value of 100% in the 2-dimensional test with Aridity Index and Heating Degree Days as well as a p value of 75% in the 2-dimensional test with Aridity Index and Cooling Degree Days.
Also, the best analogue with (AI, CDD) may be a poor one when seen with (AI, HDD) or vice versa. In the same example, the locations of the best analogues found by either of the 2-dimensional tests (for the test with Aridity Index and HDD located in the Black Sea and for the test with Aridity Index and CDD close to the Spanish city of Ciudad-Real) evaluate to a p value of 0% in the other test. This example is representative of all 12 examined cities (see Appendix B).
In short, the results are not only theoretically but also empirically more satisfying using three indicators, and since the supplementary computational cost is modest, there is no reason to use just two. We did not look beyond three, but in some case this may Figure 7: Comparison of the 3-dimensional K-S statistic results (with Aridity Index, HDD and CDD) and the two 2-dimensional K-S statistic results (with Aridity Index and HDD/CDD respectively) for Paris, Saint-Petersburg and Athens. Respective city's actual location indicated on each map along with the best climate analogue (if existent). (HadRM3H model simulation) be useful, because in many cases there are several analogues approximately as good as each other. For example, all dark red areas in Figure 7 for cities like Paris and Saint-Petersburg. Possible extensions, to name only a few, would be, for example, an indicator expressing seasonality to account for urban adaptation to seasonal variations, an indicator reflecting the effect of elevation in order to consider climatic particularities at different altitudes, or an indicator capturing water surplus to account for structural adaptations needed to combat extreme events linked to excess water such as flooding, landslides and erosion.

Climate relocation maps
Variability in the prediction of future climate arising from the stochastic nature of climatic processes is accounted for, but deeper uncertainties remain . Climate relocation maps can be used to compare the output of climate models and to better understand the differences between climate change simulations. Figure 8 compares the analogues found for the different datasets: the ARPEGE and HadRM3H models projecting global warming. Figure 6 was the reference map of actual locations of the examined 12 cities in Europe.
No good analogues were found for Athens in either model. For HadRM3H, Barcelona has an analogue near the town of Ouezzane in northern Morocco and Rome has an analogue on the southern coast of Turkey with Nicosia (the capital of Cyprus) being the closest analogue city. Neither has a good analogue for ARPEGE. Madrid by contrast has no good analogue for the HadRM3H simulation and Biskra in Algeria for ARPEGE. Berlin, London, Paris and Istanbul have good analogues near Chlef (Algeria), Vila Real (Portugal), Badajoz (Spain) and Kamaran (Turkey) respectively for HadRM3H, and Campobasso (Italy), Nantes (France), Vieste Aero near Rome (Italy) and Moron de la Frontera in Andalusia (Spain) for ARPEGE. Helsinki, Oslo, Stockholm and Saint-Petersburg have good analogues near Sandomierz (Poland), Teruel (Spain), Soria (Spain) and Ternopol (Ukraine) respectively for HadRM3H, and Banja Luka (Bosnia and Herzegovina), Klodzko (Poland), Lindenberg (Germany) and Rovno (Ukraine) for ARPEGE. Comparing the analogues found in the case of the ARPEGE and HadRM3H models, which are two leading climate simulation models, give an impression of the extent of uncertainty in climate change prediction for Europe. Despite the differences, however, both models agree in showing a clear drift towards warmer regions in the climate analogues. This supports the expected effect of global warming on European local climates towards the end of the 21 st century, under the A2 greenhouse gas emission scenario. It has to be noted however, that this simple two model comparison provides only a limited impression of the uncertainty. Future studies should investigate multimodel and multi-scenario comparisons to further assess the applicability of the method for the visualization of uncertainty and possibly the identification of different sources of uncertainty. Also, uncertainties such as those arising from biases in the control runs representing present-day climate should be taken into consideration.

Conclusions
We described a method to analyze the results of climate simulation models, improving on . It is based on the concept of climate analogues, i.e. finding a City B whose present climate statistically corresponds to the simulated future climate of an evaluated City A. This provides an intuitive visualization of climate change effects on urban areas, by replacing the change of climate (in time) with a change of a city's location (in space). Through the use of several models and scenarios, this approach also clarifies the extent of the uncertainty in climatic change predictions, and in their effects on urban areas.
Climates were characterized using three annual indicators: Aridity Index, Heating Degree Days and Cooling Degree Days. These indicators can readily be computed from monthly precipitation and temperature datasets. To compare climates, we compared 30-years time series of these indicators using the two sample three-dimensional Kolmogorov-Smirnov tests. We found that using three instead of only two climate indicators provided a more satisfying analogue selection, at the cost of a moderate increase in computational complexity.
The limitations of the approach lie primarily with the assumption of climate stationarity and the interpretation of a density map by its maximum alone (the best analogue). Also, analogues found might be physically implausible such as locations in the Mediterranean or Black Sea, and the selection of climate indicators focused primarily on precipitation and temperature derived characteristic could possibly obliterate important differences between a location and its analogue such as topography, length of day, level of economic development, etc. The analogues of Paris are representative of the kind of scientific policy-oriented message this method provides: according to one simulation, Paris could have at the end of the 21 st century a climate similar to Vieste Aero near Rome. That may not be seen as an adverse change by many stakeholders. However, according to another simulation, Paris could also have the climate of the city of Badajoz in Southern Spain. It is widely held that heat waves and water shortages, which were not considered as a significant problem in Paris only ten years ago, are nowadays recurring sources of trouble in the Badajoz area. This work illustrates, therefore, how new climate-related problems will appear in numerous cities because of climate change. The related evolution of natural risks has to be managed in the most proactive ways to avoid the repetition of costly surprises like the 2003 heat wave in Europe and its dramatic consequences.
In some cases no suitable analogue for the projected climate of a given city were found. This indicates a lack of the type of climate projected for the city within Europe, at a 50% confidence level. For example, Athens lacks a good analogue on Figure 7. It can only be supposed that a suitable analogue might be found further south. An obvious extension of this work would be to search potential analogues not only within Europe but worldwide and to assess sources of uncertainty within a larger range of models and scenarios. Another would be to search for the analogue using climatological observation data instead of model-based datasets.
Evidence of this method's communication value comes from its use in teaching and in European popular science and mass media (Kopf et al., 2007;Adam, 2007). In addition to, and in comparison with existing socio-economic simulations (e.g. Lorenzoni et al., 2000;Kaivo-oja et al., 2004) for future scenarios, climate analogues provide an alternative way to rigorously frame the climate change issue on cities, as well as provide an estimate of the extent of uncertainty in the prediction of climatic changes. Although the limitations and drawbacks of this method have to be kept in mind, it provides a strong basis for the visualization of climate change and allows socio-economic adaptation to different climates to enter the mental model.

Appendix A: Parameterization of the Kolmogorov-Smirnov tests
To determine the p value corresponding to a value of the 2 and 3-dimensional Kolmogorov-Smirnov statistic D, we derived a set of sample probability distributions from the procedure and data reported in Appendix A and B of Fasano and Franceschini (1987), and calculated the appropriate approximation formulae for each needed sample size (and a variety of correlation coefficients) through third order polynomial interpolations using an appropriate function from Press et al. (1986). In the 2-dimensional case, data points of the probability distribution for the needed sample sizes were calculated using the polynomial expansion proposed by Fasano and Franceschini (1987) for the 2-dimensional Kolmogorov-Smirnov test. In the 3-dimensional case, the data points for the needed sample sizes were obtained by linear interpolation of the data calculated by Fasano and Franceschini (1987) with Monte Carlo simulations. The range of correlation coefficients covered (in both the 2 and 3-dimensional case) were CC=0, 0.5, 0.6, 0.7, 0.8 and 0.9 as values between 0 and 0.5 do not differ significantly from the uncorrelated CC = 0 case. As our calculations with the three climate indicators Aridity Index, HDD and CDD in the 3-dimensional test did not yield partial correlation coefficients exceeding 0.95, the average of the 3 CC could be used (Fasano and Franceschini, 1987). Table 1 and 2 report the constants of the derived polynomials for the 2 and 3-dimensional case of the main scenario of sample distributions with 30 samples (i.e. annual Aridity Index and Degree Days over 30 years). Sample points and polynomial estimates for the main scenario are furthermore visualized in Figure 9 and 10. All polynomials have the form:     Table 3: Robustness of analogues computed with 3 indicators (HadRM3H dataset). In each cell, the mini barchart shows the climatic similarity between the station named in that cell and the city indicated in column 1, for three different ways to define climatic similarity. The three different ways to define climatic similarity are: using all three climate indicators (left bar), using only aridity and HDD (middle bar), and using only aridity and CDD (right bar). The station named in each cell is the best analogue for one of the three ways, as indicated by the respective column caption. The vertical scale of the barcharts goes from 0 ( ) to 1 ( ), each bar representating a p-value, i.e. taller bars in the barcharts indicate more similar climates.
City's future best analogue, climates compared on: City Aridity, HDD and CDD Aridity and HDD Aridity and CDD Appendix C: Displaying K-S statistic D instead of the pvalue For visualization of the Kolmogorov Smirnov test results, we have chosen to display the K-S statistic D rather than the corresponding p value . The relationship between D and p is obviously monotonous, so mathematically no information is lost, and the selection does not affect the location of the optimum analogue. Figure 11 illustrates the difference between a D map and a p map. Admittedly, displaying p-values would be more meaningful to the statistician theoretically. But to everyone else, the K-S statistic gives a better visual indication of graduated differences between climates. This is because, as 4 shows, the p(D) function is very nonlinear. Lower values of D give practically p = 1, and higher values give p = 0. Therefore, displaying p on a linear scale tends to produce more categorical maps of "good" versus "poor" analogues, while displaying D produce more gradual, esthetically pleasing maps. Figure 12 shows the 3-dimensional K-S statistic for all 12 examined cities (HadRM3H model simulation).