A hypothetical Pan-European Indoor Radon Map has been developed using summary statistics estimated from 1.2 million indoor radon samples. In this study we have used the arithmetic mean (AM) over grid cells of 10 km

Radon (Rn) is the major contributor to the ionizing radiation dose received by the general population, which is the second cause of lung cancer death after smoking (WHO, 2009). Worldwide radon exposure is linked to an estimated 222 000 out of the 1.8 million lung cancer cases reported per year (Gaskin et al., 2018), and in Europe alone it has been estimated that 18,000 lung cancer cases per year are induced by radon (Gray et al., 2009). Since lung cancer survival rates after 5 years can be below 20 % (Cheng et al., 2016), a reduction in radon exposure will have a significant positive impact on the health of the general population. In this context, the EU recently revised and consolidated the Basic Safety Standards Directive (Council Directive 2013/59/EURATOM), which aims to reduce the number of radon-induced lung cancer cases.

The main sources of radon indoors are the surrounding subsoils on which buildings are located, the groundwater used in the building, and the building materials (Cothern and Smith, 1987). Consequently, radon is present everywhere. The likelihood of having a high indoor radon concentration may, however, be higher in some areas than others. Radon maps are therefore an essential tool at a large scale and give very good indications of the problem, helping policymakers to design cost-effective radon action plans (Gray et al., 2009). Importantly, because of high local variability, large-scale Rn maps do not inform about Rn concentration in a particular building. Instead, this requires measurements in that building.

In 2006, the EU's Joint Research Centre (JRC) launched a long-term project
to map radon at the European level (Tollefsen et al., 2014). For more than 10 years now, the JRC has been developing the European Atlas of Natural Radiation (Cinelli et al., 2019). It includes maps of the natural radioactive levels of (i) annual cosmic-ray dose; (ii) indoor radon concentration; (iii) uranium, thorium, and potassium concentration in soil and in bedrock; (iv) terrestrial gamma dose rate; and (v) soil permeability. Digital versions of these maps are available from a JRC website (

The European Indoor Radon Map (EIRM) displays the annual average indoor
radon concentration (Rn;

The dataset underlying the EIRM represents a huge amount of work. At the time of writing (end of 2018), 32 countries (EU and non-EU member states alike) had contributed data, and information from almost 1.2 million dwellings has been aggregated into 28 468 grid cells. Since some cells overlap between countries, 28 203 of these grid cells were filled by one country, while 262 and 3 grids were filled by two and three countries, respectively (i.e. border areas which share the same grid) (version: 29-09-2018). However, there is still a large number of grid cells over European territory with no data, and the number of measurements per grid cell varies widely, from many with only one measurement up to a single one with 23 993 dwellings sampled (Table 1). Evaluating the radon exposure to European citizens would therefore require another 10 years, or more, if it had to be done based on indoor radon measurements over each grid cell.

Number of dwellings sampled by grid cells of 10 km

Interpolation techniques are therefore essential at this stage to predict a
mean indoor radon concentration in the grid cells for which no or few data
are available, and thus develop a Pan-European Indoor Radon Map. We have
tested four interpolation techniques: two that use solely indoor radon
concentration measurements, viz. inverse distance weighting (IDW) and
ordinary kriging (OK), and another two which also take into account geological information, viz. collocated cokriging with the uranium concentration
in topsoil as a secondary variable (CCK) and regression kriging with topsoil
geochemistry and bedrock geology as secondary variables (RK). Cross-validation exercises were carried out to assess the uncertainties
associated with each method. The map generated here is a hypothetical indoor
Rn map in the sense that it estimates the mean per 10 km

Arithmetic mean (AM_z) over 10 km

Histogram and q–q plot of average indoor radon concentration (AM_z) on the ground floor of dwellings.

The primary dataset used to predict the mean per grid cell with no or few
data is the one of arithmetic means (AM_z). The AM was assigned to the centre of each grid cell, and predictions were carried out only in grid cells where U, Th, and

In the study area (i.e. area with topsoil geochemistry data) there are
25 367 grid cells with indoor radon measurements (Fig. 1). The distribution of the AM is approximately log-normal (Fig. 2), with values ranging from 1 to 10 116 Bq m

Summary statistics of indoor radon data (AM_z) after merged border grids (

A mean (over a 10 km

The inverse distance weighting (IDW) interpolation technique estimates a
weighted average at an unsampled point (

The result is highly influenced by the inverse distance weighting power
chosen. An optimal value of

Inverse distance weighting power (idp) optimization.

Trans-Gaussian kriging using Box–Cox transforms (function krigeTg in R software, packages “gstat” and “MASS”; Gräler et al., 2016; Kendall et al., 2016; Pebesma, 2004; R Core Team, 2018; Venables and Ripley, 2002) was performed with the arithmetic mean. The normal transformation of data (

The variogram was modelled with two components: a Matérn model (Minasny and McBratney, 2005; Pardo-Iguzquiza and Chica-Olmo, 2008) up to a distance of 50 km and an exponential model up to 1500 km (Fig. 4). The very low kappa (0.15) points to high “roughness” of the field. Predictions were then carried out with observations within a distance of 1000 km and using a minimum and a maximum number of nearest observations of 5 and 75, respectively.

Model variogram (blue line; green dots are pairs of points up to a distance of 50 km and red points up to 1500 km) and 100 variograms from random permutations of the data (grey lines).

Collocated cokriging (CCK) is a special case of cokriging where only the direct correlation between the primary (e.g. AM_z) and the secondary variables (e.g. U) is used, ignoring the direct variogram of the secondary variable and the cross variograms. It simplified the cokriging equations although the secondary variable must be sampled at all prediction points (Bivand et al., 2008). The method is a simplification of the physical reality because the dependence structure between covariates is more complex, as they result from different physical processes.

We performed the CCK with the uranium concentration in topsoil as a secondary
variable since radon is generated in the uranium decay series (Cothern and Smith, 1987), and a positive correlation between uranium and indoor radon is therefore expected. The analysis was carried out with the data log-transformed and then back-transformed to the original scale (AM_z) with Eqs. (16) and (17) (where

Regression kriging (RK) is a two-step interpolation technique: first, a regression estimation of the dependent variable (e.g. AM_z) is performed against secondary variables (e.g. geogenic factors), and, second, an analysis of the spatial distribution of the residual is carried out using geostatistical methods (i.e. OK; Pásztor et al., 2016). The final estimates are the sums of the regression estimates and the ordinary kriging estimates of the residuals (Di Piazza et al., 2015). The analysis was also carried out with the log-transformed data and directly back-transformed with the same equation as in CCK.

The technique applied in the regression step can vary (Li and Heap, 2008); here, we have performed a linear regression using topsoil geochemistry (i.e. U and

Simplified geology map with geological units defined on a lithology basis (Nogarotto et al., 2018). The base geological map is the IGME (Asch, 2003).

The procedure is therefore (i) to fit a linear model to the data (Fig. 7a and Table 3), where the total indoor radon variance explained by U,

ANOVA table for indoor radon concentration.

Significance codes:

The performances of the different methods were assessed by

The accuracy of the different methods was assessed using six indicators: the
mean absolute error (MAE), the root-mean-square error (RMSE), the
root-mean-square

The

Box plot of the

The

Similar results are obtained in the MWCV exercise (Table 5). Geostatistical techniques (i.e. OK, CCK, RK) also have the highest

Moving-window cross-validation results.

Radon predictions with the different methods range from minimum values of
1–4 Bq m

Summary of indoor radon predictions (AM, ground floor).

Small differences may be appreciated in the predictions of the different
interpolation techniques (Fig. 9). IDW and OK are methods that rely on the Rn data only, while CCK and RK use additional predictors (i.e. geology, U and

Indoor radon predictions (AM (Bq m

The influence of geogenic factors on indoor radon is well known and normally used for radon mapping (e.g. Casey et al., 2015; Elío et al., 2017; Pásztor et al., 2016; Scheib et al., 2013; Tondeur et al., 2014). In our cases, an ANOVA (Table 3) shows that the total indoor radon
variance explained by U,

Geology is associated with both uranium and radon sources and with physical
properties which permit the release of radon from the soil matrix and its
transport in the environment (e.g. mineralogy, porosity, permeability). The total indoor radon variance explained by geology is normally of
the order of 5 %–25 % (Appleton and Miles, 2010; Borgoni et al., 2014; Miles and Appleton, 2005; Tondeur et al., 2014; Watson et al., 2017), although it depends on the geological scale map (i.e. increase with the scale; Appleton and Miles, 2010). A 4.64% of indoor radon variation explanation is therefore reasonable, taking into account that we used a simplified

The positive correlation between indoor radon and potassium is, however, not
evident.

Summary of indoor radon at the European scale.

Back-transforming predictions to the original scale is a critical point of
log-normal and trans-Gaussian kriging. OK as given in this study solves this
problem by using the Lagrange multiplier in the back-transformation. However, the

Finally, a theoretical problem, if using kriging-type interpolators, it may be that input data are actually cell or grid means (blocks in geostatistical language), treated as point samples. The change-of-support problem, which is particularly unpleasant in log-normal kriging, may be alleviated since the target supports are also the same. We regard input data as point data at the cell centre, and we estimate points at other locations that again represent cells of the same size. However, the theoretical aspect remains to be clarified in more depth. Taking into account all of these limitations and weaknesses, the solution demonstrated here, however, represents an acceptable compromise between mathematical exactness, numerical tractability, and complexity of the physical realm.

We would like to produce the Pan-European Indoor Radon Map by minimizing data
processing, and therefore we prefer to estimate the radon average directly
by indoor radon measurements carried out at each grid (i.e. AM_z). However, if the number of measurements were low, the uncertainty of this value could be high. In this sense, if dwellings were randomly selected and therefore representative, which is the condition for unbiased estimates of the mean and other statistics, and the sample size large, the mean value and the confidence interval would be (Eq. 24)

The confidence interval decreases when the sample size increases. In our
cases (Fig. 10), the relative (to the mean) CI

Variation in the 95 % confidence interval of the arithmetic mean according to the sample size (

Grids with 30, or more, indoor radon measurements (

For the final Pan-European Indoor Radon Map (Table 7 and Fig. 12), we therefore use the AM of the grid cells with 30 or more measurements (Fig. 11)
and the value predicted by RK (Fig. 9) in the cells with fewer than 30 measurements. Indoor radon concentration ranges from 3 to 2662 Bq m

Final Pan-European Indoor Radon Map.

After more than 10 years of collecting and processing Rn data, with the
support of 32 European countries, we could cover approximately 50 % of the
continent with 10 km

Of the four methods tested in this study, regression kriging (RK), using a
simplified geological map and the topsoil concentration of U and

The Pan-European Indoor Radon Map is not a finished map, and it will be
upgraded as new data become available. In future versions a larger scale of
the geological map (e.g. scale

The indoor radon data related to this article are confidential. Additional data and code can, however, be made available by the authors upon request. Last versions of the maps used in the article can be seen on a JRC website (

JE was mainly responsible for the data analysis and interpretation, and he wrote the article with inputs from all authors. PB and JLGV contributed to the data analysis and interpretation. AN and RB carried out the simplification of the geological map of Europe. GC, TT, and MDC, contact persons of the European Atlas of Natural Radiation, have provided the indoor radon data – collecting them from national authorities, maintaining the datasets, and upgrading them when new data are available. They also helped with data interpretation.

The authors declare that they have no conflict of interest.

We wish to thank all the national competent authorities, universities, and
laboratories who have provided, and continue to provide, indoor radon data
to the JRC (see

This paper was edited by Heidi Kreibich and reviewed by two anonymous referees.