RADON POTENTIAL MAPPING  THE PROBLEM OF GEOGRAPHIC SCALE
George Aspbury
Department of GeographyGeology, Illinois State University
Normal, IL
ABSTRACT
The purpose of this study is to investigate the significance of geographic scale in the construction of radon
hazard potential maps using a vectorbased Geographic Information System (GIs). I compare and discuss radon
potential maps produced at the national scale by the United States Geological Survey, several statelevel maps based
on observed residential radon levels using counties and zipcode areas as the units of analyses, and lastly, a highly
localized (large scale) map based on individual (nonaggregated) observations. In addition, oneway analysis of
variance statistical techniques are utilized to establish the correlation between type of glacial deposits and observed
radon levels. I demonstrate that when data is aggregated at both the zipcode and county scales, there appears to be
no strong relationship between quarternary units and observed radon levels. However, at the large scale, statistically
significant relationships are observed. I also demonstrate that such analyses within a GIs environment aids in the
more meaningful construction of radon hazard potential maps.
INTRODUCTION
Due to the efforts of the United States Environmental Protection Agency, the public has become
increasingly aware of the potential risks and health hazards associated with elevated levels of residential radon.
Several studies during the past several years however, have questioned the extent to which radon is a contributing
risk factor associated with lung cancer mortality. Other studies have gone so far as to completely reject the notion
that residential radon is related to lung cancer. A moderate position on this issue is to assume that residential radon is
in fact a causal factor, but the critical level of residential radon may have to be reassessed. If this is the case, one can
argue for additional research.
This paper explores some of the difficulties inherent in the cartographic and geographic information
systems (GIs) analysis of residential radon data. I also examine and demonstrate some of the problems of
geographic scale and how the development of radon potential maps must carefully consider this factor. This is
particularly critical when researchers are attempting to explain spatial variations in radon concentrations as the result
of a multivariate process. As a general rule, very different spatial patterns may develop as a result of the same
explanatory processes operating at different spatial scales. The data used in this study is drawn from a variety of
sources and for three different geographical scales for the state of Illinois.
CARTOGRAPHIC AND CIS ANALYSIS OF RADON DATA
Previously it has been wellestablished that residential radon levels are not geographically constant but
typically exhibit significant geographical variation regardless of the spatial scale of the data. If a map is worth a
thousand words then it is only natural to represent tabular data on residential radon levels cartographically. Thus, for
example, the United States Geological Survey has published a map (Map 1) of the United States  "The Generalized
Geologic Radon Potential of the United States." (United States Geological Survey, 1993). This simple map identifies
three categories of radon potential regions: Low potential areas (<2 pCiiL), moderate potential areas (2  4 pCi/L),
and high potential areas (>4 pCi/L). This map as a generalization is an excellent first start in that it links
characteristics of geologic provinces to radon potential. With reference to this work, the "EPA's Map of Radon
Zones: Illinois," (United States Environmental Protection Agency, 1993) states: "It is important to note that EPA's

1996 International Radon Symposium 1 4.1
extrapolation from the province level to the county level may mask significant "highs" and "lows" within specific
counties. In other words, withincounty variations in radon potential are not shown on the Map of Radon Zones."
Radon level and radon potential maps have also been developed for individual states; these statelevel maps
have been developed by extrapolating fromthe national geologic province level to the county level. Other statelevel
studies, namely for Ohio, have been more ambitious, treating residential radon levels aggregated at the zipcode level
and incorporating geologic factors into a large data bases(Kumar, et. al., 1990). A number of these same studies also
endeavor to explain the resulting spatial patterns on the basis of certain geologic factors, including soil permeability,
thickness of glacial deposits, type of bedrock, depth to bedrock, etc.
Recent developments in automated and analytical cartography and geographic information systems (GIs)
have greatly aided in the extent and quality of radon potential analyses and research. GIS platforms (including
software and hardware) have simultaneously become less expensive, more sophisticated, more powerful and
certainly more userfriendly compared to only several years ago. While GIs's have become a progressively more
widely accepted technology in a variety of disciplines, problems can and due arise because of the manner in which
the data itself is handled by the researcher.
Ordinary descriptive statistics and the compilation of data in the form of frequency distributions applied to
radon levels can provide obvious and useful insights about the extent and intensity of the residential radon hazard. In
this paper I am concerned with some problems inherent in analyzing and visualizing such data, namely identifying
the spatial distribution of the data for a given geographic area. A number of studies routinely incorporate maps in
which radon concentrations are cartographically displayed. Most published maps of radon concentrations that this
author is aware of use either counties or zipcode areas as the spatial units. In all likelihood, these maps using either
of these geographical units, ultimately were constructed by aggregating individual residential observations. This
data aggregation, however, raises serious questions with respect to the usefulness of such maps for analytical
purposes. Stated differently, these maps are a generalization of the reality of the pattern of spatial distribution and
thus may mask very significant statistical variations both within and among the spatial units being utilized. The
question arises then, as to what are the appropriate spatial units that should be used in the construction of more
powerful, predictive maps of radon potential.
SPATIAL SCALE AND DATA AGGREGATION
The creation of frequency distributions from 'raw' or 'unclassified' data forces the researcher to carefully
consider issues such as the number of frequency classes and the upper and lower limits of each of these frequency
classes. Failure to select a meaningful number of classes or appropriate class intervals can lead to erroneous
interpretation of the generalized data. An analogous problem is faced by geographers or others creating choropleth
maps. Barber (Barber, 1988) states, "...spatial aggregation tends to reduce the variation depicted on a map.
Comparisons of maps of a variable at different levels of aggregation must take into account the ramifications of this
variance reduction. The problem becomes even more acute when we try to examine the relationship between two
maps of different variables." He proceeds to illustrate this with the following example (Figure 1.)
Figure 1: Spatial Scale and Aggregation Effects on Summary Statistics
LARGE SCALE DATA
m = 7.5
s2= 9.75
Fig. 1.a

1996 International Radon Symposium I 4.2
MEDIUM SCALE DATA
m = 7.5
s 2 = 1.75
SMALL SCALE DATA
m = 7.5
s2= 0.00
In this illustration assume that each group of square cells is a map of some variable for the same region. In
Figure l.a, the region is subdivided into sixteen square cells and the mean for the entire region is m = 7.5 with a
variance s2= 9.75. If this data is then aggregated by combining two horizontally adjacent cells (Figure 1.b) such that
the region is partitioned into only eight cells, the mean remains unchanged but the variance is now reduced to s2 =
1.75. If further data aggregation and generalization in undertaken (Figure l .c) such that the region is partitioned into
only four subregions, the mean again remains unchanged but the variance is now s2 = 1.75. In other words, all of the
original statistical and spatial variation in the data has been lost. Other examples could be provided in which both
mean and variance would change in response to different spatial configurations and aggregations of the data.
Another example of aggregation effects is presented by Clark and Hosking (Clark and Hosking, 1986) and
Clark and Avery (Clark and Avery, 1976). With particular emphasis on the effects of data aggregation with respect
to correlation and regression analyses, they observed that, "Aggregation of observational units on the basis of
proximity leads to substantially biased correlation coefficients, with an increase in r as the level of grouping
increases." In the case of cartographic representation of radon concentrations or the development of radon potential
maps, the researcher should utilize (in most cases) data from the smallest geographic units available. Similarly in the
development of multivariate statistical models in which radon concentration, the dependent variable, Y, is a function
of a set of independent variables, Xi, or symbolically represented as:
Y = f (Xi, X2,........Xi) + e
Equation 1
the effects of data aggregation must be acknowledged in the research design and dealt with explicitly. Failure to
account for this effect may lead to erroneous and misleading results. If it operationally necessary or imperative to
aggregate data into larger geographic units, I suggest that aggregation should occur among proximal units with
approximately the same data values. This practice will tend to minimize the problem of variance reduction. In a
quantitative sense, the effect of numerical map generalization is to obscure or mask fundamental and important
spatial variations in the data.
Cartographers and geographers make critical distinctions with respect to spatial data. Any univariate spatial
data set can be reduced to a geometric primitive of point, line, area or volume. Of concern here is the distinction
between areal data and point data. Areal data is most commonly represented in the form of a choropleth map (for
example, Map2). The implicit assumption is that the variable being mapped is continuous and constant over each
1996 International Radon Symposium I  4.3
areal unit and hence, any variance within the original data aggregated to this scale is reduced to zero. Furthermore,
the data, in a mathematical sense, is implicitly discrete.
In many cases, it is common practice to assign the data variable for each areal unit to the centroid of that
unit and then use some automated technique to contour this data. The result is a contour map which is a continuous
representation of the variable. A three dimensional representation of this data can also be easily generated from a
contour map to produce a continuous data surface. Barber states, "Cartographers sometimes argue that a continuous
areal representation is appropriate if the phenomena exists everywhere on the map, both at and between observation
points. This argument appears to be valid for many physical phenomena such as rainfall and temperature, but is less
compelling for a variable such as population density." (Barber, 1988).
CIS DATA SETS
In this study three separate and independent data sets are used in this analysis of radon concentration levels
in Illinois. The first data set is drawn from data published by the U.S. Environmental Protection Agency (U.S.
Environmental Protection Agency, 1993). This data is published in tabular form and also shown cartographically as
a choropleth map. The second data set consists of radon concentration values compiled by an independent radon
testing agency. Although this data contains information for individual residential radon levels, the locational
information for each of these is identified at the zipcode level rather than the county level. The third data set, used to
observe microscale radon levels within a single county, was provided by another testing company. Most of the
individual observations in this data are concentrated in the central Illinois area. For the purpose of this study this data
set was further reduced to include only observations located within the BloomingtonNormal, Illinois, metropolitan
area. The locational information in this set contains both a zipcode identifier and street address for each observation.
This data has been treated in such a manner as to guarantee complete locational confidentiality of individuals by
minimally aggregating the data at the subdivision level. Each of the three data sets was provided in digital form,
either in a spreadsheet or a database file format. Because of the relative ease of file conversion, each set was
translated into a standard dBase IV format.
The second major task was to create a standard digital map such that the county outline map, zipcode area
map, and address location map all share a common coordinate system. A pcARCANF0 coverage of county
boundaries was provided by the Illinois Department of Energy and Natural Resources. The zipcode area map data
was translated from an Atlas GIs data file to a pcARC/INFO coverage. For purposes of addressmatching for the
BloomingtonNormal data, the most recent version of the U.S. Bureau of the Census TIGER (Topologically
Integrated Geographic Encoding Referencing) line files was utilized. These files were translated into a pcARCIINF0
coverage. Because TIGER line files are based upon latitude and longitude coordinates, these were converted into a
Lambert conformal projection using state plane feet as the coordinate units to conform to the same projection units
as the zipcode and county coverages.
Each of the three radon concentration data sets was then related to their respective GIs coverages. The three
separate resulting pcARC/INFO coverages could then be individually displayed and queried as well as overlaid on
other coverages. For analytical purposes, the Illinois State Geologic Survey provided pcARC/INFO coverages of the
stack unit coverage (digital map and related data base) and quarternary deposits coverages for Illinois. They
additionally provided bedrock depth and thickness of quarternary deposits coverages for McLean County, Illinois,
which will be used in subsequent analyses of the radon concentration data for the BloomingtonNormal metropolitan
area.

COUNTY LEVEL DATA ANALYSIS
Table 1 summarizes data on radon concentrations by county. Of the 102 counties in the state fifty nine
(58%) are located in areas dominated by Wisconsinan and Woodfordian stage glacial deposits. Although only seven
counties are associated with Liman stage deposits, these counties have the highest mean radon concentrations (6.47
pCi/L). Only Woodfordian counties have a mean above 4.00 pCi/L.

1996 International Radon Symposium 1 4.4
Average county radon concentration is shown in Map 2. Clearly the counties having average radon
concentrations above 4.00 pCVL are geographically concentrated in the central, westcentral and northern tier of
counties in the state. Recall that representing data in this fashion implies that even as average values, these values
are assumed to be uniform over each entire county and the county boundaries imply sharp transitions in average
value. The continuous nature of the data is more effectively shown by contouring the data. pcARC/INFO does not
provide a tool for contouring discrete data values. To remedy this problem, a point coverage using county centroids
was developed and x and ycoordinates assigned to each point. This database was then imported into SURFER
(Golden Software, 1995), a powerful and versatile contouring and 3D surface mapping software, and gridded using
the Krigging contouring routine. This provides a more meaningful cartographic representation of the spatial trends of
radon concentrations across the state. Map 4 is a threedimensional representation of the contour surface and allows
for the easy identification of the "hills" (high concentrations) and "valleys" (low concentrations).
Map 3 is a more accurate representation of the data. The spatial trends in the data now become more
apparent with the high "ridge" of radon values extending in a band from the southeast to northwest across north
central Illinois. A trend to progressively lower values extends from this ridge toward the southern part of the state.
ANALYSIS OF VARIANCE MODEL
Although the maps reveal a great deal about the spatial distribution of radon concentration using the county
data, we are now in a position to ask whether there are statistically significant differences among counties based on
their dominant quarternary deposits. A simple oneway analysis of variance (ANOVA) seems appropriate to apply
to this problem. In general, this model provides a simple statistical technique for identifying whether statistically
significant differences exist among a partitioning of the data into meaningful categories. In the context of this study,
the radon concentration data for the countylevel and zipcode area level data, can be categorized on the basis of the
particular glacial stage of quartemary deposits and till members. In the case of the local level data, individual
observations are classified on the basis of a glacial substage. In the analysis of variance, the variance within each of
the categories is compared to the variance between categories. The ratio of the between categories variance relative
to the within categories variance yields the Fratio and the F probability distribution is used as the statistical test. To
calculate each of the variance terms requires first computing the sum of squares between categories and within
categories. The null hypothesis can be formally stated as:
and thus the alternate hypothesis:
If the computed Fratio exceeds the tabled value of the Fdistribution for the relevant degrees of freedom and at a
given level of significance (a = .05 or .O1 typically), we can reject the null hypothesis.
The analysis of variance (Table 2) for the county level data yields a calculated value of F = 3.744. For 97
and 5 degrees of freedom respectively and with a = .05, F = 4.40. Therefore Hois accepted, implying that there are
no statistically significant differences between counties on the basis of their dominant quartemary deposits.
ZIPCODE LEVEL DATA ANALYSIS
Average radon concentrations based on zipcode area data and quartemary formation are shown in Table 3.
Of the approximately thirteen hundred zipcode areas in Illinois, only 862 had at least one valid observation. Even
though the individual observations for which the averages for each zipcode area were compiled represent a
statistically independent sample (from the county level data), the Liman formation zipcode areas have the highest
mean but also a large variance. This is comparable to the county area data. Of the 862 valid zipocde areas, a total of

1996 International Radon Symposium I 4.5
34.06 percent had average radon levels in excess of 4.00 pCi/L. The highest mean is associated with the Liman
formation (6.47 pCi/L) with 55.9% of zipcodes exceeding 4 pCi/L. Although the Jubilean Formation has the highest
percent of zipcode areas above 4 (68.8%) its mean is considerably less (4.5 pCiIL). The Wisconsinan, Mixed and
Monican Formation also have high coefficients of variation (.9973, .882 1, and .83 1).
Map 5 shows the spatial pattern of radon concentrations. Maps 6 and 7 show the same data as a contour and
three dimensional surface respectively. This surface appears rather "spiked" because there are a number of zipcode
areas that have no data. Generally, however, these maps do suggest a spatial pattern comparable to what was
observed in the county level data (namely higher values concentrated in the central and west central and northern
areas of the state.
Table 4 shows the analysis of variance of the zipcode level data. At this level compared to the county level
data, the calculated Fratio is 12.84. The value of the Fdistribution for 8 and 1000 degrees of freedom and a = .01, is
2.53. Thus the null hypothesis (Ho)is rejected. There is thus a ninety nine percent probability that there do exists
differences between the quarternary units and their associated average radon concentrations.
LOCAL LEVEL DATA ANALYSIS
BLOOMINGTONNORMAL METRO AREA
The BloomingtonNormal metropolitan area, is located in McLean County, Illinois. The area has
experienced relatively recent rapid population growth. Most of this growth has been concentrated on the eastern side
of the community, resulting in the development of new residential subdivisions. More recently, there has also been a
population expansion toward the southwest.
A locally based radon testing agency supplied data on residential radon levels covering mostly the McLean
County area. Three hundred ninety one observations in the data base were located within the metropolitan area. This
database identified the geographic location of each tested residence by street name and address and subdivision. This
database was then easily merged with the GIs street coverage for BloomingtonNormal. Using the address matching
procedures available in pcARC/INFO, a point coverage of all residences was created. This coverage was then
overlaid on the quartemary deposits coverage. Unlike the case with the county and zipcode level data (in which the
quarternary stages were used), at this scale the analysis could be undertaken using individual till units, smaller spatial
units associated with quartemary substages.
Table 5 presents these results. Although there are clear differences in the mean radon concentration values
by till units, a total of 147 of the 391 observations (37 percent) had consentrations above 4.01 pCi/L. For all
observations, the mean level for the metropolitan area is 4.04 pCi/L. Figure 2 shows the frequency distribution of
radon concentration by till units.
Because each observation's location was addressedmatched to a given city block's address range, for
reasons of confidentiality individual observations were aggregated to the approximate geographic centroid of
residential subdivisions. This data was then contoured and overlaid on the street network and quarternary deposits
coverages. Spatial trends in the radon concentration surface are again evident. A ridge of high values exists over the
northern section of the metropolitan area. From that ridge there is generally a trend toward areas of progressively
lower concentrations. Maps 8 and 9 show these patterns.
Lastly, an analysis of variance was performed on this data set. The calculated Fratio is 19.15. For the
appropriate degrees of freedom the null hypothesis is again rejected at a = .O1 level, indicating that there are
statistically significant differences among till units in terms of their radon concentrations. The specific characteristics
of these till units and their relationship to radon emissions awaits further investigation.
1996 International Radon Symposium 1 4.6
SUMMARY
Geographic Information Systems provide an excellent analytical tool for researching and modeling the
spatial patterns of radon concentrations. The researchers wishing to analyze radon concentration data using a GIs,
must however, confront the problem of geographic scale. As suggested earlier data at the least aggregated scale
possible should be utilized without compromising the confidentiality of individuals. By doing this the problem of
masking meaningful spatial and statistical variation that exists within and among the geographic units of analysis
will be avoided.
The most significant difficulty in this study is the lack of a sufficiently large database at the zipcode level.
As more individual observations are aggregated to this spatial scale the contour surface should become more
powerful as a predictive tool for the development of radon hazard potential maps. It will also aid in the analysis of
the explanatory causes  geologic or otherwise  of high radon areas.
REFERENCES
ARCVIEW, Environmental System Research Institute, Redlines, CA
Atlas GIs, Strategic Mapping, Inc., Santa Clara, CA
Barber, Gerald M. Elementary Statistics for Geographers, New York: The Guilford Press, 1988, pp. 1141 19

Christensen, Lindsay G. and Rigby, James G . GIs Applications of Radon Hazard Studies An Example from
Nevada, Nevada Bureau of Mines and Geology, 1996
Clark, W.A.V. and Hosking, P.L. Statistical Methods for Geographers, New York: John Wiley & Sons, Inc., 1986
Clark, W.A.V. and Avery, K. (1 976) The effects of Data Aggregation in Statistical Analysis, Geographical Analysis
8:428438.
Illinois Department of Energy and Natural Resources, Digital GIs Data, 1994
Kumar, Ashok, Heydinger, Andrew G.and Harrell, James A. Development of an Indoor Radon Information System
for Ohio and Its Application in the Study of the Geology of Radon in Ohio, Ohio Air Quality Development
Authority, 1990
Lineback, Jerry A. Quarternary Deposits of Illinois: Illinois State Geological Survey, scale 1:500,000, 1979
Nero, Anthony Developing a Methodology for Identifying High Radon Areas,
http://eande.lbl.gov/CBS/Newsletter/NL3/Radon2.html, 1995, p. 3.
pcARC/INFO, Environmental Systems Research Institute, Redlands, CA
SURFER For Windows, Golden Software, Inc. Golden, CO
United States Bureau of the Census, 1993. TIGER Line Postcensus Files (Illinois), Washington, D.C.
United States Geological Survey, Geologic Radon Potential Maps for Counties in the Washington, D.C.
MetroArea,http://sedwww.cr.usgs.gov:8080/radon/mcounty.html
United States Environmental Protection Agency, EPAYsMap or Radon Zones  Illinois, Radon Division, Office of
Radiation and Indoor Air, US. Environmental Protection Agency, September, 1993
1996 International Radon Symposium I  4.7
Willman, H.B.and Frye, John C. Pleistocene Stratigraphy of Illinois, Bulletin 94, Illinois State Geological Survey,
Urbana, IL, 1970

1996 International Radon Symposium I 4.8
Table 1: Radon Concentration
Analysis by County
FORh
TOTALS
NUMBER OF
OBSERVATIONS
MEAN
102
3.85
VARIANCE STANDARD COEFF. OF
DEVIATION VARIATION
3.868
1.966723163 0.510837185

1996 International Radon Symposium I 4.11
MAP 3
Average Radon Concentrations
By Counties  Contoured Values

1996 International Radon Symposium 1 4.12
.
..
MAP 4
Average Radon Concentrations
By Counties (Surface)  pCi/L

1996 International Radon Symposium 1 4.13
Table 2: Analysis of Variance
County  Scale Data
Source of
Variation
1
1
1
Sum of
Squares
Between I 75.948 1
Within 1 393.57 1
Total
1 469.518 1
Degrees of
Freedom
Variance
FRatio
5
97
1,
1
....
15.189
4.057
1
1
102
1
19.246
1
1
I
1
n
3.744
1
Table 3: Radon Concentration Data
Zipcode Data by Quarternary Deposits
d
'a
g

LA
FORMATION
TOTALS
NUMBER OF
OBSERVATIONS
NUMBER OF
OBSERVATIONS> 4 pCi/L
PERCENT
> 4.0 pCi/L
MEAN
VARIANCE
STANDARD
DEVIATION
COEFF. OF
VARIATION
I
I
I

1996 International Radon Symposium 1 4. I6
MAP 6
Average Radon Concentrations
By Zipcode Area  Cumtoured Values

1996 International Radon Symposium 1 4.17

1996 International Radon Symposium 1 4.18
Table 4: Analysis of Variance
Zipcode Scale.Data

Source of
Variation
Between
Within
Total
Sum of
Squares
47 1.77
3963.35
4435.12
Degrees of
Freedom
Variance
FRatio
8
854
862
58.97
4.59
63.56
12.84
Table 5: Radon Concentration Data
BloomingtonNormal Data
UNITS
CLASS
TOTALS
MEAN
MEDIAN
VARIANCE
ST. DEVIATION
% ABOVE 4.00
% BELOW 4.00
hm
wb
wbn
ws
wsm
TOTAL
32
90
60
24
185
391
5.5
3.74
3.96
3.24
3.51
2.70
5.18
9.69
3.72
3.00
4.04
6.22
3.15
39%
61%
3.38
35%
65%
3.19
42%
58%
2.97
37%
63%
4.16
44%
56%
hm: Mackinaw member of Henry fonnalion; wb: Batcaowntill member of Wedron
formation; ws:Snider till member of Wedron formation; "rn" as identifier indicatesmoraine

1996 International Radon Symposium 1 4.22

1996 International Radon Symposium I 4.23
FIGURE 2
Indoor Radon Levels by Type
Count