Environ Geol (2008) 54:945–956 DOI 10.1007/s00254-007-0897-1

ORIGINAL PAPER

Sinkhole hazard assessment in Minnesota using a decision tree model

Yongli Gao Æ E. Calvin Alexander Jr

Received: 31 December 2005 / Accepted: 23 March 2007 / Published online: 18 July 2007 Ó Springer-Verlag 2007

Abstract An understanding of what influences sinkhole database (KDD) Á Nearest neighbor analysis (NNA) Á formation and the ability to accurately predict sinkhole Minnesota hazards is critical to environmental management efforts in the karst lands of southeastern Minnesota. Based on the distribution of distances to the nearest sinkhole, sinkhole Introduction density, bedrock geology and depth to bedrock in south- eastern Minnesota and northwestern Iowa, a decision tree An understanding of what influences sinkhole formation model has been developed to construct maps of sinkhole and the ability to accurately predict sinkhole hazards is probability in Minnesota. The decision tree model was critical to environmental management efforts in the karst converted as cartographic models and implemented in lands of southeastern Minnesota. Several regression anal- ArcGIS to create a preliminary sinkhole probability map yses and mathematical models have been conducted to in Goodhue, Wabasha, Olmsted, Fillmore, and Mower assess sinkhole hazards and develop sinkhole probability Counties. This model quantifies bedrock geology, depth to maps. Matschinski (1968) treated sinkholes as points and bedrock, sinkhole density, and neighborhood effects in did not consider their dimensions and orientations. LaValle southeastern Minnesota but excludes potential controlling (1967, 1968) investigated the sinkhole morphology in factors such as structural control, topographic settings, south central Kentucky. Multiple regression analyses were human activities and land-use. The sinkhole probability used to study the relationships among drainage systems, map needs to be verified and updated as more sinkholes are karst relief, structurally aligned depressions, limestone mapped and more information about sinkhole formation is density index, insoluble residue content, flank slope, and obtained. bedding thickness. However, despite his elegant statistical arguments, his conclusions are not convincing and Wil- Keywords Decision tree model Á Sinkhole probability Á liams (1972) criticized some of his geomorphic assump- Karst feature database (KFD) Á Knowledge discovery in tions. For instance, the karst relief ratio seems insufficient as a measure of hydraulic gradient. Williams (1972) emphasized that a firm geomorphic foundation is necessary prior to morphometric studies. McConnell and Horn (1972) Y. Gao (&) tested several hypotheses about sinkhole development. Department of Physics, Astronomy and Geology, These hypotheses included Poisson models (single random East Tennessee State University, process), Negative Binomial models (contagious process), Johnson City, TN 37614, USA e-mail: [email protected] and Mixed Poisson models (two mutually independent random processes). The Mixed Poisson models fitted the E. C. Alexander Jr sinkhole data in Mitchell Plain of southern Indiana. Department of Geology and Geophysics, McConnell and Horn (1972) interpreted this fit in terms of University of Minnesota, 310 Pillsbury Dr., SE, Minneapolis, MN 55455, USA two mutually independent random processes of ‘‘cavern e-mail: [email protected] roof collapse’’ and ‘‘corrosion’’ for sinkhole development. 123 946 Environ Geol (2008) 54:945–956

Palmquist (1977) demonstrated that the major control on that orientations of sinkhole pairs correspond to regional doline density is the amount of groundwater recharge in and local structures. three counties of northern Iowa by using regression anal- yses. According to Kuhns et al. (1987), a loose zone of mixtures of sand, silt and clay was a possible indicator of Minnesota karst and sinkhole distribution ongoing sinkhole activity in Maitland, Florida. Upchurch and Littlefield’s (1987) moving-average analyses and chi- Southeastern Minnesota is part of the Upper Mississippi square tests showed that ancient sinkholes in bare karst Valley Karst (Hedges and Alexander 1985) that includes areas of twelve 7.5’ quadrangles in Hillsborough County, northwestern Illinois, southwestern Wisconsin, and north- Florida, could significantly predict the locations of modern eastern Iowa. Karst lands in Minnesota are developed on sinkholes. Veni’s (1987) research showed that fracture Paleozoic carbonate and sandstone bedrock. Most surficial permeability should be considered when assessing the karst features such as sinkholes, stream sinks, springs, and sensitivity of a karst area to human development based on a caves are found only in those areas with less than 50 ft survey of over 300 caves and sinkholes in the southeastern (15 m) of sedimentary cover over bedrock surface (Fig. 1). corner of Edwards Plateau, Texas. Data sources for bedrock geology and depth to bedrock GIS based models have been widely used for decision- geology in southeastern Minnesota are listed in Table 2. making on sinkhole hazard analysis in the last decade (Gao Figure 2 shows significant sandstone karst developed in and Alexander 2003). Whitman and Gubbels (1999) dem- Pine County (Shade 2002). Much of the scientific karst onstrated the importance of hydrostatic loads in sinkhole literature (Davies and Legrand 1972; Dougherty et al. hazard, and this information can then be used to construct 1998; Troester and Moore 1989) has focused on other parts predictive models of sinkhole hazard. Lei et al. (2001) of the country and world and few scientific descriptions of investigated sinkhole distributions based on factors such as the Upper Mississippi Valley Karst exist. Nevertheless, the types of carbonate rock, the geomorphologic settings, hy- karst lands of southeastern Minnesota present an ongoing drogeologic conditions, human activities, and land use. All challenge to environmental planners and researchers and factors were digitized as corresponding GIS coverages and have been the focus of a series of research projects and processed in a grid-based IDRISI GIS system. A series of studies by researchers for more than 30 years (Giammona grid-based relative risk maps of sinkhole hazard were 1973; Wopat 1974). developed for four cities, Tangshan, Xiangtan, Yulin, and Gao et al. (2001) divided the sinkholes in southeastern Liupanshui in China (Lei et al. 2001). Jiang et al. (2005) Minnesota into three karst groups: Cedar Valley Karst expanded the sinkhole hazard assessment to a national (Middle ), Galena/Spillville Karst (Upper Ordo- scale and applied analytic hierarchy process (AHP) to de- vician/Middle Devonian), and Prairie du Chien Karst velop a relative sinkhole risk map in China. Zhou et al. (Lower ). Gao et al. (2005) revised the classi- (2003) conducted orientation analysis of sinkholes along fication to Prairie du Chien Karst (Lower Ordovician, I-70 highway near Fredrick, Maryland and demonstrated closest to Mississippi river valley), Galena-Maquoketa

Fig. 1 Minnesota Karst lands. This map overlays the areas with <50 ft (15 m), 50–100 ft (15–30 m), and >100 ft (30 m) of surficial cover over the areas underlain by carbonate bedrock. This map emphasizes the patchy nature of the thick sediment cover and the importance of site-specific information for land-use decisions

123 Environ Geol (2008) 54:945–956 947

Fig. 2 Sandstone Karst in Pine County. The red triangles are mapped sinkholes that indicate a sandstone karst area developed in the Mesoproterozoic Hinckley Sandstone. Pine County is in east central Minnesota, about 100 miles north of Twin Cities (data sources: Boerboom 2001; Shade et al. 2001)

Karst (Upper Ordovician) and Devonian Karst (the most quantify the map-making process to reduce the potential distant from Mississippi river valley) based on more recent for subjective biases for developing sinkhole probability sinkhole and bedrock distribution in southeastern Minne- maps. A revised map of relative sinkhole risk in Fillmore sota and northwestern Iowa. Figure 3 shows the three County (Gao and Alexander 2003) was constructed by bands of sinkholes distributed in these three karst groups. implementing a decision tree model in a GIS system. This All analyses of sinkhole distribution in southeastern paper describes the expansion of the Fillmore County Minnesota reveal that sinkholes in Minnesota tend to be sinkhole hazard assessment to southeastern Minnesota clustered at a regional scale (Gao et al. 2002; Magdalene using the decision tree model. The resulting regional and Alexander 1995). Gao et al. (2005) studied sinkhole sinkhole probability map includes Fillmore, Goodhue, distribution in Minnesota at different scales using nearest Mower, Olmsted, and Wabasha Counties where relatively neighbor analysis (NNA). The sinkhole distribution pattern complete sinkhole datasets exist. changes from clustered to random to regular as the scale of the analysis decreases from 10–100 km2 to 5–30 km2 to 2– 10 km2. The distribution of distance to the nearest neighbor Knowledge discovery and decision tree model (DNN) within the sinkhole plains of Fillmore County fits a lognormal distribution (Gao et al. 2005). Isolated sinkholes Spatial data mining aims at discovering spatial patterns occur more often in Prairie du Chien Karst (Gao et al. embedded in large spatial databases (Shekhar and Chawla 2002) compared to sinkholes in Devonian Karst and 2002). The goal of Knowledge Discovery in Database Galena-Maquoketa Karst. (KDD) is to extract knowledge from data in the context of Sinkhole probability maps have been developed for large databases (Fayyad et al. 1996). Machine learning is southeastern Minnesota in Winona County (Dalgleish and one of the most popular methods for both KDD and spatial Alexander 1984), Olmsted County (Alexander and Maki data mining. Machine learning was originally developed in 1988), Fillmore County (Witthuhn and Alexander 1995), the field of artificial intelligence and became an important and Goodhue County (Alexander et al. 2003). A Karst approach in data mining in the 1990s. The process of Hydrogeomorphic Unit map, which includes known karst machine learning allows systems to learn and improve with features has been developed for Mower County (Green experience (Mitchell 1997). Most machine learning models et al. 2002a, b). These mapping efforts occurred during the use inductive inference to predict the overall system from a time period that digital GIS technology was introduced and set of training examples. Decision tree learning is one of became an integral part of geologic mapping in Minnesota. the most widely used methods for inductive inference As part of the transition of this mapping effort into a digital (Mitchell 1997; Winston 1992). Decision trees are con- GIS environment, a decision tree model is created to structed from top-down, divide-and-conquer strategies that

123 948 Environ Geol (2008) 54:945–956

Fig. 3 Sinkhole distribution and bedrock geology in southeastern Minnesota. Notice the three bands of karst development that are arranged parallel to the Mississippi River: Prairie du Chien Karst (Lower Ordovician), Galena-Maquoketa Karst (Upper Ordovician) and Devonian Karst partition sets of objects into smaller subsets along with the Model implementation growth of the tree (Quinlan 1990). The structures of decision trees include series of tree Based on the available karst feature data stored in the Karst nodes and branches. Decision trees have three types of Feature Database (KFD) of Minnesota, the primary con- nodes: (1) root nodes that have no incoming braches; (2) trols on sinkhole development are stratigraphic position or internal nodes that connect with one incoming branch and bedrock geology and the thickness of surficial cover over two or more outgoing branches; and (3) leaf nodes that bedrock surface. Major secondary controls appear to be have one incoming branch and no outgoing branches. Each structural geology such as joints and position in the land- non-leaf node is associated with attribute values of the scape. However, the majority of the sinkhole population database. A test condition will be made for each non-leaf tends to form in highly concentrated zones. Neighborhood node to partition the data set (Tan et al. 2005). The leaf node represents the classification of the decision tree model. Figure 4 illustrates how to classify sinkhole prob- Root Node Bedrock Units ability using a decision tree model. For example, the root Boerlow CJDN Between CJDN and KRET node at the top is associated with the test condition of Above DCUU bedrock formation in southeastern Minnesota. The karst No Sinkhole Depth to Bedrock probability dataset is then partitioned into two sets of data. Areas on Internal ≥ top of bedrocks that are older than Ordovician or younger 50 ft. (15m) < 50 ft. (15m) Nodes Leaf Nodes than Devonian can be classified as no probability area since no sinkholes have been found in those areas. This Low probability Bedrock Units classification is represented as the first leaf node of the decision tree, ‘‘no sinkhole probability’’. Areas underlain by bedrocks in Ordovician or Devonian will be further Fig. 4 The structure of a decision tree to classify sinkhole probability tested down the decision tree to define other sinkhole in southeastern Minnesota (See Table 1 for the sequence of bedrock probability areas. units) 123 Environ Geol (2008) 54:945–956 949

Fig. 5 Decision Tree model for Bedrock Units sinkhole probability map in Below CJDN or BEeTtween CJDN and KR southeastern Minnesota (See Above DCUU Table 1 for the sequence of Dkepth to Bedroc bedrock units) No sinkhole probability

≥ 50 ft.(15 m) < 50 ft. (15 m)

Low probability Bedrock Units Kcakrst Bedro DCCMLC, ODPG, or OG Sinkhole Density Low probability 2 2 ≥ 1/km < 1/km

Low to moderate probability Bedrock Units OGPR – OMAQ, DSPL, DCLP, DCUM, or DCUU OPDC or OSTP

Dist. to Nearest Dist. to Nearest Sinkhole Sinkhole

≥ ≥ 700 m < 700 m 700 m . 400 – 700 m < 400 m

Moderate to high probability High probability Moderate to high probability High probability Sinkhole plain

effect plays a very important role in sinkhole distribution to bedrock coverages from each county were reclassified and formation (Gao and Alexander 2003). and merged to generate reclassified bedrock geology and Based on the sinkhole distribution of distances to the depth to bedrock coverages in the five county areas. An nearest sinkhole, sinkhole density, bedrock geology and intersection of reclassified bedrock geology and depth to depth to bedrock in southeastern Minnesota and north- bedrock coverages generates reclassified karst coverage in western Iowa, a decision tree model has been developed to the five-county area. The karst areas were reclassified as construct sinkhole probability maps. This model quantifies non-carbonate, active karst, transition karst, and covered bedrock geology, depth to bedrock, sinkhole density, and karst. neighborhood effects in southeastern Minnesota but does Figure 7 represents the cartographic model to create a not include potential controlling factors such as structural sinkhole probability map in Goodhue, Wabasha, Olmsted, control, topographic settings, human activities and land-use. Fillmore, and Mower Counties. This model first creates The mean and standard deviation of DNN were used to sinkhole buffer zones and densities. The sinkhole density define boundaries for extended NNA and sinkhole proba- and buffer zones were then intersected to generate a cov- bility modeling (Gao and Alexander 2003). According to erage including attributes of both buffer distances and Gao (2002), 95% of Devonian sinkholes and 99% of the sinkhole density. The final intersection of buffer zone, Galena-Maquoketa sinkholes are less than 400 m away to density, and reclassified karst areas is reclassified accord- their nearest neighbor; 70% of Prairie du Chien sinkholes ing to the decision tree model to construct the final sinkhole are less than 700 m away to their nearest neighbor. probability in the five-county area. Therefore, 400 and 700 m were used to define concentrated The cartographic models illustrated by Figs. 6, 7 were sinkhole zones for the two highest sinkhole probability implemented using ArcView GIS. The intermediate cov- areas. erages were cleaned or built prior to proceeding to the next The non-leaf nodes of the decision tree are associated step to ensure correct topography and to reduce propaga- with attributes such as bedrock geology, depth to bedrock, tion errors (Gao and Alexander 2003). sinkhole density, and distances to the nearest sinkhole stored in the KFD of Minnesota. A test condition was conducted on each non-leaf node to partition the decision Results tree. By sorting down the decision tree from the top to the bottom recursively, the karst land of southeastern Minne- Figure 8 is a draft of sinkhole probability map in Goodhue, sota can be classified into six probability areas which are Wabasha, Olmsted, Fillmore, and Mower Counties by represented by the leaf nodes of the decision tree (Fig. 5). implementing the decision tree model in ArcView GIS. Figure 6 is the cartographic model used to create Relative ‘‘Sinkhole risk’’ was used in the early attempt of reclassified karst areas in Goodhue, Wabasha, Olmsted, sinkhole hazard assessment using decision tree model (Gao Fillmore, and Mower Counties. Bedrock geology and depth and Alexander 2003). However, the model did not capture

123 950 Environ Geol (2008) 54:945–956

Bedrock Geology Reclass Grouped Bedrock Units Goodhue Co. Goodhue Co.

Bedrock Geology Reclass Grouped Bedrock Units Wabasha Co. Wabasha Co. Union Reclassified Bedrock Geology Reclass Grouped Bedrock Units Bedrock Units Olmsted Co. Olmsted Co. Five Counties Bruild o Bedrock Geology Reclass Grouped Bedrock Units Clean Fillmore Co. Fillmore Co.

Bedrock Geology Reclass Grouped Bedrock Units Icntterse Mower Co. Mower Co. Reclass Reclassified Karst areas Five Counties Depth to Bedrock Reclass Gkrouped Depth to Bedroc Bruild o Goodhue Co. Goodhue Co. Clean

Depth to Bedrock Reclass Gkrouped Depth to Bedroc Wabasha Co. Wabasha Co. Union Reclassified Reclass Gkrouped Depth to Bedroc Depth to Bedrock Depth to Bedrock Olmsted Co. Olmsted Co. Five Counties Bruild o Depth to Bedrock Reclass Grouped Depth to Bedrock Clean F.illmore Co Fillmore Co.

Depth to Bedrock Reclass Grouped Depth to Bedrock Mower Co. Mower Co.

Fig. 6 Cartographic modeling flowchart to create reclassified karst areas in Goodhue, Wabasha, Olmsted, Fillmore, and Mower Counties. The reclassified karst areas are used to construct a sinkhole probability map for the five-county area

Reclassified Karst Areas Five Counties

Icntterse Reclass Sinkhole Probability Five Counties Sinkholes Buffer Sinkhole Zones Bruild o Five Counties Five Counties Clean Bruild o Clean Icnterse t Density and Buffer Zones Five Counties Convert to Bruild o Calculate Polygon Clean Sinkholes Density Sinkhole Densities Five Counties Density Grid Five Counties Five Counties

Fig. 7 Cartographic modeling flowchart to create a sinkhole probability map in Goodhue, Wabasha, Olmsted, Fillmore, and Mower Counties the detailed ‘‘risk’’ level in moderate to low ‘‘risk’’ areas. SINKHOLE PLAINS. The descriptions of these probability The term sinkhole probability is resumed for sinkhole areas are as follows: hazard assessment in southeastern Minnesota. To be con- sistent with the published sinkhole probability maps in No sinkhole probability Winona, Fillmore, Olmsted, and Goodhue Counties, the sinkhole probability map includes six probability zones, The only places where sinkholes cannot form are those NO SINKHOLE PROBABILITY, LOW PROBABILITY, areas where non-carbonate formations are the uppermost LOW TO MODERATE PROBABILITY, MODERATE bedrock. Many of these areas are in deep river valleys in TO HIGH PROBABILITY, HIGH PROBABILITY, and which erosion has removed all of the carbonate bedrock.

123 Environ Geol (2008) 54:945–956 951

Fig. 8 A partial sinkhole probability map in Goodhue, Wabasha, Olmsted, Fillmore, and Mower Counties

Low probability Moderate to high probability

Areas underlain by carbonate bedrock or mixtures of car- Areas in which sinkholes are a routine part of the land- bonate and non-carbonate bedrocks where the boundaries scape. Sinkholes occur as diffuse clusters of three or more are not clearly defined, but in which essentially no sink- sinkholes. The minimum sinkhole density is 1 per square holes were observed, are shown on the map as having low kilometer. probability for sinkhole development. Some of these areas are slopes containing abundant evidence of past karst High probability solution, such as caves and enlarged joints. If new sink- holes do form, they may not be noticed because of the rapid Areas in which sinkholes are a common part of the land- erosion of the down slope rim of the sinkhole and filling of scape. The minimum distance to the nearest sinkhole is the sinkhole with sediments. 700 m, and the minimum sinkhole density is 1 per square kilometer. New sinkholes periodically appear and many Low to moderate probability more are expected to form. These areas are usually near areas where sinkhole density exceeds 10 per square kilo- Areas underlain by carbonate rock covered with only a thin meter, but exhibit a noticeably lower density within the layer of surficial material, but containing only widely area or areas where sinkholes describe obvious visible scattered individual sinkholes or isolated clusters of two or trends, such as along fractures in the bedrock. Clusters of three sinkholes. The sinkhole density is less than one sinkholes may develop in response to local changes, such sinkhole per square kilometer. The expected future sinkhole as fluctuation of the water table, construction of a building development is generally low in these areas, but is moderate or water-retention facility, or hydraulic changes due to the where small sinkhole clusters have developed. These areas formation or reactivation of isolated sinkholes. of low to moderate probability are underlain by any of the carbonate bedrock units, but coverage by unconsolidated Sinkhole plains materials and soils is usually less than 50 ft (15 m). The near-surface carbonate aquifers in these areas are clearly Areas in which sinkholes are the dominant landscape fea- karst aquifers, although the distribution of sinkholes within tures. The minimum distance to the nearest sinkhole is these units varies considerably. The presence or absence of 400 m, and the minimum sinkhole density is 1 per square sinkholes alone is not a sufficient predictor of susceptibility kilometer. Essentially all of the precipitation that is not of the groundwater to contamination. lost to evapotranspiration either infiltrates or runs into a

123 952 Environ Geol (2008) 54:945–956

Table 1 Lithostratigraphic codes and karst groups in southeastern Minnesota as used in spatial analysis and probability modeling (modified from Gao et al. 2005) Series Group, formation, member Unit symbol Karst group

Middle Devonian Lithograph city formation–Hinkle & DCUU Devonian Karst Eagle Center Mbrs DCUM Chickasaw member– DCLC a DCLP Devonian Karst DSPL Upper Ordovician Maquoketa & Dubuque formations; OMAQ Galena-Maquoketa Karst Galena Group (Stewartville, Prosser ODUB and Cummingsville formations) OGAL OGSV OGPR OGCM –Glenwood formations ODPG b St Peter Sandstone OSTP Prairie du Chien Karst Lower Ordovician Prairie du Chien Group OPDC OPSH (Shakopee and Oneota formations) OPOD a The Devonian karst as defined above may need to be subdivided in future work b For map compatibility reasons, the map bottom of the Galena-Maquoketa karst in this paper is taken to be the Cummingsville/Prosser (OGCM/ OGPR) contact. The top two thirds of the Cummingsville formation is an active part of this karst, but does not contain many sinkholes sinkhole. New sinkholes often appear. Sinkholes are a sinkholes exist outside of the active karst areas, they are major problem for agriculture and prevent the cultivation near the boundary between active and non-active karst of a significant fraction of many fields. Sinkhole collapse is areas and this probably indicates that some errors exist in a major, ongoing concern for roads and any structures or the bedrock geology and depth to bedrock maps. There are facilities. also major scale problems. The bedrock geology and depth to bedrock information was mapped at 1:100,000 scale and becomes increasingly inaccurate as the scale is increased Discussion (Table 2). Other analyses that were conducted to look for con- This sinkhole probability map is mainly based on sinkhole trolling factors of sinkhole distributions include searches distribution in Devonian and Galena-Maquoketa karst for correlations with land surface slope, depth to bedrock, areas. The sinkhole distribution in the Prairie du Chien and bedrock dips. Slope values were derived from USGS Karst is significantly different and may not be adequately 30 m DEMs for the locations of the sinkholes. Figure 9 described by this algorithm. A second problem in Prairie du shows the histogram of the slopes of the land at sinkhole Chien Karst is that these areas are not fully mapped. locations in Olmsted County. The slopes on which 64 and Sinkholes in Winona County were more intensively 83% sinkholes are located in this county are less than 5° investigated and present more small clusters of sinkholes and 10°, respectively. The majority of the sinkholes are on than the rest of Prairie du Chien Karst. relatively flat surface. No detailed correlation has been As shown in Table 1, boundaries for karst areas, espe- detected between sinkhole distribution and surface slope. cially in Devonian Karst, are not well defined. Approxi- The slope values derived from the 30 m DEM are not mately 75% sinkholes concentrate in 3% of active karst accurate enough in many areas in southeastern Minnesota areas in the Devonian Karst. Even for carbonate bedrock due to the widely distributed river and stream valleys. This units, different counties may have different standards. For correlation need to be further tested when more accurate example, in Goodhue County, few sinkholes were observed DEM data is developed. in the , the lower formation in the Prairie Correlation between sinkhole distribution and depth to du Chien Group. However, in Olmsted County, Oneota bedrock is limited by the 50 ft (15 m) resolution of avail- Dolomite is not separated from the Prairie du Chien Group. able depth to bedrock information and the absence of Another problem associated with this sinkhole proba- structural contour maps. Refined depth to bedrock and bility map is that the boundaries for bedrock geology and structural contour maps were attempted. Figure 10 shows depth to bedrock are not accurately known. While some the distribution of sinkholes and water wells used to obtain

123 Environ Geol (2008) 54:945–956 953

Table 2 Data sources for County or multi-county area Bedrock geology Depth to bedrock bedrock geology and depth to bedrock in southeastern Fillmore County (Mossler 1995a) (Mossler and Hobbs 1995) Minnesota Goodhue County (Runkel 1998) (Setterholm and Bloomgren 1998) Houston County (Runkel 1996) (Runkel 1996) Mower County (Mossler 1998a) (Mossler 1998b) Olmsted County (Olsen 1988a) (Olsen 1988b) Rice County (Mossler 1995b) (Mossler 1995c) Wabasha County (Mossler 2001a) (Mossler 2001b) Steele, Dodge, Olmsted (Mossler 2004a) (Mossler 2004b) and Winona Counties Seven-county metropolitan (Mossler and (Mossler and Twin Cities area Tipping 2000) Tipping 2000) 13 counties of south central (Water Resources (Water Resources Minnesota Center 1999a) Center 1999b)

original depth to bedrock information. A model was built Conclusions in ArcView GIS to derive several depth to bedrock grids from 95% randomly selected water wells in Olmsted This decision tree model quantifies bedrock geology, depth County using different interpolation methods (Gao 2007). to bedrock, sinkhole density, and distances to the nearest The remaining 5% of the water wells were used to evaluate sinkhole in southeastern Minnesota but potential control- the accuracy of the different interpolation methods. A ling factors such as structural control, topographic settings, similar model was also constructed to calculate the bedrock human activities, and land-use are not yet built into the dips in this county. These models were implemented in model due to the lack of data coverage. Compared with ArcView GIS and the results show that the depth to bed- earlier, conventional versions of county scale sinkhole rock and bedrock dip are statistically acceptable in areas probability map, the decision tree model reproduces most where water wells are highly concentrated. Unfortunately, of the important features seen on the original maps in the the water wells are usually not drilled in areas where many high density areas and has led to new insights about sinkholes exist. Therefore, more accurate depth to bedrock the internal structure of high density areas. However, the and bedrock dips do not exist in areas of highly con- decision tree model is less successful in capturing the de- centrated sinkholes due to the lack of water well data. tails of the lower density areas especially in Prairie du Geophysical explorations such as seismic exploration, microgravity surveys, electrical resistivity, and ground penetrating radar (GPR) could be used to detect more accurate information about depth to bedrock and bedrock dips in selected areas with highly concentrated sinkholes.

500 1.0 0.9 400 0.8

y cumulative fraction

c 0.7

n 300 0.6 e

u 0.5 q

e 200 0.4 r

F 0.3 100 0.2 0.1 0 0.0 0 5 10 15 20 More Slope

Fig. 9 Histogram of the slopes of the land surfaces at sinkhole Fig. 10 Distributions of water wells and sinkholes in Olmsted locations in Olmsted County County 123 954 Environ Geol (2008) 54:945–956

Chien Karst, where the subjective criteria are more sig- Dougherty PH, Jameson RA, Worthington SRH, Huppert GN, nificant and has no simple way of extrapolating across Wheeler BJ, Hess JW (1998) Karst regions of the eastern United States with special emphasis on the Friars Hole Cave system. In: areas in which the sinkholes have not been mapped. This Yuan D, Liu Z (eds) Global karst correlation. Science Press, result confirms and expands Gao and Alexander’s (2003) Beijing, pp 137–155 conclusions with regard to the entire Minnesota data set. Fayyad UM, Piatetsky-Shapiro G, Smyth P (1996) From data mining Even though the decision tree model defines some to knowledge discovery: an overview. In: Fayyad UM, Piatetsky- Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowl- mathematical boundaries such as the minimum distance to edge discovery and data mining, AAAI Press, Menlo Park, pp 1– the nearest sinkhole and minimum sinkhole density for 34 relatively higher sinkhole risk areas, the sinkhole proba- Gao Y (2002) Karst feature distribution in southeastern Minnesota: bility map developed using this model does not replace extending GIS-based database for spatial analysis and resource management. Ph.D. thesis. Department of Geology and Geo- original county scale probability maps. Boundaries of the physics, University of Minnesota, 210 p probability map developed for the five-county area need to Gao Y (2007) Spatial operations in a GIS-based karst feature be adjusted based on local karst feature distribution, de- database. Environ Geol (in press) tailed depth to bedrock, topographical setting, structural Gao Y, Alexander EC Jr (2003) A mathematical model for a sinkhole probability map in Fillmore County, Minnesota. In: Beck BF controls, human activities, and land-use. The sinkhole (eds) Sinkholes and the engineering and environmental impacts probability map needs to be verified and updated as more of karsts. Proceedings of the ninth multidisciplinary conference. sinkholes are mapped and more information about sinkhole Huntsville, Alabama, September 6–10, ASCE Geotechnical formation is obtained. Special Publication, no. 122, pp 439–449 Gao Y, Alexander EC Jr, Barnes RJ (2005) Karst database implementation in Minnesota: analysis of sinkhole distribution. Acknowledgements The decision tree model is built upon a data- Environ Geol 47(8):1083–1098 base including karst feature data collected by a series of research and Gao Y, Alexander EC Jr, Tipping RG (2001) Application of GIS mapping projects by researchers for three decades. The karst feature technology to study karst features of southeastern Minnesota. In: locating and verification efforts of Scott Alexander, David Berner, Beck BF, Herring JG (eds) Geotechnical and environmental Janet Dalgleish, Jeff Green, Sue Magdalene, Geri Maki, Ron Spong, applications of karst geology and hydrology. Proceedings of the Robert Tipping, Bev Shade, Betty Wheeler, Kathleen Witthuhn, and eighth multidisciplinary conference on sinkholes and the engi- many karst workers and researchers are greatly appreciated. These neering and environmental impacts of karsts. Louisville, KY, 1– research projects were supported by a series of grants and contracts 4, April, A. A. Balkema, Lisse, pp 83–88 from the Legislative Commission on Minnesota Resources via the Gao Y, Alexander EC Jr, Tipping RG (2002) The Development of a Minnesota Geological Survey (MGS), Minnesota Department of karst feature database for southeastern Minnesota. J Cave Karst Natural Resources (MnDNR) and the University of Minnesota Stud 64(1):51–57 Department of Geology and Geophysics and support from the Min- Giammona CP (1973) Fluorescent dye determination of groundwater nesota Department of Health (MnDH) and the Counties involved. We movement and contamination in permeable rock strata. Int J thank Professor Shashi Shekhar and research fellow Ranga Raju Speleology 5(3–4):201–208 Vatsavai of the Department of Computer Science at the University of Green JA, Marken WJ, Alexander EC Jr, Alexander SC (2002a) Karst Minnesota, for sharing their ideas and experience of using decision unit mapping using geographic information system technology, tree model in spatial data mining. Mower County, Minnesota, USA. Environ Geol 42(5):57–461 Green JA, Alexander EC Jr, Marken WJ, Alexander SC (2002b) Karst hydrogeomorphic units. Geologic Atlas of Mower County, References Minnesota, County Atlas Series C-11, Part B, Plate 10 (1:100,000). Minnesota Department of Natural Resources, Alexander EC Jr, Berner D, Gao Y, Green JA (2003) Sinkholes and Division of Waters Sinkhole probability, and springs and seeps. Geologic Atlas of Hedges J, Alexander EC Jr (1985) Karst-related features of the Upper Goodhue County, Minnesota, County Atlas Series C-12, Part B, Mississippi valley region. Stud Speleology 6:41–49 Plate 10 (1:100,000). Minnesota Department of Natural Re- Jiang X, Lei M, Li Y, Dai J (2005) National-scale risk assessment of sources, Division of Waters sinkhole hazard in China. In: Beck BF (ed) Sinkholes and the Alexander EC Jr, Maki GL (1988) Sinkholes and Sinkhole probabil- engineering and environmental impacts of karst. Proceedings of ity. Geologic Atlas Olmsted County, Minnesota, County Atlas the tenth multidisciplinary conference San Antonio, Texas, Series C-3, Plate 7 (1:100,000). Minnesota Geological Survey, September 24–28, ASCE geotechnical special publication, no. University of Minnesota 144, pp 649–658 Boerboom TJ (2001) Bedrock geologic map and sections. Geologic Kuhns GL, Phelps LM, Marshall BP, Cox EA, III (1987) Subsurface Atlas of Pine County, Minnesota, County Atlas Series C-13, Part indicators of potential sinkhole activity at the Maitland Colon- A, Plate 2 (1:100,000). Minnesota Geological Survey, University nades project in Maitland, Florida. In: Beck BF, Wilson WL of Minnesota (eds) Karst hydrogeology: Engineering and environmental Dalgleish JD, Alexander EC Jr (1984) Sinkholes and sinkhole applications. Proceedings of the second multidisciplinary con- probability. Geologic Atlas Winona County, Minnesota, County ference on sinkholes and the environmental impacts of karst. Atlas Series C-2, Plate 5 (1:100,000). Minnesota Geological Orlando, Florida, 9–11 February, A. A. Balkema, Rotterdam, pp Survey, University of Minnesota 365–381 Davies WE, Legrand HE (1972) Karst of the United States. In: Herak LaValle P (1967) Some aspects of linear karst depression develop- M, Stringfield VT (eds) Karst, important karst regions of the ment in south central Kentucky. Ann Assoc Am Geogr 57(1):49– northern hemisphere. Elsevier, Amsterdam, pp 466–505 71

123 Environ Geol (2008) 54:945–956 955

LaValle P (1968) Karst depression morphology in south central Atlas Series C-8, Part A, Plate 4 (1:100,000). Minnesota Kentucky. Geogr Ann 50(2):94–108 geological survey, University of Minnesota Lei M, Jiang X, Li Y (2001) New advances of karst collapse research Mossler JH, Tipping RG (2000) Bedrock geology and structure of the in China. In: Beck BF, Herring JG (eds) Geotechnical and seven-county metropolitan twin cities area, Minnesota. Miscel- environmental applications of karst geology and hydrology. laneous Map Series, m-104 (1:100,000). Minnesota geological Proceedings of the eighth multidisciplinary conference on survey, University of Minnesota sinkholes and the engineering and environmental impacts of Olsen BM (1988a) Bedrock geology. Geologic Atlas Olmsted karsts. Louisville, KY, 1–4, April, A. A. Balkema, Lisse, pp County, Minnesota, County Atlas Series C-3, Plate 2 145–151 (1:100,000). Minnesota geological survey, University of Min- Magdalene S, Alexander EC Jr (1995) Sinkhole distribution in nesota Winona County, Minnesota revisited. In: Beck BF, Person FM Olsen BM (1988b) Depth to bedrock and bedrock topography. (eds) Karst Geohazards. Proceedings of the fifth multidisciplin- Geologic Atlas Olmsted County, Minnesota, County Atlas Series ary conference on sinkholes and the engineering and environ- C-3, Plate 4 (1:100,000). Minnesota geological survey, Univer- mental impact of karst. Gatlinburg, Tenn., 2–5 April, A.A. sity of Minnesota Balkema, Rotterdam, pp 43–51 Palmquist RC (1977) Distribution and density of dolines in areas of Matschinski M (1968) Alignment of dolines northwest of Lake mantled karst. In: Dilamarter RR, Csallany SC (eds) Hydrologic Constance, Germany. Geol Mag 105:56–61 problems in karst regions: International symposium on hydro- McConnell H, Horn JM (1972) Probabilities of surface karst. In: logic problems in karst regions. Bowling Green, Ky., pp 117– Chorley RJ (eds) Spatial analysis in geomorphology. Harper & 129 Row, New York, pp 111–133 Quinlan JR (1990) Decision trees and decision making. IEEE Trans Mitchell TM (1997) Machine learning. McGraw-Hill, New York, 414 Syst Man Cybern 20(2):339–346 pp Runkel AC (1996) Bedrock geology of Houston County, Minnesota. Mossler JH (1995a) Bedrock geology. Geologic Atlas Fillmore Minnesota geological survey open file report 96–4, 3 pls. Scale County, Minnesota, County Atlas Series C-8, Part A, Plate 2 1:100,000. Pl. 1, bedrock geology; pl. 2, bedrock topography; pl. (1:100,000). Minnesota Geological Survey, University of Min- 3, orientation of fractures in carbonate rocks; text, 13 p. nesota Minnesota geological survey, University of Minnesota Mossler JH (1995b) Bedrock geology. Geologic Atlas of Rice Runkel AC (1998) Bedrock geology. Geologic Atlas of Goodhue County. Minnesota, County Atlas Series C-9, Part A, Plate 2 County, Minnesota, County Atlas Series C-12, Part A, Plate 2 (1:100,000). Minnesota geological survey, University of Min- (1:100,000). Minnesota geological survey, University of Min- nesota nesota Mossler JH (1995c) Depth to bedrock and bedrock topography. Setterholm DR, Bloomgren BA (1998) Bedrock topography. Geo- Geologic Atlas of Rice County, Minnesota, County Atlas Series logic Atlas of Goodhue County, Minnesota, County Atlas Series C-9, Part A, Plate 5 (1:100,000). Minnesota geological survey, C-12, Part A, Plate 5 (1:100,000). Minnesota geological survey, University of Minnesota University of Minnesota Mossler JH (1998a) Bedrock geology. Geologic Atlas of Mower Shade BL (2002) The genesis and hydrogeology of a sandstone karst County, Minnesota, County Atlas Series C-11, Part A, Plate 2 in Pine County. M.S. thesis. University of Minnesota (1:100,000). Minnesota geological survey, University of Min- Shade BL, Alexander SC, Alexander EC Jr, Martin S (2001) Sinkhole nesota distribution, Depth to bedrock, and Bedrock topography. Geo- Mossler JH (1998b) Depth to bedrock and bedrock topography. logic Atlas of Pine County, Minnesota, County Atlas Series C- Geologic Atlas of Mower County, Minnesota, County Atlas 13, Part A, Plate 6 (1:100,000). Minnesota geological survey, Series C-11, Part A, Plate 5 (1:100,000). Minnesota geological University of Minnesota survey, University of Minnesota Shekhar S, Chawla S (2002) Spatial databases: a tour. Prentice Hall, Mossler JH (2001a) Bedrock geology. Geologic Atlas of Wabasha 300 pp County, Minnesota, County Atlas Series C-14, Part A, Plate 2 Tan P-N, Steinbach M, Kumar V (2005) Introduction to data mining. (1:100,000). Minnesota geological survey, University of Min- Addison Wesley, Reading, USA, 769 pp nesota Troester JW, Moore JE (1989) Karst hydrogeology in the United Mossler JH (2001b) Bedrock topography and depth to bedrock. States of American. Episodes 12(3):172–178 Geologic Atlas of Wabasha County, Minnesota, County Atlas Upchurch SB, Littlefield JR Jr (1987) Evaluation of data for sinkhole- Series C-14, Part A, Plate 4 (1:100,000). Minnesota geological development risk models. In: Beck BF, Wilson WL (eds) Karst survey, University of Minnesota hydrogeology: Engineering and environmental applications. Mossler JH (2004a) Bedrock geology of Steele, Dodge, Olmsted and Proceedings of the second multidisciplinary conference on Winona Counties. Data in support of EPA 319 demonstration sinkholes and the environmental impacts of karst. Orlando, project: contaminant management in the karst region. Produced Florida, 9–11 February, A. A. Balkema, Rotterdam, pp 359–364 in conjunction with development of state karst features database Veni G (1987) Fracture permeability: implications on cave and and enhanced karst feature inventory of Steele, Dodge, Olmsted sinkhole development and their environmental assessments. In: and Winona Counties. (1:100,000). Minnesota geological sur- Beck BF, Wilson WL (eds) Karst hydrogeology: Engineering vey, University of Minnesota and environmental applications. Proceedings of the second Mossler JH (2004b) Depth to bedrock of Steele, Dodge, Olmsted and multidisciplinary conference on sinkholes and the environmental Winona Counties. Data in support of EPA 319 demonstration impacts of karst. Orlando, Florida, 9–11 February, A. A. project: contaminant management in the karst region. Produced Balkema, Rotterdam, pp 101–105 in conjunction with development of state karst features database Water Resources center (1999a) Bedrock geology of the 13 counties and enhanced karst feature inventory of Steele, Dodge, Olmsted of south central Minnesota. 13 County ArcView GIS and Winona Counties. (1:100,000). Minnesota geological sur- (1:150,000). Mankato State University vey, University of Minnesota Water Resources center (1999b) Depth to bedrock of the 13 counties Mossler JH, Hobbs HC (1995) Depth to bedrock and bedrock of south central Minnesota. 13 County ArcView GIS topography. Geologic Atlas Fillmore County, Minnesota, County (1:150,000). Mankato State University

123 956 Environ Geol (2008) 54:945–956

Whitman D, Gubbels T (1999) Applications of GIS technology to the Witthuhn MK, Alexander EC Jr (1995) Sinkholes and sinkhole triggering phenomena of sinkholes in central Florida. In: Beck probability. Geologic Atlas Fillmore County, Minnesota, County BF, Pettit AJ, Herring GJ (eds) Hydrogeology and engineering Atlas Series C-8, Part B, Plate 8 (1:100,000). Minnesota geology of sinkholes and karst. Proceedings of the seventh Department of Natural Resources, Division of Waters multidisciplinary conference on sinkholes and the engineering Wopat MA (1974) The karst of southeastern Minnesota; and methods and environmental impacts of karst. Harrisburg-Hershey, Penn., for statistical analysis of polymodal two-dimensional orientation 10–14 April, A. A. Balkema, Rotterdam, pp 67–73 data. M.S. thesis. University of Wisconsin-Madison Williams P (1972) The analysis of spatial characteristics of karst Zhou W, Beck BF, Adams AL (2003) Application of matrix analysis terrains. In: Chorley RJ (eds) Spatial analysis in geomorphology. in delineating sinkhole risk areas along highway (I-70 near Harper & Row, New York, pp 135–163 Frederick, Maryland). Environ Geol 44(7):834–842 Winston PH (1992) Learning by building identification trees. In: Winston P (ed) Artificial intelligence. Addison-Wesley, Read- ing, USA, pp 423–442

123