Mammalian Faunas, Ecological Indices, and Machine-Learning Regression for the Purpose of Paleoenvironment Reconstruction in the Miocene of South T America ⁎ Jackson P
Total Page:16
File Type:pdf, Size:1020Kb
Palaeogeography, Palaeoclimatology, Palaeoecology 518 (2019) 155–171 Contents lists available at ScienceDirect Palaeogeography, Palaeoclimatology, Palaeoecology journal homepage: www.elsevier.com/locate/palaeo Mammalian faunas, ecological indices, and machine-learning regression for the purpose of paleoenvironment reconstruction in the Miocene of South T America ⁎ Jackson P. Spradleya,b, , Bryan J. Glazerc, Richard F. Kayb,d a University of North Carolina-Wilmington, Department of Biology and Marine Biology, Wilmington, NC 28403, United States of America b Duke University, Department of Evolutionary Anthropology, Durham, NC 27708, United States of America c Vanderbilt University, Department of Biomedical Informatics, Nashville, TN 37232, United States of America d Duke University, Division of Earth and Ocean Sciences, Nicholas School of the Environment, 27710, United States of America ARTICLE INFO ABSTRACT Keywords: Reconstructing paleoenvironments has long been considered a vital component for understanding community Gaussian process regression structure of extinct organisms, as well as patterns that guide evolutionary pathways of species and higher-level Random forest taxa. Given the relative geographic and phylogenetic isolation of the South American continent throughout Santa Cruz much of the Cenozoic, the South American fossil record presents a unique perspective of mammalian community La Venta evolution in the context of changing climates and environments. Here we focus on one line of evidence for paleoenvironment reconstruction: ecological diversity, i.e. the number and types of ecological niches filled within a given fauna. We propose a novel approach by utilizing ecological indices as predictors in two regressive modeling techniques—Random Forest (RF) and Gaussian Process Regression (GPR)—which are applied to 85 extant Central and South American localities to produce paleoecological prediction models. Faunal richness is quantified via ratios of ecologies within the mammalian communities, i.e. ecological indices, which serve as predictor variables in our models. Six climate/habitat variables were then predicted using these ecological in- dices: mean annual temperature (MAT), mean annual precipitation (MAP), temperature seasonality, precipita- tion seasonality, canopy height, and net primary productivity (NPP). Predictive accuracy of RF and GPR is markedly higher when compared to previously published methods. MAT, MAP, and temperature seasonality have the lowest predictive error. We use these models to reconstruct paleoclimatic variables in two well-sampled Miocene faunas from South America: fossiliferous layers (FL) 1–7, Santa Cruz Formation (Early Miocene), Santa Cruz Province, Argentina; and the Monkey Beds unit, Villavieja Formation (Middle Miocene) Huila, Colombia. Results suggest general concordance with published estimations of precipitation and temperature, and add in- formation with regards to the other climate/habitat variables included here. Ultimately, we believe that RF and GPR in conjunction with ecological indices have the potential to contribute to paleoenvironment reconstruction. 1. Introduction (Fleming, 1973; Andrews et al., 1979; Kay and Madden, 1997; Reed, 1998; Kay et al., 2012a; Di Giacomo and Fariña, 2017). For the pur- Reconstructing paleoenvironments has long been considered a vital poses of this study, we focus on this latter form of evidence, ecological component for understanding the paleobiology of extinct organisms, as richness. That is, the number and types of ecological niches well as the evolutionary pathways of species and higher-level taxa. (Hutchinson, 1957) filled within any given faunal community. This However, the ability for researchers to reconstruct past environments is study proposes a novel approach for the reconstruction of climate necessarily limited by the available lines of evidence; including geo- parameters in the fossil record by utilizing two machine-learning al- chemical indicators of the environment (Wilson, 1974; Hayes et al., gorithms—the random forest (RF) and Gaussian process regression 1990; Kohn, 2010; Cerling et al., 2011), the adaptations of the organ- (GPR)—which are applied to faunal diversity and richness, and climate isms present in fossil assemblages (Andrews and Van Couvering, 1975; data from several extant Central and South American localities to Vizcaíno et al., 2017), and the ecological richness of the assemblage produce a paleoecology prediction model. In order to gauge the ⁎ Corresponding author at: University of North Carolina-Wilmington, Department of Biology and Marine Biology, Wilmington, NC 28403, United States of America. E-mail address: [email protected] (J.P. Spradley). https://doi.org/10.1016/j.palaeo.2019.01.014 Received 13 September 2018; Received in revised form 8 January 2019; Accepted 9 January 2019 Available online 15 January 2019 0031-0182/ © 2019 Elsevier B.V. All rights reserved. J.P. Spradley et al. Palaeogeography, Palaeoclimatology, Palaeoecology 518 (2019) 155–171 predictive power of the generated models, the models were used to that niche structure covaries predictably with precipitation (Kay et al., estimate climate parameters at several extant Australian sites based on 1997). (For a recent example see Robinson et al., 2017). The former their faunal assemblages. Australia serves as an ideal test for our data approach of Andrews et al. (1979), for reconstructing the habitat type for two primary reasons: 1) due to the relative geographic isolation of reveals a potential flaw of the approach of Kay et al. (1997); namely, Australia, the mammalian fauna there is phylogenetically distinct from focusing on a specific climate variable fails to consider that a climate that of South America, and thus similar ecologies and morphologies variable (such as precipitation) is related to community structure only between the two faunas are almost certainly due to convergence rather insofar as it is also correlated with a certain habitat type (specified by than a signal of phylogenetic relationship; and 2) Australia is a large Andrews et al. as the vegetational structure). In other words, by using continent with a great diversity of climates and habitats, much like community structure to reconstruct a single climate variable like rain- South America. fall, we are potentially losing track of the important matter of de- To the degree that we find the accuracy of our model for estimating scribing the habitat in which these animals lived. On the other hand, modern climate parameters to be accurate between the two phylogen- using quantitative variables eliminates the need to subjectively char- etically distinct faunas, we have gained confidence for its use for re- acterize a habitat into one category or another. To reconcile these two constructing paleoclimatic variables in two well-sampled Miocene approaches, we include a number of different climatic variables (tem- faunas from South America, the Santa Cruz fauna of Argentina and the perature, precipitation, and seasonality) as well as two quantitative La Venta fauna of Colombia. The former is an unusually well sampled variables directly associated with the botanic habitat (canopy height (from the perspective of species richness and fossil completeness) late and net primary productivity). Combined, these variables can give a Early Miocene fauna from the far southern tip of South America. The specific description of the habitat without relying on somewhat sub- latter is the only Middle Miocene neotropical locality for which we have jective qualitative descriptions of habitat. a relatively complete sample of well-preserved mammal taxa. For both In the context of paleoclimate reconstruction, it is thus proposed formations, we are restricting our analysis to specific well-sampled that the kinds and proportions of species within specific niches within a stratigraphic levels that have previously been identified as approx- mammalian fauna as a whole can serve as a proxy for habitat structure imating a contemporary fauna, fossiliferous levels 1–7 (FL 1–7) of the or climate, as has been done in a number of previous studies (e.g., Santa Cruz formation (Kay et al., 2012b) and the Monkey Beds of the Andrews et al., 1979; Gingerich, 1989; Fariña, 1996; Vizcaíno et al., Honda Group in La Venta (Kay and Madden, 1997). We compare our 2010; Kay et al., 2012a, 2012b; Robinson et al., 2017). Such studies are results to those of Kay and Madden (1997) and Kay et al. (2012b). limited by a multitude of factors, specifically related to the difficulties inherent to sampling, both in living faunas and in the fossil record. In 2. Background relation to living faunas, an incomplete sampling effort can lead to the exclusion of rare species from a faunal list. It can also be difficult to give That mammal species occupying similar habitats have convergent an accurate estimation of the general climate or habitat, depending on adaptations is one of the foundational blocks of natural philosophy, the length of the observation. In order to overcome these potential including Cuvier (Gould, 1982) and Darwin (Darwin, 1859), and serves pitfalls, long-term field studies can be used. Such stations collect data as the theoretical basis for the use of ecological richness as an avenue on the fauna present as well as track various climatic variables (e.g., for understanding and reconstructing paleoecology. Referred to by precipitation, temperature, etc.) over decades, providing an approx- Charles Elton in 1927 as ecological “vicars”, Darwin, Elton, and