Articles and Minimizes the Loss of Material
Total Page:16
File Type:pdf, Size:1020Kb
Clim. Past, 16, 2415–2429, 2020 https://doi.org/10.5194/cp-16-2415-2020 © Author(s) 2020. This work is distributed under the Creative Commons Attribution 4.0 License. Technical note: A new automated radiolarian image acquisition, stacking, processing, segmentation and identification workflow Martin Tetard1, Ross Marchant1,a, Giuseppe Cortese2, Yves Gally1, Thibault de Garidel-Thoron1, and Luc Beaufort1 1Aix Marseille Univ, CNRS, IRD, Coll France, INRAE, CEREGE, Aix-en-Provence, France 2GNS Science, Lower Hutt, New Zealand apresent address: School of Electrical Engineering and Robotics, Queensland University of Technology, Brisbane, Australia Correspondence: Martin Tetard ([email protected]) Received: 3 June 2020 – Discussion started: 9 July 2020 Revised: 22 September 2020 – Accepted: 5 October 2020 – Published: 2 December 2020 Abstract. Identification of microfossils is usually done ing, processing, segmentation and recognition, is entirely by expert taxonomists and requires time and a significant automated via a LabVIEW interface, and it takes approx- amount of systematic knowledge developed over many years. imately 1 h per sample. Census data count and classi- These studies require manual identification of numerous fied radiolarian images are then automatically exported and specimens in many samples under a microscope, which is saved. This new workflow paves the way for the analysis very tedious and time-consuming. Furthermore, identifica- of long-term, radiolarian-based palaeoclimatic records from tion may differ between operators, biasing reproducibility. siliceous-remnant-bearing samples. Recent technological advances in image acquisition, pro- cessing and recognition now enable automated procedures for this process, from microscope image acquisition to taxo- nomic identification. 1 Introduction A new workflow has been developed for automated radio- larian image acquisition, stacking, processing, segmentation The term radiolarians currently refers to the polycystine ra- and identification. The protocol includes a newly proposed diolarian orders of Spumellaria and Nassellaria (whose shell methodology for preparing radiolarian microscopic slides. is made of opaline silica) that are relatively well preserved We mount eight samples per slide, using a recently developed in the fossil record by comparison with the Acantharia and 3D-printed decanter that enables the random and uniform set- Phaeodaria groups. Radiolarians are marine micro-organisms tling of particles and minimizes the loss of material. Once whose siliceous shells are found in the sedimentary record ready, slides are automatically imaged using a transmitted dating back to their appearance during the Cambrian period light microscope. About 4000 specimens per slide (500 per (Boltovskoy, 1999; Lazarus et al., 2005; Suzuki and Not, sample) are captured in digital images that include stacking 2015). While they have been neglected for a long time in techniques to improve their focus and sharpness. Automated biostratigraphical studies due to several documented cases of image processing and segmentation is then performed using recurrent evolution in the overall morphology of some taxa a custom plug-in developed for the ImageJ software. Each (e.g. Schrock and Twenhofel, 1953; Campbell, 1954; Bjørk- individual radiolarian image is automatically classified by a lund and Goll, 1979), radiolarian taxonomy and stratigra- convolutional neural network (CNN) trained on a Neogene phy have significantly progressed due to Deep Sea Drilling to Quaternary radiolarian database (currently 21 746 images, Project (DSDP) studies since 1968 (Sanfilippo et al., 1985) corresponding to 132 classes) using the ParticleTrieur soft- and are currently of major interest. Radiolarians are com- ware. monly used in biostratigraphy by documenting the presence The trained CNN has an overall accuracy of about 90 %. or absence of key marker species as well as in palaeoceano- The whole procedure, including the image acquisition, stack- graphic reconstructions of the past productivity, temperature and variability of water masses, wherein these approaches Published by Copernicus Publications on behalf of the European Geosciences Union. 2416 M. Tetard et al.: A new radiolarian workflow instead rely on relative species abundances. For both of oped a structured multilayer fat neural network for coccolith these approaches, radiolarians are particularly useful in high- recognition, which was first applied in 2001 to Late Pleis- latitude settings (e.g. the Southern Ocean) where both the tocene primary productivity reconstructions (Beaufort et al., preservation and species diversity of calcareous microfossils 2001). This formed the base for the following Système de Re- are very low. connaissance Automatique de Coccolithes (SYRACO) work- Indeed, radiolarian’s delicate siliceous remains have been flow, which used dynamic neural networks (Beaufort and proven to be important for decades in micropalaeontologi- Dollfus, 2004) and is still operating today. For the past few cal studies focusing on palaeoenvironmental reconstructions years, the field of computer vision has seen the emergence from various oceanic areas to investigate primary and ex- and development of convolutional neural networks (CNNs), port productivity (e.g. Welling et al., 1992; Lazarus, 2002; a deep-learning approach that enables the automated classifi- Abelmann and Nimmergut, 2005; Lazarus et al., 2006; cation of large sets of images. Convolutional neural networks Hernández-Almeida et al., 2013; Matsuzaki et al., 2019), sea are a class of deep neural networks that consist of an input surface temperature (e.g. Abelmann et al., 1999; Lazarus, and an output layer as well as multiple convolutional layers. 2002; Cortese and Abelmann, 2002; Lüer et al., 2008; Panitz This architecture is similar to the organization of the visual et al., 2015; Kamikuri, 2017; Hernández-Almeida et al., cortex in the human brain. 2017; Matsuzaki et al., 2019), water masses (e.g. Welling et Several workflows inspired by SYRACO and now using al., 1992; Kamikuri et al., 2009; Kamikuri, 2017; Hernández- CNNs have been successively developed at CEREGE and ap- Almeida et al., 2017; Matsuzaki et al., 2019) and oxygena- plied to microfossil taxa (e.g. Marchant et al., 2020; Bourel tion (e.g. Matsuzaki et al., 2019) across the Cenozoic. At et al., 2020). Regarding radiolarians, previous attempts have present, radiolarian assemblages are considered to be con- mainly focused on the identification step. Apostol et al. sistent and valuable micropalaeontological bioindicators, as (2016) used morphometrical measurements and support vec- they are largely distributed in all oceans dating back to their tor machine methods on four radiolarian species recovered appearance and can be very abundant in sediments (e.g. San- from Triassic sediments. In 2017, Keceli et al.(2017) in- filippo et al., 1985; Boltovskoy, 1998; Hernández-Almeida vestigated scanning electron microscope (SEM) images of et al., 2017). 27 selected Triassic species. Renaudie et al.(2018) recently However, despite their usefulness for such investigations, achieved promising results focusing on the automated iden- radiolarians are not as utilized in the same way as other mi- tification of species from the same genus with transmitted crofossil groups, such as benthic and planktic foraminifera, light microscope images. They obtain an overall identifica- or nannofossils, such as coccolithophorids. Experts on living tion accuracy of 73 %, achieved over 16 species from 2 gen- and fossil radiolarians are relatively scarce, and some radi- era, where the morphological difference between species can olarian species still lack a satisfactory taxonomy, especially be very tricky. for taxa within the order Spumellaria (Riedel, 1967; Sanfil- In this paper, we also propose a workflow for the auto- ippo et al., 1985). Identification of a substantial and suffi- mated identification of radiolarians. Our approach differs in cient number of specimens per sample (usually about 300 that we wanted to generate a neural network that could rec- for reliable assemblage composition estimations; Fatela and ognize most of the common radiolarian species, rather than Taborda, 2002) is very time-consuming and requires a con- those of a specific genus, in order to investigate their abun- sistent and detailed taxonomic knowledge. Moreover, it is dance (relative and absolute) and diversity and, thus, use common for the determination and taxonomy of recovered them as bioindicators to reconstruct palaeoenvironmental pa- specimens of all microfossil groups to be different between rameters. It is necessary to obtain a large database of im- studies; this is especially true for radiolarians, as the above- ages covering the common species in order to train the net- mentioned factors can be biased by the subjective appreci- work. Of the modern living 400 to 500 polycystine species, ation of the operator, influencing the reproducibility of the about 100 are relatively common (Boltovskoy, 1998); how- census counts. ever, they have yet to be imaged to create a database for au- Recent technological advances in image acquisition, pro- tomated recognition purposes. Some online Cenozoic radi- cessing and recognition have enable automated procedures, olarian databases have already existed for a few years (e.g. from microscopic slide field-of-view acquisition to taxo- WoRaD – Boltovskoy et al., 2010; radiolaria.org – Dolven nomic identification, that can ease radiolarian studies. In the and Skjerpen, 2006;