Massive Data Management and Sharing Module for Connectome Reconstruction
Total Page:16
File Type:pdf, Size:1020Kb
brain sciences Article Massive Data Management and Sharing Module for Connectome Reconstruction Jingbin Yuan 1 , Jing Zhang 1, Lijun Shen 2,*, Dandan Zhang 2, Wenhuan Yu 2 and Hua Han 3,4,5,* 1 School of Automation, Harbin University of Science and Technology, Harbin 150080, China; [email protected] (J.Y.); [email protected] (J.Z.) 2 Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China; [email protected] (D.Z.); [email protected] (W.Y.) 3 The National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China 4 Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences, Shanghai 200031, China 5 The School of Future Technology, University of Chinese Academy of Sciences, Beijing 100049, China * Correspondence: [email protected] (L.S.); [email protected] (H.H); Tel.: +86-010-82544385 (L.S.); +86-010-82544710 (H.H) Received: 30 April 2020; Accepted: 20 May 2020; Published: 22 May 2020 Abstract: Recently, with the rapid development of electron microscopy (EM) technology and the increasing demand of neuron circuit reconstruction, the scale of reconstruction data grows significantly. This brings many challenges, one of which is how to effectively manage large-scale data so that researchers can mine valuable information. For this purpose, we developed a data management module equipped with two parts, a storage and retrieval module on the server-side and an image cache module on the client-side. On the server-side, Hadoop and HBase are introduced to resolve massive data storage and retrieval. The pyramid model is adopted to store electron microscope images, which represent multiresolution data of the image. A block storage method is proposed to store volume segmentation results. We design a spatial location-based retrieval method for fast obtaining images and segments by layers rapidly, which achieves a constant time complexity. On the client-side, a three-level image cache module is designed to reduce latency when acquiring data. Through theoretical analysis and practical tests, our tool shows excellent real-time performance when handling large-scale data. Additionally, the server-side can be used as a backend of other similar software or a public database to manage shared datasets, showing strong scalability. Keywords: connectome; massive data management; distributed storage and retrieval; electron microscope image; segmentation result; image cache 1. Introduction The brain is the organ of thought, and as the most advanced part of the nervous system, it dominates all the activities of the body. To explore the brain, scientists have done much research and have a basic understanding of its functions. For example, MRI technology is adopted to study the connections between different gray matter areas of the brain on a macro scale (millimeter scale) [1]. Based on optical technology, the projection path of a single neuron axon and its connection with upstream and downstream neurons are studied at the mesoscopic scale (micron scale) [2,3]. To ensure that all parts of the brain work together and drive the body’s activities, the transmission of signals, and the connection between the large number of neurons is still a mystery. The significant number of neurons and the complex connections between them make the structure and function of the brain extremely complicated. One of the ways to explore the mechanism is to use the electron microscope (EM) to image sequence Brain Sci. 2020, 10, 314; doi:10.3390/brainsci10050314 www.mdpi.com/journal/brainsci Brain Sci. 2020, 10, 314 2 of 15 Brain Sci. 2020, 10, x FOR PEER REVIEW 2 of 15 slices(EM) of the to brainimage tissue sequence and slices then of draw the brain the connectometissue and then of draw it [4 –the6]. connectome It possesses of oneit [4–6]. advantage It possesses that the imagingone modeadvantage has enoughthat the highimaging spatial mode resolution has enough (nanoscale) high spatial to clearlyresolution distinguish (nanoscale) the to connections clearly concerningdistinguish the delicatethe connections structures concerning of the neurons, the delicate such structures as connections of the neurons, between such synapses. as connections Thebetween reconstruction synapses. of the connectome usually requires a series of steps such as sample preparation, The reconstruction of the connectome usually requires a series of steps such as sample slicing, EM imaging, image registration, automatic segmentation, and proofreading. A schematic preparation, slicing, EM imaging, image registration, automatic segmentation, and proofreading. A diagram of obtaining the sequence images by the electron microscope from a brain tissue block is schematic diagram of obtaining the sequence images by the electron microscope from a brain tissue shownblock in Figure is shown1a,b. in Figure 1a,b. (a) (b) (c) (d) FigureFigure 1. A 1. schematic A schematic diagram diagram of of the the sequential sequential electronelectron microscope microscope images images and and the thesegmentation segmentation data thatdata makesthat makes up aup neuron. a neuron. (a) (a A) A schematic schematic diagram diagram of the the brain brain tissue tissue block; block; (b) (Sequentialb) Sequential electron electron microscopemicroscope (EM) (EM) images images obtained obtained from from the the brain brain tissuetissue block; block; (c ()c Different) Different colors colors represent represent different different segments.segments. Each Each segment segment represents represents a segmentationa segmentation region of of a a neuron neuron in inthe the layer; layer; (d) (Thed) The segments segments on each layer form a neuron. on each layer form a neuron. Table 1. Reconstruction results over the years With the development of advanced technology and the expectation of reconstructing complete connectome,Year the volume ofSample electron microscopeVolume images (um grows3) significantly.Reconstruction The State difficulty of EM images management2011 Visual is increasing. cortex of mice For [7] one, to ensure120 × the80 × continuity130 betweenSparse sequence 1 images and the limitation2013 of theMouse actual inner physical reticulum size [8] of the neural132 × structure,114 × 80 the thicknessSparse of the slice should be thinner and2015 is generally Rat 30–50cortex/retina nm or [9] even 544 nm. × 60 For× 141/93 another, × 60 × in93 order toSparse observe a more delicate 2016 Rat visual thalamus [10] 400 × 600 × 280 Sparse structure, the spatial resolution of imaging is also required to be higher. With the development of EM 2017 Songbird X Area [11] 98 × 96 × 115 Sparse imaging technology,2018 itDrosophila is possible brain to fulfill[12] these750 requirements, × 369 × 286 such as the serialSparse block-face scanning electron microscopy2018 (SBF-SEM)Songbird X [ 7Area], the [13] focused-ion-beam 98 × 96 scanning × 115 electron microscopyDense 2 (FIB-SEM) [8,9], the automated2019 tape-collecting Drosophila ultramicrotome brain [14] scanning995 × 537 electron× 283 microscopyDense (ATUM-SEM) [10,11], TEMCA [112 The], sparse and the state Zeiss means MSEMto reconstruct [13,14 a few]. neurons As a consequence, of interest.2 The dense the imaging state means speed to reconstruct is significantly improvedmost and of more the neurons image in a data tissue can block. be obtained at an acceptable time. Thus, the volume of the image dataWith obtained the development by the EM of advanced becomes technology more massive and the and expectation can be up of toreconstructing terabytes (TB) complete and even petabytesconnectome, (PB). Table the 1volume shows of the electron reconstruction microscope results images over grows the years.significantly. The volume The difficulty of reconstruction of EM is increasingimages management year by year, is increasing. and the state For one, of reconstruction to ensure the continuity gradually between changes sequence from sparse images toand dense. Therefore,the limitation large-scale of the image actual datasets physical will size beof morethe neural common structure, in the the future, thickness with of thethe excellentslice should value be for understandingthinner and the is generally neuronal 30–50 connection nm or andeven the5 nm. mechanism For another, of signalin order transmission to observe a inmore our delicate brain. structure, the spatial resolution of imaging is also required to be higher. With the development of EM imaging technology, it is possibleTable 1. toReconstruction fulfill these requ resultsirements, over such the years as the serial block-face scanning electron microscopy (SBF-SEM) [15], the focused-ion-beam scanning electron microscopy (FIB-SEM) [16,17],Year the automated Sampletape-collecting ultramicrotomeVolume scanning (um3 )electron microscopyReconstruction (ATUM-SEM) State [18,19],2011 TEMCA Visual [20], cortex and ofthe mice Zeiss [15 ]MSEM [21,22]. 120 As80 a 130consequence, the imagingSparse 1speed is × × significantly2013 improved Mouse inner and reticulum more image [16 ]data can be 132 obtained114 at 80an acceptable time. Thus,Sparse the volume × × of 2015the image data Rat obtained cortex/retina by the [17 EM] becomes 44 mo60re massive141/93 and60 can93 be up to terabytesSparse (TB) and × × × × even2016 petabytes Rat (PB). visual Table thalamus 1 shows [18] the reconstruc 400 tion600 results280 over the years. TheSparse volume of × × reconstruction2017 is Songbird increasing X Areayear [19by] year, and