Analyzing Data with 1D Non-Linear Shapes Using Topological Methods
Total Page:16
File Type:pdf, Size:1020Kb
Analyzing data with 1D non-linear shapes using topological methods Dissertation Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate School of The Ohio State University By Suyi Wang, M.S. Graduate Program in Computer Science and Engineering The Ohio State University 2018 Dissertation Committee: Yusu Wang, Advisor Rephael Wenger Tamal K. Dey c Copyright by Suyi Wang 2018 Abstract Shape analysis has been applied in many applications across a broad range of domains. Among various different families of \complex" shapes, the ones with 1D non- linear topological structures (skeletons) are particularly interesting. These shapes are simple, as they can be decomposed into 1-d pieces, but still informative in representing the connectivity and other important information behind data. In this thesis, I focus on two objects from computational topology that have been useful for modeling the skeleton of data: the Reeb graph (and its variants) and the 1-(un)stable manifold from discrete Morse theory, and study their properties as well as applications to shape analysis. The two specific topological objects that I focus on have both already been widely used in practical applications. Further theoretical understanding and applications of these two objects in modeling and studying the skeleton of data will be provided. The first part of the dissertation work concerns the so-called Reeb graph and its loop-free variant, the contour tree, which can be used to provide a 1D tree summary of an input scalar field. It has been commonly used in computer graphics and visual- ization. I have investigated one problem regarding to the theoretical understanding of the contour tree, as well as developing a variant of the Reeb graph to address the issue of noise. Carr et al. has proposed an algorithm for computing the contour tree for a piecewise linear function defined on a simplicial complex domain. This algorithm ii is simple, efficient and has been widely used in practice. However, the algorithm is often applied even when the output contour tree may not exist, in which case the algorithm may not terminate or exit with only partial output. My work provides new understanding of this contour tree algorithm and characterizes the cause for such behavior. I also propose a simple variation of the contour tree (called JS-graph) to handle this situation in practice. The Reeb graph, which provides a general 1D summary of a scalar field, has been used to extract the skeleton behind data. However, when there is significant amount of ambient noise, the Reeb graph may no longer reflect the structures of the data. To handle such ambient noise, I propose and develop a concept called \gradational Reeb graph", which incrementally merges the Reeb graphs from different density levels while keeping structures from high density regions. I demonstrate that when extracting the road network from GPS samples, the \gradational Reeb graph" could capture finer structures, which traditional Reeb graph cannot. In the second part of my thesis, I will focus on the so-called 1-(un)stable manifold from Morse Theory, which is another topological object for modeling and extracting data skeleton. In this thesis, I develop a pipeline for the automatic map reconstruc- tion problem that aims to reconstruct the underlying road network, which can be considered as a hidden geometric graph from GPS trajectory samples. I also develop a Morse theory based approach that can integrate multiple maps (graphs) into a single one, as well as correct an existing map with a partial but more trustworthy map. The effectiveness of the method is demonstrated by reconstructing maps from GPS trajectories sampled in the cites of Athens, Beijing, Berlin and Chicago. This iii work proposes a new methodology with simpler pipeline for handling large maps with better performance, which also advances the state of art. The closing topic of this thesis is the extension of 1-stable manifold to a 3D appli- cation, neuron tracing, which asks to extract trees representing neurons from digital images. Here I aim to tackle the challenges such as the massive size and noisy nature of data. In this work I have developed a new algorithm that handles new large data of 3D meso-scale brain by divide and conquer topological method. Experiments have demonstrated that the proposed method has obtained quantitatively better results than those from existing algorithms in tracing single neurons, while my methods can also produce a summary of trends for more challenging and general injection data containing multiple neurons. iv Dedicated to my parents, advisor, those I love and those who love me v Acknowledgments First of all, I would sincerely express my gratitude to my advisor, Dr. Yusu Wang. She leaded me into the topological data analysis field and has tremendously supported me throughout the study. She helped me with her experience and insight- ful understanding of the field in selecting research topics, tackling critical problems and presenting the works. I am also grateful that she is patient and tolerant towards my mistakes and sometimes even thinks ahead of me to prevent potential detours. It is fortunate to have Dr. Wang as my advisor in the Ph.D. journey. These accom- plishments would not have been achieved without her advices. I would like to thank Dr. Tamal Dey and Dr. Rephael Wenger for their advices on my work and my defense. Thanks to the graduate stuents I have met in the group: Chuanjiang Luo, Lei Wang, Xiaoyin Ge, Fengtao Fan, Andrew Slatton, Alfred Rossi, Dayu Shi, Zhe Dong, Sayan Mandal, Jiayuan Wang, Tianqi Li, Dingkang Wang, Cheng Xin, Elena Farahbakhshtouli, Ryan Slechta and Minghao Tian. Thanks to all the friends I have made during the Ph.D. studies for their companionship. Finally, I would thank my parents and family, especially my uncle Yuan Ma and Yixing Yuan, for their advices and support. vi Vita 2011 . .B.S. Physics, Beijing Normal University 2015 . .M.S. Computer Science and Engineer- ing, The Ohio State University 2016-present . .Ph.D. Candidate, Computer Science and Engineering, The Ohio State University Publications Research Publications S. Wang, Y. Wang, Y. Li Efficient Map Reconstruction and Augmentation via Topo- logical Methods. In Proceedings of the 23rd SIGSPATIAL International Confer- ence on Advances in Geographic Information Systems (SIGSPATIAL '15), 25:1{25:10, 2015. S. Wang, Y. Wang, R. Wenger JS-Graph of Join and Split Trees. In Proceedings of the thirtieth annual symposium on Computational geometry (SOCG'14), 539:539{ 539:548, 2014. L. Che, Y. Xiao, S. Wang, Q. Jiang Application of K-means Clustering Analysis in Chinese Pronunciation Degree. Computer Technology and Development, Vol.21, 223-225, 2011. Fields of Study Major Field: Computer Science and Engineering vii Table of Contents Page Abstract . ii Dedication . .v Acknowledgments . vi Vita......................................... vii List of Tables . xi List of Figures . xii 1. Introduction . .1 1.1 Overview . .1 1.2 Representing Data Skeleton via Reeb graph . .4 1.2.1 Computing Contour Trees . .6 1.2.2 Handling Noise in Reeb graph . .8 1.3 Representing Data Skeleton via 1-stable manifold . 10 1.3.1 DiMorSC Implementation . 11 1.3.2 Automatic Map Reconstruction and Augmentation . 12 1.3.3 Neuron Tracing . 15 2. Preliminary . 19 2.1 Simplicial complex . 19 2.2 Critical points and 1-(un)stable manifold . 20 2.3 Reeb graph . 24 2.4 Persistence Simplification . 25 viii 3. JS-Graph of Join and Split trees . 28 3.1 Constructing a Contour Tree . 28 3.1.1 Computing Join / Split Tree . 29 3.1.2 Computing the Merge Tree . 31 3.1.3 Types of problematic output . 33 3.2 JS-graphs and its Characterization . 35 3.2.1 Definition of JS-graphs . 35 3.2.2 Characterization of JS-graphs . 37 3.3 An Efficient Algorithm to Compute A Linear-size JS-graph . 42 3.3.1 The algorithm . 43 3.3.2 Correctness of Algorithm 3 . 47 3.3.3 Time complexity analysis . 51 3.4 Proofs in Computing JS-Graph . 55 3.4.1 Missing Details from Section 3.2 . 55 3.4.2 Missing Details from Section 3.4.2 . 56 3.4.3 Characterization of Join and Split Trees which have Merge Trees . 63 3.5 Concluding Remarks . 68 4. Gradational Reeb Graph . 70 4.1 Representing data skeleton with Reeb graph . 70 4.2 Gradational Reeb Graph . 73 4.3 Experiment . 77 4.3.1 GPS data under threshold . 77 4.3.2 Merged Result . 79 5. Map Reconstruction . 82 5.1 Related Work . 82 5.2 Map Reconstruction via Morse Theory . 84 5.2.1 (Step 1) Density field construction . 86 5.2.2 (Step 2) Ridge extraction . 87 5.2.3 DiMorSC Implementation . 91 5.3 Map Integration and Augmentation . 98 5.4 Results . 102 6. Neuron Tracing . 112 6.1 Related Work . 113 6.2 Method . 115 ix 6.2.1 Pipeline . 115 6.2.2 Stitching . 119 6.2.3 Tree simplification . 121 6.3 Results . 123 6.3.1 Tracing a single neuron . 123 6.3.2 Tracing multiple larger neurons . 127 6.3.3 Simplifying the neuron bundle . 129 7. Conclusion and Future Work . 134 Bibliography . 139 x List of Tables Table Page 6.1 Running time in seconds . 126 6.2 DIADEM score . 126 xi List of Figures Figure Page 1.1 The Reeb graph of a uniform function on a surface. .5 1.2 Reconstruct road map from GPS traces . .8 2.1 Critical points in 3D: For a maxima(minima), the function values within a small ball are all lower(higher) than it. For a 2-critical point, the points with function values lower than it form a connected compo- nent while the points with function values higher than it are separated into two components.