A Robust Void-Finding Algorithm Using Computational Geometry and Parallelization Techniques

UNIVERSIDAD DE CHILE FACULTAD DE CIENCIAS FÍSICAS Y MATEMATICAS´ DEPARTAMENTO DE CIENCIAS DE LA COMPUTACION´ A ROBUST VOID-FINDING ALGORITHM USING COMPUTATIONAL GEOMETRY AND PARALLELIZATION TECHNIQUES TESIS PARA OPTAR AL GRADO DE MAGÍSTER EN CIENCIAS, MENCION COMPUTACION´ DEMIAN ALEY SCHKOLNIK MULLER¨ PROFESOR GUÍA: BENJAMÍN BUSTOS CARDENAS´ NANCY HITSCHFELD KAHLER MIEMBROS DE LA COMISION:´ MAURICIO CERDA VILLABLANCA GONZALO NAVARRO BADINO MAURICIO MARÍN CAIHUAN SANTIAGO DE CHILE 2018 i Resumen El modelo cosmológicoactual y másaceptado del universo se llama Lambda Cold Dark Matter. Este modelo nos presenta el modelo mássimple que proporciona una explicaciónrazonablemente buena de la evidencia observada hasta ahora. El modelo sugiere la existencia de estructuras a gran escala presentes en nuestro universo: Nodos, filamentos, paredes y vac´ıos. Los vac´ıosson de gran interéspara los astrof´ısicosya que su observaciónsirve como validaciónpara el modelo. Los vac´ıosson usualmente definidos como regiones de baja densidad en el espacio, con sólounas pocas galaxias dentro de ellas. En esta tesis, presentamos un estudio del estado actual de los algoritmos de búsquedade vac´ıos. Mostramos las diferentes técnicasy enfoques, e intentamos deducir la complejidad algor´ıtmica y el uso de memoria de cada void-finder presentado. Luego mostramos nuestro nuevo algoritmo de búsquedade vac´ıos, llamado ORCA. Fue construido usando triangulaciones de Delaunay para encontrar a los vecinos máscercanos de cada punto. Utilizando esto, clasificamos los puntos en tres categor´ıas: Centro, borde y outliers. Los outliers se eliminan como ruido. Clasificamos los triángulosde la triangulaciónen triángulosde vac´ıosy centrales. Esto se hace verificando un criterio de distancia, y si los triánguloscontienen outliers. Este método nos permite crear un algoritmo de búsquedade vac´ıosrápidoy robusto. Adicionalmente, se presenta una versiónparalela del algoritmo. ii Abstract Cosmic Voids are generally described as large, underdense regions of the Universe. Over the past years, there have been many attempts to build algorithms to find voids in the large-scale structure of the Universe. There are many different methods, but most approaches do not consider robustness. In this thesis, we present an efficient, fast and robust void-finding algorithm, by using a series of computational geometry and parallelization techniques. We take advantage of the properties of certain features, such as Delaunay triangulations, and k-nearest-neighbour search algorithms. Additionally, we made a parallel version of the algorithm (on GPU), a useful feature for large data sets, since it speeds up running time. We successfully build a cosmic void-finding algorithm, that is both robust and efficient. We tested the algorithm on randomly generated samples of two-dimensional data sets, and found most voids on most sets, with a retrieval rate on average above 90%. In order to test robustness, we inserted random noise into voids, and the algorithm proved to be highly tolerant to it, still detecting the void even with 200 noise points inside it. Regarding running time, the new algorithm is around three times as fast as the algorithm against it was benchmarked. The parallel version is about twice as fast as the sequential algorithm. iii Dedicatoria Dedicado a mis padres, mis profesores gu´ıas,y mis amigos. iv Contents 1 Introduction 1 1.1 Motivation..............................................1 1.2 Research Questions..........................................2 1.3 Hypothesis..............................................2 1.4 General Objective..........................................2 1.5 Specific Objectives..........................................2 1.5.1 Development Plan......................................3 1.6 Contributions.............................................3 1.6.1 Algorithmic Complexity and running time.........................3 1.6.2 Memory Usage........................................3 1.6.3 Effectiveness.........................................3 1.6.4 Robustness..........................................4 1.6.5 Parallelization........................................4 2 Basic Concepts 5 2.1 Review of current Literature.....................................5 2.1.1 Adaptive tree (ART).....................................5 2.1.2 Gridding, and cube growing.................................6 2.1.3 Distance Field by gridding, then climbing algorithm...................7 2.1.4 Statistical analysis......................................8 2.1.5 Delaunay tetrahedra / triangulation............................8 v 2.1.6 Distance Field and Watershed techniques......................... 10 2.1.7 Voronoi Tesselation and watershed techniques....................... 11 2.1.8 Wall Builder and Sphere Growing............................. 11 2.1.9 Discussion........................................... 13 2.2 Computational Techniques...................................... 15 2.2.1 Delaunay Triangulation................................... 15 2.2.2 K-Nearest-Neighbor Search................................. 15 2.2.3 KD-Tree............................................ 16 2.2.4 Parallelization, OpenCL and pyOpenCL.......................... 17 2.3 Research Methodology........................................ 17 2.3.1 Performance Indicators................................... 18 3 Cosmic Void-Finding Algorithm 19 3.1 First Approaches........................................... 19 3.1.1 Brute Force.......................................... 19 3.1.2 Delaunay Triangulation................................... 22 3.1.3 KD-Tree............................................ 24 3.2 Improved Solutions.......................................... 25 3.2.1 KD-Tree with high k + Image Processing......................... 25 3.2.2 ORCA: Higher generation Delaunay Triangulation k-NN search for Noise removal + edge removal......................................... 26 3.3 Discussion and selection of algorithm................................ 29 3.4 Parallel Approach.......................................... 30 3.4.1 Parallelization tools and frameworks............................ 30 3.4.2 First approach........................................ 31 3.4.3 Final parallel version..................................... 32 4 Results and Analysis 34 vi 4.1 Generated Samples.......................................... 34 4.2 Void-finding algorithm comparison................................. 38 4.2.1 Comparison conditions and Data Sets........................... 38 4.2.2 DELFIN and ORCA..................................... 38 4.2.3 Performance Indicators................................... 39 5 Conclusions and Future Work 47 5.1 Conclusions.............................................. 47 5.2 Future Work............................................. 48 Bibliography 49 vii List of Tables 2.1 Overview of existing void-finders..................................6 4.1 Running time comparison between DELFIN, ORCA, and ORCA Parallel........... 39 4.2 Memory used by DELFIN and by ORCA............................. 40 viii List of Figures 2.1 A Delaunay Triangulation with with circumcircles shown.................... 15 2.2 A representation of 3-NN search [?]................................. 16 2.3 KD-Tree................................................ 16 3.1 KD-Tree algorithm run with n=8192 points, k=9, "=100, plotting center points only..... 25 3.2 Example of second-gen Triangulation Neighbors.......................... 27 4.1 A randomly generated 4096-point dataset............................. 35 4.2 Voids found on a randomly generated 4096-point dataset, using 4th generation Delaunay neighbors, " value of 100 and k value of 15............................. 35 4.3 A randomly generated 8192-point dataset............................. 36 4.4 Voids found on a randomly generated 8192-point dataset, using 3th generation Delaunay neighbors, " value of 100 and k value of 8............................. 36 4.5 A randomly generated 32768-point dataset............................. 37 4.6 Voids found on a randomly generated 32768-point dataset, using 5th generation Delaunay neighbors, " value of 80 and k value of 20............................. 37 4.7 Recovery and Error Rates of irregular voids over a 1.000 points set, using values of k=3 and "=150.................................................. 41 4.8 Recovery and Error Rates of irregular voids over a 5.000 points set, using values of k=7 and "=100.................................................. 42 4.9 Recovery and Error Rates of irregular voids over a 10.000 points set, using values of k=12 and "=70.................................................. 42 ix 4.10 Recovery and Error Rates of irregular voids over a 50.000 points set, using values of k=15 and "=50.................................................. 42 4.11 Recovery and Error Rates of regular voids over a 1.000 points set, using values of k=3 and "=200. 43 4.12 Recovery and Error Rates of regular voids over a 5.000 points set, using values of k=7 and "=120. 43 4.13 Recovery and Error Rates of regular voids over a 10.000 points set, using values of k=10 and "=100.................................................. 44 4.14 Recovery and Error Rates of regular voids over a 50.000 points set, using values of k=14 and "=70.................................................. 44 4.15 Recovery and Error Rates of regular voids over a 100.000 points set, using values of k=14 and "=70.................................................. 44 4.16 Comparison between ORCA (left)

A Robust Void-Finding Algorithm Using Computational Geometry and Parallelization Techniques

And Ecclesiastical Cosmology

Zerohack Zer0pwn Youranonnews Yevgeniy Anikin Yes Men

Galaxy and Mass Assembly (GAMA): the Bright Void Galaxy Population

The W. M. Keck Observatory Scientific Strategic Plan

The 2Df Galaxy Redshift Survey: Wiener Reconstruction of The

Cover Illustration by JE Mullat the BIG BANG and the BIG CRUNCH

A Robust Public Catalogue of Voids and Superclusters in the SDSS Data Release 7 Galaxy Surveys

Chronology of the Universe - Wikipedia

The Full Appendices with All References

The Dark Energy Survey Collaboration , 2015) Using DES Data As Well As Combined Analyses Focusing on Smaller Scales (Park Et Al., 2015) Have Been Presented Elsewhere

Defining Cosmological Voids in the Millennium Simulation Using The

UCLA Electronic Theses and Dissertations