Astrometry.Net: Automatic Recognition and Calibration of Astronomical Images

Astrometry.net: Automatic recognition and calibration of astronomical images by Dustin Lang A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Computer Science University of Toronto Copyright c 2009 by Dustin Lang This is Subversion revision 13655 in the Astrometry.net repository, dated November 17, 2009 at 13:39:31. Abstract Astrometry.net: Automatic recognition and calibration of astronomical images Dustin Lang Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2009 We present Astrometry.net, a system for automatically recognizing and astrometrically calibrating astronomical images, using the information in the image pixels alone. The system is based on the geometric hashing approach in computer vision: We use the geometric relationships between low-level features (stars and galaxies), which are rela- tively indistinctive, to create geometric features that are distinctive enough that we can recognize images that cover less than one-millionth of the area of the sky. The geometric features are used to generate rapidly hypotheses about the location—the pointing, scale, and rotation—of an image on the sky. Each hypothesis is then evaluated in a Bayesian decision theory framework in order to ensure that most correct hypotheses are accepted while false hypotheses are almost never accepted. The feature-matching pro- cess is accelerated by using a new fast and space-efficient kd-tree implementation. The Astrometry.net system is available via a web interface, and the software is released un- der an open-source license. It is being used by hundreds of individual astronomers and several large-scale projects, so we have at least partially achieved our goal of helping “to organize, annotate and make searchable all the world’s astronomical information.” ii Acknowledgements First, I want to thank Sam Roweis for accepting me as his student, for handing me the nascent Astrometry.net project which has been so fun to work on, for providing help, support, and encouragement throughout, and, most importantly, for his warm friendship. I owe huge thanks to David Hogg, for serving as my semi-official co-supervisor, mentor, and guide to the world of astronomy. It is hard to imagine what my life would be like if I hadn’t met him: no Delion-fuelled hack sessions in New York, no lovely summers in Heidelberg, and I almost certainly would not have made the jump into astronomy. He has shown me a great deal about how to be a good scientist and how to love my job and love my family at the same time. It has been a great privilege working with him. I have been fortunate to work with a number of great people on the Astrometry.net team, in particular Keir Mierle, who taught me many things about the practice of software development, and Christopher Stumm, who has done excellent work in bringing Astrometry.net to the amateur astronomy community. I must also thank the army of alpha-testers who have provided a lot of bug reports, great ideas for improving the system, and some very encouraging feedback. It is a pleasure to thank Iain Murray for many helpful ideas and insights; for listening politely while I rambled about some half-understood problem I was working on; and for being so amusingly English. I thank Sven Dickinson and Rich Zemel for serving on my advisory committee. They provided excellent guidance, advice, and suggestions for improving this thesis. I thank Doug Finkbeiner for taking the time to read my thesis and provide thoughtful comments, and for travelling to Toronto to serve as my external examiner. I thank my mom and my brother, Gerda and Devon Lang, who are two fantastic people who have always been there for me. I give huge thanks to my lovely wife Micheline. A PhD is a team effort, and she is the foundation of Team Lang. She provided a huge amount of love and encouragement, iii proof-read thesis drafts, forced me to practice my talks and listened patiently while I did, and kept our household running while I was embroiled in writing. I couldn’t have done it without her. There are exciting times ahead and I’m so very happy to have her at my side. Finally, I thank my baby son Julian for having brought so much joy into our lives, and for sleeping soundly while I write these words. iv Contents 1 Introduction 1 1.1 Preface ..................................... 2 1.2 Introduction .................................. 2 1.3 Astronomical imaging for computer scientists ................ 4 1.4 Pattern recognition .............................. 6 1.5 Visual pattern recognition .......................... 7 1.6 The geometric hashing framework ...................... 8 1.7 Related work in fast feature matching .................... 12 1.7.1 Bloom filters ............................. 13 1.7.2 Locality Sensitive Hashing ...................... 14 1.7.3 Kd-trees ................................ 17 1.7.4 Other approaches ........................... 18 1.8 Astrometric calibration as a pattern recognition task ........... 19 1.9 Related work in astrometric calibration ................... 22 1.9.1 Non-blind astrometric calibration .................. 23 1.9.2 Blind astrometric calibration ..................... 26 1.9.3 Fine-tuning astrometric calibrations ................. 30 1.10 Summary ................................... 31 2 Astrometry.net: recognizing astronomical images 33 v 2.1 Introduction .................................. 34 2.2 Methods .................................... 38 2.2.1 Star detection ............................. 39 2.2.2 Hashing of asterisms to generate hypotheses ............ 40 2.2.3 Indexing the sky ........................... 44 2.2.4 Verification of hypotheses ...................... 52 2.3 Results ..................................... 54 2.3.1 Blind astrometric calibration of the Sloan Digital Sky Survey ... 54 2.3.2 Blind astrometric calibration of Galaxy Evolution Explorer data . 77 2.3.3 Blind astrometric calibration of Hubble Space Telescope data ... 79 2.3.4 Blind astrometric calibration of other imagery ........... 83 2.3.5 False positives ............................. 85 2.4 Discussion ................................... 85 3 Verifying an astrometric alignment 93 3.1 Introduction .................................. 93 3.2 Bayesian decision-making ........................... 94 3.3 A simple independence model ........................ 96 3.3.1 Issues with this model ........................ 99 3.4 The Astrometry.net case ........................... 117 3.5 Discussion ................................... 122 4 Efficient implementation of kd-trees 124 4.1 Introduction .................................. 125 4.2 The standard kd-tree ............................. 125 4.3 Kd-tree implementation ........................... 128 4.3.1 Data structures ............................ 128 4.3.2 Construction ............................. 128 vi 4.3.3 Distance bounds ........................... 129 4.3.4 Nearest neighbour ........................... 133 4.3.5 Rangesearch .............................. 133 4.3.6 Approximations ............................ 135 4.4 Efficient Implementation Tricks ....................... 137 4.4.1 Store the data points as a flat array. ................ 137 4.4.2 Create a complete tree. ........................ 138 4.4.3 Don’t use pointers to connect nodes. ................ 138 4.4.4 Pivot the data points while building the tree. ........... 139 4.4.5 Don’t use C++ virtual functions. ................... 139 4.4.6 Consider discarding the permutation array. ............. 140 4.4.7 Store only the rightmost offset of points owned by a node. .... 140 4.4.8 Don’t store the R offsets of internal nodes. ............. 141 4.4.9 With median splits, don’t store the R offsets. ........... 141 4.4.10 Consider transposing the data structures. .............. 142 4.4.11 Consider using a smaller data type. ................. 142 4.4.12 Consider bit-packing the splitting value and dimension. ...... 143 4.4.13 Consider throwing away the data. .................. 143 4.5 Speed Comparison .............................. 144 4.6 Conclusion ................................... 147 5 Conclusion 148 5.1 Contributions ................................. 149 5.2 Future work .................................. 150 5.2.1 Tuning of the Astrometry.net system ................ 150 5.2.2 Additions to the Astrometry.net system for practical recognition of astronomical images ........................ 152 5.2.3 Other kinds of calibration ...................... 155 vii 5.2.4 Using heterogenous images for science ................ 156 Bibliography 157 viii Chapter 1 Introduction 1 Chapter 1. Introduction 2 1.1 Preface 1.1 This thesis describes some of the research carried out by me and my colleagues in the Astrometry.net group. Astrometry.net started as a collaboration between Sam Roweis, a University of Toronto computer scientist, and David W. Hogg, a New York University astronomer. An idea that began as a crazy scrawl on the back of a napkin—probably in a bar—eventually grew into a real project and began attracting collaborators and students. Fast-forwarding a few years, we have achieved a rare feat: a computer vision software system that works, with little human intervention, and is being used by hundreds of real astronomers doing real research. We have also branched out and explored a number of other promising areas where ideas in computer vision and machine learning can be applied to astronomical data to great benefit. 1.2 This thesis is quite blatantly a collection of manuscripts.

Load more