A Close Look at the Transient Sky in a Neighbouring

Kiran Tikare

Space Engineering, master's level (120 credits) 2020

Luleå University of Technology Department of Computer Science, Electrical and Space Engineering Master Thesis

A close look at the transient sky in a neighboring galaxy

Kiran Tikare

Erasmus Mundus SpaceMaster Program Department of Computer Science, Electrical and Space Engineering Lule˚aUniversity of Technology Kiruna, Sweden

Supervisors: Examiner: Prof. Ariel Goobar Prof. Anita Enmark Dr. Rahul Biswas ii To my beloved motherland Bharat and to the ancient and modern Gurus...

iii iv Om Asatomaa Sadgamaya Tamasomaa Jyotirgamaya Mrtyormaa Amrtamgamaya Om Shaantih Shaantih Shaantih

Lead us from the unreal to the real Lead us from darkness to light Lead us from death to immortality Aum peace, peace, peace!

– Brihadaranyaka Upanishad (1.3.28)

If I have seen further it is by standing on ye sholders of Giants.

– Isaac Newton in a letter to Robert Hooke

One thing I have learned in a long life: that all our science, measured against reality, is primitive and childlike — and yet it is the most precious thing we have.

– Albert Einstein

Bear in mind that the wonderful things you learn in your schools are the work of many generations, produced by enthusiastic effort and infinite labour in every country of the world. All this is put into your hands as your inheritance in order that you may receive it, honour it, add to it, and one day faithfully hand it on to your children. Thus do we mortals achieve immortality in the permanent things which we create in common. If you always keep that in mind you will find a meaning in life and work and acquire the right attitude toward other nations and ages.

– Albert Einstein, Ideas and Opinions

v vi Abstract

Study of the time variable sources and phenomena in Astrophysics provides us with im- portant insights into the stellar evolution, galactic evolution, stellar population studies and cosmological studies such as number density of dark massive objects. Study of these sources and phenomena forms the basis of Time Domain surveys, where the telescopes while scanning the sky regularly for a period of time provides us with positional and tem- poral data of various Astrophysical sources and phenomena happening in the . Our vantage point within the galaxy greatly limits studying our galaxy in its entirety. In such a scenario our nearest neighbour The (M31) proves to be an excellent choice as its proximity and inclination allows us to resolve millions of using space based telescopes. Zwicky Transient Facility (ZTF) is a new optical time domain survey at , which has collected data in the direction of M31 for over 6 months using multiple filters. This Thesis involves exploitation of this rich data set. Stars in M31 are not resolved in ZTF as it is a ground based facility. This requires us to use the large public catalogue of stars observed with (HST): The Panchromatic Hubble Andromeda Treasury (PHAT). The PHAT catalogue provides us with stellar co- ordinates and observed brightness for millions of resolved stars in the direction of the M31 in multiple filters. Processing of the large volumes of data generated by the time domain surveys, requires us to develop new data processing pipelines and utilize statistical techniques for deter- mining various statistical features of the data and using machine learning algorithms to classify the data into different categories. End result of such processing of the data is the astronomical catalogues of various astrophysical sources and phenomena and their light curves. In this thesis we have developed a data processing and analysis pipeline based on Forced Aperture Photometry Technique. Since the stars are not resolved in ZTF, we per- formed photometry at pixel level. Only small portion of the ZTF dataset has been an- alyzed and photometric light curves have been generated for few interesting sources. In our preliminary investigations we have used a machine learning algorithm to classify the resulting time series data into different categories. We also performed cross comparison with data from other studies in the region of the Andromeda galaxy.

vii viii Contents

List of Figures1 List of Tables3 Chapter 1 – Introduction 5 1.1 Our work ...... 5 1.2 Outline of the Thesis ...... 6 1.3 Scientific Context ...... 7 1.3.1 The Realm of the Nebulae - Island Universe ...... 9 1.3.2 Golden Era of Sky Surveys and Big Data ...... 10 1.3.3 Statistical and Machine Learning Techniques ...... 13 1.3.4 Time Domain ...... 13 Stellar Variability ...... 14 Transients ...... 15 Gravitational Lensing ...... 16 Strong Lensing ...... 19 Weak Lensing ...... 19 Microlensing ...... 20 Applications of Gravitational Lensing ...... 22 1.4 Our Testbed - The Andromeda Galaxy ...... 24 Chapter 2 – Studying Stars 29 2.1 Magnitude System ...... 31 2.1.1 - m ...... 31 2.1.2 Absolute Magnitude - M ...... 31 2.1.3 Bolometric Magnitude ...... 32 2.2 Photon Detection using Charge Coupled Devices (CCDs) ...... 33 2.2.1 Key parameters of CCDs ...... 34 2.3 Errors and Noise ...... 35 2.3.1 Dark Current ...... 35 2.3.2 Photon Noise ...... 35 2.3.3 Readout Noise ...... 35 2.3.4 Pixel Saturation ...... 36 2.3.5 Cosmic Rays ...... 36 2.3.6 Sky Background ...... 36 2.3.7 Other sources of Noise ...... 36

ix 2.3.8 Crowding ...... 36 2.4 Photometric Filter System ...... 37 2.5 Stellar Photometry ...... 38 2.5.1 Aperture Photometry ...... 39 Extinction correction and Photometric Calibration ...... 41 Uncertainties ...... 41 2.5.2 Light Curves ...... 41 Chapter 3 – Observational Datasets 43 3.1 Panchromatic Hubble Andromeda Treasury(PHAT) ...... 43 3.2 Zwicky Transient Facility (ZTF) ...... 47 Chapter 4 – Data Processing Approach and Pipeline Development 51 4.1 Software Tool Chain ...... 51 4.1.1 Python Programming Language ...... 51 4.1.2 Python Packages ...... 52 4.1.3 Git as Version Control System ...... 52 4.1.4 Imaging and data visualization ...... 52 4.2 HPC Computing Resource - NSC’s Tetralith Supercomputer ...... 52 4.3 Exploratory Data Analysis ...... 53 4.3.1 Data used for processing and Analysis ...... 53 4.3.2 Quality Filtering of ZTF images ...... 56 4.4 Our Approach ...... 57 4.4.1 Pixel Photometry ...... 57 4.4.2 Source Measurement and Estimation ...... 58 4.4.3 Background Measurement and Estimation ...... 58 4.5 Methodology and Workflow ...... 59 4.6 Use of Machine Learning Algorithm for Classification ...... 63 4.6.1 Principal Component Analysis (PCA) ...... 65 4.6.2 Random Forest Algorithm (RFA) ...... 65 4.7 Validating the Pipeline ...... 66 Chapter 5 – Results and Data Analysis 67 5.1 Time Series Data Analysis ...... 67 5.1.1 Visual Assessment ...... 67 5.1.2 Lomb Scargle Periodogram Analysis for Variability ...... 67 5.2 Machine Learning Classification ...... 68 5.2.1 Microlensing Detections ...... 68 5.2.2 Variable Detections ...... 69 5.2.3 Cataclysmic Variable Detections ...... 71 5.2.4 Constant Source Detections ...... 72 5.3 Cross Comparisons ...... 73 5.3.1 Comparison with Hubble Catalog of Variables (HCV) ...... 73 5.3.2 Comparison with ZTF Caltech team ...... 73 5.3.3 Comparison with Caldwell in M31 ...... 74

x 5.4 Summary and Conclusion ...... 75 5.5 Outlook and Future Work ...... 76 Appendix – A - Terminology 79 Appendix – B - zwindromeda - python pipeline 83 Python Packages ...... 83 7.1 Zwindromeda - Pipeline ...... 83 References 85

xi xii Acknowledgments

It was in 1999 I first came to know about Gravitational Lensing during one of the thought- provoking classes taught by my physics teacher Shri Arun Pujer. I remember it even after 20 years and that is a testimony to the impression it had created on me and the subsequent curiosity about the phenomena. Coming back to 2019, while looking for a Master Thesis I came across Caltech webpage of Prof. Shrinivas Kulkarni, about whom I had read during my school and college days in various Indian regional and national print media. I was excited to be writing to him to seek his guidance for my Master Thesis. Subsequent to our email exchange, Prof. Shrinivas Kulkarni suggested me to do my Master Thesis under his collaborator Prof. Ariel Goobar at The Oskar Klein Center for Cosmoparticle Physics, Stockholm University in Stockholm, Sweden. I readily accepted the Thesis topic proposed by Prof. Ariel Goobar as I was excited to work on the topic. Coincidentally 2019 also marked the 100th anniversary of observation of deflection of light by gravity, which was predicted by Albert Einstein in his General Theory of Relativity and was observationally confirmed by Sir Arthur Eddington during a Solar Eclipse on 29 May 1919. Apart from this there are four other important historical anniversaries which makes it bit exciting period, namely, the 100th anniversary of Saha Equation (named after Meghnad Saha), 50th anniversary of discovery of CCD, 100th anniversary of (April 26, 1920) between Astronomers and Heber Curtis. And without those works I would not be doing this thesis work today. Every thesis work reflects the subject matter understanding (at the time of writing the thesis) of the author or the lack there of. As this work is in the domain of science ideas are put to test rigorously and any lack in understandings or wrong directions or misinterpretations will be found out with subsequent works by the author or the other researcher working on same problems and may require revision in some point in time after the thesis has been published. As one takes a look at History of Science, it is evident that scientific explorations and the insights gained are not linear, it has undergone rigorous tests and revisions and modifications, only those works remain which agree with the experiments.1 As one goes through this thesis, one may notice some mention of historical developments and some quotes from the works which I have read, and which had an impact on me

1It doesn’t matter how beautiful your theory is, it doesn’t matter how smart you are. If it doesn’t agree with experiment, it’s wrong. - Richard Feynman

xiii and have remained a source of inspiration in all this limited scientific adventures until this point in time. Although some health issues and residence permit issues affected me during this time, but I thoroughly enjoyed working on this thesis and reading up invaluable scientific literature. And at the final stages of preparing the thesis report, the COVID-192 virus outbreak begun causing loss of life and impacting economy across the world, several countries have been forced to lock down. The impact of this pandemic will be tremendous on health of the people and economy across the world. This also had effect on my health, and I had to self-quarantine which has caused some delay with the thesis work. I am greatly indebted to Prof. Shrinivas Kulkarni for suggesting me to work on my Master Thesis under Prof. Ariel Goobar. And I sincerely thank my thesis advisor Prof. Ariel Goobar for providing me with an opportunity to work on a fascinating topic under his advice, guidance and for his endless encouragement during this Thesis. I am sincerely thank Dr. Rahul Biswas for his valuable support, discussions and providing continuous guidance in particular on High Performance Computing (HPC), Python, data visualization, statistics and Dr. Sem´eliPapadogiannakis for advice and discussions. It was also wonderful to attend various colloquium seminars organized at The Oskar Klein Center, which were insightful and thought provoking for me, I will always cherish these seminars. I sincerely thank internal examiner Prof. Anita Enmark and have greatly enjoyed her lectures during other courses. I thank Prof. Victoria Barabash, head of Erasmus Mundus SpaceMaster Program at Lule˙aUniversity of Technology (LTU), Kiruna, Sweden for her constant support throughout the Masters studies. I cannot thank enough Ms Maria Winneb¨ack for all her support with administrative work and making our life easier at Lulea University of Technology, Kiruna, Sweden during my studies. I also would like to thank Anette Sn¨allfot-Br¨andstr¨om for her support with the administrative work. I am indebted to all those teachers, lecturers, professors, scientists and online resources created by various people across the world from whom I have learnt various things and they have kept me going and have shaped me. Finally, I owe significant thanks to my family and friends who stood by my side and supported me in various ways in pursu- ing this Master Studies. Without their encouragement, moral and financial support this journey would not have been possible. Also I would like to acknowledge NASA ADS Sys- tem3 and Arxiv4 which provided invaluable scientific literature and acknowledge various Open Source Software’s such as Python, LaTeX, Ubuntu Linux OS, Github which were of enormous use during the course of Master Thesis. Based on observations obtained with the Samuel Oschin 48-inch Telescope at the Palo- mar Observatory as part of the Zwicky Transient Facility project. ZTF is supported by the National Science Foundation under Grant No. AST-1440341 and a collaboration

2https://www.who.int/emergencies/diseases/novel-coronavirus-2019 3https://ui.adsabs.harvard.edu/ 4https://arxiv.org/

xiv including Caltech, IPAC, the Weizmann Institute for Science, the Oskar Klein Center at Stockholm University, the University of Maryland, the University of Washington, Deutsches Elektronen-Synchrotron and Humboldt University, Los Alamos National Lab- oratories, the TANGO Consortium of Taiwan, the University of Wisconsin at Milwaukee, and Lawrence Berkeley National Laboratories. Operations are conducted by COO, IPAC, and UW. The computations and data handling were enabled by resources provided by the Swedish National Infrastructure for Computing (SNIC) at the National Supercomputer Centre (NSC), LinkoiU partially funded by the Swedish Research Council through grant agree- ment no. SNIC 2019/3-575” Based on observations made with the NASA/ESA Hubble Space Telescope, obtained from the MAST Data Archive(https://archive.stsci.edu/prepds/phat/) at the Space Telescope Science Institute, which is operated by the Association of Universities for Re- search in Astronomy, Inc., under NASA contract NAS 5-26555. These observations are associated with the Panchromatic Hubble Andromeda Treasury Multi-cycle Program. This research has made use of the SVO Filter Profile Service (http://svo2.cab.inta-csic. es/theory/fps/) supported from the Spanish MINECO through grant AYA2017-84089.

xv xvi List of Figures

1.1 Cosmic Inventory ...... 8 1.2 Comparison of Field of View of various Sky Survey Telescopes...... 11 1.3 The ray diagram showing the Geometry of Strong lensing...... 19 1.4 The artistic impression showing the geometry of Microlensing...... 20 1.5 Magnification of source due to gravitational microlensing ...... 21 1.6 Picture of Andromeda Galaxy (M31) taken from Zwicky Transient Facility (ZTF) ...... 25 1.7 Major Variability Surveys of M31 ...... 27 2.1 Workings of a CCD Camera ...... 33 2.2 Filter transmission for the ZTF g, r, and i-band filters ...... 38 2.3 Aperture Photometry Technique ...... 39 2.4 Aperture Photometry Plots ...... 41 2.5 Light curves of (a) Transient events, (b)Variables ...... 42 3.1 Location and Alignment of PHAT Bricks ...... 44 3.2 HST PHAT Composite color image ...... 45 3.3 RGB image from PHAT Brick 11 ACS-WFC ...... 46 3.4 Comparison of Field of View of various meter class Sky Survey Telescopes. 47 3.5 Specifications of the ZTF Observing System ...... 48 3.6 ZTF CCD readout Channel Layout ...... 49 4.1 PHAT Brick 09 and Brick 11 Overlap Region ...... 53 4.2 Plot showing F814W Vega vs F814W SNR for Brick 11 ...... 54 4.3 Histogram Plot showing number of stars vs F814W Vega Magnitude . . . 54 4.4 ZTF Science images of M31 ...... 55 4.5 Distribution of ZTF Observations under fieldid 695 and r band filter . . . 56 4.6 Plot for Quality Cut ...... 57 4.7 Blending and overlapping ra/dec ...... 58 4.8 Data Processing Workflow ...... 63 5.1 Detections classified as Microlensing from Machine Learning Algorithm LIA 69 5.2 False Positives due to artifacts. North is up and East is left of the image. 69 5.3 Plot of Variable Star detection through Machine Learning classification . 70 5.4 Plot of Cataclysmic Variable Detections through Machine Learning clas- sification ...... 71 5.5 Plot of Constant Source Detections through Machine Learning classification 72

1 5.6 PHAT Brick 11 over Hubble Catalogues of Variables (HCV) data of M31 region ...... 73 5.7 Cross Match with ZTF Caltech Team data ...... 74 5.8 Cross Match with Caldwell Star Catalogue to identify Star Clusters . . . 74

2 List of Tables

1.1 Variables and Transients ...... 15 1.2 Sky Surveys towards M31 - Some of these are specific to search for Mi- crolensing Events, while some surveys have different scientific objectives . 28 3.1 Approximate Corners of PHAT Bricks ...... 45 3.2 Downloaded ZTF Science Images listed per field id, filter id, ccdid and qid 49 4.1 Number of Stars in each Band in each Brick of PHAT ...... 55 4.2 Column Names used in the final dataset ...... 60 4.3 List of 47 Statistical features computed using LIA for classification . . . . 64

3 4 Chapter 1 Introduction

History of astronomy displaces us from cosmic importance.

- - Unknown

One of the most remarkable and surprisingly unexpected discovery - based on Type Ia supernovae- in the recent years was that the expansion of the universe is accelerating1 [1,2,3] and we don’t yet know much about the source responsible for this cosmic accel- eration. It is generally accepted that this accelerated expansion of the universe begun approximately 4 billion years ago during the dark energy dominated era [4,5]. But the important thing to know is that without the expansion and cooling of the universe, the atoms, molecules, terrestrial planets, stars, and all the large scale structures, including life on earth would not come into existence. And how we arrived to this un- derstanding of the universe from the ancient times is an exciting story about human curiosity and ingenuity. In this chapter we will start with the outline of the Master Thesis work. We provide introductory information about the astrophysical landscape, starting the journey from the beginnings of astronomy to the contemporary times to the sky surveys and statistical techniques used for data processing. After this we will take a look at time domain studies of transients and variability, and to some extent detailed explanation of microlensing. Finally we will take a look at the Andromeda Galaxy and the previous time domain studies of this galaxy.

1.1 Our work

The scope of this thesis work is limited to the following tasks, i). develop a software pipeline for processing the data gathered over 6 months in optical band from the ground based observation using Zwicky Transient Facility (ZTF) towards Andromeda Galaxy ii). process small part of the imaging data using the developed software pipeline and

1The accelerating expansion of the universe is the observation that the expansion of the universe is such that the velocity at which a distant galaxy is receding from the observer is continuously increasing with time.

5 6 Introduction generate time series data iii). generate light curves from time series data. We have also made an attempt to show, usage of Machine Learning techniques for classification and cross comparisons with other catalogues. The scientific motivations behind the thesis work is that with the Zwicky Transient Facil- ity (ZTF) optical survey, the imaging data can be mined to find astrophysical variability and transients in our neighbouring galaxy the Andromeda galaxy. For this we need stel- lar positional data so we tap into the rich data set of Hubble Space Telescope : Hubble Panchromatic Andromeda Treasury (PHAT) catalogue which provides us with observed brightness and stellar coordinates for millions of resolved stars in the direction of the Andromeda galaxy. It is important to note that stars in the Andromeda galaxy are not resolved in ZTF as it is a ground based facility but are resolved through Hubble Space Telescope. We perform forced aperture photometry on ZTF images, at the stellar coordi- nates provided by HST’s PHAT catalogue. Studying this imaging data will not only help us with a census of transient events and variable sources but also lead to an understand- ing of the distribution of such sources and phenomenology in the Andromeda galaxy and aid in the understanding of stellar and galactic evolution. The technical motivations are that this project involves processing large volume of data and also involves using the data generated by both space based and ground based telescope. Another important aspect is that as with other sciences, Astronomy is also largely becoming data intensive science so the modern astronomer requires knowledge of computer science and statistics. In the coming days Software Engineering and Statistics will be essential tools for Astronomers to do large scale processing of data, this requires development of different automation techniques. The bulk of the thesis work was towards using one approach by means of a semi-automated supervised processing of the data, generation of light curves, preliminary exploratory data analysis, and using a Machine Learning algorithm for classification of the detections.

1.2 Outline of the Thesis

The subsequent chapters will describe the scientific basis and the scientific methods and astrophysical techniques used as well as the data processing pipeline development, data analysis and the results obtained. Chapter 1 begins with an introduction to the layman and briefly provides outline of the thesis. This chapter begins with a very generic overview of the beginnings of Astronomy, contemporary developments and the cosmic inventory. The attention is then turned towards the of Galaxies followed by a brief explanations about how the Sky Surveys and their datasets are turning out to be gold mine for astronomers where important discoveries are being made and how the growth of Big Data Science is making use of Statistical Techniques by adopting Machine Learning Techniques. Then we take a look at one important kind of sky survey through which Time domain studies are being carried out to study astrophysical variability and transients. We will discuss about stellar 1.3. Scientific Context 7 variability, various classification schemes, various mechanisms involved and transients. We also introduce the subject of Gravitational Lensing. Since Gravitational Lensing was of particular interest for this thesis work, to some extent a detailed explanations is given. An attempt has been made to give some brief historical developments which I believe helps in getting a broader overview of the topic of interest and gives a glimpse of decisive turning points which shaped this important tool in use today in Astrophysics research. This also in a way serves as a tribute to the few giants - to paraphrase Newton- on whose shoulders we will climb to look further. We then provide details about the Andromeda Galaxy and briefly list down the previous time domain studies carried out in the direction of M31. Chapter 2, covers the basic theory and practical aspects of astrophysical techniques utilised in this Thesis. The chapter begins with the nature of stars and how we study stars, following this the magnitude system is introduced which helps in quantifying the stellar measurements carried out through observational data. Then a brief introduction to the CCD detectors is provided along with the various errors and noise sources that affect the measurements. In the subsequent sections details of various photometric filter systems are explained and then the stellar photometry technique called aperture photometry will be explained followed by the description of light curves and how photometry helps us with gaining an understanding of the astrophysical aspects of the sources that we measure and quantify. In Chapter 3, we provide details of the datasets from Hubble Space Telescope and Zwicky Transient Facility which acts as foundation for this entire thesis. Chapter 4 deals with the data processing techniques, software and computing resources used, pipeline development, verification and validation of the data processing steps and the statistical techniques used. This section discusses the investigations carried out by the author during the course of Master Thesis. This investigation if carried out for few more months this will probably lead to the discovery of interesting objects. And finally in Chapter 5, will discuss the results of the work carried out for thesis as well as the intended future works.

1.3 Scientific Context

Astronomy started when the ancients began visual observation of the cosmos. What one could see in the night sky through naked eye was, few heavenly bodies moving around the sky in front of what appeared to be fixed background of stars, few stars appear- ing bright while few others appearing faint, few stars with blue, white or yellow/orange colors, few stars twinkling, stars appearing in different patterns which were named con- stellations, solar and lunar eclipses and occasionally one could see comets and rarely new stars appearing for several days and then disappearing without a trace. Which gives an impression that the universe is made of collection of stars and other wandering stuff. 8 Introduction

Through such observations our ancestors were trying to understand what was going on in the sky above earth. And our ancestors looked at the sky for various purposes for omens of war and peace, to time keeping, to making predictions of weather, to predicting suitable time for agriculture, for meaning of life and for religious purposes.

Next wave of astronomical observations saw a tremendous progress as astronomers started to take a glimpse at the undiscovered universe through newly invented optical telescopes. This had a tremendous impact not only on physical sciences but also on the society. Physi- cal laws of motion and gravitation provided a way to understand the underlying governing physical principles. Followed by this the invention of camera allowed astronomers to take images of the night sky and store it and process it. With recent development in digital imaging, super fast computers and large data storage systems allowed astronomers to store and share the collected data among other astronomers and perform analysis and study them after carrying out observations. Modern developments in space and ground based telescopes are enabling us to study the cosmos in different spatial and temporal aspects.

Cosmology - being the study of the Universe in its entirety, from its birth to evolution to its fate - is rapidly undergoing change at tremendous scale due to rise in High Performance Computing (HPC) as well as digital storage resources (used for simulations and data processing), and unprecedented growth in precision observational datasets produced by different sky survey programs utilizing the higher sensitivity and higher resolution offered by modern telescopes. This growth in high quality datasets may help observational to answer some of the fundamental questions.

Contemporary developments have lead us to dis- cover even more astrophysical objects and astro- physical phenomenology. Now we know about dif- ferent structures on a vast range of scale, and each structure is at a different evolutionary stage. Fig.1.1 shows the cosmic inventory. Our current sci- entific understanding is limited only to about 5% of the constituents of our Universe, this luminous mat- ter (or baryonic matter2) comprises of all that we know and see from stars, planets, moons, comets, asteroids to gas, dust to people. The other 95% of the constituents of the Universe - Dark Matter3 and Dark Energy - remain hidden from our understand- ing. Our continued observations of the sky reveals that a great deal of mysteries yet to be explained by physical laws. Figure 1.1: Cosmic Inventory

2Only the baryonic matter produces the light that we observe in stars and galaxies. 3”Dark” as some of the exotic form of non luminous matter (non baryonic matter) that does not absorb or emit light thus remains dark but shows gravitational interaction. 1.3. Scientific Context 9

Observations carried out over a millennia have enriched our knowledge and understanding of the universe. We have come a long way and today we live in an era of Multi Messenger Astrophysics. Multi Messenger Astrophysics is an interdisciplinary field that combines data from many different instruments that probe the universe using different cosmic mes- sengers. Cosmic messengers comprise of electromagnetic waves (Gamma Rays, X-Rays, UV, Visual, Infra Red, Microwaves, Radio Waves), neutrinos, cosmic rays and recently in the form of gravitational waves and they play an important role in understanding our Universe. There are many more areas to explore including the origin and fate of the Uni- verse [6]. The ongoing research in many areas will continue to improve our understanding of the cosmos in the years to come.

1.3.1 The Realm of the Nebulae4 - Island Universe5

As we are used to call the appearance of the heavens, where it is surrounded with a bright zone, the Milky Way, it may not be amiss to point out some other very remarkable Nebulae which cannot well be less, but are probably much larger than our own system; and, being also extended, the inhabitants of the planets that attend the stars which compose them must likewise perceive the same phenomena. For which reason they may also be called milky ways by way of distinction.

- - , [8]

Our understanding of the scale of the universe has undergone tremendous change since the ancient times. Up until 15th century it was believed that based on Aristotelian view, the Earth is at the center of the Universe (Geocentric) and a few astronomical objects (Planets, Moon, Sun and Comets were the only known astronomical objects ) revolve around it in front of a fixed sphere of stars. This view was challenged by Copernicus and others, they proposed Sun centric universe (Heliocentric) which was confirmed by further observations that were carried out. Later Thomas Digges expanded ideas of Copernicus and was probably first one to suggest the infinite universe and stars being located at varying distances rather than fixed to a sphere [9]. Up to the beginning of 18th century the entirety of the universe was largely limited to the solar system, then by the late eighteenth century astronomers found that the solar system was part of a much larger group called a galaxy. Our solar system is located in a spiral arm of a galaxy - containing a vast number of stars (1011 stars) appearing like a great band of scattered light stretching around the sky - 6 which we call the Milky Way galaxy, our own cosmic

4Title of the Book by Edwin Hubble [7] 5Astronomer Heber Doust Curtis’s the Island Universe theory suggesting that the Milky Way was just one of many galaxies within the universe. 6A gravitationally bound system of distributed vast number of astrophysical sources such as stars, planets, moons, asteroids, comets, gas, dust and mysterious Dark Matter. Typically galaxies contain ∼1011 stars. 10 Introduction home. The stars that we see in the night sky through naked eye belong to this Milky Way galaxy, and our Galaxy has been studied extensively. Our home galaxy, belongs to a group called the Local Group[10] - a group of galaxies within 1 Mpc of the Milky way, which are gravitationally bound and hence relatively close to each other. This local group contains 35 other galaxies including the Andromeda Galaxy, the Large and Small Magellanic Clouds, and the . The Milky way galaxy together with Andromeda galaxy dominates the Local Group and account for the 90% of the luminosity. Most of the other galaxies in the Local group are clustered around these two galaxies. By modern astronomical convention, concentration of about ∼10 to 40 galaxies with mass range from are called as “groups”, which have 13 14 mass of the order of ∼ 10 − 10 M . About ∼40 to 1000 number of galaxies are called 14 as “clusters”, which have mass more than ∼ 10 M . Note that these are arbitrarily defined. Since there are no galaxies between 1.3 Mpc and 2.4 Mpc around our galaxies, our local group can be thought of as an island-like region within the universe and there are several such groups of galaxies have been observed in the universe. And telescopic observations have revealed that the universe contains millions of galaxies clustered around each other and distributed, and some of these clusters themselves grouped around other clusters forming what is known as “superclusters” with sizes ranging from 30 to 60 Mpc.

1.3.2 Golden Era of Sky Surveys and Big Data

It is important to have a census of astronomical objects and their distribution and know how these astronomical objects change over time. Once we gather information about many objects we can study them to understand the underlying governing principles as well as discover basic physical properties of the universe. Some of the astronomical objects and phenomenology associated with them are so rare that to find them we must look at millions of objects. This requires mapping the universe over large areas of the sky at greater depths and over entire wavelength range. Sky surveys provide a best technique in this regard for discovering new objects and phenomenology. Since the early 20th century, the astronomy community is witnessing the large volumes of scientific data being produced due to various astronomical surveys. Large Field of View (FoV) offered by modern telescopes helps us to scan vast area of the sky in few pointing’s, some of the FoV of various existing and planned Sky Survey Telescopes is shown in fig.1.2. The astronomical surveys in the coming days will even further increase the volume of data. Improvements in computing power, storage and other infrastructure have enriched astronomical image processing and data management capabilities. These large sets of data are processed and maintained in astronomical catalogues. Astronomy is undergoing a revolution and changing the way we probe the universe and 1.3. Scientific Context 11

Figure 1.2: Comparison of Field of View of various Sky Survey Telescopes. The Moon and Andromeda Galaxy are shown to Scale. Credits: Joel Johansson the way we answer fundamental questions. We are in the golden age of ambitious sky surveys where telescopes - both ground based and space based - have been scanning and will continue to scan large areas of the night sky and penetrating deeper into the universe and generate large volumes of science data - what is known as Big Data 7 science today. Innovative detectors are opening new windows on the universe, creating unprecedented volumes of high-quality data, and coupled with this, the computing technology is driving a shift in the way scientific research is done in astronomy and astrophysics. Some of the frontiers of these survey astronomy include Time Domain studies (of Transients and Variability), Census of the Solar System (NEOs, MBAs, Comets, KBOs, Oort Cloud), Dark energy and Dark matter studies (Gravitational Lensing, Gravitational Waves) etc. The survey data sets have been invaluable resource to the astronomy community in revolutionising astronomy by providing wealth of information, helping the community to observe and analyse changes in large amount of astrophysical sources over the period of time, continuous monitoring of astrophysical phenomena and helping produce catalogue of variety of astrophysical objects and phenomena. Large surveys helps in constructing catalogs and maps of objects in the entire sky. The catalogs will be filled with rich data related to positions, magnitudes, shapes, profiles and temporal behaviors of the objects. This will enable Astronomers with initial classification and discovering objects for further follow up. Based on these studies Astronomers can understand the details of astrophysical processes that govern these objects and further extrapolation leads to understanding of the entire class of such objects.

7Big data in general is characterized by 3 V’s and they are i.) Volume (Data Size varying from datasets with sizes of Terabytes to Zettabyte), ii.) Variety (Data Sources such as astrometric, photo- metric, spectroscopic, multi-wavelength, multi-messenger in the form of structured and unstructured) and iii.) Velocity (Speed of Change such as generating more than Terabytes of data per observation or per night) 12 Introduction

Astronomical Archival data is any kind of organised, systematized information about the sky above us. Historically archives have been very small - contained 10s to 100s of objects, so the Astronomy was data starved science. Some historical catalogues are Hipparcos star catalogue. Then we have catalogues produced by Ptolemy’s Almagest in 138 AD based on Hipparcos Catalogue and ’s catalogue, each of these catalogues revolutionized Astronomy and Physics. During the early telescopic era, several star catalogues were produced. Johann Bayer produced a star catalogue in 1603, in that he classified stars in various , and named brightest stars in that as α , 2nd brightest as β and so on and he constructed a star chart and a star catalogue. A century later in 1771, produced a catalogue [11] of around 100 extended nebulous objects.8 as he wanted to provide comet hunters of his time with well known nebulous objects which were frequently being confused with new comets. This was an important catalogue and is still being used today mainly by amateur astronomers. In 1786, William Herschel with the assistance of his sister compiled The Catalogue of Nebulae and Clusters of Stars (CN) [12, 13, 14] which was later expanded into the General Catalogue of Nebulae and Clusters of Stars (GC) [15] by his son, John Herschel. This was expanded later in 1888 by John Louis Emil Dreyer where he compiled 7840 deep sky objects under of Nebulae and Clusters of Stars (NGC) catalogue [16]. Apart from general catalogues, there are specialised catalogues they do not list all the stars in the sky, instead highlight a particular type of star, such as variable stars. The earlier catalogues were small and were in printed form, the modern catalogues are stored on world wide web9, which makes it easy to search and is available for other researchers. Since the earliest days astronomers have gathered and systematised data in catalogues and made it useful for various purposes from farming to gaining an understanding of the universe. This makes astronomy a data intensive domain due to the availability of large volumes of data from large sky surveys in recent years and it will continue to remain so. Some of the next generation projects such as Wide Field InfraRed Survey Telescope - WFIRST (IR), James Webb Space Telescope - JWST (IR), Vera Rubin Observatory [For- merly, Large Synoptic Survey Telescope - LSST] (Optical), Extremely Large Telescope - ELT (Optical), Atacama Large Millimeter/submillimeter Array - ALMA (Millimeter), Square Kilometer Array -SKA (Radio), Laser Interferometer Gravitational Wave Obser- vatory - LIGO (Gravitational Waves), Laser Interferometer Space Array - LISA (Gravita- tional Waves) with their extraordinary capabilities will lead to unprecedented discovery about our Universe. Each of these astronomical projects will generate terabytes of science data every night. New sky surveys will continue to explore ever-larger survey volumes in the times to come and will discover new transient and variable phenomena thus enriching our knowledge. Technology is a key driver in many of the recent discoveries in Astron- omy. New Technologies enables us to study new phenomena which were inaccessible due

8Today we know that some of them are gaseous nebulae and others as galaxies and star clusters both open and globular clusters. 9One can refer to CDS site for modern astronomical catalogues http://cdsarc.u-strasbg.fr/ cats/Cats.htx 1.3. Scientific Context 13 to old technology and helps to develop new approaches for probing the universe.

1.3.3 Statistical and Machine Learning Techniques

The data generated from sky surveys will provide us with important large volume statis- tical data sets. Huge surveys of the sky over many wavelengths of light can be analyzed statistically for hidden correlations and explanations leading to new discoveries. The scale of this dataset requires us to develop innovative, increasingly automated, and increasingly more effective ways to mine scientific knowledge. An emerging branch of Computational Statistics known as Machine Learning (ML) involves a plethora of algorithms and sta- tistical techniques which perform tasks such as regression, classification and clustering analysis using the numerical inferences drawn based on the statistical patterns in the data without explicit instructions. The large volume of data makes it much more complex for any single person or a group of people to analyse. So instead computer algorithms are written in such a way that a computer can process all the data and find correlations on its own. The algorithm then outputs its results that can be put to further analysis. The real power though comes from the fact that it’s very common for these algorithms to feed the results back into themselves learning how it can analyze the data even better for more accurate results.

1.3.4 Time Domain Astronomy

Deceptively, the night sky appears to be unchanging apart from the twinkling of stars due to atmospheric effects but careful observations with naked eye over a period of time and with telescopes (small to wide field of view) in all wavelength of the EM spectrum one can start noticing the changing nature of the sky. There are many astronomical time varying periodic phenomena, aperiodic phenomena and sudden one time events that occur both in the local universe and in the distant universe.

Time Domain Astronomy is the study of astrophysical objects that change in brightness and/or positions over the period of time. Time domain studies of stellar origin, involves studying the Variable stars (such as pulsating stars, eclipsing binaries etc) and tran- sients(such as supernova, nova, kilonova, microlensing etc) And the Time domain studies of non-stellar origin involves studying Asteroids - which change position, show variation in brightness due to rotation, and /Active Galactic Nucleai (AGN) where there is random change in brightness). Time domain studies are extremely important in astro- physics as they cover all areas of modern astrophysics and helps to understand the , stellar populations, stellar and galactic evolution and cosmology. 14 Introduction

Stellar Variability

Some stars are known to change in brightness over several periods of time from seconds to years and such stars are known as variable stars. Variable stars have periodicity so they are predictable. Variable stars are “Experimental Laboratories” for stellar physics. Earliest record of a variable star was in 1596 when the German astronomer David Fabri- cius accidentally noted a variable star o Ceti (Mira10 type) in the constellation Cetus, which he could not find in the recorded star catalogue of his time. Later in 1603, Jo- hannes Bayer rediscovered this star and listed it as Omicron in his atlas. And almost a century later in 1782 John Goodricke discovered variability in β Persei (aka Algol) as well as δ Cephei. These discoveries lead to the search for stellar variability. In 1905, Henrietta Levitt was working as a “computer”, studying the stellar variability in Small Magellanic Clouds (SMC) and Large Magellanic Cloud (LMC) [17, 18], discovered the Period Luminosity relation for a group of variable stars known as Cepheids. This impor- tant relation -Leavitt Law- subsequently helped with determination of distances to SMC and LMC and distances of other galaxies. This also contributed to the understanding of the structure and scale of the universe. In principle during the stellar evolution, each star radiates energy in the form of photons with different intensities due to various mechanisms. The cause of this change in light in- tensity is either intrinsic to the star or extrinsic. If the cause of the variability is intrinsic, this may be due to change in physical properties such as its radius (alternative expansion and contraction either rapidly or slowly) due to internal mechanisms involved with en- ergy production (such as nuclear reactions) as well as energy transportation mechanisms (radiation, convection, convolution) or due to rapid rotation or due to mass loss of the star. All of these mechanisms affect stellar evolution. If the cause of the variability is extrinsic, then the reason might be due to some orbiting planet or a companion star or due to absorption by interstellar medium. All such mechanisms give us clues about the interstellar medium and immediate vicinity of the star. Variable stars are divided into different classes, sub classes and types and are listed in the table.1.1. Each of these Pulsating stars, Eclipsing Binaries, Variability induced by rotation, eruptive stars and cataclysmic variables have different time scale and amplitude of variation, from milli-magnitudes to several magnitudes11. The variability may be periodic or quasi irregular. Cepheids are young variable stars located in spiral galaxies. They have been used as standard candles for extragalactic distance measurement as they obey period luminosity relation. In cepheid stars, the envelop expands and contracts this causes changes in brightness. Flares are sudden explosions caused by the rearranging of magnetic fields which causes flashes of increased brightness of the stars, they are similar to Solar Flares but some of these flares might be 2 or 3 magnitudes stronger compared to sun. We may

10Mira means the wonderful. 11Magnitude is based on inverse logarithmic scale to measure the brightness of astronomical bodies. 1.3. Scientific Context 15

No Category Group Class Sub Class Type 1 Variables Intrinsic Pulsating Stars Cepheids Type I Classical 2 Type II W Virigins 3 RR Lyrae 4 RV Tauri 5 Long Period Variables Mira Type 6 Semi Regular 7 Eruptive (Cataclysmic Stars) Recurrent Novae 8 Dwarf Novae 9 Symbiotic Stars 10 R Coronae Borealis 11 Extrinsic Eclipsing Binaries 12 Rotating Variables 13 Transients Novae 14 Supernovae 15 Gravitational Microlensing 16 Gamma Ray Bursts

Table 1.1: Variables and Transients also find Cataclysmic Variables (CVs), which are binary stars, where massive primary star transfers matter from its less massive companion star creating discs around itself. As the matter falls into the primary star/ these accretion discs can become unstable and show an increased brightness - 2 or 3 orders of magnitude higher than their original brightness - within few hours and fall back to their original brightness within few days. Historically these CVs were discovered due to their large amplitude variability. The sub-classes of CVs, Classical Novae (White Dwarf interacting with late-type companion star) also have these large amplitude variability where they can show 6 to 19 magnitudes of increase in brightness for several days to years, before fading back to the original brightness level [19, 20]. The basic data in stellar variability studies are the Photometric observational data taken over a period of time - what is known as Time Series Data. The light curves generated from these photometric observational data is an important source for interpretations of stellar variability. This when coupled with spectroscopic follow up observations, helps us in identifying the mechanisms and the causes behind the variability.

Transients

Some of the events/phenomena are sudden events, meaning one can find that the signal (increase in brightness) for the phenomena appears suddenly and then fades away. The sudden astronomical events or phenomena are known as “Transients”. They are short lived and are unpredictable. These transient events have time duration ranging from few seconds to several months, with constant monitoring of the night sky in multiple wavelengths we can catch such events while they occur. Some events/phenomena that fall under the transients are gravitational lensing, supernova, nova, gamma ray bursts. Study of these transients will help us in understanding the various mechanisms involved in the evolutionary aspects of the universe and also helps in determining the dark matter in the universe. At this point let us take a brief look at a special class of transients - 16 Introduction

Gravitational Lensing- which are the result of deflection of light by gravity.

Gravitational Lensing

Nature and Nature’s laws lay hid in night: God said, Let Newton be! and all was light.

- Alexander Pope, Poet

It did not last: the Devil howling ”Ho! Let Einstein be!” restored the status quo.

- Sir J.C. Squire, 1926

Can Gravity12 affect Light? If so in what ways? and how can we make use of such a phenomena? These thoughts have captured imaginations of many natural philosophers of medieval times and physicists of modern times, from Laplace, Newton, Soldner, Cavendish to Einstein and Zwicky to modern cosmologists. Einstein’s General Relativity (GR) has given successful framework to study and answer some of these questions. In our everyday life we perceive reality in terms of Euclidean space, 3 dimensions grid of Space with static time. We are familiar with the idea that the light travels in straight lines and when you look at a star you would think that the light from that star is reaching your eyes in a straight-line from star to your eyes, but across the universe this light follows the curvature of space time as predicted by GR. As light travels it gets stretched(shear), distorted and deflected, shifted and magnified (convergence) due to the nature of spacetime fabric and its curvature around astronomical objects. So many things that we see are not where we perceive them to be. A good analogy is to think of wavy fun house style party mirrors that make you look stretched, twisted and bent, giving you an illusory view of reality. In 1783 Rev. John Michell an elected member of the Royal Society in London delivered a lecture on the gravity of stars where he reasoned that a massive star’s gravity might be so strong that no light could escape from its surface13. Following this, in 1796, Peter Simon Laplace wrote an essay14, suggesting the effect of attractive force of heavenly bodies

12According to General Relativity gravity is nothing but the manifestation of curvature of SpaceTime fabric. In ordinary Eucleadean space without curvature, light travels in a straight line. In curved Riemannian space light travels along the geodesics. 13And he also wrote a letter to Henry Cavendish in 1784 suggesting existence of ”dark stars” [21]. 14A luminous star, of the same density as the earth, and whose diameter should be two hundred and fifty times larger than that of the sun, would not, in consequence of its attraction, allow any of its rays to arrive at us; it is therefore possible that the largest luminous bodies in the universe may, through this cause, be invisible. - Laplace in Systeme du Monde, Book 5, Chap VI., [22, 23], English translation in 1.3. Scientific Context 17 on light [23]. While he was actively researching on the nature of light and working on Corpuscular Theory of Light, Isaac Newton instead of speculating (Hypotheses non fingo15 - was his attitude) the effect of gravity on light, he listed down this and many other unsolved queries at the end of his second major treatise Optiks16. Later Johann Georg von Soldner in 1801 proposed deflection of light by gravity and calculated the deflection angle at the solar limb eq.1.1, based on Newtonian Gravitational Theory [26]. But observations were not carried out because of two reasons first one was that this deflection angle is so small that it was beyond the capability of astronomical instruments of his time in early 19th century and the second reason was that the prevalence of wave nature of light at around this time as compared to corpuscular theory. In the lens like effect, the light ray will travel along a hyperbola near a spherical gravitat- ing mass, this mass will be its focus and the two asymptotes intersect to give a deflection angle α and is given by,

2GM α = 2 = 0.”875 (1.1) c R where M is the mass of Sun, R is the radius of the Sun, G is Newton’s gravitational 2GM constant, and c is the speed of light. c2 is the Schwarzschild Radius. When one looks at the above equation, one quickly realizes that since c is much larger than G, the deflection angle will be very small. For this reason it was thought that it is almost impossible to observe this. Almost a century later, In around 1911, Albert Einstein predicted the deflection of light by Sun’s gravity as a consequence of his General Theory of Relativity (GR). Initial calcu- lations by Einstein were similar to the calculations of Soldner but later he revised those calculations in 1915 [27, 28] and came up with a value of deflection angle which was dou- ble the value (ref eqn.1.2) calculated by Soldner. This was due to the realization that not only is space bent, but time is distorted as well, thus slowing down near massive objects. This “time dilation” creates an additional deflection of light along with the geometric curvature of space. To observe this deflection angle, Arthur Eddington and others went on an expedition to watch the solar eclipse of 29 May 1919 during which the sun was favourably located in front of the Hyades , thus allowing for measurements of the positions and deflections of these background stars [29]. The expedition was car- ried out to answer two questions i.) Whether light has any mass? or gravity has any effect on light? ii.) If light has mass, whether the angle of deflection follows Newton’s

Appendix A of Hawking and Ellis ”Large Scale Structure of Space-Time” [24]. 15Principia, General Scholium. Third edition, page 943 16When I made the foregoing observations, I designed to repeat most of them with more care and exactness, .... But I was then interrupted, and cannot now think of taking these things into further consideration. And Since I have not finished this part of my design, I shall conclude with proposing only some queries, in order to a farther search to be made by others. Query 1. Do not Bodies act upon Light at a distance, and by their action bend its Rays; and is not this action(caeteris paribus) strongest at the least distance? [25] 18 Introduction gravitational laws or Einstein’s General Relativity? The observations revealed that the deflection angle was indeed as predicted by Einstein. The answers to these questions captured public attention and made Einstein famous across nations. And subsequent observations with microwaves improved the accuracy of this deflection angle.

4GM α = 2 = 1.”75 (1.2) c R where M is the mass of Sun, R is the radius of the Sun, G is Newton’s gravitational constant, and c is the speed of light. This bending of light from astronomical objects leads to the lens like effect and is now known as Gravitational Lensing. At around 1924, the Russian physicist Orest Danilovich Khvolson or Orest Chwolson too had arrived at the same value independently but he did not think it as a lens [30]. Although Einstein had thought about lensing like effect due to this deflection of light, he did not pursue it until 1939 when a Czech engineer Rdui W. Mandl paid visit and pursued him to publish17. Reluctantly Einstein published his calculations concluding that there is no great chance of observing such a phenomena [31]. Soon after this the Swiss-American Astronomer Fritz Zwicky came with ideas about lensing effect due to galaxies and predicted they can be observable18 as the deflection angle is larger than that of point like sources [33, 32, 34, 35]. Astronomers Sydney Liebes and Sjur Refsdal provided all the essential equations to analyse the gravitational lensing, Refsdal was the first one to propose using gravitational lensing for measuring the rate of expansion of the universe i.e., Hubble constant [36, 37]. Later Bohdan Paczynski proposed using gravitational lensing for searching dark matter, brown dwarfs, planets and planetary masses in galactic bulge, halo, and the lo- cal group [38, 39, 40, 41]. It was only in 1979, first gravitationally lensed was discovered [42, 43]. Then in 1988, first partial Einstein ring MG1131+0456 was discov- ered followed by in 1998 first complete Einstein ring B1938+666, was discovered [44]. Since then the gravitational lensing has been used by astrophysicists to discover many gravitational lensing events. Gravitational Lensing can occur in variety of configurations such as individual stars, binary systems, massive compact objects, galaxies, clusters of galaxies, and the filamentary structure of the cosmic web. Depending on the positions of the source, lens, observer, the mass of the lens and shape of the lens, gravitational lensing can occur in three different regimes., namely, Strong lensing, Weak lensing and Microlensing. In a gravitational lensing system four things are involved Source, Lens, Observer and the image(s)/brightness variation. Source is the one which is being observed, it can be a quasar, a galaxy or cosmic web, cosmic microwave background (CMB), a

17...It would be, according to my view, in the interest of science to begin with these experiments as soon as possible...– Rdui W. Mandl in a letter to Albert Einstein, 23 April 1936 18. . . the probability that nebulae which act as gravitational lenses will be found becomes practically a certainty [32]. 1.3. Scientific Context 19 or a star cluster or a star. Lens can be any compact source with mass or energy and this causes deflection of light of the source and the amount is proportional to the mass or energy of the lens. The lens is between the observer and the source either along line of sight or around it. Observer is the one who is observing the source (usually a detector of a telescope). The observer will see either brightness variations or single or multiple images of the source depending on the lens and the configuration of source, lens and observer and the path of the light. In the next sections we will give brief details about each of the gravitational lensing regimes.

Strong Lensing

Gravitational lensing by elongated sources such as galaxies is known as Strong lensing. The fig.1.3 shows the artistic impression of the geometry of Strong Lensing by extended objects like a galaxy. Here an extended object which distorts the light is marked as Lens and the background extended object is denoted as source. In such configuration we see either multiple images of background sources, or we see luminous rings (called Einstein Rings) and luminous arcs (Einstein arcs), fig.1.5 demonstrates the luminous rings and luminous arcs. A single lens produces two unresolved images, a binary lens three or five unresolved images depending on the location of the source in comparison to the lens. Stronger lensing effect produces greater distortions and from these we can pinpoint the concentrations of mass responsible for producing such distortions.

Figure 1.3: The artistic impression showing the Geometry of Strong lensing. The figure is not to scale and the distortion shown here is greatly exaggerated relative to real astronomical systems. Weak Lensing

Gravitational lensing by large scale structures of the universe across cosmic distances is known as weak lensing. In this regime the lensing effect is so small that we can study them only statistically by averaging over a large number of galaxies. This allows us to measure mass distribution of galaxy clusters as well as the properties of the cosmic web. 20 Introduction

Microlensing

Gravitational lensing by compact sources where the angular separation between the im- ages is micro arc seconds is known as Microlensing19 and is demonstrated in fig.1.4. The separation between the two images is of micro-arcsec (hence the name microlensing) or- der for a solar mass located at cosmological distance, is of milli-arcsec order for galactic stars. The microlensing events are a spacetime geometric effect generated by an object (a lens) passing near the line of sight between the observer (our telescope) and a background source (a star). When this object passes, it generates a deformation in spacetime causing the rays of light that came from the background star to deviate towards itself. Here instead of seeing multiple images of the background source we will see an apparent increase in the brightness of the background sources.

Figure 1.4: The artistic impression showing the geometry of Microlensing. The figure is not to scale (the ”Lens” star here is representational it can be any compact objects) and the distortion shown here is greatly exaggerated relative to real astronomical systems.

The fig.1.4 shows the artistic impression of the geometry of Microlensing. Here you can see that the light from the source gets deflected around a point like source that is another star which we call as lens this deflection causes the source to appear in a different location. The fig.1.5 shows an artistic impression of the apparent increase in the brightness as a compact object passes between the source and observer in the line of sight of the observer. The distance between the images produced by microlensing effect are so small that we can not resolve them through imaging, instead what we see is increase in the brightness

19Bohdan Paczynski suggested the name microlensing to describe gravitational lensing that can be detected by measuring the intensity variation of a macro-image made of any number of unresolved micro-images. 1.3. Scientific Context 21 of the source star as lens moves between observer and the source. In the plot you see the time on the x-axis and in the y-axis the brightness of the microlensing event. So you see that the brightness suddenly increases and then returns to normal. This effect allows us to find dark objects.

Figure 1.5: Artistic impression of the magnification of source due to gravitational microlensing. As the lens (L) and source(S) come in the line of sight of the observer, the observer notices increased brightness if it is a point source. Otherwise rings and arcs in the form of distorted images in case of elongated source like galaxy, quasars etc. The arrow mark indicates the direction of motion of the Lens

Consider a simple configuration as shown in fig.1.4 where a single point like compact ob- ject acting as a lens on a single point like source. The magnification, A, source luminosity as a function at a given time, t , is given by, u(t)2 + 2 A(t) = (1.3) u(t)pu(t)2 + 4

Where u(t) is impact parameter or the distance of the lens star from the line-of-sight to the source star(marked with dashed line), and is expressed in units of the Einstein Radius. The impact parameter u(t) and the magnification of the source changes with time.

If the distance between observer and source is DOS, distance between observer and lens is DOL, distance between lens and source is DLS, and the lens mass is M, and when the source is directly behind the lens, then Einstein ring radius, RE is given by,

r 4GM DLSDOL RE = 2 (1.4) c DOS

And lensing event Einstein radius crossing time scale tE is given by,

RE tE = (1.5) v⊥

Where V⊥ is the transverse velocity of the lens with respect to the line-of-sight to the source. As the lens moves relative to the line of sight, the magnification will change with time. Consider the lens moving at a constant relative transverse velocity V⊥, reaching its min- imum distance u0 (impact parameter) to the undeflected line of sight at time to, then u(t) is given by, 22 Introduction

s  2 2 t − t0 u(t) = uo + (1.6) tE

Microlensing events are rare, short lived and are unique events, as a given star may be microlensed at most once in a human lifetime. The magnification is a function of time and depends only on (u0,t0,tE). If we assume no binary stars or planetary systems around stars, they are achromatic as lensing is independent of wavelength, meaning, source’s color should not change during the microlensing event20. This means, the ratio of flux change in different filter bands should be constant in time.

∆F (t) g = Constant (1.7) ∆Fr(t)

These microlensing events are possible to detect if we have sufficient time baseline and high cadence monitoring. Also the galactic bulge regions are interesting due to the high density of both background stars and ordinary stars in the line of sight, microlensing is guaranteed to occur. As compared to the resolved and bright stars, crowded regions such as bulge of a galaxy pose a serious challenge in the analysis of microlensing events, due to the effect of blending. Also blending occurs due to atmospheric seeing, where several stars are blended together of which typically only one star will be lensed. These limit the determination of different microlensing parameters and detection efficiency. Microlensing events that also exhibit a detectable photometric signature provide lens mass constraints. Most of the surveys used for discovering microlensing events have been photometric monitoring programs. The other thing to note here is that all stars at a given distance have the same probability of being lensed so the sample of lensed stars should be representative of the monitored population at that distance, particularly with respect to the observed color and mag- nitude distributions. And the probability of a microlensing event is low, so it becomes necessary to monitor a large number of stars for a long period of time.

Applications of Gravitational Lensing

Gravitational lensing was used recently to make a portrait of a located at the center of a galaxy known as M87 and it provided visual evidence for the glowing disk of gas - the radiation ring [61]. With the help of gravitational lensing we will learn more about mass function of stars and brown dwarfs, binary systems and their mass ratios, detect planets and planetary mass objects, stellar evolution, distribution of dark matter and it has also been suggested

20Chromaticity may also arise due to differential amplification for a limb-darkened extended source and blending as suggested in[60] 1.3. Scientific Context 23 for searching intelligent life. This will also allow us to test Einstein’s general theory of relativity ever more rigorously. Microlensing has become one of the powerful tools in advancing several areas of Astro- physics. Using microlensing we can learn about massive objects surrounding the stars and in the galaxy and cluster of galaxies. Some probable applications of gravitational lensing at different regimes are listed below. 1. Helps in Understanding the Dark Matter (MACHOs) distribution. Nature of Dark Matter still remains unknown even nearly 8 decades after it was first proposed. Since they do not produce any EM radiation, but show gravitational interaction. Paczynski was the first one to suggest using microlensing to search for them, he also calculated the probability of finding microlensing events along the magellanic cloud to be 10−2. 2. Discovery of [40, 62, 63] 3. Measurement of Hubble Constant [37] - by measuring the path length we can mea- sure the distances - this tells us the rate of expansion of the universe. 4. Search for Extraterrestrial Intelligence [64, 65, 66, 67] 5. Microlensing by Cosmic Strings [68] 6. Discovery of Double stars and Eclipsing Binaries [40] 7. Population studies of local isolated neutron stars and black holes 8. Discovery and measurement of masses of nearby Dwarf stars, Brown Dwarfs [69] and Planets 9. Interstellar Communication [70, 71] / Intergalactic Communication Tool [72] 10. Imaging Exoplanets and Extended sources with Solar gravitational lens (SGL) [73] 11. Detection of non-luminous matter in astrophysics in the form of stellar and plane- tary mass objects [74] 12. Detection of Primordial Black Holes [75] 24 Introduction

1.4 Our Testbed - The Andromeda Galaxy

“Like a candle seen through a horn”

—Simon Marius, 1612[76]

Under dark and clear moonless night sky we could see a fuzzy looking nebulous object in the constellation of Andromeda - which is nothing but a collection of stars and is known as the Andromeda Galaxy. As it is visible to the naked eye, up until 1920s, it was believed that this nebulous object is part of the Milky Way. Several spectroscopic (leading to stellar signatures) [77] and radial velocity studies [78] had indicated it to be extragalactic. These studies lead to great debates [79] and were instrumental in revolu- tionising our understanding of the shape and extent of the cosmos. Edwin Hubble’s [80] observations with the Mount Wilson Observatory Telescopes, proved beyond doubt that the Andromeda Galaxy is - extra-galactic- outside the Milky Way [81, 82]. Later Walter Baade carried out observational studies and was able to define different stellar popula- tions in the Andromeda Galaxy. The stellar populations categorization continues to be used to this day as it forms an essential feature used in studies of stellar and galactic evolution. Under Messier Catalogue, Andromeda galaxy is designated as object M31 and under Hubble Galaxy Classification Scheme based on galactic morphology, it is classified as ”Sb” - Intermediate Spiral [81, 83] and ”SA(s)b” according de Vaucouleurs Third Catalogue of Bright Galaxies [84] and modern IR observations have revealed the bar like structure with an estimated length of 4-5kpc in its center [85]. Andromeda galaxy is the nearest major located at around r = 770 kpc (or ∼ 0.77±0.04 Megapasec or 2.5 million light years, or in terms of red shift z = -0.00121. The apparent size of the Andromeda Galaxy is 3o.1 × 1o.25 and has absolute magnitude of -21.1 mag. Distance to Andromeda was determined using Period Luminosity relation of Cepheid’s. It has an estimated inclination of 77o relative to Earth and has 13o between plane of galaxy and line-of-sight and is 22o away from the Galactic plane. Even at 770 kpc distance it is about 2.5o wide on the sky (roughly 5 times the Moon’s diameter). Like other spiral galaxies, Andromeda has bulge, disc and halo, and has different population of stars distributed in the bulge and disc region. The disc shows two spiral arms marked by dark dust lanes and young star clusters. M31 has high dust content which belongs to 3 different components,a fore-ground component related to Milky Way extinction, mid-plane component related to M31’s internal extinction, and a differential component [86]. Later observations carried out by Hubble Space Telescope (HST) revealed the double central region, indicating galactic merger. Most galaxies show , meaning they are moving away from the Milky Way, but M31 shows the opposite. So,interestingly, both our own Milky way and Andromeda galaxies are in collision course and approaching each other with a velocity of v=110 km

21Negative sign as it is moving towards us 1.4. Our Testbed - The Andromeda Galaxy 25

−1 r s (or 402,000kph) and in approximately tc = v = 6.3 billion years, may merge to form a giant [87].

Figure 1.6: Picture of Andromeda Galaxy (M31), Credit: ZTF/D. Goldstein and R. Hurt (Caltech) [88]

A composite image of the Andromeda galaxy is shown in fig.1.6 made by combining three bands of visible light. The image covers 2.9 square degrees, which is one-sixteenth of ZTF’s full field of view. Due to its naked eye visibility and proximity, M31’s role as an important test bed was realized very early and several studies of M31 have provided us with important insights in multiple areas of Astrophysics22. Our vantage point within the Milky Way galaxy

22Hubble’s study of Cepheid variable stars in the Andromeda Galaxy helped us determine the dis- tance to the Andromeda Galaxy and its extragalactic nature and the rate of expansion of the universe (Hubble Constant). Studies conducted by Baade[89] helped us understand the different population of Stars (Young Population I Stars, Old Population II Stars), these populations helped further identifica- tion of Population I Cepheids (classical cepheids) and Population II Cepheids which helped refinement of Cepheid Period Luminosity Relationships improving the distance measurements. Rubin and Ford 26 Introduction greatly limits studying our galaxy in its entirety as dust clouds hide most of its struc- ture, in such a scenario M31 proves to be an excellent choice and as such it has been responsible for breakthroughs in our understanding with respect to evolution of stars, rotation of galaxies and the scale of distances in the universe. Other than that what makes it important/interesting is its proximity and morphological similarity with Milky Way galaxy. Proximity will allow us to resolve stars as well as we can perform spectro- scopic follow ups. However one limitation with regards to M31 to note is that it is close to being an edge on galaxy. M31 and the Milky Way both may be similar in size, shape and both have Supermassive Black Holes (SMBH) in their core23. If both share similarity then studying the An- dromeda Galaxy in entirity allows us to understand our own Milky Way Galaxy. M31 offers many excellent advantages. Barring the high stellar surface density and crowding, its proximity allows us to resolve faint stars, this also helps in avoiding confusion between foreground and background stars, as there is a general understanding of the distances involved due to studies of Cepheids. M31 hosts different stellar populations, in its spiral arms, in its bulge and halo. M31’s halo can be studied globally. The halo region is home to very old stars, while the disk contains mixture of stars of different age groups, this is also home to interstellar matter so it is a region of active star formation. It is an ideal location for the hunt of varied kinds of variable stars and transient events hence it is kind of observational testbed for stellar physics, galaxy formation and evolution, as well as cosmology. This enables us to study about various physical processes which govern the stellar and galactic evolution in their full galactic context as the entire galaxy including its halo is visible for Astrophysical and Cosmological studies. We can probe not only the dark halo of the Milky way along different line of sight (LOS) but also of M31. Its high inclination will provide a strong gradient in the spatial distribution of microlensing events. The microlensing studies of M31 will complement the studies of the Milky Way halo using the the Large and Small Magellanic Clouds. M31 has different metallicity and star forming regions. M31 has dense stellar field for microlensing studies and is rich with variable stars, and having capability to directly observe their variable phenomena makes it an interesting target for testing various astrophysical stellar theories. Since both M31 and our galaxy halo can be probed thus we might be able to study galactic dark matter composed of compact objects - such as black holes, faint stars, brown dwarfs, Jupiter like planets - known as MACHOs (”MAssive Compact Halo Objects”)24 and its distribution. MACHOs are made of non-baryonic matter and they emit little or no radiation, as they are not luminous so they are hard to detect. MACHOs may explain the apparent presence of dark matter in galaxy halos. Various time domain studies have been conducted and being planned to study the Andromeda Galaxy. Some of these time studies[90] of the rotation curves of the Andromeda Galaxy indicated presence of ”dark matter”. 23The SMBH in our Milky way is known as SgrA* and the one at the center of M31 is known as M31* [91] 24MACHO term was coined by astrophysicist Kim Griest. 1.4. Our Testbed - The Andromeda Galaxy 27

Figure 1.7: Major Variability Surveys of M31 reproduced from[50]. The figure shows different survey under different colors, shape of the symbol points to the type of variables. domain studies and surveys specifically directed towards M31 are listed in the table.1.2 as well as shown in the fig.1.7. Most of these surveys were carried out to improve the distance determination to the nearest neighbouring spiral galaxy using Cepheids, detection of eclipsing binaries and microlensing events. They used different techniques, some of them are Point Spread Function (PSF) profile fitting [92] using DoPHOT software [93] or pixel lensing based on difference imaging analysis (DIA) techniques [94]. 28 Introduction 2MOA 12 EROS Mar - 11 1992 PLAN Jul NMS 10 [ 55 ] MACHO 9 MEGA 8 PAndromeda7 Feb 6 1999- Jun OGLE Telescope POMME 5 4 POINT- 3 Year WeCAPP 2 1 Survey No h d a g e c b j i f al 1.2: Table irlnigOsrain nAtopyis(MOA) Astro-physics in Observations Microlensing (EROS) Sombres d’Objets Recherche Exp´erience de (PLAN) Andromeda Lensing Pixel (NMS) Andromeda(MEGA) Survey and Microlensing Galaxy Nainital the Galaxy of Andromeda Exploration of survey Microlensing (PS1) 1 Pan-STARRS (OGLE) The (POINT-AGAPE) Experiment Experiment Lensing Pixels MEgacam(POMME) Gravitational Amplified with Optical Galaxy M31 Telescope–Andromeda of Newton Observations Isaac Pixel the with (WeCAPP) Observations Project Pixellensing Pixellensing Alto Calar Wendelstein e 51 ] 49 ] 48 , AGAPE 46 ] [ 53 ] g j i h d 98 0120 0 c Sampurnanand -cm 104 2001-2002 1998, [ 56 ] 0c olrhvnTele- BollerChiven 60-cm [ 59 ] f e 90 Apr 1990- Sep [ 58 ] b 07 0015mCsiiad2mHi- m 2 and Cassini m 1.5 2010 2007- [ 57 ] 1992–1995, [ 52 ] . N 34 INT m 2.5 [ 54 ] a c k uvy oad 3 oeo hs r pcfi osac o irlnigEet,wiesm uvy aedffrn cetfi objectives. scientific different have surveys some while Events, Microlensing for search to specific are these of Some - M31 towards Surveys Sky [ 50 , [ 47 , [ 45 , 1994 deg 1 1998 Dec - 2010 2010 Jul Present 2010- 2001–2009, Megacam 1996–2000, 2005 2004- Jan 2001 - 1999 Aug 2002 Mar 2008 1997- Sep .7m0.7 0.7 scope m 1 Marley ESO-Schmid deg 7 m 1.27 Hawaii 1.8m deg 0.237 Warsaw m 1.3 Swope, m 1.0 WFCAM INT m 2.5 Obs. Alto Calar at m 1.23 Obs. Wendelstein at m 0.8 Instrument aaa hnr Telescope Chandra malayan Telescope 55 34 Microlensing 8.3 FOV 13 × × × × × × × 3arcmin 83 arcmin 34 4arcmin 34 3arcmin 13 0.7 arcmin 8.3 2 2 1.4 o o 2 deg 1 , 2 2 2 2 2 2 Events 0Atri n oalk events like Nova and Asteroid 20 3 10 14 6 91 ∼ ∼ 12 3 8 346 1 17,000 10 Stars Variable ∼ ∼ Cepheids ∼ 2500 10 35,414 6 Others ∼ ∼ 0Exo-Planets 70 Novae 20 Chapter 2 Studying Stars

In this chapter, we begin with understanding what a star is, the nature of radiation, the magnitude scale, and CCD as a photon detector, various sources of noise, this is followed by stellar photometry where we describe the Aperture Photometry technique followed by various photometric filter system and provide a brief description of photometric light curves of transients and variables. A star is an astrophysical body formed from the collapse of molecular cloud after it exceeds a critical mass known as jeans mass, over the course of millions of years this molecular cloud collapse reaches a state of equilibrium under its own gravity and internal radiation pressure. Stars appear to be spherical in shape due to gravity being a spherically symmetric force field. The source of luminosity of a star comes from energy released by thermonuclear fusion reactions at its core. Stars radiate energy in different wavelengths of the EM spectrum and the luminosity output of stars varies as the interior undergoes thermonuclear reactions. As stars release energy they undergo changes in their size or composition and depending on the mass of the star, its lifetime can range from millions to billions of years where the stars slowly become brighter and hotter before running out of hydrogen. And at the end stage of the stars depending on their mass, and going through cataclysmic explosions or supernova explosions, may become white dwarf, or black holes. Most of what we know about stars is based on the study of different aspects and effects of light or Electromagnetic (EM) radiation. Although contradictory, this EM radiation has dual nature i.e., it has characteristics of both waves and particles (packets of energy called photons). The EM wave of wavelength λ and frequency ν= c/λ can have only specific allowed energy values that are integral multiples of a minimum wave energy, this minimum energy is known as quantum of energy or photons. Each of the photon has different wavelength, frequency and energy associated with it and these properties define the interaction of radiation with matter. The relation between wavelength λ , frequency ν and energy E, of a photon is given by eq.2.1

hc E = hν = (2.1) λ where h - is Planck’s constant of proportionality.

29 30 Studying Stars

Most stars emit light isotropically and are considered as a good approximation of a blackbody under thermodynamic equilibrium. For a blackbody, the total energy ET otal radiated at absolute temperature T, can be given by the eq.2.2,

4 ET otal = σT (2.2) where σ- is constant of proportionality known as the Stefan-Boltzmann constant

Stars differ in their brightness, and we notice that some stars are bright and some are faint. This can be due to 3 reasons, i).their luminosity output ii). due to their distance from us or iii) due to the interstellar medium absorbing/reemitting the light. We need precise and quantitative definitions to describe the strength of radiation and how it varies with distance between the source and the observer. To understand how bright the stars are we need to understand apparent brightness (how bright it appears to us as seen from Earth) and absolute (or intrinsic) brightness of stars (how bright the star is in reality).

The absolute (or intrinsic) brightness of a star is known as Luminosity. It is the total energy radiated isotropically per second which is essentially the power output of the star. Since stars are not perfect black body so we define, effective temperature Te as the temperature of a blackbody that has the same surface flux as the star. For a spherical star of radius R and is radiating isotropically, then luminosity L distributed evenly on a spherical surface of area 4πR2 is given by Stephan-Boltzmann eq.2.3

2 4 L = 4πR σTe (2.3)

It is important to realize that the energy a star produces each second (its luminosity L) is spread out over the surface of a sphere whose radius is the distance to the star.

To understand the apparent brightness, let us consider concept of flux. This is what the eye measures. Assuming the light is not absorbed until it reaches a given unit area, Flux (or radiant flux), F, is the total amount of energy of all wavelengths that crosses a unit area (collecting surface) oriented perpendicular to the direction of the light per unit time. Flux or intensity is the total energy from a star that hits each square meter of a detector aimed at the star per second and is measured in W/m2.

Consider a star, with luminosity L , and is surrounded by a spherical shell of radius, r, the flux, F, measured at distance r, known as inverse square law for light is given by eq., (2.4), L F (r) = (2.4) 4πr2 2.1. Magnitude System 31

2.1 Magnitude System

Humans for most part of the history, have been using their unaided eyes for observations of the night sky. Greek Astronomer Hipparchus of Nicea (∼ 160-127 BCE) while prepar- ing his catalogue of stars developed a ranking system to denote how bright or dim a star appears to the naked eye. His method of ranking was such that the apparent magnitude of brightest star was ranked m = 1 and the dimmest barely visible to the naked eye under ideal observing conditions was set to m = 6. The human eye is approximately logarith- mic in its response to brightness. So the estimates of stellar brightness measurements are given in terms of an inverse logarithmic based scale known as magnitudes. With the advent of Telescope, as one could see even fainter stars so it was difficult to place much fainter stars into 6 magnitude classes. Followed by this the developments in different detector technologies, photographic techniques more objective methods of measurements were employed. It was in 1856, N. Pogson proposed a new method which is in use today.

2.1.1 Apparent Magnitude - m

The apparent magnitude (or apparent brightness or flux) of an object is a measure of how bright an object appears to be to an observer or a detector. It is the amount of energy from an astronomical object in space which reaches a unit area of a detector each second. Magnitudes are dimensionless quantity, and they are related to ratios of fluxes. In Math- ematical terms, for two stars with apparent magnitudes m1 and m2 with fluxes F1 and F2 the equation is given by,

F1 m2 − m1 = 2.5log10 (2.5) F2

2.1.2 Absolute Magnitude - M

Since astronomers needed a way to compare the intrinsic, or absolute brightness of ce- lestial objects they defined absolute magnitude as the apparent magnitude of a star located at the distance of 10 . A star at 10 parsec has a parallax of 0.1” (100 milliarcseconds).1 From above eq.2.5,

 2 Ftp (m−M) d = 100 5 = (2.6) F 10pc

1Parsec comes form PARallax of one SECond of arc and 1 Parsec is defined as a distance at which the mean radius of the earth’s orbit subtends an angle of one second of arc. British Astronomer , FRS is credited with coining the word parsec. 1 Parsec is about 3 light years. 32 Studying Stars

where Ftp is the flux of the star located at a distance of 10 pc and d is the star’s distance in .

Rewriting the eq.,(2.6) for d, measured in parsecs, allows us to measure the distance to a star,

 d  µ = m − M = −5 + 5log (d) = 5log (2.7) 10 10 10pc or

µ = m − M = −5log10(p) − 5 (2.8)

Where μ is the distance modulus of the star, and p is annual parallax in arcsec.

2.1.3 Bolometric Magnitude

The magnitudes described above are for monochromatic wavelengths. One can obtain a magnitude scale for the entire wavelength range of EM spectrum and that magnitude scale is Bolometric Magnitude. And mathematically the relation between apparent bolometric magnitude and absolute bolometric magnitude (a star at a distance of 10 pc) is given by,

 d  m = M + 5log (2.9) bol bol 10pc

Luminosity of a star is all of the radiation emitted in all direction per second, where as flux is the fraction of the luminosity that crosses an unit area. With this understanding we can start with measuring the flux, F, and apparent magnitude, m, for a star and then using the distance, d, to the stars measured with stellar parallax, we can determine the intrinsic properties of that star. By measuring the apparent distance to stars - by means of stellar parallax - we will know whether the faint stars are big stars lying at a great distance from us or weak stars nearby. The star’s actual luminosity L, is determined using the inverse-square law for light. By measuring the Luminosity and computing the Temperature of a star and using the eq.2.3 we can essentially calculate the radius of the star. We also know that the luminosity of main sequence stars is roughly proportional to their mass to the fourth power i.e., LαM4, this allows us to compute the mass of the star. 2.2. Photon Detection using Charge Coupled Devices (CCDs) 33

2.2 Photon Detection using Charge Coupled Devices (CCDs)

CCDs are a highly sensitive solid state photon detectors. After their discovery in 1960s [95], CCDs are extensively used for imaging purpose in Astronomy since late 1970s. Mainly because of their important characteristics such as their stability and linearity characteristics, their high quantum efficiency, and ease of use compared to photographic plates. They can be used to detect broad range of wavelengths and can reach greater depths with short exposure times. Most important characteristics of an astronomical image are field of view (FoV) and pixel size. FoV determines how much sky the CCD can capture at once.

Figure 2.1: (a)Artistic impression of Working of a CCD (b) ZTF CCD Camera [96]

The CCD is divided up into a large number of light-sensitive small silicon diodes known as pixels arranged in rectangular 2D array. With CCDs we measure flux, which is the rate at which photons arrive, but we can not measure it directly, instead we count how many photons accumulate during an interval, called the integration/exposure time. So the image is formed from the accumulation over time but not instantaneously. A photon which falls on the pixel will be converted into one (or more) electrons in proportional to the light intensity falling on each of the pixel and the electric charge gets accumulated at that pixel, until the exposure is complete the free electrons keep accumulating at each pixel. The electric charges are held by electric charge applied to CCD control electronics. Through a control circuit the CCD is clocked out,and the number of electrons in each pixel are read. The last capacitor in the array dumps its charge into a charge amplifier, 34 Studying Stars which converts the charge into a voltage. By repeating this process, the controlling circuit converts the entire contents of the array in the semiconductor to a sequence of voltages. In a digital device, these voltages are then sampled, digitized, and usually stored in memory. CCDs only count up to a certain maximum level. If the number of counts for a certain pixel exceed this value then that pixel gets saturated.

2.2.1 Key parameters of CCDs

1. Quantum Efficiency (QE) - is the percentage of photons that are detected by the detector. QE varies with wavelength. Our eyes have QE of 20% whereas CCDs can have QE of around 90%. QE of 100% means one count (or electron) equals one photon.

2. Linearity - is the ability to respond linearly to any detections i.e., the number of electrons is directly proportional to the number of photons. This is essential parameters as this will help us avoid performing additional steps to determine the real or true intensity of different objects in an image. If 100% QE is assumed then if 100 photons falls on the CCD and CCD is able to determine all of them then they get converted to 100 electrons. The linearity of CCDs only holds true for a certain range of signals. There is a tendency for CCDs to get saturated with bright stars as the CCD response is reduced. Usually the stars brighter than 12 mag may be saturated which happens when a pixel contains more than certain number of electrons.

3. Dynamic Range - is the difference between brightest possible sources and the faintest possible sources that a detector can accurately capture in the same im- age. So in a CCD it is the minimum and maximum number of electrons that can be stored in a pixel. If the maximum number of electrons are exceeded then the pixel becomes saturated, for most CCDs it is around 150,000 electrons. Where as minimum number of electrons is not 1 but can have 2 to 4 electrons in each pixel due to the electronic noise associated with the readout (conversion of electrons in each pixel to a voltage) as well as thermal noise (known as Dark Current). Read out noise determines the dynamic range of the CCD.

4. Wavelength Range - CCDs can have wide wavelength range fro soft x-rays to visible to near Infrared.

Astronomical data taken with CCDs must undergo several calibration steps (or data reduction steps) before they are suitable for astronomical analysis. These pre-processing steps are to be performed to reduce/remove random noise and thermal noise, dark current and other systematic defects (dead pixels, hot pixels, etc.) and cosmic ray which may alter the pixels in the CCD. This is done by capturing dark and flat frames. This helps to convert the raw CCD data to useful images which can be used for further analysis. 2.3. Errors and Noise 35

2.3 Errors and Noise

Every astronomical measurement has some noise and measurement uncertainty and this limits the precision of any measurement made. So it becomes important to be able to estimate the noise in the observations and measurement uncertainties. Let us take a look at various sources of Noise.

2.3.1 Dark Current

The thermal noise of the camera generates the dark current even in the absence of any light source when the CCD is in total darkness. This requires cooling of the camera but too much cooling can reduce the sensitivity of the CCD, so temperature must be kept constant to obtain consistent data.

2.3.2 Photon Noise

In digital sensors,the photoelectric effect is used to convert photons into electrons. The photons arrive randomly, so collecting enough photons to make a good image, we must integrate the flux over time (from seconds to minutes) which is called exposure. One thing to note here is that this integration for some duration can obscure real variations in the flux of photons from the source. The uncertainty associated with measuring the discrete photons incident on a image sensor over a period of time interval constitutes the dominant source of image noise known as Photon noise. This is independent of other sources of noise. Photon noise follows a random temporal distribution, so photon counting is a poissonian process. The number of photons, N, measured by the digital sensor over a time interval, t, is given by the discrete probability distribution or Poisson distribution so is also known as Poisson Noise. The accuracy of the photon√ detection is limited by the square root of the number of detected photons i.e., σ = N due to poisson fluctuations. Photon noise sets the ultimate limit to how much we can learn about faint astronomical objects.

2.3.3 Readout Noise

The pre-amplifiers while reading data from the CCD, can introduce noise in the data. This poses a serious problem and gives a limit to the faintest detectable signal, thus if signal is weaker than the readout noise then the signal is indistinguishable from the noise. 36 Studying Stars

2.3.4 Pixel Saturation

Each pixel can store a limited number of electrons, after which the pixel becomes satu- rated and the charge will overflow into the neighbouring pixels.

2.3.5 Cosmic Rays

The high energy particles also can cause production bright dots and which in turn gener- ates unwanted electrons in the CCD, thus contributing to the noise sources. But usually cosmic rays are easy to recognize, because they are much sharper than stars (affecting just a couple of pixels). To remove cosmic rays we can take several images of the same field of view and take median of all the images.

2.3.6 Sky Background

Sky background is an important noise source that affects astronomical observations. This affects the ability to detect faint stars for long CCD exposures and affects the Signal to Noise ratio. There are various sources that contribute to sky background, the light pollution, airglow, zodiacal light as well as the sky brightness due to other stars. The sky brightness is expressed in Magnitude per square arcsec for a given bandwidth (U, B,V,R and I standards). Sky background estimation is a difficult task, as we shall see later, for stellar photometry we will make use of an annulus region around the astronomical object to determine this sky background. This sky background correction will be applied to source measurement to make accurate measurement of the magnitude of the source.

2.3.7 Other sources of Noise

The noise sources are not just limited to what was described above but there are several of them such as bad pixels due to manufacturing defects, saturated pixels due to bright stars, blended stars due to crowding. Various techniques have been developed to remove these from the measurements.

2.3.8 Crowding

Stellar densities (or crowding) in the bulge of galaxies poses a challenge in photometry in case of images from the ground based telescopes. Central regions of galaxies including very dense regions of the galaxies are the regions where stars are tightly crowded together. This crowding increases the noise in the data in two ways. It is difficult to decide whether a given photon is due to one star or two stars or multiple stars, so difficult to detect 2.4. Photometric Filter System 37 individual stars, the star counts become inaccurate and leads to difficulty in making accurate measurement of the luminosity.

2.4 Photometric Filter System

A filter is simply a precisely manufactured color glass that is used between the telescope and the detector. The color filters allow a certain wavelength of light to pass through while absorbing some wavelength of light above and below a bandpass. U filter transmits UV rays, B filter transmits blue light, V filter allows visible light to pass through, R filter passes red light, and the I filter passes infrared light so on. This helps with getting a higher signal to noise ratio at the specific wavelengths, thus improving the detail and contrast of the object being observed. The set of color filters used to make measurements in astronomical observations is called a photometric system. The fig.2.2 shows the filter pass band of Zwicky Transient Facility (ZTF) photometric system [96]. In Astronomy, color is the difference in brightness between two bands. The amount of energy emitted by an astrophysical source in different wavelength bands contain valuable information that helps in understanding the physical processes occurring in the astro- physical object. As astronomical instruments can only measure the intensity of incident light but not its wavelength, filters are used to make measurements in two different bands to determine the color of the astronomical source. How bright a star is through a partic- ular filter is shown through color magnitude diagram. Stars differ in colour and measure of these stellar colors in quantitative terms, involves comparison of the yellow (visual) magnitude (mV ) of the star with its magnitude measured through a blue filter (mB). A color index is used with photometric filter system. After making measurements in each of the filters and taking a difference between two filters, lets say between U and B filter and B and V filter a color index can be assigned. This color index is helpful in determining the surface temperature of the stars. Hot, blue stars appear brighter through the blue filter, while the opposite is true for cooler, red stars. Under UBV magnitude system, the color bands are centered around 365nm for U, 440 nm for B and 550 nm for V and the color index B-V can be given by,

B − V = mB − mV (2.10)

Similarly for U-B,

U − B = mU − mB (2.11)

The temperature of a star can be calculated directly by utilizing the color-color diagram. Due to logarithmic magnitude scale, the smaller the color index means the more blue (or hotter) the object is and, the larger the color index, the more red (or cooler) the object is. 38 Studying Stars

Figure 2.2: Filter transmission for the ZTF g, r, and i-band filters [97]

It is faster to determine color indices by means of a color-magnitude diagram from the apparent magnitude measured with different filters. Compared to the apparent magni- tudes, stellar colors are independent of the distance. As explained earlier, the color-color diagram is a way to compare apparent magnitude of stars at different wavelengths. Color of the celestial object tells us the temperature of the object. If stars were perfect black- bodies their color-color diagram would be straight line as indicated by dashed lines, and as we can see only the hot stars (A0 to B0)2 appear to be close to black bodies.

2.5 Stellar Photometry

Stellar Photometry is the measurement of the brightness of stars. The primary informa- tion that will be extracted from photometry will be the astrophysical parameters as well as their variability characteristics - if the measurements are carried out over a period of time - in both brightness and colour. Hence, the classification and astrophysical char- acterization of all objects will be largely based on the photometric measurements. The photometric measurements provide basic tools for classifying objects as stellar or non stellar sources. Photometric detectors in our case, CCDs record the incoming photons and produce im- ages. The images contain a large set of point-like sources which are stars. We extract position and perform photometric information of stars from these images and build pho- tometric catalogues. The photometry method is simply adding up the photon counts in all the pixels that contain the light from a star and subtracting the sky background to get the net signal from that star. Stellar objects are point sources, but due to various effects (atmospheric

2These designation for stars based on the spectral type of stars and comes from the Harvard Spectral Classification scheme. 2.5. Stellar Photometry 39 and instrumental) the photons of a star is scattered over a certain range of pixels [98, 99]. An astronomical image has two components the sky background with various other noise sources and the data from the astronomical objects.

Fstar = Pcount + Sbkg (2.12)

Where Fstar is total flux of the star per second, Pcount is the photon count of the star and Sbkg is the sky background. Then to convert flux into magnitude is simply performing,

m = C + 2.5log10Fstar (2.13)

Where m is the apparent magnitude and Fstar is in counts/seconds. C is a constant. To answer the fundamental questions related to brightness, one can note down a reference star to begin with and then start comparing other stars with this reference star in terms of their brightness. Vega (α Lyrae) is used as a standard reference star, whose magni- tude is set to zero. This requires observations to be transformed from the instrumental magnitude system of the observer to that of the standard star based magnitude system. Different magnitudes have different Zero Points.

2.5.1 Aperture Photometry

How to measure flux of a star in an image? As we explained earlier the photometry method is simply adding up the photon counts in all the pixels that contain the light from a star and subtracting the sky background to get the net signal from that star. This essentially requires us to define a circular region on a 2D image, and sum the pixel values in that region to get photon count for the star and this process is defined as Aperture photometry. Within which area to measure the flux? What amount of background is added to this flux? How to correct the background? These questions can be answered in Pho- tometric Analysis which consists of a series of stages, such as determining the aperture size for flux measure- ment, the annulus inner and outer radius for background estimation. An aperture is chosen such that the photons falling in that aperture belongs to the star being measured and it is measured in a magnitude scale. Sky background is estimated around each star by means of an annulus. It is assumed that the histogram of intensity values in an Figure 2.3: Aperture Photometry Tech- nique 40 Studying Stars image just having sky background is Gaussian. There are various estimators of the background sky. The sky background is a mixture of con- tributions from unresolved stars and wings of resolved stars. Fig.2.3 shows an aperture around a star (inner circle) whose photometry is being performed and the annulus region around the inner circle is also shown. It is a simple technique where a particular aperture is selected such that it encompasses all the photons of a star and photons are added together that fall within that aperture. And an annulus is chosen with an inner and outer radius and the sky background is estimated. This sky background will be subtracted from each pixel within the aperture and the resulting photon count is the stellar flux. Although the technique of Aperture Photometry is simple it is computationally very demanding when applied to 100s of millions of stars due to floating point operations. Mathematically the Aperture photometry is given by,

X sky F = F − n (2.14) ij pix pixel ij

Where F = Total counts in aperture from source, Fij is Counts in each pixel in aperture and npix is number of pixels in Aperture. Or in other words,

DNstar = DNap − DNbkg (2.15) where DNstar is count from star, DNap is counts in Aperture, DNblg counts in Background And Signal to Noise ratio is defined as, DN S/N = star (2.16) σ(DNstar)

The magnitude m, is given by

m = −2.5log(F ) + ZP (2.17) where ZP is ZeroPoint. Zero Point refers to count rate (DN/Exposure), it is the magnitude of an object that produces 1 count (or data number DN) per second, so a larger zeropoint indicates a more sensitive system (i.e., from larger aperture, throughput, etc.). Magnitude of an arbitrary object producing DN counts in an observation of length or exposure time,texp is given by,

DN  m = −2.5log10 + ZP (2.18) texp 2.5. Stellar Photometry 41

It is the setting of ZP, which determines connection between observed counts and standard photometric system and in turn between counts and astrophysically interesting measure- ments such as flux incident on telescope. The ZP gives a measure of the system sensitivity.

Zmag = mag + 2.5log10(F ) − 2.5log10(DN) (2.19)

Extinction correction and Photometric Cali- bration

Other than Interstellar Medium (ISM), the star light is affected by Earth’s atmospheric absorp- tion or extinction. A star near the zenith has least absorption from the atmosphere but as the position of star changes from zenith, due to the increased path length through atmosphere the absorption/extinction increases. This extinc- tion/absorption is because of molecular absorption due to O3, O2, CO2 and H20 etc and due to Rayleigh scattering by dust and aerosols. Rayleigh scattering is high in UV and in blue. To perform Figure 2.4: Aperture Photometry Plots extinction correction and to calibrate photometric observations standard stars with constant brightness are used. These standard stars are observed at airmasses similar to those of target stars.

Uncertainties

Uncertainty is the amount of variation that is observed when a measurement is repeated over and over again. Things that can change between measurement can cause variation. So all photometric measurements will have some uncertainty associated with them. The systematic uncertainties can be due to blending, uncertainty in the zero point and ex- tinction correction. Uncertainties can not be eliminated entirely but can be reduced. Improving the spatial resolution helps in minimizing uncertainties due to blending.

2.5.2 Light Curves

By performing aperture photometry on the stars in a region of interest across multiple observational data we will get flux/magnitude variation over the period of time. Once we have gathered observational data for any astronomical object over a period of time, we can construct a plot ”Light Curve” of flux vs time or magnitude vs time. This helps 42 Studying Stars in determining the periodicity of variables as well as determining transients - one time rapid events. The fig.2.5 shows artistic view of the light curves for transients and variable sources. The light curves of transients, for example microlensing events (also known as Paczynski light curve) are symmetric - bell shape light curve - with respect to the peak (maximum amplification point).

Figure 2.5: Shown here are the artistic impression of Light curves of (a) Transient events, (b)Variables

Photometric variability can be used to predict stellar parameters such as effective tem- perature, surface gravity, and metallicity. Photometry coupled with spectroscopy of stars and stellar systems leads us to understand the evolution of stars, creation of elements in stars and in supernova explosions, formation and evolution of galaxies. In order to determine mass, composition and age of stars, visual magnitude and color of a star is converted in to bolometric magnitudes (total absolute flux) and effective temperature (related to surface temperature). This is used with color-magnitude based or luminosity vs temperature based Hertzsprung-Russel (HR) diagram3 to trace the position occupied by a star and trace the path it takes to understand the stellar evolution. They can also be used to study star clusters and galaxies and their evolution based on stellar population studies and based on age and composition of constituent stars. With the help of pho- tometry we can gather data of stellar brightness and colour, and using colour magnitude diagrams we can test the ideas of stellar structure and evolution and it can continue to help develop physical theories.

3One important outcome of study of large samples of stars and their luminosity and magnitudes is the HR Diagram, this scatter plots of stars helped with not only understanding the stellar evolution but also find relation between luminosity and effective temperature of stars. Chapter 3 Observational Datasets

Every astronomical observation generates the data associated with the object/phenomena being observed along with the meta data describing various information such as obser- vation time, exposure time, position of the telescope, atmospheric details, calibration details. This meta data helps in understanding the data and aids in data processing. In this chapter we will take a look at the observational datasets which were used in this Thesis work.

3.1 Panchromatic Hubble Andromeda Treasury(PHAT)

Hubble Space Telescope (HST) with its 2.4 meter telescope offers higher angular resolu- tion than the ground based telescopes due to the excellent advantage offered by being in Space (substantially lower sky background and no atmospheric distortions). The faintest objects detectable with HST have magnitudes of ∼28 in r/i bands. So, It can resolve stars in the local group of galaxies especially the Andromeda Galaxy, through which it is possible to study stars in their different stages of evolution from progenitors to descendant stage and study stellar populations across the galaxy and by analyzing the color-magnitude diagrams for the resolved stars, one can derive the age, extinction and metallicity. The Panchromatic Hubble Andromeda Treasury (PHAT) was a Multi-cycle program utilizing the HST for the wide area survey of the disk of M31. This survey covered 0.5 deg2 or 1/3 of the Andromeda galaxy in 828 orbits with 432 pointing. The disk of M31 was resolved into more than 100 million stars and measurement of stellar photometry of the 117 million resolved stars in M31 was carried out using the Wide Field Camera 3 (WFC3) and Advanced Camera for Surveys (ACS) in near ultraviolet (F275W, F336W), optical (F475W, F814W), and near infrared (F110W, F160W) bands, enabling a wide range of scientific endeavours [100, 101]. What makes this survey remarkable is the fact that the stars of another galaxy other than our own Milky Way have been resolved into distinct stars and this enables astronomers to study detailed physical processes on parsec scales. Already this rich dataset has been proven to be a tremendous asset for astronomical community and resulted in many

43 44 Observational Datasets

Figure 3.1: Location and Alignment of PHAT Bricks and their footprint. North is up and East is to the left of the image.[100] publications and several of them relevant to this thesis work have been referenced in this Thesis work [100]. And here in our Thesis work we utilize the stellar positions of the resolved stars from this survey for transient and variability studies.

The fig.3.1 shows PHAT observations and each of these observations are divided into 23 rectangular sub regions called bricks. Each brick covers ∼ 1.5 kpc x 3 kpc region in projected size, and each of them are further divided into 18 fields, each field has projected size of ∼500 pc x 500 pc. Each brick was observed as two 3x3 half brick of 2 different pointing where WFC3 pointing being primary and ACS observations in the adjacent half brick. The brick numbering on the northern strip shows that starting from the bulge of the Andromeda, the odd numbers are along the major axis up to the outer disk. Table.4.1 lists number of stars in each band in each brick [101]. The PHAT survey data is archived at Space Telescope Science Institute (STScI) and can be accessed through their website1. The approximate PHAT brick corners have been mentioned in the table.3.1.

PHAT repository contains a survey-wide catalogue and has brick wise directories with binary tables stored in fits files. We used Version 2 star files (*b?? *st.fits) which contain brick wide photometry of all astrometry corrected objects classified as stars [100]. The naming conventions and column names for the phat v2 star files are mentioned in the HST

1https://archive.stsci.edu/prepds/phat/ 3.1. Panchromatic Hubble Andromeda Treasury(PHAT) 45

Brick PID R.A. 1 Decl. 1 R.A. 2 Decl. 2 R.A. 3 Decl. 3 R.A. 4 Decl. 4 1 12058 10.87969 41.26532 10.63671 41.34631 10.5761 41.24387 10.81892 41.1629 2 12073 11.12029 41.18549 10.8779 41.26699 10.81699 41.16468 11.05921 41.08321 3 12109 10.95216 41.36298 10.7089 41.44413 10.64809 41.34173 10.89118 41.26061 4 12107 11.19357 41.28269 10.95092 41.36434 10.8898 41.26207 11.1323 41.18044 5 12074 11.04482 41.45377 10.80133 41.53511 10.74029 41.43276 10.98361 41.35144 6 12105 11.28646 41.37401 11.04358 41.45586 10.98223 41.35363 11.22495 41.27182 7 12113 11.13774 41.5451 10.89403 41.62664 10.83276 41.52434 11.0763 41.44284 8 12075 11.37848 41.46474 11.13539 41.54679 11.07381 41.44462 11.31675 41.3626 9 12057 11.2298 41.63532 10.98587 41.71706 10.92436 41.61481 11.16813 41.5331 10 12111 11.4741 41.55495 11.23079 41.63719 11.16898 41.53508 11.41213 41.45287 11 12115 11.28791 41.71003 11.02353 41.74143 10.99982 41.63066 11.26404 41.59933 12 12071 11.54277 41.65341 11.29918 41.73581 11.23717 41.63373 11.4806 41.55137 13 12114 11.31717 41.8195 11.05228 41.85054 11.02883 41.73975 11.29355 41.70886 14 12072 11.57892 41.78777 11.3143 41.81941 11.29041 41.70867 11.55487 41.67718 15 12056 11.37644 41.92498 11.11115 41.95616 11.08756 41.84538 11.35268 41.81435 16 12106 11.63914 41.8931 11.37413 41.9249 11.3501 41.81416 11.61495 41.78253 17 12059 11.48855 42.02423 11.22291 42.05568 11.19909 41.94492 11.46456 41.91363 18 12108 11.75272 41.99292 11.48736 42.02497 11.4631 41.91427 11.7283 41.88238 19 12110 11.62005 42.12072 11.35409 42.15246 11.33001 42.04174 11.59581 42.01015 20 12112 11.88344 42.08995 11.61776 42.1223 11.59324 42.01163 11.85876 41.97944 21 12055 11.68571 42.22514 11.41935 42.25704 11.39512 42.14633 11.66131 42.1146 22 12076 11.95062 42.19503 11.68455 42.22754 11.65987 42.1169 11.92579 42.08456 23 12070 11.86064 42.31678 11.59401 42.34909 11.56944 42.23843 11.83591 42.20629

Table 3.1: Reproduced from [100] gives, approximate Corners of PHAT Bricks. Note: All coordinates given in J2000. Corners are given for the approximate limits of the WFC3/IR coverage; WFC3/UVIS and ACS/WFC images will extend beyond these limits.

Figure 3.2: HST PHAT Composite color image. Credits: NASA, ESA, J. Dalcanton, B.F. Williams, and L.C. Johnson (University of Washington), the PHAT team, and R. Gendler

Phat document on the website2 as well as stored in the headers of each binary fits file. Some of the important thing to know are the columns that contain x, y pixel positions, stellar coordinates mentioned in ra and dec column (both of them are in decimal degrees), f814w vega column contains vega magnitude for the source. As one can notice form the Andromeda galaxy picture shown in fig.3.2 3, crowding is severe in the bulge region from Brick 01 to brick 07, it is moderate between brick 07-

2https://archive.stsci.edu/pub/hlsp/phat/hlsp_phat_hst_v2photometry_readme.txt 3https://imgsrc.hubblesite.org/hvi/uploads/image_file/image_attachment/26733/web_print.jpg 46 Observational Datasets brick 16 and crowding is less severe from brick 17 up to brick 23 as this region is far away from the central bulge of the Andromeda Galaxy. The fig.3.3 shows an RGB image from PHAT Brick 11 ACS-WFC taken from PHAT Catalogue website4 shows dense star field with resolved stars, we can also notice few star clusters of the Andromeda galaxy in the image.

Figure 3.3: One RGB image from 15 different fields of data from PHAT Brick 11 ACS-WFC, we can notice few star clusters in the image.

4https://archive.stsci.edu/prepds/phat/brick11_acs-wfc_display.html 3.2. Zwicky Transient Facility (ZTF) 47

3.2 Zwicky Transient Facility (ZTF)

ZTF5 is a next-generation ongoing robotic synoptic time-domain survey for transient discovery, and is a successor to intermediate Palomar Transient Factory (iPTF) [102] which itself was successor to the PTF [103]. Since March 2018 ZTF started science operation and will continue until 2020. It is high-cadence, wide-area survey and has largest field of view of any meter-class camera, and has faster readout times compared with iPTF. Comparison of ZTF with other meter-class camera is shown in fig.3.4.

Figure 3.4: Comparison of Field of View of various meter class Sky Survey Telescopes. Credits: Eric Bellm, Project Scientist for ZTF, Caltech

ZTF uses the Palomar 48-inch Schmidt Telescope at Palomar Observatory [104] with a field of view (FoV) of 47 deg2 field with a 600 MP6 camera with a CCD read out time of 8 seconds, this enables faster scan of the sky. ZTF CCDs fill the entire Palomar 48 inch Schmidt telescope focal plane. The CCD Camera has 16 CCD elements as shown in fig.2.1 b, each CCD with 6k x 6k pixels and each pixel is of 15 μm in size and close to one arc second. Each of these 16 CCDs are further divided into 4 readout quandrants as illustrated in fig.3.6. ZTF uses three filters (g, r, i) and observations with g and r band are separated by roughly 1 hour. Fig.3.5 shows the ZTF Observing System specification. ZTF can scan the entire northern visible sky of - 30o declination every three nights (so cadence of 3 nights) at rates of ∼ 3760 square degrees/hour to median depths of g ≈ 20.8 and r ≈ 20.6 mag (AB, 5 σ in 30 sec) passbands and is focused on the low redshift regime (z < 0.1). It is also performing observations of the Galactic plane every night in

5https://www.ztf.caltech.edu/ 6Mega Pixel 48 Observational Datasets

Figure 3.5: Specifications of the ZTF Observing System [96] g and r passbands. ZTF discovery rate of transients brighter than 21st mag is higher than upcoming LSST [105]. The large field of view means it is enough for finding bright astrophysical sources which can later be followed up spectroscopically using 1 to 2m class telescopes [96]. Its observation time is split between public (40%), collaboration and partnership (40%) and private for Caltech(20%) and the data is hosted at Infrared Processing and Analysis Center (IPAC)7. In addition to the public Galactic Plane survey, ZTF has conducted a dedicated high-cadence survey of selected Galactic Plane fields to study the ultra-short variable sky [106, 107]. ZTF does suffer from blending of stars. With the R∼21 mag sensitivity of ZTF, researchers can track variable stars brighter than 4 mag. Some of these stars represent young stellar population. It will detect known variables, which can be used to evaluate detected unknown objects. But we may be able to detect rare and interesting objects that have relatively brief lifetime. ZTF’s greater survey speed will provide an unprecedented variability catalogue and breaks new grounds in the study of transients thus will enable great science. Due to wide range of survey parameters involved it is difficult to have one survey fits all kind of generic Time domain surveys, so specialized surveys will still continue to be used and generic time domain surveys will continue to be limited to few science goals. ZTF design was optimized based on such limitations and was designed for the study of explosive tran- sients [106]. ZTF will be stepping stone to the next generation Large Synoptic Survey

7https://www.ipac.caltech.edu/ 3.2. Zwicky Transient Facility (ZTF) 49

No Field ID Filter ID CCD ID Q ID No. Of Images 1 695 zr c10 q2 2 c11 q1 358 c15 q4 14 zg c11 q1 247 c15 q4 7 zi c11 q1 24 c15 q4 2 2 1735 zr c01 q1 128 q2 127 c05 q3 7 q4 129 zg c01 q1 35 q2 35 c05 q3 2 q4 34 zi c01 q1 1 q2 1 c05 q4 1 3 1736 zr c04 q2 1 c08 q3 14 zg c04 q2 1 c08 q3 9 zi c08 q3 1

Table 3.2: Downloaded ZTF Science Images listed per field id, filter id, ccdid and qid. Telescope (LSST)8.

Figure 3.6: ZTF CCD readout Channel Layout [108]

The fig.3.6 shows the ZTF focal plane CCD/readout-channel layout and numbering viewed from the back onto the sky. Green numbers (01..16) are CCD IDs; the labels q1,q2,q3,q4 denote the readout-channel ordering per CCD (same for all CCDs) and packaged in this order in the multi-extension raw-image FITS files; red numbers (00..63) denote the 64 readout-channel IDs (used for internal database bookkeeping). The x, y vectors denote the directions of increasing pixel coordinates that all readout quadrants must have when the FOV is orientated on the sky as shown [108]. The table.3.2 shows the number of ZTF files downloaded per fieldid and filterid. These are the images downloaded by passing the 23 PHAT brick corner positions.

8https://www.lsst.org/ Recently it has been renamed as Vera C Rubin Observatory. 50 Observational Datasets Chapter 4 Data Processing Approach and Pipeline Development

Software is a central part of modern scientific discovery. ... The public availability of code is corner stone of the scientific method. ... Today, software is to scientific research what Galileo’s telescope was to astronomy: a tool, combining science and engineering. It lies outside the central field of principal competence among the researchers that rely on it...it builds upon scientific progress and shapes our scientific vision.

– Christophe Pradal [109]

Essentially, all models are wrong, but some are useful.

– George Box

In this chapter we explain the software and hardware environment used for pipeline development and subsequent data processing and analysis.

4.1 Software Tool Chain

This section describes the software’s used for pipeline development, data processing and analysis. Most of these software tool chains are used on Linux operating system.

4.1.1 Python Programming Language

The development of Photometric Data Processing pipeline was carried out using the open source Python programming language1. Python is increasingly being adopted in astronomy due to community support, ease of use and since it is open source it is free. Open source tools were used as it will lead to open science and helps with reproducibility of the results obtained by other researchers.

1https://www.python.org/

51 52 Data Processing Approach and Pipeline Development

4.1.2 Python Packages

We also used python packages such as Astropy2, Pandas3, Numpy4, Matplotlib5, Photu- tils6. Astropy is a collection of libraries essential for astronomers. Pandas is an important library which is used for data manipulation and analysis. Numpy is a numerical library provides various functions for numerical data processing. For generation of plots we used a 2D plotting library - Matplotlib.

4.1.3 Git as Version Control System

During the development of software, Version Control System (VCS) plays an important tool as they can manage changes to the files, documents, software programs and maintain the different versions of the code as the code development progresses. In this regard we used git as a VCS and the code is hosted on github7 so is publicly available.

4.1.4 Imaging and data visualization

SAO DS9 software8 and fv - Fits Viewer Software 9 are the main Imaging and Data visualization tools specifically used by the astronomy community, as they have various functionalities to display images including FITS image/table files and allows one to per- form various analysis through connecting to different astronomy catalogues.

4.2 HPC Computing Resource - NSC’s Tetralith Su- percomputer

We used National Supercomputer Centre’s (NSC) largest High Performance Computing (HPC) cluster Tetralith supercomputer for our data processing10. It has 1908 compute nodes with Intel Xeon Gold 6130 CPUs with 16 CPU cores each, i.e, 61056 CPU cores in total. The supercomputer has 1832 thin nodes with 96 Gb of RAM and local 200 Gb SSD disk per thin node and 60 fat nodes with 284 Gb RAM and 900 Gb local SSD disk

2https://www.astropy.org/ 3https://pandas.pydata.org/ 4https://numpy.org/ 5https://matplotlib.org/ 6https://photutils.readthedocs.io/en/stable/ 7https://github.com/. Git Hub is a Online Code Repository 8http://ds9.si.edu/site/Home.html 9https://heasarc.gsfc.nasa.gov/docs/software/ftools/fv/ 10In June 2019, Tetralith was placed at the 74th position in the top 500 Supercomputers list https: //www.top500.org/list/2019/06 4.3. Exploratory Data Analysis 53 space. HPC resources run on Linux operating System. It is important to know how to schedule11, run, monitor and end batch jobs.

4.3 Exploratory Data Analysis

We started with the exploratory data analysis approach to analyze the data sets and understand their main characteristics. For this we utilized Anaconda’s Jupyter Notebook, which provides a nice interface allowing us to create and share documents that contain live code, equations, text description.

4.3.1 Data used for processing and Analysis

For our processing and data analysis purpose we used the HST’s PHAT catalogue data as it provides the stellar positions and observed brightness for millions of resolved stars along the direction of M31. The imaging data spanning 6 months, on which we will perform photometry was gathered from ZTF.

Figure 4.1: PHAT Brick 09 and Brick 11 Overlap Region

As explained in the previous chapter, PHAT provides data in bricks, there are 23 bricks covering different region of Andromeda Galaxy. We used PHAT Brick 11 dataset for stellar positions, this choice was made as this region is a smooth inter-arm disk and it also overlaps with Brick 09 as shown in fig.4.1, thus would allow us with validating the data processing by means for sanity check of the data as well. This overlapping is due to re-purposing of the orbits of HST and the observing strategy of HST. In terms of

11Their Scheduling Policy can be read here https://www.nsc.liu.se/support/batch-jobs/ tetralith/, which provides important information. 54 Data Processing Approach and Pipeline Development coverage the Brick 11 has only a small triangle area coverage missing. And another thing to note that several star clusters have been discovered in Brick-09 region [101].

Figure 4.2: Plot showing F814W Vega vs F814W SNR for Brick 11

PHAT Brick 11 contains stellar positional data and magnitudes in different filters, we have used F814w Vega magnitudes as this contains greatest number of stars which has around 48,179,678 stars and it essentially covers the depth of up to 25 mag, reaching the deepest color-magnitude diagram (CMD) features [110].

Figure 4.3: Histogram Plot showing number of stars vs F814W Vega Magnitude.

Table.,(4.1) lists the number of stars in Each brick and for Each Filter. One thing to note here is that, as mentioned in [100], data set of this large had encountered glitches in observation, so some data was be missing in some of the bricks during the initial pointings, but later some of it was gathered. 4.3. Exploratory Data Analysis 55

Brick Total F275W F336W F475W F814W F110W F160W 1 7739068 18310 268608 6587303 7276966 5605734 4547719 2 5389720 32745 150598 4064105 4635696 3753250 2740961 3 7420074 25280 126152 6296160 6969507 5327618 4161662 4 5699925 38868 165470 4430452 4966957 3949133 2947135 5 7017006 27125 121143 5871771 6513479 5070041 4000621 6 5737431 35452 154325 4413826 4969404 3928435 2898493 7 6548176 31481 110925 5307224 5917345 4654753 3696269 8 5619644 33584 151109 4256729 4817534 3841721 2787050 9 6107219 38838 144052 4589795 5361598 4255312 3282843 10 5286304 37461 114350 3826009 4445694 3666566 2604794 11 2944424 16185 41962 2257404 2560174 2007530 1490975 12 5037810 32830 130383 3715558 4286834 3433973 2421219 13 5701760 37060 100222 4178398 4886377 3907437 2844294 14 5381346 37474 139786 3945878 4583746 3649396 2565388 15 5506459 47311 201660 3579821 4638748 3876490 2831302 16 4978208 36893 135707 3453638 4152284 3371484 2324596 17 4928251 35865 144164 3299131 4072788 3462695 2398277 18 3988686 31269 87305 2588023 3199622 2691382 1724250 19 4030587 27500 76068 2638976 3259447 2740509 1762750 20 3116348 38093 70767 1843219 2390934 2064556 1273885 21 3207788 31108 90925 1897030 2535071 2228104 1414521 22 2718684 25772 55894 1495505 2002241 1739825 1063914 23 2756854 22383 53262 1613141 2134913 1780956 1114743

Table 4.1: Reproduced from [101] lists number of Stars in each Band in each Brick of PHAT

Figure 4.4: ZTF science images of M31. Each of the circle represents different fieldid and they are observed at different times, fieldid 695 is the primary fieldid and the images gathered in this fieldid were used in this Thesis work.

The observation data from ZTF was gathered from November 2017 to May 2019 through g, r, i ZTF filters. From ZTF, we used the imaging data from primary field id 695. as it has large set of images thus provides longer base line of data which is beneficial in detecting variability and transients. The fig.4.4 shows the 3 different field id’s on the CCD quadrant, the ZTF images under each field id were captured at different periods of time. Field ID 695 is the primary field ID which has larger baseline data so was used for our analysis. 56 Data Processing Approach and Pipeline Development

4.3.2 Quality Filtering of ZTF images

Figure 4.5: Distribution of ZTF Observations under fieldid 695 and r band filter. The red lines indicate the first light and the first release date.

Fig.4.5 shows histogram of number of observations carried out in r filter primary field id 695, the red dotted lines indicate the first light to first data release duration and also shows the distribution of these data per epoch. For fieldid 695, CCD Channel 11 and quadrant 1 had all the data and number of ZTF images were 358 for r filter and 247 for g filter. The number of ZTF science images represents the number of data points for each object.

We used the PHAT brick corners as mentioned in the table.3.1 to get the ZTF imaging data hosted at IPAC. The data was gathered under different field id, FieldId 695 was the primary field id and we have used this data for our processing and analysis. The quality of the observations vary with time due to seeing, airmass or moon phase and other reasons. From the point of view of quality of ZTF science image data two important things to note here. The keywords ”infobits” and ”seeing” in the ZTF science image headers indicate the quality. ”infobits” is a 16 bit integer that denotes the status of processing and instrumental calibration steps and if any specific calibration steps are failed then this cause infobits to have non-zero values.To ensure the quality of the photometric measurements, we removed the ZTF images with non-zero “infobits” value. Table.3.2 lists down the number of images per field id and in total there were 1180 images which had infobits of 0. The fig.4.6 shows the number of files per filter with infobits = 0 and infobits = 0 and seeing less than 4. The total number of ZTF images was 1248 and after the infobits value based quality cut it was reduced to 1140 and this comprises of images with different field id. 4.4. Our Approach 57

Figure 4.6: Number of Files verses FilterId Plot used for Quality Cuts 4.4 Our Approach

4.4.1 Pixel Photometry

In various microlensing surveys for stellar magnitude extraction the photometric ap- proaches employed were either Point Spread Function (PSF) profile fitting[92] using DoPHOT software[93] or pixel lensing based on difference imaging analysis (DIA) tech- niques[94]. Since our focus here is only to detect and measure the brightness variations and we do not aim to produce the best quality light curves, we have followed a different yet simple and straight forward approach as part of this thesis work.

Due to its resolution and being in space HST PHAT has least effect from blending of stars in a crowded field as compared to the ground based ZTF. At the distance of M31, most of the stars are unresolved from ground based telescopes. So, in a crowded field, as well as the resolution offered by ZTF, there is possibility of each CCD pixel having photons from multiple stars and brightest and most massive stars may dominate the pixel thus drive the photon count. If we notice the variability or transients at a pixel then we can assume that the brightest star might be driving the variation. In such a scenario instead of performing aperture photometry on individual unresolved stars, we can perform aperture photometry of each pixel in the image and produce the pixel light curves. This way we can avoid performing photometry multiple times at almost overlapping RA/DEC of unresolved blended star. The fig.4.7 shows effect of blending and overlapping of PHAT stellar positions plotted on ZTF images using SAO DS9 Imaging and data visualization software12.

12http://ds9.si.edu/site/Home.html 58 Data Processing Approach and Pipeline Development

Figure 4.7: Blending and overlapping ra/dec

4.4.2 Source Measurement and Estimation

The first step involves computing the total flux of the star. For this we integrate the photon counts on pixels which fall within a chosen aperture radius, we used fixed aperture radius of 3 px, 1.5 x seeing, 3 x seeing. We could use a large aperture radius for estimating total flux but this is not optimal in crowded fields. A better way is to perform aperture photometry with a small radius and then compute an aperture correction. Photutils has several methods for Photometry. We make use of CircularAperture to define the circular aperture and perform the sum of the photons using aperture photometry method from photutils to perform the aperture photometry task. The thing to be noted is that at some radius we may attain the highest Signal to Noise Ratio (SNR) and minimal errors. The choice of an aperture radius is a difficult because the star’s profile is spread to infinity, so an approximation will be made to chose appropriate aperture. The difficulty arises in two cases that is if a star is bright and covers more pixels than the chosen aperture, then we will loose important information, on the other hand if a star is faint and the covers less pixels than the chosen aperture then the measurement is contaminated by the sky background.

4.4.3 Background Measurement and Estimation

Then in the next step we perform background estimation by defining an annulus region around the chosen star and then summing all those pixel values. The background will be subtracted from the aperture values to get the total flux of the star. For background measurement and estimation, we selected an annulus region with, inner radius of 10 pixel and outer radius of 15 pixel. Then we estimated the background mean, median, background standard deviation from all the pixels that fall within the annulus region using the following equation. In this background annulus region, median of all pixels should be zero, but This provides robust estimation of the uncertainties. 1 σ = (P − P ) (4.1) bkg 2 84 16 4.5. Methodology and Workflow 59

where P84 is 84th percentile of ybkg and P16 is 16th percentile of ybkg Error bars help in assessing the quality or the significance of measurements. Brick 11 is in a smooth inter-ring region so it is difficult to estimate sky background as the region is dusty.

4.5 Methodology and Workflow

In this section, we describe the methodology that we followed in performing the data processing, based on this workflow the pipeline development was carried out. This pipeline performs automatic forced aperture photometry on the Zwicky Transient Facility (ZTF) imaging data. Forced photometry requires prior measurements of stellar positions and stellar brightness. We use stellar positions and observed brightness in the Andromeda Galaxy resolved under The Panchromatic Hubble Andromeda Treasury (PHAT) catalogue in F814W Vega filter. We define a fixed aperture radius of 5 pix and the inner (10 pix) and outer (15 pix) radius for an annulus. We also define two different aperture radius namely, Atmospheric seeing x 1.5 and Atmospheric seeing x 3. The atmospheric seeing data is taken from individual ZTF image header. It is considered that the images have been calibrated through data reduction (flat fielding, bias and dark subtraction) to such an extent that intensity scale is linear. The pipeline makes use of some techniques mentioned in [111]. Below we will describe the structure of the pipeline, inputs, procedures, and deliverables. At the initial stage while performing forced photometry at the HST resolved stellar positions on ZTF images, it was found that the data processing was turning out to be computationally complex due to the amount of resources and the time needed for running the pipeline for processing by taking in millions of floating point stellar positions from PHAT. So it was decided that we perform photometry per pixel level, this brought down the computational complexity considerably thus helping us speed up our data processing. The pixel aperture photometry section below gives a brief account of our approach. Since we made several assumptions, some of those assumptions could potentially lead to systematic errors in the results we obtained. We performed the parallel computation by making use of a python library called multi- processing. This allowed us to distribute the photometric computation across multiple processors within a computational node on Tetralith Supercomputer Facility. Since there are 32 cores per node, we created 12 folders and moved ZTF images (R and G band) and our python based pipeline code into those 12 folders, with each folder containing 32 images. The idea was to run the parallel script where each ZTF image will be pro- cessed based on the stellar positions taken by PHAT and generate a Comma Separated Values (CSV) file with the raw photometric data. At this stage the CSV file contained raw science data (aperture sum, background standard deviation, background median, background mean) and no magnitude computations were performed. The CSV files were 60 Data Processing Approach and Pipeline Development

Sl. No Column Name Details 1 unique id Autogenerated ID 2 imagename Name of ZTF image 3 ra Right Ascension 4 dec Declination 5 x discrete discrete pixel X 6 y discrete discrete pixel Y 7 xcenter x-pixel center 8 ycenter y- pixel center 9 gain ztf Gain 10 objcount def Aperture Sum(5 pixels) 11 objcount 1 5 Aperture Sum(Seeing x 1.5) 12 objcount 3 Aperture Sum(Seeing x 3) 13 bkg Sky Background 14 bkg standev Background Standard Dev 15 bkg mean Background Mean 16 bkg median Background Median 17 bkg max Background Max 18 bkg min Background Min 19 zeropoint ZeroPoint 20 zeropoint uncert ZeroPoint Uncertainty 21 filterid ZTF Filter ID 22 fieldid ZTF FieldID 23 programid ZTF Program ID 24 seeing Seeing 25 ccdid ZTF CCD ID 26 qid ZTF CCD QID 27 moonra Moon RA 28 moondec Moon Dec 29 moonillf Moon Illum. 30 moonphas Moon Phase 31 airmass Airmass 32 num of agg pix count Star Count in a pixel 33 mjd Modified Julian Date 34 phat f814W vega mag mean PHAT Stellar objects 35 readnoise Read Noise 36 area 1 5 Total Number of Pixels in Aperture of seeing x 1.5 px radius 37 area 3 Total Number of Pixels in Aperture of seeing x 3 px radius 38 area def Total Number of Pixels in Aperture of 5 pixel 39 flux def Total Flux in Aperture of radius 5 px 40 scr cor var fn without bkg Source 41 yerr without bkg Background Yerror 42 mag def Magnitude (for aperture of 5 px radius) 43 objcount def median Source Aperture Sum(5 pixels) Median 44 mag def median median of the Magnitude for Source Aperture Sum(5 pixels) Median 45 delta flux (flux def - flux def median) ∆F 46 delta flux by yrr Y err 47 snr Signal to Noise Ratio 48 flux def max Total Flux in Aperture of radius (Max) 49 flux def median Total Flux in Aperture of radius (Median) 50 mag def max Maximum of Magnitude of Aperture of radius 5 pix

Table 4.2: Column Names used in the final dataset 4.5. Methodology and Workflow 61 later combined per filter and additional computations were performed to convert raw measurement values into magnitudes and the result was stored into another CSV file. We also applied the selection criteria where Signal to Noise Ratio (SNR) > 5 data was taken. So finally we had 2 csv files for Brick 11 g and r filter. These were then passed through the machine learning pipeline LIA for classification. Fig.4.8 shows the data processing workflow. The following steps describe the data pro- cessing workflow of the pipeline script. 1. Get list of ZTF fits images in a folder 2. Read the infobit information from the header of individual ZTF images and if the infobit value is non zero then move this file to a different folder. This infobit parameter in fits header provides the quality status of the image, essentially non zero values indicate the image might not be reliable. 3. Read RA/Dec position from a PHAT brick for F814w Vega < 25, this gives number of stars to perform Aperture photometry on. 4. Take a reference ZTF image. (We used ZTF 20190524468449 000695 zr c11 o q1 sciimg.fits) 5. Read the x, y pixel positions on the reference ZTF image for PHAT RA/Dec position , round the floating point position to the nearest smaller integer using numpy floor method. 6. Group the data based on these discrete pixel position and get the aggregate or count of same discrete pixel position and compute the mean of f814w vega magnitude. 7. Then add 0.5 to x-discrete and y-discrete pixel positions and convert these pixel positions to RA/Dec. This gives the center of the pixel. The steps 5 to 7 helps to speed up aperture photometry as it helps to avoid performing aperture photometry for multiple stars overlapping on each other due to low resolution of ZTF where size of the pixel is 1.012 arcsec, where as for HST it is around 0.13 arcsec. 8. After 7th step store the resulting data into separate csv file (referenceimage.csv). When we parallely run the script, it will look for this file, once it has been generated the subsequent run will use this generated csv file. 9. Find out on which images the RA/DEC targets can be found. If not found then move that image to a different folder, and log this into the log file. 10. At this stage manual intervention is needed to move 1140 ZTF files grouped into 32 files into separate folders, this is required for the parallelization of aperture photometry step. The number 32 is due to the 32 processor cores available on a single compute node of Tetralith Super Computer. This is because we can only parallelize within compute nodes (across 32 processors) and not across the compute nodes. So each node will process 32 files, each file is passed to a processor in the node. 62 Data Processing Approach and Pipeline Development

11. Now the ZTF images with target RA/Dec are stored in different folders. The parallel processing can begin. So get the number of CPU’s available (which is 32 in our case for a compute node on the Supercomputer). 12. Load the csv file generated in step 8 containing new object coordinates. 13. Then the aperture radius are selected which are 3 pixel, 1.5 times seeing and 3 times seeing. The Annulus radius selected are, inner radius = 10 pixel and outer radius = 15 pixel. 14. All the required parameters are passed for photon counting. 15. The method extracts Gain, Obsmjd, Filterid, Magzp, seeing, programid, fieldid, ccdid, qid, moonra, moondec, moonillf, moonphas, airmass, readnoise from each of the ZTF file and their values will be later saved into output file. 16. Define the columns required in the output file. 17. Create a log file and start logging to it 18. First iteration is over number of objects to process, 19. Read the ra, dec open the ztf image, perform aperture photometry. Calculate the various parameters like, source aperture sum for the 3 different aperture radius, background annulus sum, background uncertainty using eq.,(4.1), number of ag- gregated pixels count. 20. This generates csv file for each ZTF image. 21. We then combine these individual csv files into a single CSV file. This CSV file contains photon count, background (median, standard deviation, max, min, mean), and other values from ZTF headers. Here we did not compute flux, magnitude and other values because the generated file was huge and it was not possible to compute due to limitation with the memory. So we divided the computations into different stages. 22. Then we load the csv file generated in previous stage and compute the flux, yerror, magnitudes, delta flux, SNR and saved into another csv file. 23. This pipe line produced photometric data for 200,000 objects (pixels). 24. Then SNR (SNR > 5) cut off was applied to pick only the data which has SNR higher than 5. This gave us 63,8383 unique objects, and further cutoff was applied ∆F ( σ > 7) to obtain 10,922 objects. Objects here refer to the pixels and not to the stars. The table.(4.2) lists the columns at the end of all the processing, most of these columns are self explanatory. 25. The final data is used to generate light curves for these 10,922 objects. And sub- sequently used for analysis. 4.6. Use of Machine Learning Algorithm for Classification 63

Figure 4.8: Data Processing Workflow

4.6 Use of Machine Learning Algorithm for Classifi- cation

Classification of astronomical objects is important part of research in population studies and for that wide variety of information must be assessed to reliably classify an object. This includes spatial morphology in multiple colors, photometry in multiple colors, time dependent behaviour, and astrometric motion. There are three ways to detect variability, first one is to blink 2 or more images to see the variation in brightness visually, the second is to do difference imaging i.e, subtracting two images and the third approach is to use Machine Learning (ML) Algorithms based on other statistical methods. ML algorithms have been demonstrated to be good with classification and hence they can be adopted for identifying patterns of variability. On the basis of variability, the sources can be classified and characterized using machine learning techniques into non-variable, variable and transient events. Further, the best classifications will make use of surveys in other wavelength regimes and spectral information where available. Experience from many surveys has shown that no single algorithm can do a good job on all objects. Rather, good algorithms tend to be specialized that is, limited to particular objects classes, e.g. eclipsing binaries or supernovae. It is important to note here that, use of ML for transient classification is relatively recent one and is important. ML technique will continue to be essential in ZTF as well as future surveys. Another point to note here is 64 Data Processing Approach and Pipeline Development that most astronomical research groups focus on using ML for non microlensing transient classification but with the upcoming surveys astronomy community will start using ML for transient classification.

A major aspect of Machine Learning classification is to know what class an observation belongs to. Although the scope of the thesis work was limited to producing time series data and generating light curves, to demonstrate how Machine Learning can help us in classifying the data, we used a Machine Learning based Gravitational Lensing Classifier (Lens Identifying Algorithm (LIA) implementation 13 - a Machine Learning Algorithm for Single Lens Microlensing event detection developed by Daniel Godines [112]. The LIA classifier performs classification of detections into four classes: constants, variables, dwarf novae, and microlensing. This pipeline generates Training dataset for each of the classes it aims to detect. The features that are extracted are listed in Table.4.3.

Sl. No Feature Name Sl. No Feature Name 1 shannon entropy . . 2 normal gauss . . 3 inv gauss 25 peak detection 4 con 26 abs energy 5 con2 27 abs sum changes 6 kurtosis 28 auto corr 7 skewness 29 c3 8 vonNeumannRatio 30 complexity 9 stetsonJ 31 count above 10 stetsonK 32 count below 11 stetsonL 33 first loc max 12 median buffer range 34 first loc min 13 median buffer range2 35 check for duplicate 14 std over mean 36 check for max duplicate 15 amplitude 37 check for min duplicate 16 median distance 38 check max last loc 17 above1std 39 check min last loc 18 above3std 40 longest strike above 19 above5std 41 longest strike below 20 below1std 42 mean change 21 below3std 43 mean abs change 22 below5std 44 mean second derivative 23 medianAbsDev 45 ratio recurring points 24 root mean squared 46 sample entropy . . 47 time reversal asymmetry . .

Table 4.3: List of 47 Statistical features computed using LIA for classification

Due to thesis time constraints we decided to explore only one ML implementation. The main steps in LIA include generating training set based on the gravitational lensing models from pyLIMA14, using the Random Forest Algorithm and Principal Component Analysis transformations. LIA computes 47 statitical features from the time series data and then Principal Component Analysis performs feature dimentionality reduction. In

13https://github.com/dgodinez77/LIA 14an open source program for microlensing modelling https://github.com/ebachelet/pyLIMA 4.6. Use of Machine Learning Algorithm for Classification 65 the next step PCA output is supplied to Random Forrest Algorithm (RFA) which per- forms classification.

4.6.1 Principal Component Analysis (PCA)

The Principal Component Analysis (PCA) technique is used in the multivariate analysis which uses co-variance matrix. PCA technique is used for reducing the dimensional- ity of the dataset without causing any loss of information. The PCA technique helps preserve the variability of the original dataset by finding new variables called Principal Components, which are linear functions of the variables in the original dataset. There are as many principal components as variables. In a two dimensional dataset, PCA can be visualized as set of orthogonal coordinates, where each axis represents one variable or principal component and PCA is simply performing linear coordinate transformation which tries to rotate the x and y axis (which are perpendicular to each other and remain so after transformation) such that the direction of first maximum variation that is., the axis with the highest eigenvalue in the data is along the x axis, the second maximum variation along the y axis.

4.6.2 Random Forest Algorithm (RFA)

Random Forests are an ensemble learning method for classification and regression. En- semble is the process of combining multiple models (any ML) and producing outputs. There are two important concepts under Ensemble, first is called Bagging and second is Boosting. An example of Bagging technique is called Random Forest Classifier/Regressor. Random Forest (Bootstrap Aggregation) Classifier Algorithm uses multiple decision trees. The subset of sample data (bagging) is passed to multiple decision trees under bootstrap with replacement(the sample data from first decision tree will be replaced with sample data from 2nd decision tree and so on). Each decision tree will have a class prediction based on the class with most votes. During training each of these decision trees will have different accuracy. The mean of all the accuracy of decision trees is the accuracy of the random forest. When a new data is given, we check how many decision trees are giving output of 1 or 0. Based on the maximum number of similar outputs is taken as the output of the model. This final step is called aggregation. It is important to know that in Bootstrap Aggregation, decision trees are very sensitive to the data they are trained on, any change in the training dataset causes significantly different decision tree struc- ture. Due to its superiority to other methods in terms of accuracy, speed, and relative immunity to irrelevant features use of Random Forest (RF) classifier have been explored earlier [113]. It is suggested that if there is any anomaly in our magnitude and magnitude error then this can yield unpredictable results. So they suggest to use pyLIMA - Microlensing modelling algorithm for 2nd test. 66 Data Processing Approach and Pipeline Development

4.7 Validating the Pipeline

The pipeline was applied on a extra-galactic region and the resulting light curves were visually verified to see if the pipeline produces the desired light curves for less crowded regions with known stars. This step was essential to convince ourselves that the pipeline was working as expected. Chapter 5 Results and Data Analysis

Statistics: the mathematical theory of ignorance.

– Morris Kline In this chapter we will describe preliminary analysis carried out on the outcome of the pipeline. From 10922 light curves, we selected only few light curves after we applied Machine Learning classification algorithm, to include in this thesis. It is important to note that further analysis of the light curves is required.

5.1 Time Series Data Analysis

The photometric data we produced through our pipeline is a time series data. Various time series data analysis techniques can be applied to analyse the data but we limit ourselves to visual assesment, lomb scargle periodogram. Below we will discuss about some of our initial analysis. An in-depth analysis can be carried out in the future.

5.1.1 Visual Assessment

After we generated light curves for 10922 objects (pixels), we carried out visual assessment to identify any interesting objects. Several light curves were selected for further analysis to be carried out in future.

5.1.2 Lomb Scargle Periodogram Analysis for Variability

We make use of Lomb Scargle Algorithm implementation through Astropy [114] python package to analyse the generated light curves for variability and find any possible peri- odicity in the time series data for each object(pixel). If any periodicity is found then the data can be folded in phase space to get the period. We applied Lomb Scargle peri- odogram to 10922 objects (pixels) to generate plots for finding periodicity. We generated several plots for the data with Signal to Noise Ratio (SNR) less than 1, SNR between 3

67 68 Results and Data Analysis and 7 and SNR greater than 7. This helps in identifying and categorizing them into non- variable and variable and other sub classes of variables and also find transients. Further investigation of Lomb Scargle Periodogram is needed to understand various parameters and how to optimize to find the periodicity of variable stars.

5.2 Machine Learning Classification

Applications of Machine Learning in variables and transient detections have been ex- plored earlier [113, 115, 116]. Here we make use of Lens Identifying Algorithm (LIA) implementation as explained in the previous chapter. The training set required for LIA is generated with an assumption of a year long survey and the magnitude range between 15 (min mag) and 21 (max mag) with a Gaussian noise source for 500(n class) objects and the trained models produced are the labelled datasets. These trained models were used along with the magnitude and magnitude error of our dataset as an input to the LIA. The algorithm produces the Classification (Microlensing, Constant, Variable and Cataclysmic Variable) Scheme and also provides the probability of the classification for each of the categories. This classifier, identified 77 Microlensing Events, 27 Constant Sources, 9898 Variable Sources, and 231 Cataclysmic Variables from 10922 objects(pixels). It is important to remember that the classifier assigns probability to each of the classes. The more training data is supplied the better classification will be. Several representational light curves have been shown in the next sections.

5.2.1 Microlensing Detections

The 77 Microlensing event classifications were selected for further analysis. The detec- tions were then plotted using their RA/Dec as shown in fig.5.1. Further analsyis revealed that the majority of these detections correspond to artifacts on a single image as shown fig.5.2. Such artifacts hinder the automatic detection of transients. The detection of false positives at this stage, allows us to characterize such artifacts using their features. Such artifacts then can be removed from the detections automatically by employing machine-learning algorithms by retraining the models by passing new features, this helps us to focus on true detections and helps minimize false positives. This also shows that, partial human intervention is needed, although human intervention is impractical and non-scaleable and leads to some delay in the research work. Nevertheless experts need to visually scan through the remaining detection candidates to identify false detections or other problems to identify solutions to overcome these shortcomings. The false detections might be due to blending, artifacts, residual cosmic rays, and calibration errors. 5.2. Machine Learning Classification 69

Figure 5.1: Detections classified as Microlensing from Machine Learning Algorithm LIA

Figure 5.2: False Positives due to artifacts. North is up and East is left of the image.

5.2.2 Variable Star Detections

The LIA classifier classified 9898 Variable Sources in the data that we supplied. We then selected few of the detections for investigation. The fig.5.3 shows light curves for one of the detections. The vertical black and red lines indicate new moon and full moon dates, 70 Results and Data Analysis this will help us to determine if the variability is driven by the phase of the moon or not. From our initial analysis it appears that the full moon does not drive the variability, but further analysis will be necessary.

Figure 5.3: Flux vs MJD and Magnitude vs MJD, Plot of Variable Star detection through Machine Learning classifi- cation. 5.2. Machine Learning Classification 71

5.2.3 Cataclysmic Variable Detections

The LIA classifier classified 231 objects(pixels) as Cataclysmic Variables. We then se- lected few of the detections for investigation. The fig.5.4 shows light curve for one of the detections, as we can see there is one data point that appears to be driving this detection to be classified as cataclysmic variable. The vertical black and red lines indicate new moon and full moon, this will help us to determine if the variability is driven by the phase of the moon or not. From our initial analysis it appears that the full moon does not drive the variability, but further analysis will be necessary.

Figure 5.4: Plot of Cataclysmic Variable Detections through Machine Learning classification 72 Results and Data Analysis

5.2.4 Constant Source Detections

The classifier classified 231 objects(pixels) as constant sourrces, which does not show any variation in their luminosity, such stars can be used as standard stars. The fig.5.5 shows light curve for one of the detections. The vertical black and red lines indicate new moon and full moon, this will help us to determine if the variability is driven by the phase of the moon or not. From our initial analysis it appears that the full moon does not drive the variability, but further analysis will be necessary.

Figure 5.5: Plot of Constant Source Detections through Machine Learning classification 5.3. Cross Comparisons 73

5.3 Cross Comparisons

In this section we also demonstrate the kind of research work that can be carried out once we produce catalogue of detections from complete dataset analysis. Cross comparisons will help us to validate our own findings as well as they either add to the catalogue of objects or improves the data associated with findings.

5.3.1 Comparison with Hubble Catalog of Variables (HCV)

While this work was being carried out, a group of researchers published the variable sources through Hubble Catalog of Variables (HCV) 1. We used this Hubble Catalogue of Variables (HCV) and generated the plot shown in Fig.5.6. This figure shows the overlapping region of Phat Brick 11 over HST HSV Catalog [117]. In the future work we can cross-correlate ZTF variability with other similar catalogues.

Figure 5.6: PHAT Brick 11 over Hubble Catalogues of Variables (HCV) data of M31 region

5.3.2 Comparison with ZTF Caltech team

The 10922 objects (pixels) that resulted from our data processing were sent to ZTF team at Caltech. They provided us with 48 variable star detections. The fig.(5.7) shows the plot of our detections overlapped with ZTF Caltech detections.

1http://hst.esac.esa.int/hcv-explorer/ and https://archive.stsci.edu/hlsp/hcv 74 Results and Data Analysis

Figure 5.7: Cross Match with Caltech Team

5.3.3 Comparison with Caldwell Star Catalogue in M31

Figure 5.8: Cross Match with Caldwell Star Catalogue to identify Star Clusters. [118]

Noticing the overlapping regions as seen in fig.5.8 and knowing that there are several star clusters in the region of PHAT Brick 11 coverage area as shown in fig.3.3, we decided to explore if these are star clusters or some artifacts. Since we used stellar coordinates from PHAT, it is possible that the clustering can be seen in the stellar coordinates and that will reflect in our data processing. For this we used star clusters catalogue by Caldwell[118]. 5.4. Summary and Conclusion 75

5.4 Summary and Conclusion

This thesis contributed to the development of data processing pipeline by making use of forced aperture photometry technique. The developed pipeline was validated by applying it to extragalactic region with known stars and generating light curves for some of the stars in that extragalactic region. Then ZTF data within the PHAT’s brick 11 coverage region was processed, time series data was generated and a preliminary analysis was car- ried out. We also demonstrated the use of Machine Learning algorithm for classification. Following this we also demonstrated through cross comparisons the kind of research one can perform on the final data.

Microlensing events are very rare than other type of transients. To detect microlensing events we have to sift through time series data of millions of stars, and only then we can detect several of them. To perform forced photometry, we need prior measurements of source positions and source brightness, so we used Hubble Space Telescope (HST) Panchromatic Hubble Andromeda Treasury (PHAT) catalogue which provided the re- solved stellar coordinates and observed brightness of stars in the Andromeda Galaxy.

We performed forced aperture photometry at these PHAT locations on Zwicky Transient Facility (ZTF) imaging data relevant to the Andromeda Galaxy obtained for over 6 months period under three different filters. Various selection criteria were used to filter out objects. Since ZTF can not resolve individual stars in the Andromeda galaxy, we performed forced aperture photometry at pixel level. So the objects here are the pixel locations, each of those pixels have contributions from multiple stars within that pixel area. From the PHAT catalogue, we obtained the count of stars falling within the ZTF pixel area. The resulting data had ∼ 200,000 objects which had higher Signal to Noise Ratio (SNR). We selected only those pixels which have SNR greater than 7 and obtained ∼ 10922 objects (pixels). We generated the light curves of these 10922 objects and several of them were inspected visually. We then supplied the time series data of these objects as input to a machine learning algorithm which performed classification into 77 Microlensing Events (most of them turned out to be related to artifacts), 27 Constant Sources, 9898 variable sources, and 231 cataclysmic variables. The quality of the results can be improved with a large training set. We also explored possibility of using Lombe Scargle periodogram analysis to determine the variability, and cross compared with ZTF Caltech team, and Caldwell Star Cluster catalogues. To identify possible enhancements to the pipeline and improve the results, further in-depth analysis is required, which is beyond the scope of the Thesis work due to time constraints.

As the data volume grows from tera bytes to peta bytes and beyond, it becomes com- putationally challenging, and require new algorithm developments. Researchers can tap into the contemporary developments in computing and storage capabilities to find more efficient ways to process these increasing data volumes. This requires that the modern astronomers need to continually equip themselves with modern statistical and computa- tional tools and techniques. 76 Results and Data Analysis

Photometrically monitoring millions of stars with the help of multi-color lensing surveys, over long periods of time provides us with valuable time series data through which we can detect isolated transient events as well as identify different variable stars. The census of such sources and their distribution in the galaxy enable us to understand the stellar and galactic evolution, the galactic regions(bulge, halo, disk) and their evolution, the color magnitude diagrams that can be constructed from such dataset provides us with the information about age distribution and relative numbers of different types of stars and detection of the population of stellar -or planetary regime sources and detection of transients helps to understand the nature of dark matter. Thus such time domain studies find their applications in astrophysics and cosmological studies.

5.5 Outlook and Future Work

During this thesis work several interesting research topics were encountered, which can be carried out in future. Below we will list some of the future work that can be carried out. 1. Analyzing the full data set for all the remaining 22 PHAT brick regions, so the major part of data is yet to be analysed. Most of Brick 23 region was not analysed as this region falls on the chip gap of ZTF. 2. The processed data is in CSV files, more convenient option will be to move the processed data into a central, relational database system such as MySQL. This allows us to create online user interface through which the data can be queried using SQL commands and web based data visualization can be developed. Queries in the time domain (Source table) are likely to be of equal importance to those in the spatial domain. 3. Correlation of the detections with moon phase to determine if moon phase drives any of these brightness variations. 4. Use data from iPTF [119] to cross compare to find the variable stars that fall under brick 11 region. 5. The data can be re-analyzed by employing Point Spread Function (PSF) profile fitting [92] using DoPHOT software [93] or pixel lensing based on difference imaging analysis (DIA) techniques [94] for comparison with our approach as well as to determine optimal techniques. PSF photometry is another technique which is more robust in the crowded regions such as the central regions. We can use this technique to deliver a reprocessing of the data. Then we can analyze the outcome of PSF photometry approach with the Aperture Photometry that was used in this thesis work. 6. ZTF will continue to gather survey data so we will have more data with longer baseline, this will help in identifying even more variables and transients which have 5.5. Outlook and Future Work 77

longer periods. 7. Automatic classification using Machine Learning (ML) algorithms is an ongoing effort, the more the training data available for algorithms the better they become in identifying variability and transients. 78 Results and Data Analysis Appendix A - Terminology

Figure 6.1: Credits:Sidney Harris

Accretion Disk - A disk of matter rotating and accreting towards a central black hole where the rotation speed and temperature rise sharply. The disk can be composed of atomic gas, ionized gas (plasma) and dust. Arc Second - corresponds to one 3600th of a degree. Astrometry - a technique of measuring the position of astronomial objects. Baryons - Protons and Neutrons which are basic constituents of luminous objects, which take part in nuclear reactions. Bolometric Magnitude - is a measure of a heat intensity of a star. Blackbody - A body in thermal equilibrium at some given temperature that ab- sorbs or emits radiation with 100% efficiency at all wavelengths. Cadance - The sequence of pointings, visit exposures, and exposure durations performed over the course of a survey.

79 80 Results and Data Analysis

Deblending - Deblending is the act of inferring the intensity profiles of two or more overlapping sources from a single footprint within an image. Source footprints may overlap in crowded fields, or where the astrophysical phenomena intrinsically over- lap (e.g., a supernova embedded in an external galaxy), or by spatial co-incidence (e.g., an asteroid passing in front of a star). Deblending may make use of a pri- ori information from images (e.g., deep Co-Adds or visit images obtained in good seeing), from catalogs, or from models. Detector Noise - Intrinsic Noise of the Detector. Difference Image - Refers to the result formed from the pixel-by-pixel difference of two images of the sky, after warping to the same pixel grid and scaling to the same photometric scale. The pixels in a difference thus formed should be zero (apart from noise) except for sources that are new, or have changed in brightness or position. Extinction - Dimming of light from astrophysical objects (in our case Stars) due to absorption or scattering at a certain wavelength by the gas and dust in the Inter Stellar Medium (ISM). Epoch - Time period used for celestial coordinate system (RA , Dec) because the Ra Dec varies slightly every year. The current epoch is referred to as J2000.0 (the coordinates of the celestial objects relative to the coordinates as they were in the year 2000), previous epoch was B1950.0 (celestial coordinates as they were in 1950.) Forced Photometry - employs prior measurements of source positions and source brightness. A measurement of the photometric properties of a source, or expected source, with one or more parameters held fixed. Most often this means fixing the location of the center of the brightness profile (which may be known or predicted in advance), and measuring other properties such as total brightness, shape, and orientation. Field of View - Widest angular span measured on the sky that can be imaged distinctly by the optics. Halo - extended, generally low surface brightness, spherical outer region of a galaxy, usually populated by Globular clusters and population II stars and hot gas. High Cadence - Frequency by which the same region of the sky is revisited. Revisiting the same part of the sky either multiple times in a single night or every night regularly for a period of time. Interstellar Reddening - is the result of scattering of the star light by dust in the ISM. Light Curves - Magnitude Vs Time plots Magnitude Scale - Log Scale of intensities upon which 5 magnitudes is a factor of 100; thus one mag is a factor of 2.512. Sources fainter than the zero point 81

(0 magnitude) have positive magnitudes, and sources brighter than 0 mag have negative magnitudes. The scale can be defined at any wavelength. Microarcsecond - corresponds to one millionth of an arc second. Photometry - a technique for the measurement of light intensities Population I and II Stars - Stars in the galaxy are classified into two general categories. Population II stars were formed at the time of the galaxy formation, are of low metallicity and are usually found in the halo or in globular clusters. Population I stars, like the sun, are younger and more metal rich and form the disk of the galaxy. Seeing Measure of disturbance in the image seen through the atmosphere. Or- dinarily expressed as the angular size, in arc seconds of a point source (a distant star) seen through the atmosphere, that is, angular size of the blurred source. The turbulence in the Earth’s atmosphere causes point sources (such as stars) to be smeared out and twinkle. Signal to Noise Ratio - measure of the signal we are interested in compared with the signals are of least important including noise and measurement uncertainties. Sky Coverage – Entire sky 43000 Square degree, night sky 20,000 square degree. Due to horizon, we can see around 17000 square degree of the night sky. If we can scan 4,000 square degree per hour, in 4 hours we could cover entire night sky. That means we can rapidly image the sky over and over again, multiple times over a night, night after night, then make a movie of the sky and look for change and discover new objects and phenomena. SpaceTime - 4 Dimensional manifold composed of ordinary 3 dimensional space and time as 4th dimension. Visual Magnitude - measure of luminous intensity. Wide Field - covering large parts of the sky in each observation. 82 Results and Data Analysis Appendix B - zwindromeda - python pipeline

We used Anaconda 1 Python Distribution for initial analysis during data exploratory stage.

Python Packages

Below packages were utilized during this pipeline development. 1. Pandas2 2. Numpy3 3. Matplotlib4 4. Astropy5 5. Photutils6

7.1 Zwindromeda - Pipeline

The data processing pipeline will be stored on Github at https://github.com/PhiTheta/ zwindromeda

1https://www.anaconda.com/ 2https://pandas.pydata.org/ 3https://numpy.org/ 4https://matplotlib.org/ 5https://www.astropy.org/ 6https://photutils.readthedocs.io/en/stable/

83 84 Results and Data Analysis References

[1] Adam G. Riess et al. “Observational Evidence from Supernovae for an Accelerat- ing Universe and a Cosmological Constant”. In: AJ 116.3 (Sept. 1998), pp. 1009– 1038. doi: 10.1086/300499. arXiv: astro-ph/9805201 [astro-ph]. [2] S. Perlmutter et al. “Measurements of Ω and Λ from 42 High-Redshift Super- novae”. In: ApJ 517.2 (June 1999), pp. 565–586. doi: 10.1086/307221. arXiv: astro-ph/9812133 [astro-ph]. [3] Gerson Goldhaber. “The Acceleration of the Expansion of the Universe: A Brief Early History of the Supernova Cosmology Project (SCP)”. In: American Institute of Physics Conference Series. Ed. by David B. Cline. Vol. 1166. American Institute of Physics Conference Series. Sept. 2009, pp. 53–72. doi: 10.1063/1.3232196. arXiv: 0907.3526 [astro-ph.CO]. [4] J. A. Frieman, M. S. Turner, and D. Huterer. “Dark energy and the accelerating universe.” In: ARA&A 46 (Sept. 2008), pp. 385–432. doi: 10.1146/annurev. astro.46.060407.145243. arXiv: 0803.0982 [astro-ph]. [5] P. J. Peebles and Bharat Ratra. “The cosmological constant and dark energy”. In: Reviews of Modern Physics 75.2 (Apr. 2003), pp. 559–606. doi: 10.1103/ RevModPhys.75.559. arXiv: astro-ph/0207347 [astro-ph]. [6] Fred C. Adams and Gregory Laughlin. “A dying universe: the long-term fate and evolutionof astrophysical objects”. In: Reviews of Modern Physics 69.2 (Apr. 1997), pp. 337–372. doi: 10 . 1103 / RevModPhys . 69 . 337. arXiv: astro - ph / 9701131 [astro-ph]. [7] E. P. Hubble. Realm of the Nebulae. 1936. [8] Laurence A. Marschall. The supernova story. 1988. [9] Francis R. Johnson, Sanford V. Larkey, and Thomas Digges. “Thomas Digges, the Copernican System, and the Idea of the Infinity of the Universe in 1576”. In: The Huntington Library Bulletin 5 (1934), pp. 69–117. issn: 19350708. url: http://www.jstor.org/stable/3818095. [10] Igor D. Karachentsev et al. “A Catalog of Neighboring Galaxies”. In: AJ 127.4 (Apr. 2004), pp. 2031–2068. doi: 10.1086/382905.

85 86 Results and Data Analysis

[11] Charles Messier. Catalogue des N´ebuleuseset des Amas d’Etoiles´ (Catalog of Neb- ulae and Star Clusters). Tech. rep. Jan. 1781, pp. 227–267. [12] William Herschel. “Catalogue of One Thousand New Nebulae and Clusters of Stars. By William Herschel, LL.D. F. R. S.” In: Philosophical Transactions of the Royal Society of London Series I 76 (Jan. 1786), pp. 457–499. [13] William Herschel. “Catalogue of a Second Thousand of New Nebulae and Clusters of Stars; With a Few Introductory Remarks on the Construction of the Heavens. By William Herschel, L L. D. F. R. S.” In: Philosophical Transactions of the Royal Society of London Series I 79 (Jan. 1789), pp. 212–255. [14] William Herschel. “Catalogue of 500 New Nebulae, Nebulous Stars, Planetary Nebulae, and Clusters of Stars; With Remarks on the Construction of the Heav- ens”. In: Philosophical Transactions of the Royal Society of London Series I 92 (Jan. 1802), pp. 477–528. [15] John Frederick William Herschel. “A General Catalogue of Nebulae and Clusters of Stars”. In: Philosophical Transactions of the Royal Society of London Series I 154 (Jan. 1864), pp. 1–137. [16] J. L. E. Dreyer. “A New General Catalogue of Nebulæ and Clusters of Stars, being the Catalogue of the late Sir John F. W. Herschel, Bart, revised, corrected, and enlarged”. In: MmRAS 49 (Jan. 1888), p. 1. [17] Henrietta S. Leavitt. “1777 variables in the Magellanic Clouds”. In: Annals of Harvard College Observatory 60 (Jan. 1908), pp. 87–108.3. [18] Henrietta S. Leavitt and Edward C. Pickering. “Periods of 25 Variable Stars in the .” In: Harvard College Observatory Circular 173 (Mar. 1912), pp. 1–3. [19] J. S. Gallagher and S. Starrfield. “Theory and observations of classical novae.” In: ARA&A 16 (Jan. 1978), pp. 171–214. doi: 10.1146/annurev.aa.16.090178. 001131. [20] Mario Livio. “Are classical novae and dwarf novae the same systems ?” In: Com- ments on Astrophysics 12 (Jan. 1987), pp. 87–97. [21] John Michell. “On the Means of Discovering the Distance, Magnitude, &c. of the Fixed Stars, in Consequence of the Diminution of the Velocity of Their Light, in Case Such a Diminution Should be Found to Take Place in any of Them, and Such Other Data Should be Procured from Observations, as Would be Farther Necessary for That Purpose. By the Rev. John Michell, B. D. F. R. S. In a Letter to Henry Cavendish, Esq. F. R. S. and A. S.” In: Philosophical Transactions of the Royal Society of London Series I 74 (Jan. 1784), pp. 35–57. [22] Arthur S. Eddington. “Survey of the Problem”. In: The Internal Constitution of the Stars. Cambridge Science Classics. Cambridge University Press, 1988, pp. 1– 26. doi: 10.1017/CBO9780511600005.003. [23] Pierre-Simon Marquis de Laplace. Exposition du systeme du monde. 1798. References 87

[24] S. W. Hawking and G. F. R. Ellis. The large-scale structure of space-time. 1973. [25] Isaac Newton. Opticks. Dover Press, 1704. [26] Jaume Gin´e.“On the origin of the deflection of light”. In: Chaos Solitons and Fractals 35.1 (Jan. 2008), pp. 1–6. doi: 10.1016/j.chaos.2007.06.097. arXiv: physics/0512121 [physics.gen-ph]. [27] H. von Kl¨uber. “The determination of Einstein’s light-deflection in the gravita- tional field of the sun”. In: Vistas in Astronomy 3.1 (Jan. 1960), pp. 47–77. doi: 10.1016/0083-6656(60)90005-2. [28] T. Sauer. “Nova Geminorum 1912 and the origin of the idea of gravitational lensing”. In: Archive for History of Exact Sciences 62.1 (Jan. 2008), pp. 1–22. arXiv: 0704.0963 [physics.hist-ph]. [29] F. W. Dyson, A. S. Eddington, and C. Davidson. “A Determination of the De- flection of Light by the Sun’s Gravitational Field, from Observations Made at the Total Eclipse of May 29, 1919”. In: Philosophical Transactions of the Royal Society of London Series A 220 (Jan. 1920), pp. 291–333. doi: 10.1098/rsta.1920.0009. [30] O. Chwolson. “Uber¨ eine m¨ogliche Form fiktiver Doppelsterne”. In: Astronomische Nachrichten 221 (June 1924), p. 329. [31] Albert Einstein. “Lens-Like Action of a Star by the Deviation of Light in the Gravitational Field”. In: Science 84.2188 (Dec. 1936), pp. 506–507. doi: 10.1126/ science.84.2188.506. [32] F. Zwicky. “On the Probability of Detecting Nebulae Which Act as Gravita- tional Lenses”. In: Physical Review 51.8 (Apr. 1937), pp. 679–679. doi: 10.1103/ PhysRev.51.679. [33] F. Zwicky. “Nebulae as Gravitational Lenses”. In: Physical Review 51.4 (Feb. 1937), pp. 290–290. doi: 10.1103/PhysRev.51.290. [34] F. Zwicky. “Die Rotverschiebung von extragalaktischen Nebeln”. In: Helvetica Physica Acta 6 (Jan. 1933), pp. 110–127. [35] Heinz Andernach and Fritz Zwicky. “English and Spanish Translation of Zwicky’s (1933) The Redshift of Extragalactic Nebulae”. In: arXiv e-prints, arXiv:1711.01693 (Nov. 2017), arXiv:1711.01693. arXiv: 1711.01693 [astro-ph.IM]. [36] Sidney Liebes. “Gravitational Lenses”. In: Physical Review 133.3B (Feb. 1964), pp. 835–844. doi: 10.1103/PhysRev.133.B835. [37] S. Refsdal. “The gravitational lens effect”. In: MNRAS 128 (Jan. 1964), p. 295. doi: 10.1093/mnras/128.4.295. [38] B. Paczynski. “Gravitational Microlensing by the Galactic Halo”. In: ApJ 304 (May 1986), p. 1. doi: 10.1086/164140. [39] B. Paczynski. “Gravitational Microlensing of the Galactic Bulge Stars”. In: ApJ 371 (Apr. 1991), p. L63. doi: 10.1086/186003. 88 Results and Data Analysis

[40] Shude Mao and Bohdan Paczynski. “Gravitational Microlensing by Double Stars and Planetary Systems”. In: ApJ 374 (June 1991), p. L37. doi: 10.1086/186066. [41] Bohdan Paczynski. “Gravitational Microlensing in the Local Group”. In: ARA&A 34 (Jan. 1996), pp. 419–460. doi: 10.1146/annurev.astro.34.1.419. arXiv: astro-ph/9604011 [astro-ph]. [42] III Gott J. Richard and James E. Gunn. “The Double Quasar 1548+115a,b as a Gravitational Lens”. In: ApJ 190 (June 1974), p. L105. doi: 10.1086/181517. [43] D. Walsh, R. F. Carswell, and R. J. Weymann. “0957+561 A, B: twin quasistellar objects or gravitational lens?” In: Nature 279 (May 1979), pp. 381–384. doi: 10.1038/279381a0. [44] J. N. Hewitt et al. “Unusual radio source MG1131+0456: a possible Einstein ring”. In: Nature 333.6173 (June 1988), pp. 537–540. doi: 10.1038/333537a0. [45] C. -H. Lee et al. “The Wendelstein Calar Alto Pixellensing Project (WeCAPP): the M 31 nova catalogue”. In: A&A 537, A43 (Jan. 2012), A43. doi: 10.1051/0004- 6361/201117068. arXiv: 1109.6573 [astro-ph.GA]. [46] C. -H. Lee et al. “Microlensing events from the 11-year Observations of the Wen- delstein Calar Alto Pixellensing Project”. In: ApJ 806.2, 161 (June 2015), p. 161. doi: 10.1088/0004-637X/806/2/161. arXiv: 1504.07246 [astro-ph.GA]. [47] M. Auri`ereet al. “A Short-Timescale Candidate Microlensing Event in the POINT- AGAPE Pixel Lensing Survey of M31”. In: ApJ 553.2 (June 2001), pp. L137–L140. doi: 10.1086/320681. arXiv: astro-ph/0102080 [astro-ph]. [48] Eamonn Kerins. “Pixel Lensing Towards Andromeda: the POINT-AGAPE Sur- vey”. In: Identification of Dark Matter. Ed. by Neil J. C. Spooner and Vitaly Kudryavtsev. Jan. 2001, pp. 269–274. doi: 10.1142/9789812811363_0029. [49] Y. Tsapras et al. “The POINT-AGAPE survey: comparing automated searches of microlensing events towards M31”. In: MNRAS 404.2 (May 2010), pp. 604–628. doi: 10.1111/j.1365-2966.2010.16321.x. arXiv: 0912.2696 [astro-ph.CO]. [50] J. Fliri and D. Valls-Gabaud. “First results from the POMME survey of M31”. In: Ap&SS 341.1 (Sept. 2012), pp. 57–64. doi: 10.1007/s10509-012-1079-5. [51] R. Savalle et al. “POMME: Exploring Time Domain Astronomy in the Andromeda Galaxy”. In: Astronomical Data Analysis Software an Systems XXIV (ADASS XXIV). Ed. by A. R. Taylor and E. Rosolowsky. Vol. 495. Astronomical Society of the Pacific Conference Series. 2015, p. 219. [52] A. Udalski et al. “The Optical Gravitational Lensing Experiment”. In: Acta As- tron. 42 (Oct. 1992), pp. 253–284. [53] C. -H. Lee et al. “PAndromeda—First Results from the High-cadence Monitoring of M31 with Pan-STARRS 1”. In: AJ 143.4, 89 (Apr. 2012), p. 89. doi: 10.1088/ 0004-6256/143/4/89. arXiv: 1109.6320 [astro-ph.GA]. References 89

[54] J. T. A. de Jong et al. “MACHOs in M 31? Absence of evidence but not evidence of absence”. In: A&A 446.3 (Feb. 2006), pp. 855–875. doi: 10.1051/0004-6361: 20053812. arXiv: astro-ph/0507286 [astro-ph]. [55] C. Alcock et al. “The MACHO Project Large Magellanic Cloud Microlensing Results from the First Two Years and the Nature of the Galactic Dark Halo”. In: ApJ 486.2 (Sept. 1997), pp. 697–726. doi: 10.1086/304535. arXiv: astro- ph/9606165 [astro-ph]. [56] Y. C. Joshi et al. “First microlensing candidate towards M 31 from the Nainital Microlensing Survey”. In: A&A 433.3 (Apr. 2005), pp. 787–795. doi: 10.1051/ 0004-6361:20042357. arXiv: astro-ph/0412550 [astro-ph]. [57] S. Calchi Novati et al. “The M31 Pixel Lensing PLAN Campaign: MACHO Lens- ing and Self-lensing Signals”. In: ApJ 783.2, 86 (Mar. 2014), p. 86. doi: 10.1088/ 0004-637X/783/2/86. arXiv: 1401.2989 [astro-ph.GA]. [58] C. Afonso. “Discovery and photometry of the binary-lensing caustic-crossing event EROS-BLG-2000-5”. In: arXiv e-prints, astro-ph/0303647 (Mar. 2003), astro– ph/0303647. arXiv: astro-ph/0303647 [astro-ph]. [59] I. A. Bond et al. “Real-time difference imaging analysis of MOA Galactic bulge observations during 2000”. In: MNRAS 327.3 (Nov. 2001), pp. 868–880. doi: 10. 1046/j.1365-8711.2001.04776.x. arXiv: astro-ph/0102181 [astro-ph]. [60] Cheongho Han, Seong-Hong Park, and Jang-Hae Jeong. “Chromaticity of grav- itational microlensing events”. In: MNRAS 316.1 (July 2000), pp. 97–102. doi: 10.1046/j.1365-8711.2000.03485.x. arXiv: astro-ph/9911375 [astro-ph]. [61] Event Horizon Telescope Collaboration et al. “First M87 Event Horizon Tele- scope Results. I. The Shadow of the ”. In: ApJ 875.1, L1 (Apr. 2019), p. L1. doi: 10.3847/2041-8213/ab0ec7. arXiv: 1906.11238 [astro-ph.GA]. [62] Andrew Gould and Abraham Loeb. “Discovering Planetary Systems through Grav- itational Microlenses”. In: ApJ 396 (Sept. 1992), p. 104. doi: 10.1086/171700. [63] G. Ingrosso et al. “Pixel lensing as a way to detect extrasolar planets in M31”. In: MNRAS 399.1 (Oct. 2009), pp. 219–228. doi: 10.1111/j.1365-2966.2009. 15184.x. arXiv: 0906.1050 [astro-ph.SR]. [64] Frank Drake. “Stars as gravitational lenses.” In: IAU Colloq. 99: Bioastronomy - The Next Steps. Ed. by George Marx. Vol. 144. Astrophysics and Space Science Library. 1988, pp. 391–394. doi: 10.1007/978-94-009-2959-3_64. [65] Mission to exploit the gravitational lens of the sun for astrophysics and SETI. Oct. 1993. [66] Richard Factor. “Gravitational Lensing Extends SETI Range”. In: Searching for Extraterrestrial Intelligence. Ed. by H. Paul Shuch. 2011, p. 227. doi: 10.1007/ 978-3-642-13196-7_13. 90 Results and Data Analysis

[67] Sohrab Rahvar. “Gravitational Microlensing Events as a Target for the SETI project”. In: ApJ 828.1, 19 (Sept. 2016), p. 19. doi: 10.3847/0004-637X/828/ 1/19. arXiv: 1509.05504 [astro-ph.IM]. [68] Konrad Kuijken, Xavier Siemens, and Tanmay Vachaspati. “Microlensing by cos- mic strings”. In: MNRAS 384.1 (Feb. 2008), pp. 161–164. doi: 10.1111/j.1365- 2966.2007.12663.x. arXiv: 0707.2971 [astro-ph]. [69] P. Baillon et al. “Detection of brown dwarfs by the micro-lensing of unresolved stars.” In: A&A 277 (Sept. 1993), pp. 1–9. arXiv: astro-ph/9211002 [astro-ph]. [70] V. R. Eshleman. “Gravitational Lens of the Sun: Its Potential for Observations and Communications over Interstellar Distances”. In: Science 205.4411 (Sept. 1979), pp. 1133–1135. doi: 10.1126/science.205.4411.1133. [71] Michael Hippke. “Interstellar communication. II. Application to the solar gravi- tational lens”. In: Acta Astronautica 142 (Jan. 2018), pp. 64–74. doi: 10.1016/ j.actaastro.2017.10.022. arXiv: 1706.05570 [astro-ph.EP]. [72] Nathan Cohen. “The gravitational lens as an intergalactic communication tool”. In: Acta Astronautica 26.3 (Jan. 1992), pp. 249–251. doi: 10.1016/0094-5765(92) 90106-S. [73] Slava G. Turyshev and Viktor T. Toth. “Imaging extended sources with the solar gravitational lens”. In: Phys. Rev. D 100.8, 084018 (Oct. 2019), p. 084018. doi: 10.1103/PhysRevD.100.084018. arXiv: 1908.01948 [gr-qc]. [74] Shuji Deguchi and William D. Watson. “Electron Scintillation in Gravitationally Lensed Images of Astrophysical Radio Sources”. In: ApJ 315 (Apr. 1987), p. 440. doi: 10.1086/165149. [75] T. Naderi, A. Mehrabi, and S. Rahvar. “Primordial black hole detection through diffractive microlensing”. In: Phys. Rev. D 97.10, 103507 (May 2018), p. 103507. doi: 10.1103/PhysRevD.97.103507. arXiv: 1711.06312 [astro-ph.CO]. [76] P. W. Hodge. “The Andromeda galaxy”. In: Scientific American 244 (Jan. 1981), pp. 92–97. doi: 10.1038/scientificamerican0181-92. [77] J. Scheiner. “On the spectrum of the great in Andromeda.” In: ApJ 9 (Mar. 1899), pp. 149–150. doi: 10.1086/140564. [78] V. M. Slipher. “The radial velocity of the Andromeda Nebula”. In: Lowell Obser- vatory Bulletin 1 (Jan. 1913), pp. 56–57. [79] Harlow Shapley and Heber D. Curtis. “The Scale of the Universe”. In: Bulletin of the National Research Council 2.11 (May 1921), pp. 171–217. [80] G. E. Christianson. Edwin Hubble. Mariner of the nebulae. 1997. [81] E. P. Hubble. “Extragalactic nebulae.” In: ApJ 64 (Dec. 1926), pp. 321–369. doi: 10.1086/143018. [82] Knut Lundmark. “The determination of the curvature of space-time in de Sitter’s world”. In: MNRAS 84 (June 1924), pp. 747–770. doi: 10.1093/mnras/84.9.747. References 91

[83] E. P. Hubble. “A spiral nebula as a stellar system, Messier 31.” In: ApJ 69 (Mar. 1929), pp. 103–158. doi: 10.1086/143167. [84] Gerard de Vaucouleurs et al. Third Reference Catalogue of Bright Galaxies. 1991. [85] Alexia R. Lewis et al. “The Panchromatic Hubble Andromeda Treasury. XI. The Spatially Resolved Recent Star Formation History of M31”. In: ApJ 805.2, 183 (June 2015), p. 183. doi: 10.1088/0004-637X/805/2/183. arXiv: 1504.03338 [astro-ph.GA]. [86] B. T. Draine et al. “Andromeda’s Dust”. In: ApJ 780.2, 172 (Jan. 2014), p. 172. doi: 10.1088/0004-637X/780/2/172. arXiv: 1306.2304 [astro-ph.CO]. [87] T. J. Cox and Abraham Loeb. “The collision between the Milky Way and An- dromeda”. In: MNRAS 386.1 (May 2008), pp. 461–474. doi: 10.1111/j.1365- 2966.2008.13048.x. arXiv: 0705.1170 [astro-ph]. [88] ZTF/D. Goldstein and R. Hurt (Caltech). https://www.ztf.caltech.edu/ image/andromeda. Last Accessed on 21 January 2020. [89] W. Baade. “The Resolution of Messier 32, NGC 205, and the Central Region of the Andromeda Nebula.” In: ApJ 100 (Sept. 1944), p. 137. doi: 10.1086/144650. [90] Vera C. Rubin and Jr. Ford W. Kent. “Rotation of the Andromeda Nebula from a Spectroscopic Survey of Emission Regions”. In: ApJ 159 (Feb. 1970), p. 379. doi: 10.1086/150317. [91] Alan Dressler and Douglas O. Richstone. “Stellar Dynamics in the Nuclei of M31 and M32: Evidence for Massive Black Holes”. In: ApJ 324 (Jan. 1988), p. 701. doi: 10.1086/165930. [92] J. N. Heasley. “Point-Spread Function Fitting Photometry”. In: Precision CCD Photometry. Ed. by Eric R. Craine, David L. Crawford, and Roy A. Tucker. Vol. 189. Astronomical Society of the Pacific Conference Series. 1999, p. 56. [93] Paul L. Schechter, Mario Mateo, and Abhijit Saha. “DoPHOT, A CCD Photom- etry Program: Description and Tests”. In: PASP 105 (Nov. 1993), p. 1342. doi: 10.1086/133316. [94] C. Alard. “Analysis of Microlensing Observations Using Purely Differential Meth- ods”. In: Cosmological Physics with Gravitational Lensing. Ed. by J. Tran Thanh Van, Yannick Mellier, and Marc Moniez. Jan. 2001, p. 69. [95] Willard S. Boyle. “Nobel Lecture: CCD—An extension of man’s view”. In: Reviews of Modern Physics 82.3 (July 2010), pp. 2305–2306. doi: 10.1103/RevModPhys. 82.2305. [96] Eric C. Bellm et al. “The Zwicky Transient Facility: System Overview, Perfor- mance, and First Results”. In: PASP 131.995 (Jan. 2019), p. 018002. doi: 10. 1088/1538-3873/aaecbe. arXiv: 1902.01932 [astro-ph.IM]. 92 Results and Data Analysis

[97] Carlos Rodrigo, Enrique Solano, and Amelia Bayo. SVO Filter Profile Service Version 1.0. Tech. rep. Oct. 2012, p. 1015. doi: 10.5479/ADS/bib/2012ivoa. rept.1015R. [98] Zeljko Ivezic, Lynne Jones, and Robert Lupton. The LSST Photon Rates and SNR Calculations, v1.2. http://faculty.washington.edu/ivezic/Teaching/ Astr511/LSST_SNRdoc.pdf. Last Accessed on 21 January 2020. 2010. [99] G. S. Da Costa. “Basic Photometry Techniques”. In: Astronomical CCD Observing and Reduction Techniques. Ed. by Steve B. Howell. Vol. 23. Astronomical Society of the Pacific Conference Series. 1992, p. 90. [100] Julianne J. Dalcanton et al. “The Panchromatic Hubble Andromeda Treasury”. In: ApJS 200.2, 18 (June 2012), p. 18. doi: 10.1088/0067- 0049/200/2/18. arXiv: 1204.0010 [astro-ph.CO]. [101] Benjamin F. Williams et al. “The Panchromatic Hubble Andromeda Treasury. X. Ultraviolet to Infrared Photometry of 117 Million Equidistant Stars”. In: ApJS 215.1, 9 (Nov. 2014), p. 9. doi: 10.1088/0067-0049/215/1/9. arXiv: 1409.0899 [astro-ph.GA]. [102] Arne Rau et al. “Exploring the Optical Transient Sky with the Palomar Transient Factory”. In: PASP 121.886 (Dec. 2009), p. 1334. doi: 10.1086/605911. arXiv: 0906.5355 [astro-ph.CO]. [103] Nicholas M. Law et al. “The Palomar Transient Factory: System Overview, Per- formance, and First Results”. In: PASP 121.886 (Dec. 2009), p. 1395. doi: 10. 1086/648598. arXiv: 0906.5350 [astro-ph.IM]. [104] R. G. Harrington. “The 48-inch Schmidt-type Telescope at Palomar Observatory”. In: PASP 64.381 (Dec. 1952), p. 275. doi: 10.1086/126494. [105] Eric C. Bellm. “Volumetric Survey Speed: A Figure of Merit for Transient Sur- veys”. In: PASP 128.966 (Aug. 2016), p. 084501. doi: 10.1088/1538-3873/128/ 966/084501. arXiv: 1605.02081 [astro-ph.IM]. [106] E. Bellm. “The Zwicky Transient Facility”. In: The Third Hot-wiring the Transient Universe Workshop. Ed. by P. R. Wozniak et al. Jan. 2014, pp. 27–33. arXiv: 1410.8185 [astro-ph.IM]. [107] Eric Christopher Bellm, Shrinivas R. Kulkarni, and ZTF Collaboration. “The Zwicky Transient Facility”. In: American Astronomical Society Meeting Abstracts #225. Vol. 225. American Astronomical Society Meeting Abstracts. Jan. 2015, p. 328.04. [108] Frank J. Masci et al. The ZTF Science Data System (ZSDS) Explanatory Sup- plement -Pipelines, Definitions, Data Products, Access & Usage. https://irsa. ipac.caltech.edu/data/ZTF/docs/ztf_pipelines_deliverables.pdf. Ver- sion 4.0. Dec. 2019. References 93

[109] Christophe Pradal, Ga¨elVaroquaux, and Hans Peter Langtangen. “Publishing sci- entific software matters”. In: Journal of computational science 4.5 (2013), pp. 311– 312. doi: 10.1016/j.jocs.2013.08.001. url: https://hal.inria.fr/hal- 00858663. [110] Jacob E. Simones et al. “The Panchromatic Hubble Andromeda Treasury. VI. The Reliability of Far-ultraviolet Flux as a Star Formation Tracer on Subkiloparsec Scales”. In: ApJ 788.1, 12 (June 2014), p. 12. doi: 10.1088/0004-637X/788/1/12. arXiv: 1404.4981 [astro-ph.GA]. [111] Yuhan Yao et al. “ZTF Early Observations of Type Ia Supernovae. I. Properties of the 2018 Sample”. In: ApJ 886.2, 152 (Dec. 2019), p. 152. doi: 10.3847/1538- 4357/ab4cf5. arXiv: 1910.02967 [astro-ph.HE]. [112] Daniel Godines4. LIA-Lens Identification Algorithm. https : / / github . com / dgodinez77/LIA. Version 1.0. Jan. 2019. doi: 10.5281/zenodo.2541465. [113] Joseph W. Richards et al. “On Machine-learned Classification of Variable Stars with Sparse and Noisy Time-series Data”. In: ApJ 733.1, 10 (May 2011), p. 10. doi: 10.1088/0004-637X/733/1/10. arXiv: 1101.1959 [astro-ph.IM]. [114] Astropy Collaboration et al. “The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package”. In: AJ 156.3, 123 (Sept. 2018), p. 123. doi: 10.3847/1538-3881/aabc4f. arXiv: 1801.02634 [astro-ph.IM]. [115] K. V. Sokolovsky et al. “Comparative performance of selected variability detec- tion techniques in photometric time series data”. In: MNRAS 464.1 (Jan. 2017), pp. 274–292. doi: 10.1093/mnras/stw2262. arXiv: 1609.01716 [astro-ph.IM]. [116] Ilya N. Pashchenko, Kirill V. Sokolovsky, and Panagiotis Gavras. “Machine learn- ing search for variable stars”. In: MNRAS 475.2 (Apr. 2018), pp. 2326–2343. doi: 10.1093/mnras/stx3222. arXiv: 1710.07290 [astro-ph.IM]. [117] A. Z. Bonanos et al. “The Hubble Catalog of Variables (HCV)”. In: A&A 630, A92 (Oct. 2019), A92. doi: 10.1051/0004-6361/201936026. arXiv: 1909.10757 [astro-ph.SR]. [118] Nelson Caldwell et al. “Star Clusters in M31. I. A Catalog and a Study of the Young Clusters”. In: AJ 137.1 (Jan. 2009), pp. 94–110. doi: 10.1088/0004- 6256/137/1/94. arXiv: 0809.5283 [astro-ph]. [119] Monika D. Soraisam et al. “Variability of Massive Stars in M31 from the Palomar Transient Factory”. In: ApJ 893.1, 11 (Apr. 2020), p. 11. doi: 10.3847/1538- 4357/ab7b7b. arXiv: 1908.02439 [astro-ph.SR]. I do not know what I may appear to the world, but to myself I seem to have been only like a boy playing on the sea-shore, and diverting myself in now and then finding a smoother pebble or a prettier shell than ordinary, whilst the great ocean of truth lay all undiscovered before me. ISAAC NEWTON