Universiteit van Amsterdam

MSc Artificial Intelligence Master Thesis

Characterizing Transits and Stellar Activity with Scalable Gaussian Processes

by Victoria Foing 11773391

October 17, 2020

36 ECTS February 2020 - October 2020

Main Supervisor: Ana M. Heras () Co-Supervisor: Patrick Forré (University of Amsterdam) Abstract

Exoplanets are planets orbiting around stars other than our sun. The field of exoplanet science has boomed since the 1990s [1]. Space missions like CoRoT, Kepler and TESS, and ground-based observatories have collected millions of lightcurves and detected more than 3,200 using photometry, an approach that involves measuring the brightness of a star over time and looking for periodic dips in the brightness caused by a planet repeatedly passing in front of the star. Once a transit is detected, astronomers seek to characterize the transit signal in the lightcurve to learn about properties of the planet. This can be challenging due to noise from stellar activity, namely rotational modulation patterns in the lightcurve caused by starspots and plages appearing in and out of view as the host star rotates. In the past decade, Gaussian Processes have become popular models for modelling both instrumental and astrophysical noise in lightcurves to aid the detection and characterization of exoplanet transits. Gaussian Processes are well-suited to this task because they are flexible, interpretable, and probabilistic estimates of the model parameters can be obtained with Markov Chain Monte Carlo sampling. However, they have been disadvantaged by slow computation speeds which has precluded their application to the vast datasets provided by Kepler and TESS. Recently, a new software, celerite, has introduced special kernel functions that are not only well-suited to model stellar activity but also scale linearly with the dataset by taking advantage of semi-separable matrices [2]. In this work, we apply scalable Gaussian Process models using the software celerite in order to characterize exoplanet transits and stellar activity in Kepler and TESS lightcurves. By "charac- terizing", we mean the retrieval of accurate transit and rotation parameters, with a focus on the radius of the planet and the rotation period of the star. We develop a pipeline for preprocessing the lightcurves, defining prior distributions using Physics techniques, building a Gaussian Process model with a transit mean function and a quasi-periodic Rotation kernel and a noise kernel, and sampling from the model with Markov Chain Monte Carlo. To assess the benefits of jointly mod- elling the rotation and the transits, we compare the performances of three Gaussian Process models: RotGP (only rotation), ExoGP (only transit), and ExoRotGP (rotation and transit). The mod- els are applied to 15 TESS stars and 9 Kepler stars with confirmed planets and rotation periods, and the parameter estimates obtained are compared to those from refereed publications. For the lightcurves classified as having unambiguous rotation periods by previous authors, our best GP models obtain rotation periods within one one day of the reference rotation periods. For four of the TESS lightcurves classified as having ambiguous rotation periods by previous authors, the GP models are able to find rotation periods, outperforming the analysis of other researchers using tra- ditional techniques. For the recovery of planet radii, our best GP models are able to estimate the planet radii within one standard deviation of the radii for all Kepler planets and all but two TESS planets. Moreover, we highlight the importance of preprocessing decisions, demonstrating that the pipeline correction method can worsen the retrieval of the rotation period and temporal binning can dramatically worsen the retrieval of planet radii. Finally, we discover that joint modelling improves rotation period estimates for a few stars, due to the extra noise kernel, but only marginally improves the planet radii estimates, suggesting that celerite’s noise kernel can be effective for removing stellar activity. Our results further indicate that the joint model could be a promising tool for the charac- terisation of small exoplanets, with sizes between Neptune and Earth. Our method provides a solid basis for a future application and extension of the model to these objects. Code: https://github.com/victoriafoing/ExoRotGP

1 Acknowledgements

I would like to express my sincere gratitude to my supervisor at the European Space Agency, Ana M. Heras, for her wonderful guidance and excellent support throughout the research project. I would also like to thank my supervisor at the University of Amsterdam, Patrick Forre, for the great feedback about the Artificial Intelligence side of the project. I would like to thank my family for their unwavering support and advice, in particular Bernard Foing. Furthermore, I would like to thank the authors of celerite software, which I used throughout the project, in particular Dan Foreman-Mackey, who was very helpful in answering my questions and guiding me in the right direction. Lastly, I would like to thank my boyfriend Adam, my friends, and my classmates in the AI masters program for teaching me a lot throughout the AI master and making this experience very enjoyable.

2 Contents

1 Executive Summary 5

2 Introduction 7 2.1 Problem Statement ...... 7 2.2 Motivation ...... 7 2.3 Research questions ...... 8 2.4 History of Exoplanet Science ...... 9 2.4.1 Exoplanet Transits ...... 10 2.4.2 Stellar activity ...... 10 2.5 Challenges ...... 11 2.5.1 Detection of Signals ...... 11 2.5.2 Noise: photon, instrumental, astrophysical ...... 11 2.5.3 Stellar Limb Darkening ...... 11

3 Literature Review 12 3.1 Traditional Methods in Physics ...... 12 3.1.1 Transit Fitting ...... 12 3.1.2 Recovering Rotation Periods ...... 13 3.2 Applications in Machine Learning ...... 14 3.2.1 Gaussian Processes ...... 14 3.3 Position in the State of the Art ...... 15

4 Methodology 16 4.1 Bayesian Inference ...... 16 4.2 Gaussian Processes ...... 17 4.3 Celerite ...... 17 4.3.1 Stochastically-Driven Dampled Harmonic Oscillator (SHO) Kernel ...... 19 4.3.2 Rotation Kernel ...... 19 4.3.3 Semi-Separable Matrics ...... 20 4.4 Transit models ...... 22 4.5 Markov Chain Monte Carlo ...... 23

5 Algorithm 24 5.1 Algorithm ...... 24 5.1.1 Initial estimates ...... 24 5.1.2 Three Models: ExoGP, RotGP, ExoRotGP ...... 26 5.1.3 Sampling ...... 30

6 Experiments 31 6.1 Data ...... 31 6.2 Selection of Targets ...... 31 6.3 Preprocessing ...... 32 6.4 Experimental Design ...... 35 6.5 Experimental Evaluation ...... 36

7 Results and Discussion 38 7.1 Simulated Experiment ...... 38 7.2 Rotation Periods ...... 39 7.2.1 Kepler ...... 39 7.2.2 TESS ...... 40 7.2.3 Observation Window ...... 42

3 CONTENTS CONTENTS

7.3 Transit Characterization ...... 43 7.3.1 Kepler ...... 43 7.3.2 TESS ...... 44 7.4 Computation Time ...... 46 7.5 Discussion ...... 47 7.5.1 Kepler vs. TESS ...... 48 7.5.2 How could our method be improved? ...... 48 7.5.3 Challenges ...... 49

8 Conclusion and Future Work 50 8.1 Contributions ...... 50 8.2 Future work ...... 51 8.3 Concluding Remarks ...... 52

References 54

Appendix A 59 A.1 Summary Statistics ...... 59 A.2 Power spectrum of Celerite kernel and SHO ...... 60 A.3 Example of Document Outputted by Pipeline: WASP-173 A ...... 60 CHAPTER 1 Executive Summary

In this work, we apply scalable Gaussian Process models using the software celerite in order to characterize exoplanet transits and stellar activity in Kepler and TESS lightcurves. Our goal is to use this machine learning method to obtain accurate characteristics of the planet and its orbit, and of the stellar light variations, with a focus on the radius of the planet and the rotation period of the star. We begin by presenting our research questions: (1) How accurately can we model the stellar activity with Gaussian processes in Kepler and TESS lightcurves? (2) How accurately can we model the exoplanet transits with Gaussian processes in Kepler and TESS lightcurves? (3) Does joint modelling improve the characterization of rotation and transit parameters in Kepler and TESS lightcurves? In the motivation section, we outline the motivation for studying the transit signals and rotation signals in Kepler and TESS lightcurves. The motivation is to help astronomers study and understand the planetary systems observed by these missions and to investigate one of the most fundamental questions in science: is our solar system unique? We further highlight how important finding accurate transit parameters is for deriving the properties of planets, which can give clues about the habitability of the planet. Modelling stellar activity can provide valuable insights about stellar physics (i.e. the evolution, age, and magnetic fields of stars) and the potential habitability of orbiting planets. Stellar activity models can also be used to detrend lightcurves, which can improve the detection of smaller Earth-sized planets, which are more likely to be habitable, and improve the characterization of transit parameters. Next, we explore the motivation for using Gaussian Process to solve the problem of finding accurate transit and rotation parameters. In the past decade, Gaussian Processes have become popular models for modelling both instrumental and astrophysical noise in lightcurves because they are flexible and interpretable. However, they have been largely limited by slow computation speeds. Recently, a new software, celerite, has introduced special kernel functions that are not only well-suited to model stellar activity but also scale linearly with the dataset by taking advantage of semi-separable matrices [2]. We provide a brief summary of the history of exoplanet science and an introduction of the two concepts that we will study with our machine learning method: the exoplanet transits and the stellar activity. We discuss the boom in exoplanet science and the importance of space missions like Kepler and TESS, which have collected millions of lightcurves. Furthermore, we emphasize how the lightcurves produced by the Kepler and TESS missions can be studied to provide valuable insights about the detected exoplanets and their host stars. In particular, the transit signals in the lightcurves can be characterized to derive the radii of the planets, while the stellar activity can be characterized to infer the rotation periods of the stars In the literature review section, we outline traditional techniques in Physics for estimating rotation periods, such as the Lomb-Scargle periodogram, and transit models for fitting lightcurves. We give an overview of the machine learning literature that exists on the application of Gaussian Processes on the stellar light curve analysis and why they are well-suited to this problem. Finally, we define our position in the state of the art, highlighting that our approach is innovative in that we work with novel TESS data, explore the joint modelling of transit and stellar signals, and develop a general pipeline that can be applied to many targets. In the method section, we explain the theory behind our approach. We explain how Gaussian Processes work, how they can be used to model instrumental and stellar astrophysical noise, how they use Bayesian settings to find parameter estimates, and how the models can be optimized. Following that, we break down the kernel functions of the software celerite and explain how they can be used to model stellar variability and why they are fast to compute [2]. We further describe the transit model we used to fit the transits and the various features that are used to define it: limb darkening, orbit types, exposure time integration. Finally, we describe Markov Chain Monte Carlo Sampling and how it can be used to derive uncertainty estimates from the Gaussian Process posterior [3]. In the algorithm section, we explain how we implemented the theory using the software exoplanet, pymc3, starry, and celerite [4][5],[6], with the help of pseudocode examples. Essentially, we develop a pipeline for pre-

5 CHAPTER 1. EXECUTIVE SUMMARY

processing the lightcurves, defining prior distributions using traditional Physics techniques, building a Gaussian Process model with a transit mean function and a quasi-periodic Rotation kernel, and sampling from the model with Markov Chain Monte Carlo. With certain arguments, this flexible pipeline can run experiments for three models: three Gaussian Process models: RotGP (only rotation), ExoGP (only transit), and ExoRotGP (rota- tion and transit). We summarize the prior distributions that we used for our models and the mean functions and kernel functions used for each Gaussian Process model. In the experiments section, we begin by giving an overview of the data from the Kepler and TESS mission and how they differ. Next, we describe the selection of targets and visualize them so the reader can refer to them when interpreting the results. We describe the preprocessing steps, in particular the lightcurve correction methods, and we set up the experiments for investigating the research questions. To assess the benefits of jointly modelling the rotation and the transits, we compare the performances of three Gaussian Process models: RotGP (only rotation), ExoGP (only transit), and ExoRotGP (rotation and transit). We evaluate our method by first performing a simulated experiment, and then applying the models to 9 Kepler stars and 15 TESS stars. For the real data, we calculate the mean and quantiles of the parameter estimates and compare them to values from refereed publications. Moreover, we assess convergence of the model using the number of effective samples and Gelman-Rhubin statistic, and assess the efficiency of the model by looking at the computation time. In the results and discussion section, we present the results of our experiments on simulated and real lightcurves. Our model is validated using a simulated experiment, where it finds the injected rotation period and the injected planet radius in a synthetic lightcurve. We discuss how with our GP models, we were able to estimate rotation periods within 1 day for all the Kepler targets and rotation periods within 1 day of all the TESS targets with unambiguous rotation, ranging from 1.2 days to 28.5 days. For four TESS lightcurves which were deemed to have “dubious rotation” or “ambiguous variability” by Martins et al., we find rotation periods, one of which is a new rotation period [7]. Our method was able to recover planet radii ranging from small super Earths (0.1 R_jup) to large Jupiters (1.5 R_jup). We highlight the decrease in performance when the correction method flattens the rotation signal and when temporal binning smears the transit curve. Moreover, we conclude that the joint ExoRotGP only marginally performs better than the modular RotGP and ExoGP models, at the expense of much longer MCMC computation time. Thus, we conclude that the joint model should only be used when there is a clear rotation signal present and if it strongly interferes with the transit signal (i.e. when the stellar rotation period and the planet orbital period overlap, or when the planet is very small) In the conclusion, we wrap up the contributions of the thesis, which are the answers to our research questions, and the development of a software tool which automates multiple stages of data analysis: correcting lightcurves, transit detection, defining prior distributions based on the data, modelling of the stellar activity and exoplanet transits with a Gaussian Process, and sampling from the model with MCMC. For each star, our code outputs a document with figures and tables summarizing different stages of the pipeline. The tool we developed and the documents produced will be used by astronomy students for the analysis of more stars and planets. We have also published results at two conferences: the first, a virtual poster at the European Astronomical Society (EAS), and the second, a recorded talk at the Europlanet Science Congress (EPSC) (see Appendix). CHAPTER 2 Introduction

2.1 Problem Statement

“Three decades ago, astronomers could not say reliably whether there were planets around other stars”, and now we know that “the universe is home to more planets than stars, with billions of potentially habitable planets just in our own galaxy” [8]. Since the first exoplanet discovery in 1992, the field of exoplanet science has boomed and over 4,000 exoplanets have been discovered [1]. These discoveries have been facilitated by improved instrumentation of ground-based telescope observations and the launch of exoplanet-hunting missions, which have collected vast amounts of lightcurve data. Lightcurves are time series data which measure the brightness ("flux") of a star over time and can be used detect exoplanets transits, periodic dips in the brightness caused by a planet passing in front of the star as it orbits [9]. To date, the most successful exoplanet-hunting mission is Kepler/K2, which observed 530,000 stars and was responsible for discovering 3/4th of the known exoplanets using the transit method [8]. In 2018, the baton was a passed to a new space mission called Transiting Exoplanet Survey Satellite (TESS). In contrast to the Kepler mission, TESS is performing a transit survey of the full sky and is observing 200,000 stars that are closer and brighter. To date, TESS has discovered 79 exoplanets and is expected to discover several thousands more [10]. The lightcurves produced by the Kepler and TESS missions can provide valuable insights about the detected exoplanets and their host stars. In particular, the transit signals in the lightcurves can be characterized to derive various properties of the planet, such as the radius, while the stellar activity in the lightcurves can be characterized to infer the rotation periods of the stars. There is a flood of astronomy data, and a need to develop reliable, scalable and automated methods for characterizing the stars and planets observed by these missions. The characterization of exoplanet transits in lightcurves is a non-trivial problem and is plagued by instrumental and astrophysical noise. Thus, we need methods capable of modelling noise and disentangling signals. In the past decade, Gaussian Processes have become popular models for modelling both instrumental and astrophysical noise in lightcurves because they are flexible and interpretable, but they have been largely limited by slow computation speeds. Recently, a new software, celerite, has introduced special kernel functions that are not only well-suited to model stellar activity, but also scale linearly with the dataset by taking advantage of semi-separable matrices. Few studies have worked with real TESS data and no studies have applied Gaussian Process for modelling stellar activity in TESS data. Thus, the objective of this research is to jointly characterize exoplanet transits and stellar activity in TESS and Kepler lightcurves using scalable Gaussian Processes. Simply put, we want to demonstrate that this machine learning method can obtain accurate estimates of the transit and rotation parameters, with a focus on the planet radii and stellar rotation periods, and that it can be applied to many star-planet systems.

2.2 Motivation

The motivation for characterizing exoplanet transits and stellar activity is to help astronomers study and understand the planetary systems observed by these missions and to continue the "quest to end humankind’s cosmic loneliness" [8]. The study of exoplanets allows us to investigate one of the most fundamental questions in science: is our solar system unique? By studying other exoplanets, we can learn about the evolution of other planetary systems and better understand the history of own solar system and its place in the universe. Finding accurate transit parameters is important for deriving the properties of planets. When the radius and the mass of a planet are obtained, astronomers can calculate the density of the planet, which can provide insights about the composition and the atmosphere of the planet and whether it is habitable [11]. Modelling stellar activity can provide valuable insights about stellar physics (i.e. the evolution, age, and magnetic fields of stars) and the potential habitability of orbiting planets. According to Martins et al. (2020), it is valuable to study stellar

7 2.3. RESEARCH QUESTIONS CHAPTER 2. INTRODUCTION

rotation because it is "a fundamental observable driving stellar and planetary evolution, including planetary atmospheres and impacting on habitability conditions and the genesis of life around stars" [7]. Stellar activity models can also be used to detrend the lightcurves, which can improve the detection of Earth-sized planets, which are more likely to be habitable, and improve the characterization of their transit parameters. The motivation for using scalable Gaussian Processes is that they are well-equipped to deal with the prob- lem of characterizing stellar activity and exoplanet transits without the disadvantage of being limited to small datasets. First, they are non-parameteric and flexible, allowing them to model both instrumental and astro- physical processes, which can help separate signals. Second, they provide probabilistic uncertainty estimates for parameters, which is useful for hierarchical modeling in astronomy (i.e. making "scientific inferences about a population" of stars and planets) [12]. Few studies have jointly modelled the exoplanet transits and stellar activity. We explore the joint modelling of stellar activity, noise, and exoplanet transits, as it has been suggested to give better, more holistic results [13]. Typically, these are treated as two separate problems and one signal is removed to amplify the other. Rotation periods can be obtained by first removing the transit signals and then studying the parts of the lightcurve that are out of transit. Transit parameters can be obtained by first detrending the lightcurve to remove long-term astrophysical signals and then fitting the transit parameters. Though these approaches have proven to be successful, they remove information from the data which can be important for characterization. By jointly modelling the components, the model can work with all the data available and learn how the rotation and transit parameters influence each other, ultimately leading to better characterization. Additionally, joint modelling permits the study of two objects at once: the planet and the star. The motivation for applying this method to TESS lightcurves is that the data from the TESS mission is extremely novel. First, few studies have worked with real TESS data and no studies have applied Gaussian Processes for modelling stellar activity in TESS data. This is because the data from the TESS mission is new, having been released in increments. This presents an opportunity to test new methods on the data and make new scientific discoveries which can be published. For example, the stellar rotation periods of the majority of the stars observed by TESS have not been established and testing a method which can accurately derive them would be useful for studying the stars and detrending the lightcurves. The motivation for working with Kepler lightcurves is to validate and finetune our method. Though less novel, Kepler lightcurves are less noisy and have been studied extensively, leading to a lot of reference radii and rotation periods in literature for comparison. The motivation for turning to machine learning is because there is a strong need for developing reliable, scalable and automated methods for analyzing these large datasets. The datasets provided by both Kepler and TESS are vast, and there will be many more missions launched in the near future. Manually searching through these datasets will be a challenge for astronomers. Few studies have worked on developing a generalizable method for transit characterization which can be applied to multiple planets. From a machine learning perspective, we want to develop a pipeline which can work on a handful of targets. The reduction in computation time offered by the scalable kernels introduced by the celerite software offers an opportunity to characterize multiple star-planet systems. All in all, with the advent of new exoplanet-hunting missions and the flood of astronomy data, we are truly living at one of the peaks of exoplanet science. Our aim is to contribute to the field of astronomy during these exciting times, using artificial intelligence as a tool.

2.3 Research questions

To the best of our knowledge, no studies have applied Gaussian Process methods to TESS data for the task of modelling stellar activity and few studies have examined the joint modelling of exoplanet transits and stellar activity. Considering the novelty of the TESS data and the research gaps in scientific literature, this thesis will focus characterizing exoplanet transits and stellar activity in TESS and Kepler lightcurves with scalable Gaussian Processes. We propose three research questions:

• RQ1: How accurately can we model the stellar activity with Gaussian processes in Kepler and TESS lightcurves? • RQ2: How accurately can we model the exoplanet transits with Gaussian processes in Kepler and TESS lightcurves? • RQ3: Does joint modelling improve the characterization of rotation and transit parameters in Kepler and TESS lightcurves? 2.4. HISTORY OF EXOPLANET SCIENCE CHAPTER 2. INTRODUCTION

2.4 History of Exoplanet Science

We provide a brief summary of the history of exoplanet science and a brief overview of the two astronomy concepts that we will study with our machine learning method: the exoplanet transits and the stellar activity. We discuss the boom in exoplanet science and the importance of space missions like Kepler and TESS, which have collected millions of lightcurves. Furthermore, we emphasize how the lightcurves produced by the Kepler and TESS missions can be studied to provide valuable insights about the detected exoplanets and their host stars. In particular, the transit signals in the lightcurves can be characterized to derive the radii of the planets, while the stellar activity can be characterized to infer the rotation periods of the stars.

Figure 2.1: Total number of exoplanet detections from 1989 to 2016 by different methods. Source: NASA Exoplanet Archive [14]

The field of exoplanet science has boomed since the 1990s and has revolutionized our vie of the universe. Since the first exoplanet discovery in 1992 (around a pulsar), over 4,000 exoplanets have been discovered. These discoveries have been facilitated by improved instrumentation of ground-based telescope observations and the launch of exoplanet-hunting missions, which have collected vast amounts of lightcurve data.

Kepler and TESS The most successful mission was Kepler/K2, which was a space satellite in operation from 2009 to 2018. The Kepler mission observed 530,000 stars in a distant and narrow section of the sky using the transit method and was responsible for discovering 3/4th of the known exoplanets, namely 2,662 exoplanets [8]. In 2018, the baton was a passed to a new space mission called Transiting Exoplanet Survey Satellite (TESS). In contrast to the Kepler mission, TESS is performing a transit survey of the full sky and is observing 200,000 stars that are closer and brighter. To date, TESS has discovered 79 exoplanets and is expected to discover several thousands more [10]. In the upcoming decade, several new missions will be go into operation. Following the JWST, the PLAnetary Transits and Oscillations of stars (PLATO) mission will be launched in 2026. The PLATO mission will not only study exoplanets that appear in the habitable zone of solar-like stars but also study the host stars to learn more about planetary systems [10]. 2.4. HISTORY OF EXOPLANET SCIENCE CHAPTER 2. INTRODUCTION

2.4.1 Exoplanet Transits

Figure 2.2: The transit method measures the brightness of a star over time and checks for periodic dips in the brightness caused by exoplanet passing in front of it. Source: PyPi Exotic [15]

Exoplanets are lightyears away and the signals can be very faint. Researchers have devised a variety of indirect methods to detect exoplanets from the ground and in space. The most successful method now is the transit method, which measures the brightness of a star over time and checks if there is a periodic dip caused by an exoplanet transiting in front of it. Once a transit signal is detected, astronomers seek to characterize the signal as accurately as possible using transit parameters. Some transit parameters can be directly measured from the geometry of the transit: transit period (how often it appears), transit duration (how long it is visible), and transit depth (how much of the star’s light it blocks). Other more complex parameters can be derived using physical equations and fitting models: the radius of the planet, the distance to the star, and the eccentricity of the orbit, the impact parameter, which indicates whether the planet is crossing the center of the star or grazing the edge of the star. Based on the transit depth, the radius of the planet can be determined if the radius of the star is known. Once the radius is found, the transit method is often followed up by the Radial Velocity method, which determines the mass of the planet and allows the density of the planet to be calculated [16].

2.4.2 Stellar activity

Figure 2.3: Rotational modulation patterns in a lightcurve caused by a starspot on the surface of the star appear in and out of view as the star rotates. Source: Armagh Observatory [17]

Stellar activity refers to magnetic field activity on the surface of stars shown by the emergence and decay of active regions such as starspots (dark spots), faculae (bright spots), and plages (bright regions). When the star rotates, these active regions appear in and out of view, resulting in variations in the stellar flux called rotational modulation. Active regions vary in size, location, and lifetime, leading to a variety of complex rotational modulation patterns. If rotational modulation is treated as a wave, it can be described by the following parameters: rotation period, amplitude of rotation, and quality factor, which indicates how quickly the modulating signal dies out. 2.5. CHALLENGES CHAPTER 2. INTRODUCTION

2.5 Challenges

2.5.1 Detection of Signals The problem of recovering accurate transit and rotation parameters is non-trivial and comes with many chal- lenges. To begin with, the transit signal and/or the rotation signal must be detected in the data. An exoplanet can have a size ranging from smaller than the Earth to larger than Jupiter (which is 11.2 Earth radii) and the orbital period of an exoplanet can range from a few hours to thousands of years. A transit signal may not be detected if the signal is too faint (i.e. the planet is too small and/or the instrumental noise is too high) or if the transit does not appear in the observation window (i.e. the orbital period is too long). The situation is similar for the rotation signal, which can have a period ranging from shorter than one day to longer than 100 days. A rotation signal may not be detected in the data if the amplitude of rotation is too low (i.e. star spots on the surface of the star are small) or if the rotation period is too long for the time frame studied. In general, rotation signals are hard to characterize because even though every star has a rotation period, a rotation signal is not always present or consistent. There may be no starspots or many starspots covering both sides of the star, leading to the lack of a rotation modulation signal. The starspots are also constantly growing and shrinking, leading to changes in the pattern of rotational modulation. Furthermore, since stars are fluid and not solid, they exhibit differential rotation, which means that different parts of the surface of the star are rotating at different velocities. As you can see, there is a huge diversity in cases if we consider the range of sizes and periods. This makes it difficult to develop a robust method for finding rotation and transit parameters that works for all stars.

2.5.2 Noise: photon, instrumental, astrophysical The main challenge in detecting and characterizing signals in lightcurves, however, is the presence of noise in data which interfere with the signals. There are various kinds of noise: 1) photon noise, caused by varying amounts of photons being collected by the telescope at different intervals, 2) instrumental noise, caused by instability of the instrument, (i.e. space craft drift or jumps) and 3) astrophysical noise (stellar activity), caused by the magnetic activity of the star. Instrumental noise in lightcurves and affects the retrieval of both transit and rotation parameters. For transit characterization, noise is a problem because it can mimic exoplanet signals, leading to false positives, and can also disguise exoplanet signals, in particular the signals of smaller earth-sized planets with longer orbital periods. Ford (2016) argues that with the improvement of instruments, the main obstacle for transit characterization is stellar activity [18]. All in all, noise and the entanglement of many signals makes it difficult to recover accurate transit and rotation parameters [19][20]. In our case the stellar activity can be both signal, when we want to detect it in the joint model to determine the rotation, or noise, when we want to eliminate it in the transit-only model.

2.5.3 Stellar Limb Darkening

(a) Limb Darkening (b) Effects of Limb Darkening on the shape (c) Limb Darkening Laws of the Transit Curve

Figure 2.4: Sources: Left: Richmond [21]; Middle: University of Toronto [22]; Right: Kreidberg [23]

In order to obtain accurate planetary parameters, researchers have to account for stellar limb darkening. Limb darkening is a phenomenon that causes the outer edges of the star (called the limb) to appear darker than the inner part of the disk. This means that from the perspective of the viewer, the light is not uniform across the surface of the star, the reason being that the light near the limb of the star is emitted from colder regions. Limb darkening affects the shape of the downwards and upwards slopes of the transit curve, resulting in a transit shape that is round instead of rectangular. There are several laws for capturing limb darkening effects in lightcurves (i.e. uniform, linear, quadratic, non-linear) which account for different transit shapes [22]. CHAPTER 3 Literature Review

In the literature review section, we outline the traditional techniques in Physics for estimating rotation periods and the traditional techniques for detecting and fitting transits. We give an overview of the machine learning literature that exists on the application of Gaussian Processes on the stellar light curve analysis and why they are well-suited to this problem. Finally, we define our position in the state of the art, highlighting that our approach is innovative in that we work with novel TESS data, explore the joint modelling of transit and stellar signals, and develop a general pipeline that can applied to many targets.

3.1 Traditional Methods in Physics

3.1.1 Transit Fitting There are several approaches in Physics which are used to detect and characterize transits in lightcurves.

Figure 3.1: Left: Transit Least Squares; Right: Box Least Squares; Source: https://github.com/hippke/tls

Box Least Squares (BLS) Since 2002, the Box Least Squares (BLS) method has been commonly used to detect transit signals in lightcurves [24]. The BLS method detects transit signals by fitting a periodic box shape (or an "upside down top hat") to the data [25]. The box shape inherently describes four parameters: the orbital period (time between boxes), the duration of the transit (width of the box), the depth of the transit (height of the box), and the reference time (the first time the box appears) [25]. A disadvantage of the BLS method is that the box shape does not accurately capture the curved shape of a transit signal caused by limb darkening of the star.

Transit Least Squares (TLS) In 2018, Hippke et al. introduce the Transit Least Squares (TLS) method, an improved version of BLS which uses a curved transit shape instead of a box shape to search for transits. The TLS method retrieves the limb darkening coefficients of the star from a catalog and then calculates the impact of the coefficients on the shape of the transit. As a result, TLS is more reliable than BLS for detecting transit signals and obtaining accurate transit parameters, especially for those belonging to smaller planets [26].

12 3.1. TRADITIONAL METHODS IN PHYSICS CHAPTER 3. LITERATURE REVIEW

Transit Fitting Models The derivation of parameters describing the geometry of the transit (i.e. depth, shape, duration) is a straight- forward task. However, the derivation of other planet parameters (e.g. radius, impact parameter, orbital inclination) requires an advanced transit fitting model based on analytical solutions. Furthermore, in order to get an accurate estimate of the area of light that is blocked out by the planet transiting the star, limb-darkening laws need to be incorporated in the analytical solutions [27]. In 2002, Mandel and Agol presented "exact analytic formulae for the eclipse of a star described by quadratic or nonlinear limb darkening" [28]. In 2006, Gimenez et al. also present equations to compute transit lightcurves based on the "fractional radii of the planet and the parent star, the inclination of the orbit, and the limb-darkening coefficients of the star" [27]. Since then, many tools for modelling and fitting exoplanet light curves have been developed by the astronomy community which are based on the analytical solutions of Mandel or Gimenez. These include but are not limited to: batman (BAsic Transit Model cAlculatioN in Python) [29], pytransit [30], and starry [31].

3.1.2 Recovering Rotation Periods There are several traditional methods in Physics for recovering rotation periods from lightcurves.

(a) Lomb-Scargle (LS) Periodogram (b) Autocorrelation Function (ACF)

Figure 3.2

Fourier Transform The Fourier transform converts the lightcurve from the time domain to the frequency domain and analyzes the spectral characteristics of the lightcurve. When the lightcurve is converted to the frequency domain, the rotation period can be found by identifying peaks in the power spectrum.

Lomb-Scargle (LS) Periodogram The Lomb-Scargle (LS) periodogram is a method for "detecting and characterizing periodic signals in unevenly sampled data" [32]. This method can find the rotation period by using least squares to find the sinusoid that best fits the lightcurve.

Autocorrelation Function (ACF): The autocorrelation function can find the rotation period by measuring the covariance between different time points in the lightcurve "at a distance or lag k, for all different values of k" [33].

ACF (k) = cov(xn, xn+k) = E[(xn − µ)(xn+k − µ)]

Rotation Periods in Literature In the past decade, many papers have been published about finding the rotation periods of many Kepler stars at once using the traditional physics methods mentioned above. In 2013, Nielsen et al. measure the rotation periods of 12,151 stars observed by the Kepler mission using the LS Periodogram [34]. They confirm rotation periods by ensuring the rotation period for the majority of the sections of the lightcurve are within one day of the median rotation period [34]. In the same year, McQuillan et al. release three important papers about finding the rotation periods in stars observed by the Kepler mission using the autocorrelation function. In the first, they find rotation periods for 34,030 Kepler stars, which is the "largest sample of stellar rotation periods" to date [35]. In the second, they find the rotation periods for 737 main sequence Kepler stars that are known 3.2. APPLICATIONS IN MACHINE LEARNING CHAPTER 3. LITERATURE REVIEW

to host exoplanets [36]. In the third paper, McQuillan et al. find the rotation periods of 1570 Kepler stars and observe that the distribution of rotation periods is bimodal, where the most common rotation periods occur at 19 and 33 days [37]. More recently, in 2020, Martins et al. try to find the rotation periods of 1000 TESS stars by comparing rotation periods retrieved from Fast Fourier, LS Periodogram, and wavelet techniques [7]. Out of the 1000 stars, they find 131 stars with "unambiguous rotation" periods and 31 stars with "dubious rotation" periods, while the remaining 838 stars are classified as displaying ambiguous variability or being too noisy. The rotation periods found are noticeably short, ranging from 0.321 to 13.219 days, which is likely due to the fact that the sectors of TESS are short (27 days) and many stars have only one or two sectors of data available [7].

3.2 Applications in Machine Learning

3.2.1 Gaussian Processes A Gaussian Process (GP) is a probability distribution over functions. A GP is defined by a mean function and a covariance ("kernel") function which models the covariance structure of the data [38]. A GP is non-parametric and flexible, so it can be suitable for modelling various kinds of processes. The most important decision when creating a GP model is which kernel function to use and this depends on the data you have and the process you want to model. One of the most well-known kernels is the squared exponential kernel, which assumes "data points that are nearby each other in input space are highly correlated, and data points that are far from each other in input space are relatively uncorrelated", making it good for modelling instrumental noise [39]. Another well-known kernel is the quasi-periodic kernel, which is good for modelling periodic processes [12]. In the last decade, many astronomers have used GPs to model instrumental noise or stellar activity in lightcurves. In 2012, Gibson et al. were the first to apply GPs to transit lightcurves for instrumental correction, by using a squared exponential kernel [39]. In 2016, Aigrain et al. use GPs to jointly model the instrumental systematics and astrophysical variability in K2 stars using squared-exponential and quasi-periodic kernels. They show that joint modelling of instrumental and stellar systematics leads to better overall systematics correction and that their detrending method leads to more transit detections than other correction methods [13]. Angus et al. (2017) use GPs with a quasi-periodic kernel function to infer stellar rotation periods for 1102 Kepler stars [12]. They demonstrate that the GP method performs well on simulated and real data and performs better than the LS periodogram and the autocorrelation function. The disadvantage of all these studies is that the GPs used are not scalable. In 2017, there is big news in the exoplanet community regarding GPs. Foreman-Mackey et al. (2017) introduce the software celerite, which enables fast and scalable GP modeling with applications for light curve data, including the probabilistic inference of stellar rotation periods, stellar oscillations, and transiting planet parameters [2]. Their novel method for GP modeling scales linearly with the size of the dataset rather than as the cube of the number of datapoints. This works because they restrict the form of the kernels to a mixture of exponentials, which create semi-separable matrices which can be taken advantage of for faster computation. Foreman-Mackey et al. define celerite kernels to model a variety of stellar signals, from granulation, to oscillation, to rotation [2]. Since then, celerite has been adopted by many astronomers, to aid the modelling of stellar activity in transit lightcurves. In 2019, David et al. 2019 use the celerite GP model to jointly model the stellar variability and the transit signals in a specific star. Using the celerite rotation kernel and a transit mean model, they recover transit parameters for the small planets orbiting the star [40]. In 2019, Pereira et al. use the celerite GP model to model granulation and oscillations in artificial TESS lightcurves and real Kepler lightcurves of red- giant stars. They use the celerite’s granulation, oscillation, and white noise kernels and see that their method is able to retrieve a good estimate of the oscillation frequency and improves the recovery of the "planetary and stellar radius", highlighting the advantages of celerite [41]. In 2020, Barros et al. use the celerite GP to model stellar variability in transit lightcurves from the CoRoT mission [42]. They combine multiple celerite kernels to capture different types of variability (i.e. oscillation, granulation, and rotational modulation) and then remove the variability model from the lightcurve to retrieve transit parameters. They show that the multi-component GP model (with oscillation, granulation, and rotation kernels) outperforms single component GPs and non-GP methods, and that transit parameters are more precise when the stellar variability model is better [42].

Random Forests and Neural Networks Other machine learning methods such as random forests and deep learning methods have been applied to stellar lightcurves in the past two years. Hinners et al. (2018) apply machine learning to predict and classify stellar properties from noisy and sparse time series data [43]. They are one of the first studies to work with real lightcurves and compare a representation learning approach with a feature engineering approach. They demonstrate that a Recurrent Neural Network (RNN) struggles due to data quality while the feature engineering manages to make good predictions for some parameters when using a Random Forest ensemble model [43]. Breton et al. (2019) use a Random Forest to classify Kepler targets into different star categories, and a second 3.3. POSITION IN THE STATE OF THE ART CHAPTER 3. LITERATURE REVIEW

classifier on the targets categorized as rotating main sequence (sun-like) stars to determine what is the best instrumental filter to get the rotation periods [44]. In the last month of the thesis, two articles were released on the application of machine learning to the problem of finding rotation periods in Kepler and TESS lightcurves. In the first article, Lu and Angus et al. use a machine learning method involving random forests called Astraea, which tries to predict long rotation period from short lightcurves. They train a Random Forest classifier to first determine whether a rotation period is "measurable" or "unmeasurable" in a lightcurve, and if it is, they then use a Random Forest regressor to predict rotation periods up to 60 days on TESS’s short sectors by learning the relationship between stellar parameters and changes in brightness of the star [45]. They apply the method to Kepler and TESS data and get 9% uncertainty in Kepler and 55% uncertainty in TESS, pointing out that it is hard to measure rotation periods in TESS with traditional physics methods because the sectors are too short [45]. In the second paper, Angus et al. use a convolutional neural network on Kepler photometric data to recover stellar properties, such as the rotation period for main sequence stars (with an uncertainty of 5.2 days) [46]. Though these are exciting applications, in the thesis we focus on Gaussian Processes as they allow us to explore the joint characterization of stellar activity and exoplanet transits and provide probabilistic estimates of the rotation and transit parameters.

3.3 Position in the State of the Art

Despite the success of the aforementioned applications, there are research gaps and opportunities for improve- ment in the scientific literature that we seek to address with our research. First, few studies have worked with real TESS data and no studies have applied Gaussian Processes for modelling stellar activity in TESS data. This is because the data from the TESS mission is extremely novel, currently being released in increments. This presents an opportunity to test new methods on the data and make new scientific discoveries which can be published. For example, the stellar rotation periods of the majority of the stars observed by TESS have not been established and testing a method which can derive them would be useful for studying the stars and detrending the lightcurves. In this work, we apply our method to real TESS data. Second, few studies have jointly modelled the exoplanet transits and stellar activity. Typically, these are treated as two separate problems and one signal is removed to amplify the other. Rotation periods can be obtained by first removing the transit signals and then studying the parts of the lightcurve that are out of transit. Transit parameters can be obtained by first detrending the lightcurve to remove long-term astrophysical signals and then fitting the transit parameters. Though these approaches have proven to be successful, they remove information from the data which can be important for the task. By jointly modelling the components, the model can work with all the data available and learn how the rotation and transit parameters influence each other, ultimately leading to better characterization. However, joint modelling leads to more parameters, which can increase computation speed. In this work, we test the joint method and investigate the trade-off between speed and accuracy. Third, few studies have worked on developing a generalizable method for transit characterization which can be applied to multiple stars. Many papers in astrophysics will focus on one star-planet system and all steps of the method (i.e. instrumental correction, initialization, sigma clipping) are specifically tuned to that star. From a machine learning perspective, we want to develop a pipeline which can work on a handful of targets. The reduction in computation time offered by the scalable kernels introduced by the celerite software offers an opportunity to characterize multiple star-planet systems. In this work, we develop a pipeline for automating the method. Using the pipeline, we apply the method to 24 targets and assess how well the method can be generalized.

In summary, no studies have applied Gaussian Process methods to TESS data for the task of modelling stellar activity and few studies have examined the joint modelling of exoplanet transits and stellar activity. Considering the novelty of the TESS data and the research gaps in scientific literature, this thesis will focus on characterizing exoplanet transits and stellar activity in TESS and Kepler lightcurves with scalable Gaussian Processes. The final research questions are:

• RQ1: How accurately can we model the stellar activity with Gaussian processes? • RQ2: How accurately can we model the exoplanet transits with Gaussian processes? • RQ3: Does joint modelling improve the characterization of rotation and transit parameters?

To address these questions, we use three Gaussian Process models: (1) ExoGP, which uses a transit mean function and a noise kernel, (2) RotGP, which uses a zero mean function and a rotation kernel, and (3) Ex- oRotGP, which uses a transit mean function and a rotation and noise kernel. We compare the parameter results obtained by the models to those from refereed publications. CHAPTER 4 Methodology

In the method section, we explain the theory behind our approach. We provide background information on Gaussian Processes, how they can be used to model instrumental and stellar astrophysical noise, how they use Bayesian settings to find parameter estimates, and how the models can be optimized to maximize the posterior. Following that, we break down the kernel functions of the software celerite and explain how they can be used to model stellar variability and why they are fast to compute. We move on to describing the transit model that we use to fit the transits and the various features that define it: limb darkening, orbit types, exposure time integration. Finally, we explain how Markov Chain Monte Carlo Sampling can be used to provide uncertainty estimates for the transit and rotation parameters.

4.1 Bayesian Inference

Bayesian inference involves making probabilistic estimates of "unobserved quantities condition on observed data" using probabilistic models [3]. We can make estimates of parameters, or fit our model to the data, by "infer[ring] the joint probability distribution for the model parameters" based on the data and the prior beliefs of the parameters [3]. The joint probability distribution is "the posterior distribution (posterior) and it is obtained by updating a prior distribution (prior) with a sampling distribution (likelihood)" [3]. The posterior is "the information about the model parameters given the prior information and the likelihood from observations."[3].

P (θ)P (y|θ) P (θ|y) = P (y) where P (θ) is the prior distribution and P (y|θ) is the likelihood and P(y) is the model evidence. The prior represents our "assumptions about a model parameter" and the likelihood is the probability that the observed data follows from our model given the model parameters [3]. When observations are passed to the model, the prior is "updated by the likelihood to produce a posterior distribution" [3]. Priors can be "informative", meaning they are "based on previous research" and are tightly constrained. For example, for the rotation period parameter, we can define the prior as a normal distribution where the mean and the standard deviation are based on estimates by other researchers [3]. Priors can also be uninformative, which is useful if we have no previous knowledge about the parameter. Uninformative priors can be defined so that they do not have a strong effect on the posterior, "allowing the data to ’speak for itself’" [3]. If we want to estimate the posterior distribution for a specific model parameter θi (e.g. the radius or the rotation period), we have to integrate the joint posterior "over all other parameters" (i.e marginalization) [3]. Z P (θi|y) = P (θ|y)dθj,i

In our case, the model is a Gaussian Process (GP), and we can perform parameter estimation using the following steps. First, we specify the GP model with a mean function and a kernel function, which gives us the prior distributions. Second, we condition the GP model on the data (i.e. lightcurve), which gives us the likelihood. Third, we fit the model to the data (i.e. find the posterior distributions of the parameters of interest). Since our model is complex, it is not feasible to "calculate the posterior estimates analytically" and we have to rely on other methods [47]. The first step of model fitting involves finding the maximum a posteriori (MAP) parameters "using optimization methods" [47]. The MAP parameters will represent the "mode of the posterior distribution", which may be biased and can only give a point estimate of each parameter [47]. In our case, we use L-BFGS-B optimization to find the MAP parameters. The second step of model fitting involves drawing samples from the posterior distribution with Markov Chain Monte Carlo (MCMC) [47]. This allows us

16 4.2. GAUSSIAN PROCESSES CHAPTER 4. METHODOLOGY

to quantify the uncertainty of the model [47]. In our case, we use the NO U-Turn (NUTS) sampling method to approximate the posterior.

4.2 Gaussian Processes

A Gaussian Process is a stochastic model which consists of a mean function µθ(x) and a covariance function kα(xn, xm), where θ and α are parameters of the model and xn and xm are the coordinates of two datapoints in the dataset [2] [38]. When we perform GP regression, we want to choose a kernel that describes the covariance structure of the data and finds "the set of parameters that best represent the observed data" [41]. Given a dataset y = (y1, ..., yN) with coordinates X = (x1, ..., xN), we can define the following log likelihood equation: 1 1 N lnL(θ, α) = lnp(y|X, θ, α) = − rT K−1r − lndetK − ln(2π) 2 θ α θ 2 α 2 where r is a vector of residuals and K is the covariance matrix. The covariance (kernel) function k maps the time points to a covariance value and defines the elements of the covariance matrix:

2 Ki,j = k(xi, xj) + σ δij where σ2 is white noise and δ is the Kronecker delta function (1 if the inputs are equal, 0 if they are not), which essentially means that white noise is added to the diagonal of the covariance matrix [3]. Once we have the log likelihood equation, we can maximize it to get the values for the model parameters θ and α that maximize the probability of drawing the dataset. We can then quantify "the uncertainties on θ and α ... by multiplying the likelihood by a prior p(θ,α) and using a Markov Chain Monte Carle (MCMC) algorithm to sample the joint posterior probability density" [2]. Typically, Gaussian Process models are trained on a set of training data to perform classification and regression on new inputs. In our case, we use the Gaussian Process to describe the data directly, which in our case is the lightcurve of the star, for the task of parameter estimation. Instead of computing a predictive posterior distribution based on new inputs, we use the posterior defined by the data and prior beliefs to obtain estimates of the model parameters and their uncertainty intervals. We are interested in the model parameters because we can define the Gaussian Process model in such a way that the model parameters summarize information about the stellar activity and the exoplanet transits in the lightcurve. We define the mean function of the model as the transit model, which is the noise free part of the lightcurve that captures the transits of the planet. We set up the covariance function to model stellar activity.

Advantages and Disadvantages The advantages of Gaussian Process models is that they can directly quantify the uncertainty in predictions and that they are non-parametric, which means "we do not have to worry about whether it is possible for the model to fit the data" [38]. Gaussian Processes are versatile and flexible, so we can adapt them to the data and the problem by choosing special priors and kernel functions. As a result, they can model a variety of processes, ranging from "aperiodic, periodic, and quasiperiodic behavior" to various kinds of noise signals [3]. The disadvantages of Gaussian Process models is that the "computation time scales with the number of observations cubed" [2]. As a result, Gaussian Process are "generally limited to small datasets", which is a problem because many of the datasets produced by space missions such as TESS and Kepler are enormous [2]. In order for Gaussian Processes to be useful for this application, they need to be scalable. Luckily, special covariance functions and solvers have been designed to address the problem of scaling.

4.3 Celerite

Normally, the "cost of computing the inverse and determinant of a general matrix Kα" in the log likelihood equation (number) is O(N 3) [2]. Foreman-Mackey et al. introduce a software package Celerite that addresses the "cubic scaling" by limiting the choice of kernels to certain forms that are faster to compute [2]. However, a drawback of these kernel functions is that they only work with "one-dimensional datasets" [2]. In our case, this is not a problem, since we are working with one-dimensional time series data.

What is the celerite kernel? The celerite kernel is a "mixture of exponential functions": [41]:

J 2 X kα(τnm) = σ δnm + ajexp(−cτnm) j=1 4.3. CELERITE CHAPTER 4. METHODOLOGY

where α = {aj, cj} Given the relations between the exponential, cosine and sine functions we can rewrite the exponentials in the celerite kernel as "sums of sines and cosines" [41]. If we also incorporate complex kernel parameters "aj → aj ±ibj and cj → cj ± idj" then the celerite kernel becomes "a mixture of quasi-periodic oscillators" [41]:

J X kα(τnm) = [ajexp(−cjτnm)cos(djτnm) + bjexp(−cjτnm)sin(djτnm)] j=1 where α = {aj, bj, cj, dj} The power spectrum of the celerite kernel is:

J r 2 2 2 X 2 (ajcj + bjdj)(cj + dj ) + (ajcj − bjdj)ω S(ω) = π ω4 + 2(c2 − d2)w2 + (c2 + d2)2 j=1 j j j j

Relation between celerite kernel and stochastically-driven, damped harmonic oscillator Foreman-Mackey et al. demonstrate that the power spectrum of the celerite kernel function matches the power spectrum of a "stochastically-driven, damped harmonic oscillator" (SHO) [41][2].

Figure 4.1: Source: https://www.matlab-monkey.com/ODE/resonance/resonance.html

A stochastically-driven damped harmonic oscillator (SHO) is a system that exhibits repetitive variations around some central point, which are driven by a external force that is stochastic by nature, and the variations slowly dissipate over time due to energy being lost from the system. This SHO can be used to model all kinds of sinusoidal vibrations and waves. The power spectrum of the SHO is as follows:

r 2 S w4 S(w) = 0 0 π 2 2 2 2 w2 (w − w0 ) + w0 Q2 As we can see, it is characterized by the following parameters:

• ω0: is the angular frequency of the undamped oscillator (2π divided by the period) • Q: the quality factor of the oscillator (which describes how little the oscillator is damped) • S0: the normalization constant of the power spectrum, which "is proportional to the power at w = w0". When the time lag, τ, is 0, the kernel is SHO(0) = S0w0Q, which represents variance [2] The power spectrum of the SHO matches the power spectrum of the celerite kernel if we set the following values (see calculation in Appendix):

aj = S0ω0Q

S0ω0Q bj = p4Q2 − 1 ω c = 0 j 2Q w0 p d = 4Q2 − 1 j 2Q Thus, we can find a relation between celerite kernel parameters and SHO parameters [41]. 4.3. CELERITE CHAPTER 4. METHODOLOGY

How is the celerite kernel related to stellar variability? 4.3.1 Stochastically-Driven Dampled Harmonic Oscillator (SHO) Kernel Celerite has a SHO Kernel, which is a kernel that represents the stochastically-driven, damped harmonic oscil- lator and is characterized by the same parameters mentioned above:

• Angular frequency (ω0) • Quality factor (Q) • Normalization constant (S0): [2]

When the quality factor is equal to √1 , the power spectrum of the SHO can be reduced to: 2 r 2 S S(ω) = 0 π ( ω )4 + 1 ω0 And the kernel function becomes:

1 − √ ω0τ ω0τ π k(τ) = S0ω0 exp 2 cos( √ − ) 2 4 In a seminal astrophysics paper by Harvey (1985), it was demonstrated that this equation can model stellar granulation (the effect of motions of convective cells at the surface of the star that are hot and brighter at their center and darker at their downflow edges) [48]. The SHO kernel with a quality factor of √1 can model stellar 2 granulation, but it can also detect other types of variability or noise coming from the star and the instrument, making it very versatile [42].

4.3.2 Rotation Kernel The Rotation kernel in celerite is "a mixture of two SHO terms that can be used to model stellar rotation" [49]. The kernel looks at two waves "in Fourier space: one at period and one at 0.5 * period" [49]. Why look at two waves? When we look in Fourier space, a strong periodic signal will a have a peak at its given period, but there will also likely be a peak at the harmonic of the signal, at 1/2 of the period. By looking at two waves, where one period is the harmonic of the other, we can infer a stellar rotation period. The parameters of the Rotation Kernel are as follows: • Amplitude (amp): the amplitude of the rotation, reflecting the size of the starspot • Rotation period (period): the rotation period of the main wave • Quality factor (Q0): the quality factor of the second wave, which indicates how quickly the signal dies out • Difference between Quality Factors (deltaQ): the difference between the quality factors of the first and second waves • Fractional amplitude (mix): the fractional amplitude of the second wave compared to the first wave. The second wave should have an amplitude that is greater than 0 but less than that of the second wave. With these parameters, we can model two oscillations, which can be represented as SHO kernels: One SHO term with a period of period

Q1 = 0.5 + Q0 + deltaQ 4πQ ω = 1 1 p 2 period ∗ 4Q1 − 1 amp S1 = (ω1Q1) Another SHO term at half the period Q2 = 0.5 + Q0 8πQ ω = 2 2 p 2 period ∗ 4Q2 − 1 mix ∗ amp S2 = w2Q2 Then the RotationTerm is the sum of the following two SHOterms: RotationTerm = SHOTerm(S0 = S1, ω0 = ω1, Q = Q1) + SHOTerm(S0 = S2, ω0 = ω2, Q = Q2) 4.3. CELERITE CHAPTER 4. METHODOLOGY

How come the celerite kernel is faster to compute? 4.3.3 Semi-Separable Matrics Foreman-Mackey et al. 2017 introduce a direct solver for the celerite covariance matrices which "exploits the semiseparable structure of these matrices to compute the Cholesky factorization in O(NJ 2)" operations, where J refers to the number of kernels [2]. To be able to understand the solver, let us break down the concepts.

What are semi-separable matrices? Semi-separable matrices are matrices that can described by a diagonal matrix with dimension N x N and two smaller matrices with dimensions N x R. R refers to the rank of the semi-separable matrix, which means that "all submatrices which can be taken out of the lower (upper) triangular part of the matrix have rank ≤ r" [50]. If K is a semi-seperable matrix with rank R, then the elements of the matrix are:

 R P U V , if m < n  r=1 n,r m,r Kn,m = An,n, if m = n (4.1) PR r=1 Um,rVn,r, otherwise An easier way to visualize it may be as the sum of a diagonal matrix A and strictly lower triangular matrix tril(UV T ) and a strictly upper triangular matrix triu(VU T ) (Note: tril (triu) is a function for selecting the strictly lower (upper) triangular part of a matrix):

K = A + tril(UV T ) + triu(VU T )

For example, if R = 1 and N = 3, we have the following matrices:         a1,1 0 0 u1 v1 u1v1 u1v2 u1v3 T A =  0 a2,2 0  U = u2 V = v2 UV = u2v1 u2v2 u2v3 0 0 a3,3 u3 v3 u3v1 u3v2 u3v3

And a semi-separable matrix K can be defined as a sum of the following matrices:       0 0 0 a1,1 0 0 0 u2v1 u3v1 K = u2v1 0 0 +  0 a2,2 0  + 0 0 u3v2 u3v1 u3v2 0 0 0 a3,3 0 0 0

The advantage of semi-separable matrices is that they allow for linear "matrix transformation algorithms", meaning that "a system of linear equations Ax = b can be solved with linear complexity in the size of the matrix" [51]. One of the matrix transformation algorithms that can be computed in O(N) operations is Cholesky factorization.

What is Cholesky factorization? Cholesky factorization is the "decomposition of a Hermitian, positive-definite matrix into the product of a lower triangular matrix and its conjugate transpose" [52]. Here we focus on the LDL Cholesky factorization, which breaks down a matrix K into a product of a lower triangular matrix with a unit diagonal L, a diagonal matrix D, and the conjugate transpose of L: K = LDLT If we make the mathematical assumption that L can be defined by the identity matrix I, the matrix U defining the semi-separable matrix, and a new unknown matrix W with dimensions NxR, then we get:

K = I + tril(UW T )

Now we have two equations defining K and we can set them equal to each other to determine the elements in the unknown matrix W.

A + tril(UV T ) + triu(VU T ) = [I + tril(UW T )]D[I + tril(UW T )]T

From this equation, we can derive a recursive algorithm which computes the Cholesky Factorization of a semi-separable matrix in O(NR2) operations [2]. 4.3. CELERITE CHAPTER 4. METHODOLOGY

Cholesky Factorization Algorithm

To calculate the elements of D and W, the algorithm introduces another matrix, Sn, which is a symmetric RxR matrix for each data point. The base case of the recursive algorithm is when n=1, and every element S1,j,k = 0.

Sn,j,k = [Sn−1,j,k + Dn−1,n−1Wn−1,jWn − 1, k] Elements of the diagonal matrix D can be calculated using:

R R X X Dn,n = An,n − Un,jSn,j,kUn,k j=1 k=1 Elements of the matrix W can be calculated using:

R 1 X Wn,j = [Vn,j − Un,kSn,j,k] Dn,n k=1 Calculating the elements for n=1 requires O(R2) operations. Thus, when the algorithm is applied to the whole dataset, it has a run time of O(NR2).

Inverse of K The inverse of a semi-separable matrix can be solved in O(RN) operations.

z = K−1y = (LT )−1D−1L−1y

Forward substitution can be used for lower triangular matrices. Let’s apply it here to solve z0 = L−1y with the base case f0,j = 0: 0 fn,j = fn−1,j + Wn−1,jzn−1 R 0 X zn = yn − Un,jfn,j j=1 Back substitution can be used for upper triangular matrix. Let’s apply it here to solve the final equation for T −1 −1 0 0 −1 z = (L ) D z , using the solution for z = L y with the base case gN+1,j = 0:

gn,j = gn+1,j + Un+1,jzn+1

R z0 X z = n − W g n D n,j n,j n,n j=1 For each n, the run time is O(R). As a result, we can apply the inverse in O(NR) operations.

Log Determinant of K The log determinant of a semi-separable matrix can be solved in O(N) operations.

lndetK = ln(det(LDLT )) = lndet(L) + lndet(D) + lndet(LT )

The determinant of a triangular matrix and the determinant of a diagonal matrix is the product of a diagonal.

N X lndetK = ln(1) + lndet(D) + ln(1) = lnDn,n n=1

Celerite Solver Recall that the celerite kernel function "can be written as a mixture of quasi-periodic oscillators" [41]:

J X kα(τnm) = [ajexp(−cjτnm)cos(djτnm) + bjexp(−cjτnm)sin(djτnm)] j=1 where α = {aj, bj, cj, dj} and τnm ≡ |tn − tm|. The covariance matrix K associated with this kernel function can be written as a semi-separable matrix with rank R = 2J if:

−cj tn −cjtn Un,2j−1 = aje cos(djtn) + bje sin(djtn) 4.4. TRANSIT MODELS CHAPTER 4. METHODOLOGY

−cj tn −cj tn Un,2j = aje sin(djtn) − bje cos(djtn)

−cj tm Vm,2j−1 = e cos(djtm)

−cj tm Vm,2j = e sin(djtm)

J 2 X An,n = σn + aj j=1 In order to overcome issues related to numerical stability caused by the exponential function, Foreman- Mackey et al. reparametrize these equations for the final celerite solver. The full adjustment of the algorithm can be found in the original celerite paper [2]. All in all, the celerite solver can solve the log determinant of K in O(N) operations and the inverse of K in O(NJ) operations [2].

4.4 Transit models

As mentioned in the literature review, there are many approaches to define a transit model. A transit model, is essentially, a calculation of a lightcurve with transit signals and usually consists of 7 to 11 parameters [3]. Light curves are not easy to compute quickly and have to consider the planet’s size, position in the sky, and the limb darkening coefficients [29]. We use the transit model from the package starry and define priors for some of the transit parameters using various built-in functions in the software exoplanet which are based on scientific papers. We assume that there is one planet in the lightcurve and that the orbital period of the planet is between 0.1 days and 50 days. Modelling more than one planet would require double the amount of transit parameters, which greatly slows down the computation. Starry uses the following parameters to compute the transit model [5][49].

Orbit The orbit of the planet is defined using the following parameters • Stellar mass (m_star): the mass of the star in solar units • Stellar radius (r_star): the radius of the star in solar units

• Orbital period (period) (days): the time it takes for planet to complete one orbit around the star in days • Reference time (t0): the first time the transit appears in the lightcurve • Impact parameter (b): the projected distance between the center of the planet and the center of the star; this indicates whether the planet is crossing the center of the star (b = 0) or grazing near the edge of the star (b = 1). • Eccentricity (ecc): the extent to which the orbit of the planet is not circular • Omega (omega): the location in the orbit where the planet is closest to the star

Limb Darkened Lightcurve The limb darkened light curve is computed using starry and is calculated based on the orbit calculated above, the time points t, and the following parameters: • Limb darkening coefficients (u_star): the coefficients which determine how the shape of the transit is affected by limb darkening • Radius of planet (r_pl): the radius of the planet in units of the radius of the star

• Exposure time integration (texp): this makes sure the transit model is computed using the same exposure time as the data [53] The radius of the planet can be determined using the transit depth and radius of the star. The transit depth represents the area of light that the planet blocks from the star.

r_planet2 transit_depth = r_star2

r_planet = ptransit_depth ∗ r_star 4.5. MARKOV CHAIN MONTE CARLO CHAPTER 4. METHODOLOGY

For the impact parameter, we use exoplanet’s built-in prior ImpactParameter, which is a uniform distribution from 0 to 1 + ratio of planet radius to stellar radius. The eccentricity and the limb darkening coefficients are hard to infer from the transit lightcurve. To deal with this, we use exoplanet’s built-in prior beta distribution for the eccentricity parameter and the built-in uninformative prior for the limb darkening coefficients, which are based on scientific papers by Kipping et al. [54]

4.5 Markov Chain Monte Carlo

Markov Chain Monte Carlo (MCMC) sampling is necessary for parameter estimation when it is not feasible to sample from the posterior directly. Markov Chain Monte Carlo (MCMC) refers to a group of sampling algorithms. The standard MCMC method draws samples by "iteratively constructing a Markov chain with the posterior distributon as its equilibrium distribution" [3]. Essentially, the MCMC sampler starts the chain at the point in the the parameter space that corresponds to the MAP parameters that we pass to it and then builds a chain as it moves around. The sampler "proposes a move to another point, ... accepts or rejects the move based on the posterior density ratios between the current and proposed locations", and adds the location to the chain [3]. In our work, we use a novel sampler, No-U-Turn Sampler which is more effective at exploring the posterior space and is able to use fewer posterior evaluations [3]. The NUTS sampler "avoids the random walk behavior" of standard MCMC sampling methods and it does this by using "first-order gradient information" to inform the steps in the parameter space [55]. As a result, the NUTS sampler is usually able to "converge to high- dimensional target distributions" more efficiently in comparison to other MCMC sampling methods employing random walk [55]. The sampler is defined so that the distribution of the samples in the chain are guaranteed to converge to the posterior. In our work it was necessary to use an efficient sampler because we have a complex posterior with many parameters describing the planets, stellar activity, and noise in the lightcurves. CHAPTER 5 Algorithm

5.1 Algorithm

In the algorithm section, we explain how we implemented the theory using the software exoplanet [4], pymc3 [6], starry [5], and celerite, with the help of pseudocode examples. Essentially, we develop a pipeline for prepro- cessing the lightcurves, defining prior distributions using traditional Physics techniques, building a Gaussian Process model with a transit mean function and a quasi-periodic Rotation kernel, and sampling from the model with Markov Chain Monte Carlo. Based on the arguments passed by the user, this flexible pipeline can run experiments for three Gaussian Process models: RotGP (only rotation), ExoGP (only transit), and ExoRotGP (rotation and transit). We summarize the prior distributions used for each parameter and describe the steps for building the mean functions and kernel functions.

Figure 5.1: Method

5.1.1 Initial estimates In order to initialize the informative prior distributions of the parameters in the Gaussian Process model, initial estimates of the parameter values must be obtained. The Transit model requires estimates for three parameters, bls_period (the period of the planet), t0 (the reference transit time), and the bls_depth (the transit depth in percentage). Ground truth values are also needed for two parameters, the m_star (the mass of the star in solar units) and r_star (the radius of the star in solar units), which are obtained from the NASA Exoplanet Archive. The Rotation kernel requires an initial estimate for the rot_period (the rotation period of the star). Typically, the selection of parameter initializations is done on a star by star basis and astronomers will inspect the data of the star carefully to determine the best starting points. Since our goal is to develop an approach that is robust and general, we develop a pipeline that can automatically determine these initial estimates by learning from the data with traditional Physics methods.

24 5.1. ALGORITHM CHAPTER 5. ALGORITHM

The initial estimates of the transit parameters are obtained using a combination of Transit Least Squares (TLS) and Box Least Squares (BLS). The combination is necessary because Transit Least Squares is better at detecting the transit signal and Box Least Squares is more suitable for measuring the transit depth. As men- tioned in the literature review, Transit Least Squares uses a transit shape rather than a box shape to search for transits, making it better at detecting transits of small planets and planets in the noisy short cadence data from TESS [26]. However, Transit Least Squares can only search for transits in a flat detrended lightcurve and will return a missing value if it is not confident about its detection. So, it can only be applied to a flattened version of the lightcurve, which leads to underestimation of the transit depth. The BoxLeastSquares algorithm from astropy, on the other hand, works on the normal, unflattened lightcurve. It always returns an estimate, even if it is false, by simply selecting the maximum point in the BLS periodogram. Thus, the period detected with Transit Least Squares algorithm is used to constrain the period range for BoxLeastSquares. BoxLeastSquares then computes a periodogram on a grid of 5000 periods from tls_period - 0.1 to tls_period + 0.1 and returns estimates for the transit period, reference time, and transit depth for the normal lightcurve [56].

Next, we apply the Lomb-Scargle (LS) Periodogram algorithm to the lightcurve using the functionality from exoplanet, setting the minimum period to 1 day and the maximum period to 100 days. The LS Periodogram can detect a periodic signal by fitting a sinusoid to the lightcurve at various frequencies [57]. From the periodogram, we extract an initial estimate of the rotation period by selecting the peak. Though sometimes the peak of the periodogram corresponds to the harmonic of the rotation period or the transit period, the Lomb-Scargle Pe- riodogram is a reliable automated approach for making the first guess. As a result, we end up with initial estimates for the following parameters, which are passed to the GP model as a dictionary:

init = { § ’m_star’ : m_star , ¤ ’r_star’ : r_star , ’period_guess’ : transit_period , ’t0_guess’ : transit_t0 , ’depth_guess’ : transit_depth , ’rotperiod_guess’ : rot_period }

¦ ¥ 5.1. ALGORITHM CHAPTER 5. ALGORITHM

5.1.2 Three Models: ExoGP, RotGP, ExoRotGP

Figure 5.2: The three models

To build the Gaussian Process (GP) model, we use the software packages PyMC3 [6], which is built on theano[58], and exoplanet [4], celerite, starry, [5], and several other dependencies [54][59][60][61]. The code is based off of tutorials by exoplanet developers for fitting TESS data and modelling stellar variability with GPs [62][63][64]. The lightcurve consists of a time array, containing the time points, and a flux array, containing the flux values of the star at each time point. To create the GP model, the x (time) and y (flux) arrays and the initial parameter guesses are passed to the GP_model function, along with Boolean values for rot and transit. Depending on the values for rot and transit, this function can build three types of GPs:

• ExoRotGP: A GP that models both the transits and the rotation, where the mean function is a transit model and the kernel function consists of a rotation kernel and a noise kernel added together. • RotGP: A GP that models only the rotation, where the mean function is zero and the kernel consists of only a rotation kernel. • ExoGP: A GP that models only the transits, where the mean function is a transit model and the kernel consists of only a noise kernel.

Thus, before we can define the GP, we have to define the mean function and the kernel function. After the GP model is defined, we optimize it to find the Maximum a Posteriori (MAP) parameters.

import pymc3 as pm §import exoplanet as xo ¤ import theano . tensor as tt

def GP_model ( time , flux , init , rot=True , transit=True ):

with pm . Model () as model :

# 1. Define mean function ... # 2. Define kernel functions ... # 3. DefineGP with kernels and mean model ... # 4. Optimize parameters ofGP ...

return model , map_params

model , map_params = GP_model ( time , flux , init , rot=True , transit=True )

¦ ¥ 5.1. ALGORITHM CHAPTER 5. ALGORITHM

For most of the parameters, we use PyMC3 (pm) to define prior distributions. For example, a parameter x can be defined as having a Normal prior distribution with mean 0 and a standard deviation 1. It is common practice to optimize the log of a variable for numerical stability and later calculate the true value using the exponential function. This is made easy in PyMC3, where deterministic variables can be defined to keep track of transformed variables [6]. For example, a variable y can track the transformed variable exp(logy).

# Normal distribution §x = pm . Normal ( 0 , 1) ¤

# Taking the log ofa variable logy = pm . Normal ( 0 , 1)

# Deterministic variable y = pm . Deterministic ( exp ( logy ))

¦ Some of the transit parameters require special prior distributions, which are defined using built-in functions ¥ from exoplanet [4]).

1. Define mean function If we do not model the transits, we will use a simple zero mean function with standard deviation 10.

Parameter Prior Software mean N (0, 10) PyMC3

If we do model the transits however, we simulate a transit model and add it to the zero mean function. The transit model requires 11 parameters, seven of which define the orbit. Most of the orbital parameters have a Normal prior distribution where the mean is set to the initial parameter estimate and the standard deviation is 1. The prior for the radius of the planet requires a special calculation based on the transit depth and the radius of the star. Recall that the transit depth is the ratio of the surface area of the planet disk to the surface area of the star disk, i.e. r_pl2/r_star2. Since we multiplied the flux values by 1e3 when normalizing the lightcurve, we multiply the transit depth by 1e-3 to return to the original flux values. The radius of the planet can then be defined as: r_pl2 = 1e − 3 ∗ transit_depth r_star2

1 r_pl = (1e − 3 ∗ transit_depth) 2 ∗ (r_star) Furthermore, the impact parameter (b), eccentricity (ecc), and quadratic limb darkening coefficients (u_star) are defined with exoplanet’s built-in prior distributions ImpactParameter, eccentricity.kipping13, and QuadLimb- Dark, which are based on the papers mentioned in the theory section. Once the priors for all the orbital parameters are defined, we create the orbit using exoplanet’s built-in class KeplerianOrbit.

Parameter Prior Software mean N (0, 10) PyMC3 period N (period_guess, 1) PyMC3 t0 N (t0_guess, 1) PyMC3 duration N (dur√ _guess, 1) PyMC3 r_pl N ( 1e − 3 ∗ depth_guess ∗ r_star, 1) PyMC3 r_pl ror N ( r_star , 1) PyMC3 b ImpactP arameter(ror) exoplanet ecc eccentricity.kipping13 exoplanet u_star QuadLimbDark() exoplanet 5.1. ALGORITHM CHAPTER 5. ALGORITHM

If we want to used a transit model with a fixed circular orbit (where eccentricity = 0), we define the orbit using the following parameters:

if transit : § ¤ # Define orbit of transit model orbit = xo . KeplerianOrbit ( period=period , t0=t0 , b=b , dur=duration )

¦ If we want to used a transit model with a potentially eccentric orbit, where the eccentricity is a free parameter, ¥ we define the orbit using the following parameters:

if transit : § # Define orbit of transit model ¤ orbit = xo . KeplerianOrbit ( r_star=r_star , m_star=m_star , period=period , t0=t0 , b=b , ecc=ecc , omega=omega , )

¦ Finally, the transit model can be defined with exoplanet’s LimbDarkLightCurve class, which uses starry to ¥ compute a limb darkened light curve at times t using the parameters for the limb darkening coefficients u_star, the orbit orbit, the radius of the planet r_pl, and the exposure time texp

# Transit model mean function §if transit : ¤ def transit_model ( t ): # Limb Darkening coefficients u_star = xo . QuadLimbDark ( "u_star" )

light_curves = pm . Deterministic ( xo . LimbDarkLightCurve ( u_star ). get_light_curve ( orbit=orbit , r=r_pl , t=t , texp=texp ) ∗ 1e3 )

return light_curves + mean

¦ ¥ 2. Define Kernels To define the noise (SHO) kernel, we first have to define the priors for the parameters: s0, quality factor (Q), and the frequency of the undamped oscillator (w0). These normalization constant is based on the variance of the flux.

Parameter Prior Software log_w0 N (0, 10) PyMC3 log_Sw4 N (log(var(y)), 10) PyMC3 Q √1 PyMC3 2

# Noise Kernel §noise_kernel = xo . SHOTerm ( log_Sw4 , logw0 , 1/ sqrt ( 2 ) ) ¤

¦ To define the rotation kernel, we first have to define the priors for the parameters: amplitude (amp), period ¥ (period), quality factor (Q0), difference between the quality factors (deltaQ), the fractional amplitude (mix). The amplitude is based on the variance of the flux. 5.1. ALGORITHM CHAPTER 5. ALGORITHM

Parameter Prior Software log_amp N (log(var(y)), 5) PyMC3 log_rotperiod BoundedNormal(log(rotperiod_guess), 5) lower=0, upper=log(50) PyMC3 log_q0 N (1, 10) PyMC3 log_deltaQ N (2, 10) PyMC3 mix UnitUniform exoplanet

# Rotation kernel §if rot : ¤ rot_kernel = xo . RotationTerm ( log_amp=logamp , period=rotperiod , log_Q0=logQ0 , log_deltaQ=logdeltaQ , mix=mix )

¦ ¥ 3. Define GP with kernel and mean model The final parameter is the variance (s2), which is meant to capture white noise and is initialized using the variance of the flux values.

Parameter Prior Software logs2 N (log(var(y), 10) PyMC3

The GP model can now be defined using the exoplanet package and the arguments kernel, x, diag, and mean. The argument x is a 1D array of time points and the argument diag is a covariance matrix that indicates how much variance should be added to each data point. We then perform GP regression by treating the observed data (flux) as the sum of the GP model and Gaussian noise.

f(x) ∼ GP (m(x), k(x, x0)) X  ∼ N(0, )) y = f(x) + 

# ExoRotGP §if rot and transit : ¤ gp = xo . GP ( kernel=rot_kernel+noise_kernel , x=time , diag=exp ( logs2 ), mean=transit_model ( t )) # RotGP if rot and not transit : gp = xo . GP ( kernel=rot_kernel , x=time , diag=tt . exp ( logs2 ), mean=mean )

# ExoGP if transit and not rot : gp = xo . GP ( kernel=noise_kernel , x=time , diag=tt . exp ( logs2 ), mean=transit_model ( t ))

# ImplementGP regression # where the observed data are the sum of theGP model and Gaussian noise gp . regression ( "gp" , observed=flux )

¦ ¥ 4. Optimize parameters of GP The goal of optimization is to find the maximum a posteriori (MAP) parameters. The parameters are first opti- mized in groups (i.e. noise parameters, rotation parameters, transit parameters) and then optimized all together. According to the exoplanet developers, this leads to better results. After optimization, different components of the model can be plotted to assess how well the model describes the stellar activity and the exoplanet transits. 5.1. ALGORITHM CHAPTER 5. ALGORITHM

Figure 5.3: Top: Stellar activity; Middle: Exoplanet transits; Bottom: Residuals

Clip outliers The model is further improved by applying sigma clipping, as there are often still outliers present. Residuals are obtained by subtracting the stellar activity and the exoplanet transits from the flux values and the root mean square values of the residuals are calculated to assess which points are outliers (more than 5 sigma). The outliers are removed and the GP model is run again, using the current MAP parameters as a starting point, so that better MAP parameters can be obtained.

5.1.3 Sampling Now that we have a strong GP model and MAP parameters, pymc3’s NUTS Markov Chain Monte Carlo method is used to sample from the posterior distribution with a special tuning schedule. After experimenting with differ- ent settings, the following MCMC settings were chosen because they achieved good results in a feasible amount of time:

np . random . seed (42) §with GP_model : ¤ trace = NUTS_sampler ( tune =350 , draws =2000 , start=map_params , cores=2, chains=2, target_accept =0.95 , )

¦ At each stage of the pipeline, our code automatically generates figures and tables and adds them to a ¥ document. Each target star has a document that is very useful for data analysis and debugging. CHAPTER 6 Experiments

In the experiments section, we begin by giving an overview of the data from the Kepler and TESS mission and how they differ. Next, we describe the process of selecting the targets and the steps for preprocessing the data. We plot the targets with different correction methods so the reader can refer to them when interpreting the results. Following that, we define the experiments for investigating the research questions. To assess the benefits of jointly modelling the rotation and the transits, we compare the performances of three Gaussian Process models: RotGP (only rotation), ExoGP (only transit), and ExoRotGP (rotation and transit). Moreover, we summarize the techniques for evaluating the convergence, accuracy, and effectiveness of the models.

6.1 Data

The Kepler/K2 mission was in operation for 9.6 years and observed a narrow and distant part of the sky. It observed each star for 17 quarters, where the quarters were approximately 90 days each (4 years total). The Kepler spacecraft took images every 30 minutes with a telescope that had a diameter of 0.95 meters. The TESS mission has collected data for more than two years, covering the southern hemisphere of the sky in the first year and the northern hemisphere in the second. The mission has been extended for two years until summer 2022. Until now, the mission has observed 26 sectors, where each sector is a 24 degree by 96 degree patch of sky, and spent about 30 days per sector [65]. The TESS space-craft takes images every two minutes and sends them down to Earth every 13.7 days to be analyzed. During the downlink period the data acquisition is interrupted, resulting in data gaps in the middle of each sector. In the TESS mission, not all stars observed have the same amount of data available. The majority of stars have only one or two sectors of data available. The telescopes of TESS are smaller than the Kepler telescope, with a diameter of 10.5 cm, which results in noisier lightcurves than Kepler for stars of the same magnitude. Furthermore, since the TESS mission is so new, the instrumental correction methods are not as developed as those for Kepler. Throughout the missions, the space-craft takes images of the stars and send them down to Earth. Once the data reaches the ground, a Science Processing pipeline is responsible for processing the data. The pipeline produces TargetPixelFiles (TPF) and LightCurveFiles (LCF) for each sector that a target star is observed in. TargetPixelFiles are stacks of images centered around the target star. For Kepler, each image in the stack corresponds to every thirty minute timestamp in the quarter (90 days), while for TESS each image corresponds to every two minute timestamp in the sector (26 days). TargetPixelFiles are the rawest form of data and are used to extract lightcurves [66]. LightCurveFiles contain flux time series data for a target star, which comes in two versions: Simple Aperture Photometry (SAP) and Pre-search Data Conditioning (PDCSAP). The SAP lightcurve is the raw photometric lightcurve extracted from target pixel data. The PDCSAP lightcurve is the pipeline corrected lightcurve where long-term trends have been removed using Cotrending Basis Vectors (CBVs). Usually, the PDCSAP lightcurve is flatter and contains less noise than the SAP lightcurve, in its derivation stellar astrophysical signals may have been removed. After the data is processed and validated, it is archived in the Mikulski Archive for Space Telescopes (MAST) portal. The MAST portal is accessible to the public and contains millions of astronomical observations from space missions [65][66].

6.2 Selection of Targets

The first step was to find target stars with confirmed planets and confirmed rotation periods so that we could have ground truth transit and rotation parameters that we could compare to with our method. To begin, we obtained a list of confirmed host stars with planets that were observed by the TESS mission from the NASA Exoplanet Archive. To ensure that the lightcurves contained a rotation signal and transit signal, we filtered out

31 6.3. PREPROCESSING CHAPTER 6. EXPERIMENTS

planets which did not have a radius value and host stars that did not have a rotation period in the database. However, just because these targets had confirmed planets and rotation, did not mean that the signals were detectable in the TESS data. To check which targets had detectable signals, we did a preliminary analysis with LS periodogram and Transit Least Squares. From the list, we selected 9 Kepler targets and 15 TESS targets. As mentioned in the literature review, recently Martins et al. tried to find the rotation periods of 1000 TESS targets and grouped them into the categories "unambiguous rotation", "dubious rotation", "ambiguous variability" and "noisy lightcurve" [7]. We made sure to select only TESS lightcurves that were also observed by Martins et al., so we could see if our method could find the same rotation periods or better results for stars that they classified as having "dubious rotation", "ambiguous variability", or being a "noisy lightcurve" [7].

6.3 Preprocessing

Downloading Lightcurvefiles for a star can easily be downloaded using the lightkurve package, which retrieves the file based on the Target ID of the star and the mission [67].

lcf = lk . search_lightcurvefile ( target_id , mission="TESS" ). download_all () § ¤ ¦ ¥ Normalizing The lightcurves are normalized by dividing the flux by the median of the flux and subtracting 1 and multiplied by 1000. This makes it so that the data is centered around 0.

flux = ( lcf . flux / np . median ( lcf . flux ) − 1) ∗ 1e3 § ¤ ¦ ¥ Stitching For TESS targets which have rotation periods longer than 13 days (1/2 of the sector) and more than one consecutive sector, we stitch sectors together by concatenating the time arrays and the flux arrays.

Correcting Ideally we want a correction method which corrects for the instrumental noise but preserves the astrophysical noise (stellar activity). The TESS pipeline correction method (PDCSAP) is effective at removing the instru- mental noise but also removes long-term astrophysical trends. This makes sense as the pipeline is focused on exoplanet detection and does not care about preserving the astrophysical signals. We incorporate an alterna- tive correction method, RegressionCorrector, which focuses on removing scattered light and motion from the space satellite using linear regression and Principal Component Analysis on the Target Pixel File [68]. The RegressionCorrector works by: • Selecting the pixels outside of the aperture (which typically represent instrumental noise) • Creating a Design matrix (matrix of regressors) using the pixels • Reducing the design matrix using Principal Component Analysis (PCA) with 5 principal components

• Using linear regression to detrend the lightcurve with the principal components

# Extract lightcurve from target pixel file §aper = tpf_2min . pipeline_mask ¤ raw_lc = tpf_2min . to_lightcurve ()

# Makea design matrix(matrix of regressors) dm = lk . DesignMatrix ( tpf_2min . flux [ : , ~aper ], name=’pixels’ ). pca ( 5 ) . append_constant ()

# Apply Regression Correction to lightcurve reg = lk . RegressionCorrector ( raw_lc ) lc_reg = reg . correct ( dm )

¦ ¥ 6.3. PREPROCESSING CHAPTER 6. EXPERIMENTS

For the Kepler targets, we use the pipeline corrected lightcurve because it captures the rotational modulation signal. For the TESS targets, we use the regression corrected lightcurves, except for cases where it is clear the pipeline corrected lightcurve is superior.

Figure 6.1: 9 Kepler Targets: We display the Pipeline Corrected lightcurve

(a) Kepler-107 (b) Kepler-155

(c) Kepler-17 (d) Kepler-39

(e) Kepler-43 (f) Kepler-45

(g) Kepler-75 (h) Kepler-78

(i) Kepler-96 6.3. PREPROCESSING CHAPTER 6. EXPERIMENTS

Figure 6.2: 14 TESS Targets: We display the Pipeline Corrected lightcurve (Blue) and the Regression Corrected lightcurve (Green)

(a) CoRoT-18 (b) HAT-P-11

(c) HATS-16 (d) HATS-18

(e) HATS-47 (f) HIP 65 A

(g) Qatar-1 (h) WASP-140

(i) WASP-166 (j) WASP-167

(k) WASP-173 (l) WASP-19

(m) WASP-8 (n) WASP-93 6.4. EXPERIMENTAL DESIGN CHAPTER 6. EXPERIMENTS

6.4 Experimental Design

As mentioned in previous sections, there are separate approaches to characterizing stellar activity and transit signals, where one signal is removed to amplify the other. Rotation periods can be obtained by first masking the transits and then fitting the parts of the lightcurve that are out of transit. Transit parameters can be obtained by first detrending the lightcurve to remove long-term signals and then fitting the transit parameters. Though these approaches have proven to be successful, they involve removing part of the data which may remove important information for the task. By jointly modelling the components, the model can work with all the data and learn how the rotation and transit parameters influence each other, which can improve characterization. One drawback, however, is that joint modelling is a lot more computationally expensive because the model has to consider many more parameters. To evaluate the benefits of the joint model, we apply three GP models to the 15 TESS and 9 Kepler targets. The GP models we apply are: RotGP with a rotation kernel, ExoGP with a transit mean model, and ExoRotGP with both rotation and transit components. Our hypothesis is that a joint model will lead to better rotation and transit characterization.

Experiment 1: RotGP RQ1: How accurately can we model stellar activity in TESS and Kepler lightcurves with Gaussian Processes?

For this experiment, we mask the transits in the lightcurves, using TransitLeastSquares to find them. We then get a first estimate of the rotation period using LS periodogram. We apply the RotGP model on 14 TESS targets and 9 Kepler targets. We then sample from the RotGP with MCMC to obtain estimates of the rotation parameters. Since it is difficult to look at all the parameters of all the stars, we use the rotation period as a proxy for the accuracy of the stellar activity model. Finally, we compare the rotation period estimates obtained by RotGP to reference rotation periods from literature. Within this experiment, we also examine whether the RotGP model obtains different rotation period results for Pipeline Corrected lightcurve (PDCSAP) and Regression Corrected lightcurves of specific TESS targets.

Experiment 2: ExoGP RQ2: How accurately can we model exoplanet transits in TESS and Kepler lightcurves with Gaussian Processes?

For this experiment, we use a Transit Least Squares to detect the planet and Box Least Squares to get an initial estimate of the transit period, time of reference, and depth. We run the ExoGP model on 14 TESS targets and 9 Kepler targets. We then sample from the ExoGP with MCMC to obtain estimates of the transit parameters. Since it is difficult to look at all the parameters of all the planets, we use the planet radius as a proxy for the accuracy of the transit model. Within this experiment, we also examine the influence of various design choices on the transit model:

• Bin size: For TESS lightcurves, can binning data lead to the same results while drastically reducing computation time? The TESS data comes in short cadence (2 minutes) and sectors of length 27 days, which results in almost 20,000 data points per sector. To answer this question, we compare the results of the model on regular TESS 2 minute cadence data and TESS data that is binned to have 30 minute cadence.

• Exposure time: For Kepler lightcurves, what is the impact of temporal binning on the recovery of transit parameters and can it fixed by incorporating exposure time? The Kepler lightcurves have long cadence (30 minutes) which can result in smearing of the transit shape and the recovery of incorrect transit parameters. Incorporating information exposure time can fix this issue. To answer this question, we compare the results for two transit models: one incorporating information about the exposure time and one that does not. • Fixed vs. free orbits: Is it better to treat the eccentricity as a free parameter which could converge to a approximate value based on the other parameters, or is it better to assume a circular orbit? To answer this question, we compare the results for two transit models: one assuming a potentially eccentric orbit, where eccentricity is a free parameter, and one assuming a circular orbit, where there is no eccentricity parameter.

Experiment 3: ExoRotGP RQ3: Does joint modelling improve characterization of rotation and transit parameters? 6.5. EXPERIMENTAL EVALUATION CHAPTER 6. EXPERIMENTS

For this experiment, we use Transit Least Squares to detect the planet, BoxLeastSquares to get an initial estimate of the transit period, time of reference, and depth, and the LS periodogram to get an initial estimate of the rotation period. We run the ExoRotGP model on 14 TESS targets and 9 Kepler targets. We then sample from the ExoRotGP with MCMC to obtain estimates of the transit and rotation parameters. Again, we focus on the rotation period and the planet radius.

Experiment 4: Sector Length What is the influence of sector length on the quality of results?

For this experiment, we focus on one specific TESS star, WASP-62, which has one year of data available but no confirmed rotation period. We run the ExoRotGP on stitched segments of increasing length, 30 days, 60 days, 90 days, and see if we can establish an accurate rotation period for the star.

6.5 Experimental Evaluation

Finding accurate parameter estimates for real stars and planets is inherently challenging because one cannot guarantee that the parameter estimates are true. As mentioned earlier, exoplanets are extremely far away and unfortunately, we cannot get close to them and measure them directly like we do with objects on or near Earth. Thus, the convention in astronomy is to validate methods using simulated lightcurves, then apply the methods to real lightcurves, and finally, compare the parameter mean and uncertainty intervals to those obtained by other researchers in refereed publications. The consistency in results from different astronomers, who have used different methods, gives strong support to the accuracy of the parameter estimate. Conventionally, a parameter value is consistent with another value if it is within one standard deviation of the value.

Simulated Experiment We perform experiments with simulated data to validate the ExoRotGP model. This involved generating a lightcurve with random noise and injecting a rotation signal and a transit signal, providing true parameter values that we could compare to the results of our model.

Comparison to Refereed Publications We obtained rotation and planet radii estimates from the NASA Exoplanet Archive. For some targets, different values have been obtained by different research groups. We consider values from the three most recent refereed publications. For the Kepler lightcurves, comparison to literature is straightforward because the the majority of the reference rotation periods and planet radii have been derived from the same lightcurves as used in this study. Because of the novelty of TESS data, many of the TESS lightcurves have not been studied, but many of the stars observed by TESS have been observed by ground-based observatories and do have reference rotation periods in the literature. Since the reference rotation periods were obtained by other missions using different lightcurves and different techniques (i.e. not photometry), it can be more challenging to estimate the same rotation periods in TESS lightcurves than Kepler lightcurves. For example, it could be that an activity cycle is at its minimum at the time that the TESS mission was observing it, while it was at its maximum when the star was observed by the other observatory. In order to deal with this ambiguity, we reference the study by Martin et al., who have analyzed the same TESS lightcurves we have, and labelled them as having either unambiguous rotation, dubious rotation, ambiguous variability, or being a noisy lightcurve [7]. For the stars that were classified as having unambiguous rotation signals, we consider the rotation period estimate provided by Martins et al. For the stars that were not classified as having clear rotation, we used the rotation periods from the NASA Exoplanet Archive, which were obtained using data from other surveys such as WASP (Wide Angle Search for Planets), CoRoT (Convection, Rotation and planetary Transits), HAT-P (Hungarian Automated Telescope Network (HATNet)), HATS (Hungarian-made Automated Telescope Network-South), and HIP (Hipparcos).

Accuracy of the model • Mean and Quantiles: Once the MCMC samples are obtained, we calculate the mean and the 16 and 84 percentiles for each parameter, which allows us to calculate +/- one standard deviation. The parameter estimates can be compared with the references and the size of the errors can indicate the confidence of the model. 6.5. EXPERIMENTAL EVALUATION CHAPTER 6. EXPERIMENTS

• Residual analysis: We obtain the residuals by subtracting the stellar activity and transit model com- ponent from the data and analyzing whether the histogram of the residuals is a Gaussian.

Convergence of the model • Gelman-Rubin statistic (rhat): We use this statistic which "evaluates MCMC convergence by analyz- ing the difference between multiple Markov chains" [69][? ]. The Gelman-Rubin statistic assumes MCMC has not converged if there are large differences between "the estimated between-chains and within-chain variances for each model parameters" [69] [70]. If the Gelman-Rubin statistic is below 1.1 for all parame- ters, then this indicates that the model has converged. • Effective sample size (neff): We use this diagnostic to assess how well the MCMC converges. When the number of effective samples is low, it usually indicates that there is high autocorrelation in the Markov chain, which means the sampler did not explore the parameter space effectively [71] [70]. • Corner plots: We visually assess convergence by inspecting corner plots, which display the covariances between the parameters.

Effectiveness of the model • Generalization: We assess whether the model is robust and performs well for many stars and planets • MCMC Computation time: We assess the trade-off between accuracy and computation time CHAPTER 7 Results and Discussion

In this section, we present the results of the experiments on simulated lightcurves and on real TESS and Kepler lightcurves. For rotation period analysis, we discuss the influence of different correction methods and observation windows on the estimation of rotation periods. For transit analysis, we discuss the influence of temporal binning on the retrieval of the planet radii.

7.1 Simulated Experiment

Figure 7.1: Simulated Lightcurve with Rotation Period 10 days and Orbital Period 4.41 days

We ran experiments with simulated data to validate the model. This involved generating a lightcurve with random noise and injecting a rotation signal and a transit signal, providing true parameter values that we could compare to the results of our model. We inject a rotation period of 10 days and a planet with a planet to star radius ratio (ror) of 0.108. We obtained small uncertainty intervals for the orbital period and the planet to star radius ratio (ror), containing the true values for the orbital period and the ror. We observed that the eccentricity parameter was slightly higher than the ground truth, but this did not affect the planet radius. This can be explained by the fact that the eccentricity is a degenerate parameter and it is difficult to infer from the transit lightcurve.

Parameter Injected Value Recovered Value rotperiod 10 10.7877 (-0.84609, +0.05547) ror 0.1085 0.1086 (-0.00016, +0.00143) period 4.411953 4.412 (-0.00026, +0.00004) eccentricity 0.0 0.09649

38 7.2. ROTATION PERIODS CHAPTER 7. RESULTS AND DISCUSSION

7.2 Rotation Periods

7.2.1 Kepler

Figure 7.2: In this figure, we plot the estimated rotation periods from our GP models and the reference rotation periods; The grey rectangles on the diagonal represent the reference rotation periods and their error bars

Table 7.1: Stellar Rotation Periods for Kepler Stars

RotGP ExoRotGP Reference 1 Reference 2 star corr sector Prot Prot Prot Prot Kepler-107 pdcsap [2] 11.1(−1.4,+3.0) 13.6(−1.5,+2.6) 14 < 20.3+3.3 Kepler-155 pdcsap [5] 27.2(−1.3,+1.4) 27.5(−1.0,+1.2) 26.43±1.32 Kepler-17 pdcsap [1] 12±0.2 12(−0.2,+0.1) 12.01±0.16 12.36±1.5 Kepler-39 pdcsap [3] 4.5±0.1 4.5±0.1 4.5±0.07 Kepler-43 pdcsap [1] 12.8(−0.7,+0.9) 13.3(−0.7,+1.0) 12.95±0.25 Kepler-45 pdcsap [1] 16.7(−0.6,+0.9) 15.4(−0.6,+0.5) 15.8±0.2 Kepler-78 pdcsap [1] 13.2±0.5 12.7±0.2 12.588±0.03 Kepler-75 pdcsap [2] 19.6±0.3 19.5±0.2 19.18±0.25 Kepler-96 pdcsap [2] 9.4(−0.7,+1.0) 15.9±0.4 15.3 20.98(−6.05,+4.61)

For Kepler stars, the RotGP model (red) estimates rotation periods that are within one day of the reference rotation periods with the exception of Kepler-96, where the RotGP model underestimates the rotation period at 9.4 days instead of 15.3 days. ExoRotGP (green) outperforms RotGP by estimating rotation periods that are within one day of all the reference rotation periods. From the plot, it can be observed that our method obtains smaller errors for some stars in comparison to the reference rotation periods (i.e. for the stars Kepler-17, Kepler- 107, Kepler-96). When analyzing the summary statistics of the models (see Appendix), we see that the number of effective samples neff is low for some stars, indicating that the posterior space was not explored effectively. However, the value for the Gelman-Rhubin statistic rhat is below 1.1 for all stars, indicating convergence of the models. 7.2. ROTATION PERIODS CHAPTER 7. RESULTS AND DISCUSSION

7.2.2 TESS

Figure 7.3: The rotation periods estimated by RotGP (red) and ExoRotGP (green) for the 7 TESS stars classified as having "unambiguous rotation", plotted with the reference rotation periods.

Table 7.2: Stellar Rotation Periods for TESS Stars with Unambiguous Rotation

RotGP ExoRotGP Reference 1 Reference 2 (Martins) star corr sector Prot Prot Prot Prot CoRoT-18 pdcsap [6] 5.2(−0.5,+0.6) 5.7(−0.5,+0.6) 5.4±0.4 5.248±0.765 HAT-P 11 reg [14, 15] 28.5(−2.6,+2.2) 28.6(−1.7,+1.7) 30.5(−3.2,+4.1) 10.189±1.104 HIP-65 A pdcsap [1] 12.0(−0.6,+0.7) 12.1(−0.7,+0.8) 13.2(−1.4,+1.9) 13.219±1.859 WASP-140 pdcsap [4, 5] 14.7(−4.4,+15.5) 10.1(−0.1,+0.1) 10.4±0.1 10.229±1.217 WASP-167 reg [10] 1.2(−0.0,+0.1) 1.1(−0.0,+0.0) < 1.81 1.073(−0.029,+0.029) WASP-173 A pdcsap [2] 8.1(−0.2,+0.3) 8.2(−0.1,+0.1) 7.765±1.37 WASP-8 pdcsap [2] 7.2(−0.7,+1.2) 10.8(−2.3,+5.1) 7.247±1.094

The RotGP and ExoRotGP models estimate rotation periods that are within one day of the reference rotation periods from Martin’s et al. and other references, with the exception of two cases: WASP-8 and WASP-140. Interestingly, these two cases demonstrate difference in performances of the RotGP and the ExoRotGP. For the star WASP-8, RotGP returns a more accurate and precise rotation period than ExoRotGP. When looking at the lightcurve of WASP-8 in the Experiments section, it can be observed that there are only three transits, where the middle transit is cut off, which could explain why incorporating a transit model could worsen results. For the star WASP-140, ExoRotGP obtains a very precise and accurate rotation period while RotGP does not. When looking at the lightcurve of WASP-140 in the Experiments section, it is evident that there remains instrumental noise. Since the ExoRotGP model has a noise kernel and the RotGP model does not, it could be the case that the noise kernel absorbs the instrumental noise and the rotation kernel performs better. Another interesting result obtained is for the star HAT-P-11, which has two different rotation periods in literature: 30.5 and 10.1 days. When using the Regression Corrected lightcurve, we obtain a higher rotation period than Martins et al. When observing the two corrections of the lightcurve of HAT-P-11 in the Experiments section, it is evident that the correction method we use is better at preserving the rotation modulation signal and that the pipeline correction flattens the lightcurve. When analyzing the summary statistics of the models (see Appendix), we see that the number of effective samples neff is low for two targets, indicating the posterior space was not explored effectively. The value for the Gelman-Rhubin statistic rhat is above 1.1 for WASP-167 and CoRoT-18, which is surprising because the correct rotation periods are found for those stars. This could be addressed by increasing the number of tuning steps. 7.2. ROTATION PERIODS CHAPTER 7. RESULTS AND DISCUSSION

Figure 7.4: The rotation periods estimated by RotGP (red) and ExoRotGP (green) for the 7 TESS stars classified as being noisy, having ambiguous variability, or having dubious rotation by Martins et al.; We plot with the reference rotation periods based on lightcurves from other missions.

Table 7.3: Stellar Rotation Periods for TESS Stars with Ambiguous Rotation or Noise

RotGP ExoRotGP Reference 1 star corr sector Prot Prot Prot Variability HATS-16 reg [2] 19.4(−5.7,+17.0) 12.9(−2.4,+1.6) 12.35±0.02 Dubious HATS-18 reg [10] 12.3(−1.2,+3.3) 11.6(−2.0,+4.5) 9.8±0.4 Noisy HATS-47 reg [13] 7.4(−0.5,+0.7) 7.8(−0.4,+1.6) 6.42±0.28 Ambiguous Qatar-1 reg [24, 25] 25.4(−8.4,+5.0) 25.9(−8.2,+6.1) 30.0±7.0 Ambiguous WASP-166 reg [8] 3.8(−0.4,+1.8) 12.1±0.9 Noisy WASP-19 reg [9] 12.6(−0.5,+3.5) 14.2(−0.5,+10.6) 10.5±0.2 Dubious WASP-93 pdcsap [17] 11.0(−2.6,+12.0) 1.45±0.4 Noisy

In this plot, we see the rotation periods estimated by RotGP (red) and ExoRotGP (green) for the 7 TESS stars classified as having "dubious rotation", "ambiguous variability", or being a "noisy lightcurve" [7]. The reference rotation periods used are based on lightcurves from other missions, and it was not guaranteed that the rotation signal was present in the TESS lightcurve. For the noisy lightcurves, WASP-93, HATS-18, and WASP-166, our method does not pick up the rotation signal, confirming that these TESS lightcurves are indeed too noisy to analyze rotation. For the dubious star, HATS-16, and the ambiguous stars, HATS-47 and Qatar-1, our method performs better than Martin’s et al. by confirming rotation periods within one day of the reference rotation periods (12 days, 7 days and 26 days, respectively). Similar to the case of HAT-P-11, our method finds the correct rotation period for these stars because we used the Regression Corrected lightcurve rather than the Pipeline Corrected lightcurve. The Gelman-Rhubin statistic rhat is below 1.1 for all models, indicating convergence of MCMC. 7.2. ROTATION PERIODS CHAPTER 7. RESULTS AND DISCUSSION

7.2.3 Observation Window

Figure 7.5: Sectors 6-12 of the star WASP-62.

In addition to observing a group of stars, we also focus on one individual star to study the effect of increasing the observation window on the quality of rotation period recovery. The star we examine is WASP-62, which has strong transit signals, at 4.41 days with Jupiter radius 1.31, but less clear rotation modulation signals. The rotation period has not been confirmed but is estimated to have an upper limit of 7.5 days. The estimates of the upper limit are plotted in the figures below. Martin’s et al. classify this star as having ambiguous variability and do not find a rotation period [7].

Figure 7.6: When the model is applied to individual sectors of around 27 days, the uncertainty of the rotation period is quite high for WASP-62.

Figure 7.7: For stitched segments of 60 days, the uncertainties of the model decrease and the rotation period estimates converge around 6-7 days. There is a high error for the stitched segment [7,8]. 7.3. TRANSIT CHARACTERIZATION CHAPTER 7. RESULTS AND DISCUSSION

Figure 7.8: For a stitched segments of 90 days, the uncertainties of the model decrease and the rotation period estimates seem to converge around 6-7 days. Again, the results are harmed by the Sector 8.

For WASP-62, we observe that increasing the number of sectors generally leads to better results, as the model has more information about the rotation signal. However, this is not the case for any part of the lightcurve. If there is one noisy sector, this can harm the result for the entire segment, as demonstrated by Sector 8. Furthermore, as the observation window increases, it becomes more likely that a change in the rotational modulation signal occurs, due to the shrinking and growing of a star spots. Thus, the rotation period signal can become inconsistent and this can also lead to a larger error. With the case study of WASP-62, we have demonstrated an approach where Gaussian Processes can be used to find new rotation periods. Based on the consistency of the results for various segments, we interpret a rotation period of approximately 6 days for WASP-62.

7.3 Transit Characterization

7.3.1 Kepler Exposure Time Integration

Figure 7.9: Influence of Exposure Time Integration on Retrieval of Planet Radii for Kepler Planets

(a) (b) 7.3. TRANSIT CHARACTERIZATION CHAPTER 7. RESULTS AND DISCUSSION

Table 7.4: Radii for the planets observed by Kepler in Jupiter Radius (Rjup)

ExoGP ExoRotGP Reference 1 planet corr sector Rjup Rjup Rjup Classification Kepler-107 e pdcsap [2] 0.2724(−0.0101,+0.0103) 0.2709(−0.0101,+0.0113) 0.259±0.003 Neptune Kepler-155 b pdcsap [5] 0.1578(−0.0126,+0.0135) 0.1571(−0.0119,+0.0128) 0.154(−0.024,+0.025) Neptune Kepler-17 b pdcsap [1] 1.3828(−0.038,+0.0457) 1.3761(−0.0406,+0.0445) 1.33±0.04 Jupiter Kepler-39 b pdcsap [3] 1.1811(−0.0889,+0.0879) 1.1853(−0.0875,+0.0833) 1.24(−0.1,+0.09) Jupiter Kepler-43 b pdcsap [1] 1.1706(−0.0545,+0.0504) 1.169(−0.0536,+0.0511) 1.16(−0.03,+0.04) Jupiter Kepler-45 b pdcsap [1] 0.9902(−0.1371,+0.1304) 0.9878(−0.1372,+0.1359) 0.96±0.11 Jupiter Kepler-78 b pdcsap [1] 0.1098(−0.0099,+0.0113) 0.1091(−0.0094,+0.0111) 0.1(±0.01 super Earth Kepler-75 b pdcsap [2] 1.0462(−0.0587,+0.052) 1.0508(−0.0605,+0.0496) 1.05±0.03 Jupiter Kepler-96 b pdcsap [2] 0.228(−0.0229,+0.0225) 0.2284(−0.0228,+0.0216) 0.238±0.02 Neptune

As mentioned in the methodology section, temporal binning can smear the transit shape for data with longer cadence (e.g. 30 minutes). The inclusion of exposure time information in the computing of the transit model led to a drastic improvement in the recovery of small and large planet radii in Kepler lightcurves. Both the models with exposure time estimate planet radii that are within 1 sigma of the reference radii. When comparing the ExoRotGP and ExoGP models with exposure time, the joint model obtains radii estimates that are marginally closer to the reference radii than the model without rotation. The Gelman-Rhubin statistic is under 1.1 for all models and the number of effective samples is above 100 for all models, indicating strong confidence in the predictions.

7.3.2 TESS Binning

Figure 7.10: Binning TESS data from 2 minutes to 30 minutes decreases the accuracy of the radius, as demon- strated by the lighter colors in the figure.

Though binning saves computation time and can reduce noise, we observe that it leads to significantly worse estimates for the planet radii in TESS lightcurves. As visualized in the plots, the radii obtained from data that was binned, leading to a cadence of 30 minutes, are less accurate than those obtained using the regular two minute cadence. In particular, when binning is used, the results for the stars HATS-47, HATS-18, WASP-140, and HIP 65 A, are either far off from the ground truth diagonal or have a very large uncertainty. This echoes the same sentiment from the Kepler example, that a longer cadence can smear the transit shape and lead to the recovery of inaccurate transit parameters. When the lightcurves are not binned and all ∼20,000 points are used, the ExoGP model and the ExoRotGP model obtain better estimates of the radii that are more consistent 7.3. TRANSIT CHARACTERIZATION CHAPTER 7. RESULTS AND DISCUSSION

with literature and converge better. For some stars, the binned lightcurve leads to the same results. Thus, the trade-off between speed and accuracy might be worthwhile for some targets.

Figure 7.11: TESS Planets

(a) Smallest planets (b) Largest planet 7.4. COMPUTATION TIME CHAPTER 7. RESULTS AND DISCUSSION

ExoGP ExoRotGP Reference 1 planet corr sector Rjup R_jup Rjup Classification CoRoT-18 b reg [6] 1.2215(−0.1314,+0.1434) 1.2254(−0.1211,+0.1433) 1.31±0.18 Jupiter HAT-P 11 b reg [14, 15] 0.3672(−0.0087,+0.0095) 0.3665(−0.0088,+0.0102) 0.389±0.005 Neptune HATS-16 b reg [2] 1.1742(−0.1047,+0.1079) 1.177(−0.1066,+0.1014) 1.3±0.15 Jupiter HATS-18 b reg [10] 1.3442(−0.0676,+0.0732) 1.3428(−0.0706,+0.0725) 1.337(−0.049,+0.102) Jupiter HATS-47 b reg [13] 0.769(−0.0818,+0.1062) 0.8624(−0.0992,+0.1082) 1.117±0.014 Jupiter HIP-65 A b pdcsap [1] 1.5499(−0.355,+0.8413) 1.5539(−0.378,+0.8258) 2.03(−0.49,+0.61) Jupiter Qatar-1 b reg [24, 25] 1.0906(−0.0558,+0.0623) 1.0917(−0.0585,+0.0601) 1.18±0.09 Jupiter WASP-140 b pdcsap [4, 5] 1.2688(−0.0642,+0.0711) 1.2641(−0.0717,+0.07) 1.44(−0.18,+0.42) Jupiter WASP-166 b reg [8] 0.6254(−0.0318,+0.0342) 0.63±0.03 Neptune WASP-167 b reg [10] 1.6236(−0.0518,+0.0493) 1.6264(−0.0515,+0.0492) 1.58±0.05 Jupiter WASP-173 A b pdcsap [2] 1.205(−0.0526,+0.0556) 1.2087(−0.0517,+0.0545) 1.2(±0.06 Jupiter WASP-19 b reg [9] 1.2872(−0.0605,+0.0559) 1.2893(−0.0586,+0.0567) 1.31±0.06 Jupiter WASP-8 b pdcsap [2] 1.2034(−0.0521,+0.0484) 1.2038(−0.0512,+0.043) 1.13±0.05 Jupiter WASP-93 b pdcsap [17] 1.4914(−0.1471,+0.0886) 1.597±0.077 Jupiter

The ExoGP and ExoRotGP models estimate radii that are within 1 sigma of the reference radii for all planets except for HAT-P-11 b and HATS-47 b. There is no discernible difference in performance between the ExoGP and ExoRotGP model. The Gelman Rhubin statistic is under 1.1 for all models.

7.4 Computation Time

(c) TESS (d) Kepler

Markov Chain Monte Carlo sampling with the joint ExoRotGP model is significantly slower because there are more kernel parameters to marginalize over when approximating the posterior distributions. For TESS, the fastest model is the binned ExoGP, followed by a tie between RotGP and ExoGP, and the slowest is the joint ExoRotGP model. For Kepler, the fast model is RotGP, followed by ExoGP, and the slowest is the joint ExoRotGP model. For some targets (e.g. WASP-8, Kepler-17), we do not see such a dramatic increase in sampling times of the joint models. This suggests that joint modelling can be useful for specific targets where the rotation signal is very consistent. 7.5. DISCUSSION CHAPTER 7. RESULTS AND DISCUSSION

7.5 Discussion

How accurately can we model the stellar activity with Gaussian Processes within the Kepler and TESS lightcurves? In this work, we were able to estimate rotation periods within 1 day of all Kepler lightcurves and rotation periods within 1 day of all TESS lightcurves with unambiguous rotation, ranging from 1.2 days to 28.5 days. For four TESS lightcurves which were deemed “dubious” or “ambiguous” by Martins et al., we find rotation periods, one of which is a new rotation period (i.e. WASP-62) [7]. For some lightcurves (e.g. WASP-140, Kepler-96), the joint ExoRotGP model performs better than the singular RotGP model, which we interpret is due to the fact that the ExoRotGP model has a noise kernel to absorb the instrumental systematic noise. For one lightcurve (e.g. WASP-8), the singular RotGP performs better than the joint model, and by studying the lightcurve, we attribute this result to a weak transit model. Throughout the experiments, we observe several limitations regarding the nature of the rotation kernel. The first limitation we observe is that the rotation kernel picks up double the rotation period when the rotation signal closely resembles a sinusoidal. A spotted star rarely produces sinusoidal variation in the lightcurve unless the inclination of rotation is very high and can better be described by a quasi-periodic signal. Therefore the rotation kernel was designed based on quasi-periodic signals [12]. As mentioned in the method section, the rotation kernel looks for two waves, one with period p and one with a period 1/2 p. The idea behind this approach is that the power spectrum of a quasi-periodic signal will have a strong peak at the period, but also at the harmonic of the period (1/2 of the period). However, a pure sinusoidal function will not produce these two peaks. We observed this phenomenon for the star CoRoT-18, where the method found 10.8 days instead of 5.4 days for the Regression Corrected lightcurve. When we analyzed the lightcurve, we observed it was very sinusoidal and when we analyzed the LS periodogram, we observed there was no peak at of the period of 5 days, explaining why the GP model returned double the period. However, when we applied our model to the pipeline corrected version of the lightcurve, which was more quasiperiodic, the model was able to find the correct rotation period of 5 days. Another limitation we observe is that the method is not able to get precise estimates for rotation periods that are longer than 1/2 of the observed window. This makes sense according to the Nyquist-Shannon sampling theorem, which claims that when we take the Fourier transform of a signal, we can only sample a certain period if the observation window is at least twice as long as the period. For example, when analyzing the stars Kepler-96 and Kepler-75, which have rotation periods of 15 and 19 days, our model estimates the correct values of 15 and 19 with error bars under 1 for long 90 day segments, while it derives a false mean and large error: 18 days (-5.7, +13.4) and 17 days (-2.8, +1.2) for short 35 day segments. Luckily, the majority of Kepler quarters are 90 days long and all stars have 17 quarters of data available. For TESS stars, however, the 1/2 observation window rule introduces a bias because many of the stars have only one sector (27 days) of data available, making them limited to the study of rotation periods under 15 days. Extending on the topic of the observation window, we also observed with the special case of WASP-62, that analyzing more sectors (stitched together), led to better convergence of rotation period, except when a noisy sector was included or the activity cycle was changing. We conclude that Gaussian Processes can be used to find new periods for TESS stars this way. By looking at just one sector, the applicability of the rotation kernel is very limited to relatively fast rotators (<15 days). However, for the TESS stars where multiple sectors of data are available, a new rotation period can be found with ExoRotGP by analyzing different stitched segments of lightcurve, and assessing the consistency in the results. With this approach, we interpret that WASP-62, which has an unconfirmed rotation period, likely has a rotation period of around 6 days. All in all, the instances where we saw the rotation kernel struggle were cases where there were leftover instrumental jumps, where the transit model was weak, the rotation signal was sinusoidal and the observation window was too short. From these experiments, we learn that the best RotGP model would consist of rotation kernel and a noise kernel and it should be applied to lightcurves where the signals are quasi-periodic.

RQ2: How accurately can we model the exoplanet transits with Gaussian Processes within Kepler and TESS lightcurves?

In this work, our method was able to recover planet radii ranging from small super Earths (0.1 Rjup) to large Jupiters (1.5 Rjup). We focused on the radii of planets, but in order to calculate accurate radii, a complex transit model had to be defined consisting of many other parameters describing the transit shape: limb darkening coefficients, eccentricity of the orbit, the impact parameter, the period, the reference time. For some of these parameters, we defined special prior distributions based on scientific papers. With our experiments, we demonstrate that the ExoGP model was able to estimate radii that were within one sigma of the reference radii for all Kepler planets, even small superEarths which block only 10-3 of the star’s light. For TESS planets, ExoGP estimates radii within one sigma of the reference radii for all planets except two. The models slightly underestimated the radii for these two TESS planets, HATS-47 and HAT-P-11, because the lightcurve correction 7.5. DISCUSSION CHAPTER 7. RESULTS AND DISCUSSION

method used preserves the stellar activity at the expense of decreasing the transit signal. We did not observe a strong difference in performance between the joint ExoRotGP model and the ExoGP model. Additionally, we have highlighted the significance of the temporal binning effects with our experiments. This refers to the effect of analyzing the data at a longer cadence, for example 30 minutes instead of 2 minutes. When a longer cadence is used, the “long cadence (LC) data smears out the transit light curve into a broader shape which will lead to an erroneous retrieval“ [53]. The Kepler data we use is long cadence (30 minutes) and the TESS data we use is short cadence (2 minutes). With our experiments with Kepler lightcurves with long cadence, we show that when accounting for the “temporal binning” effect with exposure time integration, it leads to the retrieval of much more accurate transit parameters. Furthermore, with our experiments with TESS lightcurves, we demonstrate that when we bin the data so it has a 30 minute cadence, it leads to the retrieval of worse parameters. In particular, binning strongly affects the radii of the TESS planets HATS-47, WASP-140, and WASP-93, which are grazing planets (where the transit does not fully cover the stellar disk), and therefore have a "V" transit shape which leads to a higher uncertainty in the determination of the transit depth and affects the retrieval of the impact parameter. Thus, we confirm the research presented in the paper “Binning is sinning” [53]. From this we learn that in order to achieve accurate transit characterization, exposure time correction must be included for Kepler 30 minute data and binning should be avoided for two minute TESS data.

RQ3: Does joint modelling improve the characterization of transit and rotation parameters in Kepler and TESS lightcurves? We have investigated the advantages and disadvantages of jointly modelling the exoplanet transits and stellar activity with ExoRotGP, a GP model that consists of a transit mean model and rotation and noise kernels. For recovery of rotation periods, we demonstrate that the ExoRotGP improves estimates, thanks to its additional noise kernels, and sometimes worsens rotation periods estimates, when the transit model is not well defined. For recovery of planet radii, we demonstrate that ExoRotGP slightly decreases uncertainty but not by much in comparison to our baseline model ExoGP, which consists of a transit mean function and a noise kernel. Considering the strong and clear rotational modulating signals in our data, this suggests that the noise kernel is capable of modelling the rotational modulation signal and ExoGP already does a good job of separating the stellar variability from the exoplanet transits. This highlights the versatility of the noise kernel and demonstrates that a stochastically-driven damped harmonic oscillator is capable of modelling many signals from instrumental noise, to stellar granulation, all the way to large rotation modulation signals, making it a robust kernel for various kinds of data. Furthermore, we show the dramatic increase in MCMC computation as a result of the additional parameters in the joint model. Thus, we conclude that the joint model should only be used when there is a clear rotation signal present and if it strongly interferes with the transit signal, which was not the case for many of the TESS stars we examined. On the other hand, we propose that the joint model is useful for stellar systems where the rotation period of the star and the orbital period of the planet are very similar.

7.5.1 Kepler vs. TESS We demonstrated that our method is robust enough to work on two datasets with very different systematics: Kepler and TESS. As mentioned in the Experiments section, Kepler lightcurves have 17 quarters of data available which are 90 days long (four years total). TESS does not have the same amount of data for each lightcurve, many targets have only one 27 day sector available. The TESS data comes in short cadence (2 minutes), which results in almost 20,000 data points per sector while the the Kepler data comes in long cadence (30 minutes), which results in almost 4,500 datapoints per quarter. Our hypothesis was that the models would perform better on Kepler lightcurves because there is a better signal to noise ratio. What we observed was that our model was successful in estimating both rotation periods in Kepler lightcurves and rotation periods in TESS lightcurves with unambiguous rotation that were within one day of the reference rotation periods. The model was not able to recover some of the reference rotation periods for the TESS data but this was because those reference rotation periods were based on spectroscopic data from ground based observatories and the rotation signals were not present in the photometry provided by the TESS data. For recovering radii, we observed that the model performed well on Kepler but also performed well on TESS, despite noisier data, because TESS’s high cadence data provides a lot of information about the transit curve.

7.5.2 How could our method be improved? As described in the algorithm section, we have developed a pipeline which runs several stages of data analysis: data correction, transit detection using Transit Least Squares, making first estimates of transit parameters and rotation parameters using Box Least Squares and LS periodogram, building a Gaussian Process model, and sampling from the model with Markov Chain Monte Carlo. Each stage of the pipeline is important for the results and could be its own research topic. Here, we address some of the limitations of each stage of the pipeline 7.5. DISCUSSION CHAPTER 7. RESULTS AND DISCUSSION

and how they could be improved. To begin with, we observed that preprocessing had a significant influence on the results. To begin, our method could be improved by using a correction method which preserves both the rotation signal and the transit signal. In our case, we observed that the pipeline corrected lightcurve optimized the transit signal at the expense of the stellar activity while the regression correction method preserved the stellar activity at the expense of the transit signal. This had a big influence of the recovery of rotation periods and the planet radii. For example, using the regression correction we were able to recover rotation period that Martins et al. were not able to recover using the pipeline corrected lightcurves (e.g. HATS-47, HAT-P-11, and HATS-16, and WASP-62). For the recovery of planet radii, using the regression corrected lightcurve led to the underestimation of two planets HATS-47 b and HAT-P-11 b. Second, another way our method could be improved is by obtaining better initial estimates of the parameters or defining more informative priors.We observed that the priors of the transit parameters could drastically influence the model. For example, if the incorrect orbital period was found with Transit Least Square, then the GP model would not converge. As we looked at many stellar systems, we were limited to using more general priors for each stellar system, although the means of the priors were initialized based on the data. Third, the architecture of the GP model could be extended to incorporate more kernels, allowing it to model more signals. Currently, we use a combination of a noise kernel and a rotation kernel, however, one could also consider an oscillation kernel or the use multiple noise kernels, even though it would slow down and increase MCMC time. Finally, the settings for the MCMC sampler could be improved so more of the parameter space is explored. For some stars, we observed the number of effective samples was low, indicating that the parameter space was not explored well. As we were running many targets and using many kernel parameters, we did not use a high number of tuning steps in order to make the experiments feasible, and this meant that the MCMC did not throw away that many samples. Increasing the number of tuning steps could lead to more effective samples and precise estimates. In the conclusion and future work section, we elaborate on specific solutions regarding how these stages of the pipeline could be extended.

7.5.3 Challenges A thesis is not without its challenges, and there were some encountered. One challenge was the lack of objective truth values for the rotation periods and planet radii. In astronomy, we can not get close to the objects we are studying and the convention is to assess consistency between the values obtained by different astronomers. Thus to validate our model, we first used a simulated experiment, and then, for the actual stellar observations, we compared our results with those in refereed publications. This challenge aside, our work was able to support the accuracy of the estimates in the refereed publications by demonstrating that we were able to reproduce them using different methods and different data. Another challenge was the computation time of the Markov Chain Monte Carlo. There currently exists no good GPU support for the PyMC3 Sampler implemented in this project, and this led to slow computation times that last up to three hours for some stars. Moreover, a big theme I encountered throughout the thesis was the trade-off between developing a general method and obtain strong results for specific stars and planets. In this work, we sought to develop a general pipeline and a robust method for characterization that could be applied to many stellar systems. This was difficult because there is a large diversity in cases (which can already be inferred when looking at the targets lightcurves in our Experiments sections). Each stellar system has its own parameters and different characteristics in terms of variability (or character). Exoplanets have a lot of diversity, and can range from small super Earths to large Jupiters and can orbit around stars as fast as 0.35 days or as slow as 1000 days. As a result, the success of each step of the pipeline is different for each star. Because we built a general pipeline, we had to sacrifice the success of certain stars by using methods that worked for all the stars. However, we developed the code so that the user could run it stage by stage and tune the settings to a specific star if needed. The final challenge was the merging of two disciplines: Astrophysics and AI. The previous being more centered around physically-motivated models and understanding the workings of the model while the latter is typically more black-box and data-driven. Fortunately, Gaussian Processes provided a happy medium between Astrophysics and AI, offering a machine learning method that is interpretable with physically motivated kernels, and became an example of synergy between the two fields. CHAPTER 8 Conclusion and Future Work

8.1 Contributions

In this work, we have applied machine learning to transit lightcurves from the space missions Kepler and TESS. We investigated three questions:

RQ1: How accurately can we model stellar activity with Gaussian Processes in Kepler and TESS? To answer this question, we investigated the performance of RotGP, a GP model that consists of a zero mean function and a rotation kernel, and the performance of ExoRotGP, a GP model that consists of a transit mean function and a rotation and noise kernel. With these models, we were able to estimate rotation periods within 1 day of the reference rotation periods for all Kepler lightcurves and all TESS lightcurves with unambiguous rotation, ranging from 1.2 days to 28.5 days. For four TESS lightcurves which were deemed to have “dubious rotation” or “ambiguous variability” by Martins et al., we find rotation periods, one of which is a new rotation period (i.e. WASP-62) [7]. Throughout the experiments, we observe that the modelling of stellar activity with our GP models is limited when there is high instrumental noise, when the rotation signal is more sinusoidal than quasiperiodic, and when the observation window is too short. From these experiments, we learn that the rotation kernel should be supplemented with a noise kernel and it should be applied to lightcurves where the signals are quasi-periodic.

RQ2: How accurately can we model exoplanet transits with Gaussian Processes in Kepler and TESS lightcurves? To answer this question, we investigated the performance of ExoGP, a GP model that consists of a mean transit function and a noise kernel, and the performance of of ExoRotGP, a GP model that consists of a transit mean function and a rotation and noise kernel. In this work, our method was able to recover planet radii ranging from small super Earths (0.1 R_jup) to large Jupiters (1.5 R_jup). We focused on the radii of planets, but in order to calculate accurate radii, a complex transit model had to be defined consisting of many other parameters describing the transit shape: limb darkening coefficients, eccentricity of the orbit, the impact parameter, the period, the reference time. With our experiments, we demonstrate that the ExoGP model was able to estimate radii that were within one sigma of the reference radii for all Kepler planets, even small super Earths which block only 10−3 of the star’s light. For TESS planets, ExoGP estimates radii within one sigma of the reference radii for all planets except two. The models slightly underestimated the radii for the two TESS planets, HATS-47 and HAT-P-11, because the correction method used preserves the stellar activity at the expense of decreasing the transit signal. Throughout the experiments, we observe that modelling of exoplanet transits is limited by temporal binning, the grouping of data points into bins. This refers to the effect of analyzing the data at a longer cadence, for example 30 minutes instead of 2 minutes. When a longer cadence is used, the “long cadence (LC) data smears out the transit light curve into a broader shape which will lead to an erroneous retrieval" of planet parameters [53]. With our method, we observed this is especially relevant for grazing planets that cross the edges of the star. From these experiments, we learn that in order to achieve accurate transit characterization, exposure time correction must be included for Kepler 30 minute data and binning should be avoided for two minute TESS data.

50 8.2. FUTURE WORK CHAPTER 8. CONCLUSION AND FUTURE WORK

RQ3: Does joint modelling improve the characterization of transit and rotation parameters in Kepler and TESS lightcurves? To answer this question, we investigated the advantages and disadvantages of jointly modelling the exoplanet transits and stellar activity, by assessing the relative performances of ExoRotGP, RotGP, and ExoGP. For the recovery of rotation periods, we demonstrate that the ExoRotGP improves estimates, thanks to its additional noise kernel, and sometimes worsens rotation periods estimates, when the transit model is not well defined. For the recovery of planet radii, we demonstrate that ExoRotGP slightly decreases uncertainty but not by much in comparison to our baseline model ExoGP. Throughout these experiments, we highlight the versatility of the noise kernel, which uses a stochastically-driven damped harmonic oscillator to model variability, and demonstrate that it is capable of modelling many signals from instrumental noise to large rotation modulation signals, making it a robust kernel for various kinds of data. Furthermore, we show the dramatic increase in MCMC sampling times as a result of the additional parameters in the joint model. Thus, we conclude that the joint model should only be used when there is a clear rotation signal present and if it strongly interferes with the transit signal (i.e. when the stellar rotation period and the planet orbital period overlap). In the quest to answering these questions, we have made several contributions. To begin with, we have developed a tool using the software exoplanet, pymc3, celerite, and starry, which builds upon code by the developers of exoplanet for fitting TESS data and modelling stellar variability in lightcurves. Our tool automates multiple stages of data analysis: from preprocessing to parameter estimation. The first stage is preprocessing, which corrects the instrumental effects in the lightcurves. The second stage is getting initial estimates of the planet parameters and the rotation period to define the prior distributions. In this stage, we use a combination of the Transit Least Squares algorithm and the Box Least Squares algorithm to detect and measure the transit, and the Lomb-Scargle periodogram to make an initial estimate of the rotation period. The third stage is the modelling of the stellar activity and exoplanet transits with a Gaussian Process. At this stage, the user has the option to build and optimize three types of GP models using simple arguments: RotGP (rotation), ExoGP (transit), and ExoRotGP (rotation and transit). The fourth stage is the sampling of the posterior distributions of the Gaussian Process with Markov Chain Monte Carlo to get the parameter estimates. Once the samples are obtained, the code calculates the means and quantiles of the samples and visualizes the parameter results. Special features of our code are: the option to automate the whole process, the use of the Regression correction method (which tends to preserves more stellar activity than the pipeline correction), the use of Transit Least Squares method for detection (which is better at detecting smaller planets than the Box Least Squares method), the choice of building three different GP models, and the automatic generation of a document with information about each star. For each target star, the code outputs a document with statistics about the star (i.e. stellar mass, stellar radius, effective temperature) and figures and tables derived from different stages of the pipeline. This includes plots of the corrected lightcurve, the Box Least Squares periodogram, the Lomb-Scargle periodogram, the predictions of the GP model, the residuals of the model, and the covariances of the parameters. The document also includes tables summarizing statistical convergence of MCMC and results tables summarizing the mean and uncertainty estimates for each parameter in comparison with the literature value. These documents are useful for astronomers because they can be used for data analysis and can provide a good sanity check for the pipeline. (see Appendix). The pipeline can be run automatically by setting experimental conditions at the beginning. The user also has the option to run the code in stages, in case he or she wants to adjust certain settings for a star. Furthermore, the tool is carefully documented and user-friendly and is planned to be used by astronomy students for the analysis of more stars and planets. We have also published results at two conference: the first a virtual poster at the European Astronomical Society (EAS) and the second, a recorded talk at the Europlanet Science Congress (EPSC) (see Appendix).

8.2 Future work

Every step of the pipeline is an interesting and complex problem on its own, and could lead to its own Thesis project. To begin with, the instrumental correction applied to the lightcurves can be investigated further. In our work, we looked at the pipeline corrected lightcurve (PDCSAP) and the Regression Corrected lightcurve. Currently, the pipeline corrected lightcurves are optimized for transit detection and remove long-term astro- physical signals, and so the modelling of long-term stellar activity based on them is limited. The Regression correction method we used was able to preserve more of the astrophysical signal but at the expense of shorter transit depths in some lightcurves, and it was not consistently good for all lightcurves. Since the TESS data is quite new, these instrumental correction methods are still being developed. TASOC (TESS Asteroseismic Science Operations Center) is working on obtaining lightcurves which have kept the stellar activity signatures while minimising the instrumental noise, so that the stellar activity can be studied. Combining their correc- tion method with our pipeline could improve the characterization results. Additionally, novel methods such 8.3. CONCLUDING REMARKS CHAPTER 8. CONCLUSION AND FUTURE WORK

as Recurrent Neural Networks (RNNs) or aggregated periodograms could be applied to many lightcurves of different stars observed in the same sector to learn the patterns of instrumental drifts, and these patterns can be subtracted from the data. Second, the detection of transit and rotation signals can be enhanced before running the GP model. The initial estimates of the transit model have to be quite precise for the GP model to perform well, especially the orbital period. Currently, we use the Transit Least Squares method to find the orbital period, which works very well. However, we use Box Least Squares to find the depth of the planet, which uses a box shape instead of a transit shape. A better estimate of the transit depth might be obtained if Transit Least Squares is adapted so that it can find the transit depth in a normal modulating lightcurve. Currently, novel methods such as Random Forests and Convolutional Neural Networks (e.g. AstroNet) are being used for detecting exoplanets. Once these methods are adapted to also measuring the depth and reference time of the transit, they could be combined with the pipeline, and they might be better at spotting smaller planets. Another idea would be to use random forests to detect if a rotation period is measurable, as described in the literature review, This could be incorporated as a check before running the Gaussian process. Third, further experiments can be performed with the architecture of the Gaussian Process model. A kernel function for a Gaussian Process can be a combination of many kernels. In this work, we experimented with a background noise (stochastically-driven damped harmonic oscillator) kernel and a rotation kernel from celerite, which led to good results. If astronomers want to focus on specific stars, without caring about longer computation times, they can introduce more kernels to increase the complexity of the model. For example, Barros et al. define Gaussian Process models with a rotation kernel, an oscillation kernel, and up to five granulation kernels. Adding more granulation kernels, for example, can increase the capability of modelling background noise and enhance the transit signal and rotation signal. In addition to experimenting with more kernels, one can experiment with different prior distributions for the kernel function and mean function parameters and use more constrained priors. For example, one can incorporate additional information from the star and planet that might have already been obtained from other methods (e.g. Radial Velocity measurements), such as the limb darkening parameters and the eccentricity of the orbit and fix these in the model, rather than treating them as free parameters. Or one can incorporate certain physical constraints based on the star. For example, for some stars the rotational velocity, which is obtained from spectroscopic analysis, can define a supposed upper limit of the rotation period and this can be used to constrain the upper bound of the rotation period prior on a star-by-star basis. One can also adapt the transit model to include multiple planets. Fourth, Markov Monte Carlo Sampling speeds are slow in the current pymc3 implementation and are cur- rently not compatible with GPUs. Once GPU integration becomes possible, the whole process will be dramat- ically sped up and can be applied to many more stars. Finally, there are many more interesting targets that can be studied with the pipeline we presented. Currently we focused on stars with confirmed rotation periods and confirmed planets. The NASA Exoplanet Archive has a list of TESS planet candidates which have not been confirmed, and also as more stars are observed, this list is continuously updated. We can use this tool to characterize them. Soon, the TESS space telescope will observe the stars that were observed by the Kepler/K2 mission. Another very interesting analysis, could involve applying the tool on K2 and TESS lightcurves of the same target stars, and comparing the results of characterization. Finally, the flood of astronomy data from transitmissions and observations in the future will not stop. The ESA PLATO mission, planned to be launched in 2026, will provide transit data of thousands of exoplanets, with its main focus being on the detection and accurate characterisation of Earth-size planets orbiting up to habitable zones of Sun-like stars. The techniques to find and model these small and long period exoplanet transits must carefully account for the effects of instrument systematics, stellar activity, granulation, and oscillations present in the light curves. Gaussian Processes provide promising tools to achieve this objective.

8.3 Concluding Remarks

“Three decades ago, astronomers could not say reliably whether there were planets around other stars. . . ”, and now we know that “the universe is home to more planets than stars, with billions of potentially habitable planets just in our own galaxy” [8]. Each stellar system is a challenge; in this work we have solved 22 stellar systems with scalable Gaussian Processes. We have demonstrated that Gaussian Process work well for the problem of modelling noise and stellar activity and producing accurate probabilistic estimates of stellar rotation periods and planet radii. Thanks to their versatile and interpretable nature, Gaussian Process can disentangle many different signals (from the planet, the star, and the instrument), and produce probabilistic estimates of the signal parameters (of the planet radius, stellar rotation, and instrument drift), making them very suitable for stellar lightcurves analysis. That being said, Gaussian Processes represent just one of the many relevant applications of machine learning that can contribute to the field of astronomy. With the existing data from past missions like Kepler, which can be resurfaced and mined, and the incoming flood of big data from new missions 8.3. CONCLUDING REMARKS CHAPTER 8. CONCLUSION AND FUTURE WORK

like TESS, PLATO, and ARIEL, there are many AI opportunities for detection and characterization which will revolutionize our view of the universe. With these tools, we will be able to uncover the rotation periods of many stars, and in turn study the evolutionary stages of many stellar system and the habitability conditions of their orbiting planets. With these AI tools, we will also be able to produce more accurate estimates of planet properties, which will build the backbone of further studies on the atmosphere, composition, and habitability of the planets, and all this can be done using the “small whisper of an exoplanet’s nature” lightyears away [53]. References

[1] Jack J. Lissauer, Rebekah I. Dawson, and Scott Tremaine. Advances in exoplanet science from kepler. Nature, 513(7518):336–344, Sep 2014. ISSN 1476-4687. doi: 10.1038/nature13781. URL http://dx.doi. org/10.1038/nature13781. [2] Daniel Foreman-Mackey, Eric Agol, Sivaram Ambikasaran, and Ruth Angus. Fast and scalable gaussian process modeling with applications to astronomical time series. The Astronomical Journal, 154(6):220, nov 2017. doi: 10.3847/1538-3881/aa9332. URL https://doi.org/10.3847%2F1538-3881%2Faa9332. [3] H. Parviainen. Bayesian methods for exoplanet science. arXiv: Instrumentation and Methods for Astro- physics, page 149, 2018. [4] Daniel Foreman-Mackey, Rodrigo Luger, Ian Czekala, Eric Agol, Adrian Price-Whelan, Timothy D. Brandt, Tom Barclay, and Luke Bouma. exoplanet-dev/exoplanet v0.4.0, October 2020. URL https://doi.org/ 10.5281/zenodo.1998447. [5] R. Luger, E. Agol, D. Foreman-Mackey, D. P. Fleming, J. Lustig-Yaeger, and R. Deitrick. starry: Analytic Occultation Light Curves. , 157:64, February 2019. doi: 10.3847/1538-3881/aae8e5. [6] John Salvatier, Thomas V Wiecki, and Christopher Fonnesbeck. Probabilistic programming in python using pymc3. PeerJ Computer Science, 2:e55, 2016. [7] Bruno L. Canto Martins, Roseane L. Gomes, Yuri S. Messias, Suzierly R. de Lira, Izan C. Leão, Leonardo A. Almeida, Márcio A. Teixeira, Maria L. das Chagas, Jenny P. Bravo, Asnakew Bewketu Belete, and José R. De Medeiros. A search for rotation periods in 1000 tess objects of interest, 2020. [8] Dennis Overbye. Kepler, the little nasa spacecraft that could, no longer can, Oct 2018. URL https: //www.nytimes.com/2018/10/30/science/nasa-kepler-exoplanet.html. [9] Pat Brennan, 2020. URL https://exoplanets.nasa.gov/resources/280/ light-curve-of-a-planet-transiting-its-star/. [10] ESA. Exoplanet mission timeline, 2020. URL https://sci.esa.int/web/exoplanets/-/ 60649-exoplanet-mission-timeline. [11] Simon Fraser University. Calculating exoplanet properties. URL https://www.sfu.ca/colloquium/PDC_ Top/astrobiology/discovering-exoplanets/calculating-exoplanet-properties.html. [12] Ruth Angus, Timothy Morton, Suzanne Aigrain, Daniel Foreman-Mackey, and Vinesh Rajpaul. Inferring probabilistic stellar rotation periods using Gaussian processes. Monthly Notices of the Royal Astronomical Society, 474(2):2094–2108, 09 2017. ISSN 0035-8711. doi: 10.1093/mnras/stx2109. URL https://doi. org/10.1093/mnras/stx2109. [13] S. Aigrain, H. Parviainen, and B. J. S. Pope. K2SC: flexible systematics correction and detrending of K2 light curves using Gaussian process regression. Monthly Notices of the Royal Astronomical Society, 459 (3):2408–2419, 04 2016. ISSN 0035-8711. doi: 10.1093/mnras/stw706. URL https://doi.org/10.1093/ mnras/stw706. [14] . URL http://exoplanetarchive.ipac.caltech.edu/. [15] exotic. URL https://pypi.org/project/exotic/0.17.4/. [16] Andrew Vanderburg. The transit light curve. URL https://www.cfa.harvard.edu/~avanderb/ tutorial/tutorial2.html. [17] Christopher Duffy. Stellar flares and star spots unexpectedly found to be uncorrelated, Apr 2020. URL https://armaghplanet.com/ stellar-flares-and-star-spots-unexpectedly-found-to-be-uncorrelated.html.

54 REFERENCES REFERENCES

[18] Eric B. Ford. Future of high-dimensional data-driven exoplanet science. Journal of Physics: Conference Series, 699:012007, mar 2016. doi: 10.1088/1742-6596/699/1/012007. URL https://doi.org/10.1088% 2F1742-6596%2F699%2F1%2F012007. [19] A. Santerne, R. F. Díaz, J. M. Almenara, A. Lethuillier, M. Deleuil, and C. Moutou. Astrophysical false positives in exoplanet transit surveys: why do we need bright stars ?, 2013. [20] Mahmoudreza Oshagh. Noise sources in photometry and radial velocities. Asteroseismology and Exoplanets: Listening to the Stars and Searching for New Worlds, page 239–249, Jul 2017. ISSN 1570-6605. doi: 10.1007/978-3-319-59315-9_13. URL http://dx.doi.org/10.1007/978-3-319-59315-9_13. [21] Michael Richmond. Limb darkening. URL http://spiff.rit.edu/classes/phys440/lectures/limb/ limb.html.

[22] University of Toronto. Ast326 limb darkening. http://www.astro.utoronto.ca/astrolab/files/ AST326_LimbDarkening_2017.pdf. [23] Laura Kreidberg. Tutorial¶, 2015. URL https://www.cfa.harvard.edu/~lkreidberg/batman/ tutorial.html. [24] G. Kovács, S. Zucker, and T. Mazeh. A box-fitting algorithm in the search for periodic transits. Astronomy Astrophysics, 391(1):369–377, Jul 2002. ISSN 1432-0746. doi: 10.1051/0004-6361:20020802. URL http: //dx.doi.org/10.1051/0004-6361:20020802. [25] Astropy Developers. Box least squares (bls) periodogram, 2020. URL https://docs.astropy.org/en/ stable/timeseries/bls.html. [26] Michael Hippke and René Heller. Optimized transit detection algorithm to search for periodic transits of small planets. Astronomy Astrophysics, 623:A39, Feb 2019. ISSN 1432-0746. doi: 10.1051/0004-6361/ 201834672. URL http://dx.doi.org/10.1051/0004-6361/201834672. [27] Giménez, A. Equations for the analysis of the light curves of extra-solar planetary transits. A&A, 450 (3):1231–1237, 2006. doi: 10.1051/0004-6361:20054445. URL https://doi.org/10.1051/0004-6361: 20054445. [28] Kaisey Mandel and Eric Agol. Analytic light curves for planetary transit searches. The Astrophysical Journal, 580(2):L171–L175, Dec 2002. ISSN 1538-4357. doi: 10.1086/345520. URL http://dx.doi.org/ 10.1086/345520. [29] Laura Kreidberg. batman: Basic transit model calculation in python. Publications of the Astronomical Society of the Pacific, 127(957):1161–1165, Nov 2015. ISSN 1538-3873. doi: 10.1086/683602. URL http: //dx.doi.org/10.1086/683602. [30] Hannu Parviainen. pytransit: fast and easy exoplanet transit modelling in python. Monthly Notices of the Royal Astronomical Society, 450(3):3233–3238, May 2015. ISSN 1365-2966. doi: 10.1093/mnras/stv894. URL http://dx.doi.org/10.1093/mnras/stv894. [31] Rodrigo Luger, Eric Agol, Daniel Foreman-Mackey, David P. Fleming, Jacob Lustig-Yaeger, and Russell Deitrick. starry: Analytic occultation light curves. The Astronomical Journal, 157(2):64, Jan 2019. ISSN 1538-3881. doi: 10.3847/1538-3881/aae8e5. URL http://dx.doi.org/10.3847/1538-3881/aae8e5. [32] Jacob T. VanderPlas. Understanding the lomb–scargle periodogram. The Astrophysical Journal Supplement Series, 236(1):16, May 2018. ISSN 1538-4365. doi: 10.3847/1538-4365/aab766. URL http://dx.doi.org/ 10.3847/1538-4365/aab766. [33] Piet M. T. Broersen. Automatic autocorrelation and spectral analysis. Springer, 2010. [34] M. B. Nielsen, L. Gizon, H. Schunker, and C. Karoff. Rotation periods of 12000 main-sequencekeplerstars: Dependence on stellar spectral type and comparison withvsiniobservations. Astronomy Astrophysics, 557: L10, Aug 2013. ISSN 1432-0746. doi: 10.1051/0004-6361/201321912. URL http://dx.doi.org/10.1051/ 0004-6361/201321912. [35] A. McQuillan, T. Mazeh, and S. Aigrain. ROTATION PERIODS OF 34,030 KEPLER MAIN-SEQUENCE STARS: THE FULL AUTOCORRELATION SAMPLE. The Astrophysical Journal Supplement Series, 211(2):24, mar 2014. doi: 10.1088/0067-0049/211/2/24. URL https://doi.org/10.1088%2F0067-0049% 2F211%2F2%2F24. REFERENCES REFERENCES

[36] A. McQuillan, T. Mazeh, and S. Aigrain. STELLAR ROTATION PERIODS OF THE KEPLER OBJECTS OF INTEREST: A DEARTH OF CLOSE-IN PLANETS AROUND FAST ROTATORS. The Astrophysical Journal, 775(1):L11, sep 2013. doi: 10.1088/2041-8205/775/1/l11. URL https://doi.org/10.1088% 2F2041-8205%2F775%2F1%2Fl11. [37] A. McQuillan, S. Aigrain, and T. Mazeh. Measuring the rotation period distribution of field m dwarfs with kepler. Monthly Notices of the Royal Astronomical Society, 432(2):1203–1216, Apr 2013. ISSN 1365-2966. doi: 10.1093/mnras/stt536. URL http://dx.doi.org/10.1093/mnras/stt536. [38] Carl Edward Rasmussen and Christopher K. I. Williams. Gaussian processes for machine learning. Adaptive computation and machine learning. MIT Press, 2006. ISBN 026218253X. [39] N. P. Gibson, S. Aigrain, S. Roberts, T. M. Evans, M. Osborne, and F. Pont. A Gaussian process framework for modelling instrumental systematics: application to transmission spectroscopy. Monthly Notices of the Royal Astronomical Society, 419(3):2683–2694, 01 2012. ISSN 0035-8711. doi: 10.1111/j.1365-2966.2011. 19915.x. URL https://doi.org/10.1111/j.1365-2966.2011.19915.x. [40] Trevor J. David, Erik A. Petigura, Rodrigo Luger, Daniel Foreman-Mackey, John H. Livingston, Eric E. Mamajek, and Lynne A. Hillenbrand. Four newborn planets transiting the young solar analog v1298 tau. The Astrophysical Journal, 885(1):L12, Oct 2019. ISSN 2041-8213. doi: 10.3847/2041-8213/ab4c99. URL http://dx.doi.org/10.3847/2041-8213/ab4c99. [41] Filipe Pereira, Tiago L Campante, Margarida S Cunha, João P Faria, Nuno C Santos, Susana C C Barros, Olivier Demangeon, James S Kuszlewicz, and Enrico Corsaro. Gaussian process modelling of granulation and oscillations in red giant stars. Monthly Notices of the Royal Astronomical Society, 489(4):5764–5774, Sep 2019. ISSN 1365-2966. doi: 10.1093/mnras/stz2405. URL http://dx.doi.org/10.1093/mnras/stz2405. [42] S. C. C. Barros, O. Demangeon, R. F. Díaz, J. Cabrera, N. C. Santos, J. P. Faria, and F. Pereira. Improving transit characterisation with gaussian process modelling of stellar variability. Astronomy Astrophysics, 634: A75, Feb 2020. ISSN 1432-0746. doi: 10.1051/0004-6361/201936086. URL http://dx.doi.org/10.1051/ 0004-6361/201936086. [43] Trisha A. Hinners, Kevin Tat, and Rachel Thorp. Machine learning techniques for stellar light curve classification. The Astronomical Journal, 156(1):7, Jun 2018. ISSN 1538-3881. doi: 10.3847/1538-3881/ aac16d. URL http://dx.doi.org/10.3847/1538-3881/aac16d. [44] S. N. Breton, L. Bugnet, A. R. G. Santos, A. Le Saux, S. Mathur, P. L. Palle, and R. A. Garcia. Determining surface rotation periods of solar-like stars observed by the kepler mission using machine learning techniques, 2019.

[45] Yuxi, Lu, Ruth Angus, Marcel A. Agüeros, Kirsten Blancato, Melissa Ness, Jason L. Curtis, and Sam Grunblatt. Astraea: Predicting long rotation periods with 27-day light curves, 2020. [46] Kirsten Blancato, Melissa Ness, Daniel Huber, Yuxi Lu, and Ruth Angus. Data-driven derivation of stellar properties from photometric time series data using convolutional neural networks, 2020.

[47] John Salvatier1, Thomas V. Wiecki2, Christopher Fonnesbeck3, Bastien F, Lamblin P, Pascanu R, Bergstra J, Goodfellow I, Bergeron A, Bouchard N, and et al. Probabilistic programming in python using pymc3, Apr 2016. URL https://peerj.com/articles/cs-55/. [48] J. Harvey. High-Resolution Helioseismology. In Erica Rolfe and Bruce Battrick, editors, Future Missions in Solar, Heliospheric & Space Plasma Physics, volume 235 of ESA Special Publication, page 199, June 1985.

[49] Dan Foreman-Mackey. Api documentation, 2020. URL https://exoplanet.readthedocs.io/en/stable/ user/api/. [50] Raf Vandebril. Definition and representation of semiseparable matrices, 2004. URL https://people.cs. kuleuven.be/~raf.vandebril/homepage/publications/papers_html/stoss/node2.html. [51] Zhifeng Sheng, Patrick Dewilde, and Shivkumar Chandrasekaran. Algorithms to solve hierar- chically semi-separable systems, Jan 1970. URL https://link.springer.com/chapter/10.1007/ 978-3-7643-8137-0_5. [52] Jan 2019. URL https://www.geeksforgeeks.org/cholesky-decomposition-matrix-decomposition/. REFERENCES REFERENCES

[53] David M. Kipping. Binning is sinning: morphological light-curve distortions due to finite integration time. Monthly Notices of the Royal Astronomical Society, 408(3):1758–1769, Aug 2010. ISSN 0035-8711. doi: 10.1111/j.1365-2966.2010.17242.x. URL http://dx.doi.org/10.1111/j.1365-2966.2010.17242.x. [54] D. M. Kipping. Efficient, uninformative sampling of limb darkening coefficients for two-parameter laws. , 435:2152–2160, November 2013. doi: 10.1093/mnras/stt1435. [55] Matthew D. Hoffman and Andrew Gelman. The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo, 2011.

[56] Box least squares (bls) periodogram¶, 2020. URL https://docs.astropy.org/en/stable/timeseries/ bls.html. [57] Lomb-scargle periodograms¶. URL https://docs.astropy.org/en/stable/timeseries/lombscargle. html. [58] Theano Development Team. Theano: A Python framework for fast computation of mathematical expres- sions. arXiv e-prints, abs/1605.02688, May 2016. URL http://arxiv.org/abs/1605.02688. [59] Astropy Collaboration, T. P. Robitaille, E. J. Tollerud, P. Greenfield, M. Droettboom, E. Bray, T. Aldcroft, M. Davis, A. Ginsburg, A. M. Price-Whelan, W. E. Kerzendorf, A. Conley, N. Crighton, K. Barbary, D. Muna, H. Ferguson, F. Grollier, M. M. Parikh, P. H. Nair, H. M. Unther, C. Deil, J. Woillez, S. Conseil, R. Kramer, J. E. H. Turner, L. Singer, R. Fox, B. A. Weaver, V. Zabalza, Z. I. Edwards, K. Azalee Bostroem, D. J. Burke, A. R. Casey, S. M. Crawford, N. Dencheva, J. Ely, T. Jenness, K. Labrie, P. L. Lim, F. Pierfederici, A. Pontzen, A. Ptak, B. Refsdal, M. Servillat, and O. Streicher. Astropy: A community Python package for astronomy. , 558:A33, October 2013. doi: 10.1051/0004-6361/201322068.

[60] Astropy Collaboration, A. M. Price-Whelan, B. M. Sipőcz, H. M. Günther, P. L. Lim, S. M. Crawford, S. Conseil, D. L. Shupe, M. W. Craig, N. Dencheva, A. Ginsburg, J. T. VanderPlas, L. D. Bradley, D. Pérez- Suárez, M. de Val-Borro, T. L. Aldcroft, K. L. Cruz, T. P. Robitaille, E. J. Tollerud, C. Ardelean, T. Babej, Y. P. Bach, M. Bachetti, A. V. Bakanov, S. P. Bamford, G. Barentsen, P. Barmby, A. Baumbach, K. L. Berry, F. Biscani, M. Boquien, K. A. Bostroem, L. G. Bouma, G. B. Brammer, E. M. Bray, H. Breyten- bach, H. Buddelmeijer, D. J. Burke, G. Calderone, J. L. Cano Rodríguez, M. Cara, J. V. M. Cardoso, S. Cheedella, Y. Copin, L. Corrales, D. Crichton, D. D’Avella, C. Deil, É. Depagne, J. P. Dietrich, A. Do- nath, M. Droettboom, N. Earl, T. Erben, S. Fabbro, L. A. Ferreira, T. Finethy, R. T. Fox, L. H. Garrison, S. L. J. Gibbons, D. A. Goldstein, R. Gommers, J. P. Greco, P. Greenfield, A. M. Groener, F. Grollier, A. Hagen, P. Hirst, D. Homeier, A. J. Horton, G. Hosseinzadeh, L. Hu, J. S. Hunkeler, Ž. Ivezić, A. Jain, T. Jenness, G. Kanarek, S. Kendrew, N. S. Kern, W. E. Kerzendorf, A. Khvalko, J. King, D. Kirkby, A. M. Kulkarni, A. Kumar, A. Lee, D. Lenz, S. P. Littlefair, Z. Ma, D. M. Macleod, M. Mastropietro, C. McCully, S. Montagnac, B. M. Morris, M. Mueller, S. J. Mumford, D. Muna, N. A. Murphy, S. Nelson, G. H. Nguyen, J. P. Ninan, M. Nöthe, S. Ogaz, S. Oh, J. K. Parejko, N. Parley, S. Pascual, R. Patil, A. A. Patil, A. L. Plunkett, J. X. Prochaska, T. Rastogi, V. Reddy Janga, J. Sabater, P. Sakurikar, M. Seifert, L. E. Sherbert, H. Sherwood-Taylor, A. Y. Shih, J. Sick, M. T. Silbiger, S. Singanamalla, L. P. Singer, P. H. Sladen, K. A. Sooley, S. Sornarajah, O. Streicher, P. Teuben, S. W. Thomas, G. R. Tremblay, J. E. H. Turner, V. Terrón, M. H. van Kerkwijk, A. de la Vega, L. L. Watkins, B. A. Weaver, J. B. Whitmore, J. Woillez, V. Zabalza, and Astropy Contributors. The Astropy Project: Building an Open-science Project and Status of the v2.0 Core Package. , 156:123, September 2018. doi: 10.3847/1538-3881/aabc4f. [61] Eric Agol, Rodrigo Luger, and Daniel Foreman-Mackey. Analytic Planetary Transit Light Curves and Derivatives for Stars with Polynomial Limb Darkening. , 159(3):123, March 2020. doi: 10.3847/1538-3881/ ab4fee.

[62] Fitting tess data. URL https://gallery.exoplanet.codes/en/latest/tutorials/tess/. [63] Quick fits for tess light curves. URL https://gallery.exoplanet.codes/en/latest/tutorials/ quick-tess/.

[64] Gaussian process models for stellar variability. URL https://gallery.exoplanet.codes/en/latest/ tutorials/stellar-variability/. [65] Jon M. Jenkins, Joseph D. Twicken, Sean McCauliff, Jennifer Campbell, Dwight Sanderfer, David Lung, Masoud Mansouri-Samani, Forrest Girouard, Peter Tenenbaum, Todd Klaus, Jeffrey C. Smith, Douglas A. Caldwell, A. D. Chacon, Christopher Henze, Cory Heiges, David W. Latham, Edward Morgan, Daryl Swade, Stephen Rinehart, and Roland Vanderspek. The TESS science processing operations center. In Gianluca Chiozzi and Juan C. Guzman, editors, Software and Cyberinfrastructure for Astronomy IV, volume 9913 of REFERENCES REFERENCES

Society of Photo-Optical Instrumentation Engineers (SPIE) Conference Series, page 99133E, August 2016. doi: 10.1117/12.2233418.

[66] Tess science support center, . URL https://heasarc.gsfc.nasa.gov/docs/tess/data-access.html. [67] Lightkurve Collaboration, J. V. d. M. Cardoso, C. Hedges, M. Gully-Santiago, N. Saunders, A. M. Cody, T. Barclay, O. Hall, S. Sagear, E. Turtelboom, J. Zhang, A. Tzanidakis, K. Mighell, J. Coughlin, K. Bell, Z. Berta-Thompson, P. Williams, J. Dotson, and G. Barentsen. Lightkurve: Kepler and TESS time series analysis in Python. Astrophysics Source Code Library, December 2018.

[68] Regressioncorrector¶, 2020. URL https://docs.lightkurve.org/api/lightkurve.correctors. RegressionCorrector.html. [69] Nikolay Balov. Gelman–rubin convergence diagnostic using multiple chains, 2016. URL https://blog. stata.com/2016/05/26/gelman-rubin-convergence-diagnostic-using-multiple-chains/. [70] George Ho. Cookbook - bayesian modelling with pymc3, Jun 2018. URL https://eigenfoo.xyz/ bayesian-modelling-cookbook/. [71] Colin Carroll. Pragmatic probabilistic programming, 2019. URL https://colcarroll.github.io/hmc_ tuning_talk/. APPENDIX A

A.1 Summary Statistics

Table A.1: Summary Statistics for Kepler Rotation Periods (Prot)

RotGP ExoRotGP star corr sector Prot neff Prot rhat Prot neff Prot rhat Kepler-107 pdcsap [2] 434 1.001732 713 1.000793 Kepler-155 pdcsap [5] 363 1.002341 351 1.000044 Kepler-17 pdcsap [1] 2434 1.000449 133 1.011821 Kepler-39 pdcsap [3] 1777 1.000684 2079 0.999797 Kepler-43 pdcsap [1] 273 1.002061 38 1.021115 Kepler-45 pdcsap [1] 16 1.090170 2179 0.999874 Kepler-78 pdcsap [1] 2307 0.999762 2426 0.999785 Kepler-75 pdcsap [2] 735 1.008040 1018 1.001703 Kepler-96 pdcsap [2] 1599 0.999750 909 1.001857

Table A.2: Summary Statistics for Unambiguous TESS Rotation Periods (Prot)

RotGP ExoRotGP corr sector Prot neff Prot rhat Prot neff Prot rhat CoRoT-18 pdcsap [6] 152 1.007700 8 1.182371 HAT-P 11 reg [14, 15] 524 0.999844 1565 0.999844 HIP-65 A pdcsap [1] 2119 1.001054 2390 1.000286 WASP-140 pdcsap [4, 5] 132 1.000452 2478 1.000727 WASP-167 reg [10] 5 1.143132 699 0.999900 WASP-173 A pdcsap [2] 1895 0.999940 26 1.025491 WASP-8 pdcsap [2] 48 1.024741 106 1.000023

Table A.3: Summary Statistics for Ambiguous TESS Rotation Periods (Prot)

RotGP ExoRotGP star corr sector Prot neff Prot rhat Prot neff Prot rhat HATS-16 reg [2] 493 1.000816 141 0.999754 HATS-18 reg [10] 487 1.001161 228 1.013279 HATS-47 reg [13] 94 1.018849 50 1.002597 Qatar-1 reg [24, 25] 556 1.000078 1294 0.999874 WASP-166 reg [8] 514 0.999760 NaN NaN WASP-19 reg [9] 107 1.000251 126 1.001053 WASP-93 pdcsap [17] 109 1.000347 NaN NaN

Table A.4: Summary Statistics for Radii of Kepler Planets (Rjup)

ExoGP ExoRotGP planet corr sector Rjup neff Rjup rhat Rjup neff radj rhat Kepler-107 e pdcsap [2] 1180 1.000103 2040 0.999818 Kepler-155 b pdcsap [5] 1361 1.000431 2265 0.999824 Kepler-17 b pdcsap [1] 349 1.000426 406 0.999761 Kepler-39 b pdcsap [3] 1597 0.999921 1710 1.000192 Kepler-43 b pdcsap [1] 990 1.000297 1409 1.000207 Kepler-45 b pdcsap [1] 1078 0.999863 897 1.000380 Kepler-78 b pdcsap [1] 209 0.999896 328 0.999752 Kepler-75 b pdcsap [2] 116 1.024303 1171 1.000372 Kepler-96 b pdcsap [2] 2610 1.000187 2016 1.000044

59 A.2. POWER SPECTRUM OF CELERITE KERNEL AND SHO APPENDIX A.

Table A.5: Summary Statistics for Radii of TESS Planets (Rjup)

ExoGP (bins=15) ExoRotGP (bins=15) ExoGP ExoRotGP star corr sector Rjup neff Rjup rhat Rjup neff Rjup rhat Rjup neff Rjup rhat Rjup neff Rjup rhat CoRoT-18 b reg [6] 794 0.999752 775 1.000461 2262 1.001800 2521 1.002270 HAT-P 11 b reg [14, 15] 746 1.003343 272 1.001698 578 0.999989 918 0.999891 HATS-16 b reg [2] 2536 0.999779 708 1.001517 2857 0.999792 3037 1.000029 HATS-18 b reg [10] 60 1.010132 837 1.001517 2743 0.999757 3074 1.000146 HATS-47 b reg [13] NaN NaN 181 1.016623 724 1.000079 76 1.025100 HIP-65 A b pdcsap [1] 290 1.003998 145 1.002125 406 1.000625 325 0.999894 Qatar-1 b reg [24, 25] 1743 0.999806 2390 1.000229 1327 1.000013 1843 0.999767 WASP-140 b pdcsap [4, 5] 996 0.999862 620 1.000319 2052 0.999783 285 1.016361 WASP-166 b reg [8] 849 1.000569 1283 NaN 1949 NaN NaN NaN WASP-167 b reg [10] 1962 0.999916 2385 0.999954 2315 0.999793 2521 1.000479 WASP-173 A b pdcsap [2] 2034 0.999902 1968 1.000787 2086 0.999806 2795 0.999949 WASP-19 b reg [9] 2936 1.000040 2149 0.999850 2172 0.999752 1901 1.000084 WASP-8 b pdcsap [2] 33 1.018239 2728 1.000394 2241 0.999988 2285 0.999796 WASP-93 b pdcsap [17] 175 0.999789 243 1.012911 81 1.004389 NaN NaN

A.2 Power spectrum of Celerite kernel and SHO

Recall that the power spectrum of the celerite kernel in equation (x) is:

J r 2 2 2 X 2 (ajcj + bjdj)(cj + dj ) + (ajcj − bjdj)ω S(ω) = π ω4 + 2(c2 − d2)w2 + (c2 + d2)2 j=1 j j j j

We set the following values: aj = S0ω0Q

S0ω0Q bj = p4Q2 − 1 ω c = 0 j 2Q w0 p d = 4Q2 − 1 j 2Q We plug the values into the power spectrum of the celerite kernel:

2 2 2 2 ω0 S0ω0Q ω0 p 2 ω0 ω0 p 2 4 (ajcj + bjdj)(cj + dj ) = ((S0ω0Q)( ) + ( )( 4Q − 1))( + 4Q − 1 ) = S0ω0 2Q p4Q2 − 1 2Q 2Q 2Q

2 ω0 S0ω0Q ω0 p 2 2 (ajcj − bjdj)ω = ((S0ω0Q)( ) − ( )( 4Q − 1))ω = 0 2Q p4Q2 − 1 2Q

2 ω0 ω0 p 2 w 2(c2 − d2)ω2 = 2(( )2 − ( 4Q2 − 1) )ω2 = ω2 − 2ω2ω2 j j 2Q 2Q 0 Q2 0

ω0 ω0 p 2 (c2 + d2)2 = (( )2 + ( 4Q2 − 1) )2 = ω4 j j 2Q 2Q 0 We obtain the power spectrum of the stochastically-driven damped harmonic oscillator (SHO):

r 2 S w4 + 0 r 2 S w4 S(w) = 0 0 = 0 0 π 4 2 w2 2 2 4 π 2 2 2 2 w2 ω + ω0 Q2 − 2ω ω0 + ω0 (w − w0 ) + w0 Q2

A.3 Example of Document Outputted by Pipeline: WASP-173 A WASP-173 A tic_id: TIC 77031414 sy_snum: 2 sy_pnum: 1 st_spectype: G3 st_age: 6.78 st_teff: 5800.0 rotperiod: 7.765 martins_variability_type: unambiguous_rotation period: 1.38665318 r_pl_jup: 1.2 ecc: 0.0 b: 0.4

Stitched Lightcurve

Box Least Squares

Transit Least Squares

LS Periodogram

Autocorrelation

Outliers

MAP Parameters deltaQ: 425.1052 Q0: 4.3817 rotperiod: 14.0883 amp: 97.5902 w0: 62.6499 Sw4: 5450.7142 s2: 3.9037 omega: 1.5435 ecc: 0.0549 r_pl: 0.1253 r_pl_jup: 1.2188 ror: 0.1131 b: 0.2397 t0: 1355.1957 period: 1.3867 r_star: 1.1076 m_star: 1.0559

Computation Time: (307.0, 37.43838400000095)

Summary Statistics mean sd mc_error hpd_2.5 hpd_97.5 n_eff Rhat logdeltaQ 4.590668 9.511877 0.299745 -15.039249 22.045737 822.123311 0.99991 logQ0 5.680951 4.954875 0.352366 0.611034 16.481772 96.019511 0.999759 rotperiod 8.206696 1.027926 0.09641 7.719286 8.332843 26.947864 1.025491 logamp 4.009899 0.839737 0.024094 2.503701 5.739687 984.896011 1.000782 logw0 -1.282285 3.339007 0.248705 -7.211138 4.967853 79.991573 1.015819 logSw4 -1.701533 4.252762 0.203335 -9.166141 2.530074 300.653382 1.001006 logs2 1.366413 0.010436 0.00016 1.345374 1.386187 5057.767243 0.999807 omega 0.772399 1.723311 0.036413 -2.994627 3.120486 2211.792391 0.999984 ecc 0.147002 0.124252 0.003906 3.8e-05 0.418518 1015.798055 1.003226 r_pl 0.124205 0.005436 0.000111 0.113718 0.134642 2795.477166 0.999949 ror 0.112947 0.000774 1.8e-05 0.111484 0.114461 1948.245011 0.99979 b 0.186951 0.097663 0.002467 0.002727 0.344361 1580.989783 0.999758 t0 1355.195716 0.00024 6e-06 1355.195257 1355.196192 1951.328479 1.000253 period 1.386658 1.8e-05 0.0 1.386623 1.386693 5198.115122 0.999751 r_star 1.099654 0.046997 0.000936 1.010088 1.19234 2881.772177 0.999932 m_star 1.061854 0.07779 0.001475 0.904434 1.208179 3468.553177 0.999863 u_star__0 0.280515 0.086945 0.001807 0.108933 0.44168 2281.550885 1.000097 u_star__1 0.095316 0.169147 0.003822 -0.212426 0.406911 1963.602632 0.99988 mean -0.315829 4.870473 0.228118 -12.37907 9.140438 410.305351 1.001661

Results param gp_mean gp_error ref_mean ref_error ref_name deltaQ 98.56029 (48872.70152, 6060.08864)

Q0 293.22826 (3.93068, 1371.45992) rotperiod 8.2067 (0.0981, 0.10144) 7.765 (1.37, 1.37) Martins et al. amp 55.14132 (2.08188, 2.46412) w0 0.2774 (7.47954, 4.00825)

Sw4 0.1824 (46.00484, 49.45206) s2 3.92126 (1.01096, 1.01045) omega 0.7724 (1.37878, 1.79047) ecc 0.147 (0.07852, 0.14269) 0.0 Hellier et al. 2019 r_pl_jup 1.20871 (0.05167, 0.05449) 1.2 (0.06, 0.06) Hellier et al. 2019 ror 0.11295 (0.00079, 0.00079) b 0.18695 (0.11862, 0.09956) 0.4 (0.08, 0.08) Hellier et al. 2019 t0 1355.19572 (0.00024, 0.00022) period 1.38666 (2e-05, 2e-05) 1.3866531 (0.0, 0.0) Hellier et al. 2019 8 r_star 1.09965 (0.04588, 0.0494) 1.11 (0.05, 0.05) Hellier et al. 2019 m_star 1.06185 (0.08247, 0.07971) 1.05 (0.08, 0.08) Hellier et al. 2019 u_star__0 0.28052 (0.09392, 0.08693) u_star__1 0.09532 (0.16884, 0.18193) mean -0.31583 (3.96915, 2.28509) Parameter Plots

Model Components

Residuals

Corner Plots

MACHINE LEARNING TO IMPROVE THE CHARACTERISATION OF STELLAR ACTIVITY AND EXOPLANET TRANSITS IN NOISY LIGHTCURVES FROM TESS AND KEPLER

V I C T O R I A F O I N G 1 ( V I C T O R I A . F O I N G @ S T U D E N T . U V A . N L ) , A N A H E R A S 2, B E R N A R D F O I N G 2 ( 1U V A , 2E S T E C )

ABSTRACT METHODOLOGY & RESULTS

Gaussian Processes (GPs) can be useful for the task of modelling stellar and A Gaussian Process (GP) is used to model the stellar activity, background

instrumental systematics in lightcurves because they can identify patterns in granulation, and exoplanet transits simultaneously. data without prior knowledge of the functional form of the model [1][2]. Few The GP is implemented using the exoplanet and PyMC3 toolkits, which allow studies have applied Gaussian Processes to describe the stellar activity and for fast and scalable computation [3][4][5]. the transit signals in TESS data. Considering the novelty of the TESS data and The GP consists of: previous experience from Kepler, this thesis work is focusing on applying this A RotationTerm kernel to describe rotational modulation [1][2][3]. machine learning method to TESS and Kepler lightcurves. We seek to answer A SHOTerm kernel to describe background granulation [1][2][3]. the following research questions: A mean transit model to describe transit signals [6][7][8]. The GP parameters are initialized with traditional methods (see Fig. 2). RQ1: How accurately can we model the stellar activity and transit signals in TESS The GP is optimized to find the MAP parameters (see Fig. 3). and Kepler lightcurves with Gaussian Processes? Markov Chain Monte Carlo (MCMC) is run to approximate the posterior RQ2: To what extent can this method interpret the rotation periods of the stars? distributions of the parameters of interest (see Fig. 4) [4]: RQ3: To what extent can this method improve transit exoplanet characterization? Rotation parameters: rotation period, amplitude, quality factor Planet parameters: transit period, radius of planet, time of first transit, TESS lightcurve of WASP-62 (TIC 149603524) impact parameter, eccentricity

Figure 1: TESS lightcurve of WASP-62 Sector 1 (~30 days) with 30 minute cadence (binsize=15) [9]. WASP-62 exhibits rotational modulation and has a large planet with 1.32 Jupiter radius orbiting around it every 4.41 days. The green regression corrected version of the lightcurve is selected because it contains the least amount of instrumental noise while preserving the stellar activity signatures.

Initializing the parameters of the GP

Lomb-Scargle (LS) Periodogram Box Least Squares (BLS) Autocorrelation Function (ACF)

Figure 2a (left): LS is applied to the lightcurve to get the initial rotation period estimate: rotation period = 2.20 days (note: other peaks at 4.41 days and 5.88 days) [3]. Figure 2b (center): BLS is applied to the lightcurve to get the initial transit parameter estimates: transit period = 4.41 days, time of first transit = 1326, transit depth = 0.0132 [3]. Figure 2c (right): ACF is applied to the lightcurve to demonstrate that the transit period estimate is probably correct but the rotation period estimate is probably incorrect, because it is the harmonic of the transit period. The rotation period estimate of 5.88 days in the LS periodogram is more suitable.

Optimizing the GP to find the MAP parameters Sampling the posterior with MCMC

Figure 3: After optimization, the model can separate stellar activity (green) and transit signals (blue) [3][4].

REFERENCES

[1] Foreman-Mackey et al., 2017, [7] Kipping, D.M., 2013,mnras 435, 2152-2160 The Astronomical Journal 154, 1538-3881 [8] Agol et al., 2019, arXiv, 1908.03222 [2] Barros et al., 2020, Astronomy & Astrophysics 634 [9] Lightkurve Collaboration, 2018, ascl, 1812.013 [3] Foreman-Mackey et al., 2020, 10.5281/zenodo.1998447 [10] Astropy Collaboration, 2015, aap 558, A33 [4] Salvatier et al., 2016, PeerJ CS 2, e55 Figure 4: Planet parameters after MCMC sampling [3][4][10]. Acknowledgements: I thank my supervisors at the ESA [5] Theano, 2016, arXiv 1605.02688 Good convergence for transit period, planet radius (r_pl_jup), first transit (t0). [6] Luger et al., 2019, aj 157, 64 and the UvA (Patrick Forre) for their support; Weaker convergence for impact parameter (b) and eccentricity (ecc).