Vartools: a Program for Analyzing Astronomical Time-Series Data✩ J.D

Astronomy and Computing 17 (2016) 1–72 Contents lists available at ScienceDirect Astronomy and Computing journal homepage: www.elsevier.com/locate/ascom Full length article Vartools: A program for analyzing astronomical time-series dataI J.D. Hartman ∗, G.Á. Bakos Princeton University, Department of Astrophysical Sciences, 4 Ivy Lane, Princeton, NJ, 08544, USA article info a b s t r a c t Article history: This paper describes the Vartools program, which is an open-source command-line utility, written Received 3 March 2016 in C, for analyzing astronomical time-series data, especially light curves. The program provides a Accepted 21 May 2016 general-purpose set of tools for processing light curves including signal identification, filtering, light Available online 31 May 2016 curve manipulation, time conversions, and modeling and simulating light curves. Some of the routines implemented include the Generalized Lomb–Scargle periodogram, the Box-Least Squares transit search Keywords: routine, the Analysis of Variance periodogram, the Discrete Fourier Transform including the CLEAN Methods: data analysis algorithm, the Weighted Wavelet Z-Transform, light curve arithmetic, linear and non-linear optimization Methods: statistical Time of analytic functions including support for Markov Chain Monte Carlo analyses with non-trivial Techniques: photometric covariances, characterizing and/or simulating time-correlated noise, and the TFA and SYSREM filtering algorithms, among others. A mechanism is also provided for incorporating a user's own compiled processing routines into the program. Vartools is designed especially for batch processing of light curves, including built-in support for parallel processing, making it useful for large time-domain surveys such as searches for transiting planets. Several examples are provided to illustrate the use of the program. ' 2016 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). 1. Introduction a need for specialized software to analyze it. This need is expected to grow even larger as new time-domain surveys (e.g., LSST, TESS, Measuring the variation over time in the apparent brightness PLATO, HATPI, etc.) continue to come online. of a distant object is one of the primary techniques that an This paper describes Vartools, a program which provides a astronomer may use to study the universe. A few classic examples suite of tools for processing light curves, and which is designed in include using supernovae or pulsating variable stars as standard particular for handling large datasets. An early version of Vartools, candles in setting the cosmic distance ladder (e.g. Hubble, 1929; which included only a handful of period-finding tools, was briefly Riess et al., 1998; Perlmutter et al., 1999), using eclipsing described in Hartman et al.(2008) who used it in analyzing the data binaries to measure the fundamental physical properties of from a variability survey of the open cluster M37. Since that time stars (e.g. Kopal, 1959; Torres et al., 2010), observing microlensing the program has been significantly altered, motivating the more events to constrain the population of MAssive Compact Halo detailed and up-to-date description provided here. Objects (MACHOs) in the Galaxy or to identify extrasolar planets In addition to the numerous standalone software packages available which perform particular analysis tasks on light curves (e.g. Paczynski, 1986; Bond et al., 2004), and finding extrasolar (e.g., transiting planet and eclipsing binary identification and planets by searching for stars that show periodic transit events modeling software; Kovács et al., 2002; Devor, 2005), there are (e.g. Henry et al., 1999; Charbonneau et al., 2000; Konacki et al., several other software suites available for general astronomical 2003). light curve analysis. Notable examples include PyKE (Still and Developments in technology, together with scientific interest in Barclay, 2012),1 the Peranso period analysis program (Paunzen finding rare types of variability, have led to numerous massive time and Vanmunster, 2016), the Period04 program (Lenz and Breger, domain surveys, especially in the optical and infrared bandpasses 2005), Wqed which was developed for the Whole Earth Telescope (see for example Bakos et al., 2013, and references therein). Adding project (Thompson and Mullally, 2009), and the open-source VStar the time dimension to a survey leads to a massive flow of data, and program provided by the American Association of Variable Star Observers.2 I This code is registered at the ASCL with the code entry ascl:1208.016. ∗ 1 Corresponding author. http://keplerscience.arc.nasa.gov/PyKE.shtml. E-mail address: [email protected] (J.D. Hartman). 2 http://www.aavso.org/vstar-overview. http://dx.doi.org/10.1016/j.ascom.2016.05.006 2213-1337/' 2016 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/ 4.0/). 2 J.D. Hartman, G.Á. Bakos / Astronomy and Computing 17 (2016) 1–72 The Vartools programs differs from these in a few key ways. (beginning with ``Name'') are the output of the program. Colors are First, it provides more general purpose analysis routines than used to highlight commands and options passed to Vartools. the other programs. Second, unlike the other programs, it is 1 prompt > vartools -i EXAMPLES/1 -rms\ designed specifically for processing large numbers of light curves. -oneline This includes the ability to seamlessly pass light curves between 3 commands without requiring intermediate temporary files, not Name =EXAMPLES/1 requiring any human interaction to process each light curve, and 5 Mean_Mag_0 = 10.24745 native support for parallel processing of light curves. RMS_0 = 0.15944 General signal processing packages exist for data analysis 7 Expected_RMS_0 = 0.00101 platforms such as matlab, R, idl, or python, but these are Npoints_0 = 3122 typically not tailored for astronomical purposes, and in general are Alternatively, we can apply this to a list of light curves stored in collections of functions (like PyKE), requiring the user to solve a ``EXAMPLES/lc_list'' (each line in this list contains the name of a variety of bookkeeping problems, and in-so-doing to write a fair light curve file to process), and run this using 32 parallel processes amount of code, to apply them to large astronomical time domain (listing 2). A number of more complicated examples are provided datasets. All of these platforms also allow executing system in Section3. commands, so may be incorporated into pipelines built Vartools The next subsection describes the general operation of the for these platforms as well. program, paying special attention to those features relevant for The following section provides an overview of the program, batch processing. The processing algorithms which are included including a general description of its operation, and discussions of as ``commands'' in Vartools are discussed in Section 2.2, built- the processing commands and control options that are provided; in methods for extending Vartools with a user's own processing in Section3 several examples are provided; performance tests routines are discussed in Section 2.3, and options which may are presented in Section4; possible areas for future development be used to control the operation of the program are discussed are discussed in Section5. The appendices include comprehensive in Section 2.4. Performance benchmark tests are provided in listings of the input and output syntax and data formats. Section4. 2. The Vartools program 2.1. General operation The basic operation of Vartools is to read-in a light curve, 2.1.1. Input apply one or more ``commands'' (i.e. processing routines) to the A user may provide either a single light curve or an ascii list of light curve, and output resulting statistics to an ASCII table for light curve files as input to the program. Each light curve is stored later analysis. The program is written in ansi c, with regular in a separate file, either in ascii text format (one measurement per 3 updates provided on the web. This article describes version row, with separate white-space delimited columns for the time, 1.33 of the program, a static copy of which is preserved on magnitude or flux, uncertainties, and other optional information), the website. To allow maximum portability, it can be compiled or as a binary FITS table. The input list file may also be used to with basic functionality without any non-standard external provide light-curve dependent data (e.g., this could be the right libraries, however a number of features have library dependencies ascension and declination coordinates of each star) as needed by 4 (including the cfitsio library, Pence, 1999 ; the GNU scientific the various processing commands invoked by the user. There is 5 library, or GSL, Galassi et al., 2009 ; and the JPL NAIF CSPICE library, flexibility for the user to specify the format of the input files. The 6 Acton, 1996 ). Compilation and installation are carried out using user may also provide a string giving a set of commands that 7 the GNU Build System and has been successfully carried out on a are passed to the shell and applied to the input light curve files number of system architectures (Linux, Mac and Windows). While before being read-in by the program (e.g., this may be used to Vartools may be used to analyze an individual light curve of process compressed light curve files without having to store the interest, it has been developed primarily to be included in pipelines uncompressed files on the disk). which conduct batch processing of large numbers of light curves. As a simple example, listing 1 shows the use of Vartools to 2.1.2. Data flow calculate the average, standard deviation, and expected standard The input light curves are then sent to the commands in deviation of the magnitudes in the light curve stored in the file the order that they are invoked on the command-line.

Load more