Bayesvp: a Bayesian Voigt Profile Fitting Package

Mon. Not. R. Astron. Soc. 000, 1–?? (2015) Printed 30 October 2017 (MN LATEX style file v2.2) BayesVP: a Bayesian Voigt profile fitting package Cameron J. Liang1? and Andrey V. Kravtsov1;2 1Department of Astronomy & Astrophysics, and Kavli Institute for Cosmological Physics, University of Chicago, Chicago IL 60637 2Enrico Fermi Institute, The University of Chicago, Chicago, IL 60637, USA 30 October 2017 ABSTRACT We introduce a Bayesian approach for modeling Voigt profiles in absorption spectroscopy and its implementation in the python package, BayesVP , publicly available at https://github.com/cameronliang/BayesVP. The code fits the absorption line profiles within specified wavelength ranges and generates posterior distributions for the column density, Doppler parameter, and redshifts of the corresponding absorbers. The code uses publicly available efficient parallel sampling packages to sample posterior and thus can be run on parallel platforms. BayesVP supports simultaneous fitting for multiple absorption components in high-dimensional parameter space. We provide other useful utilities in the package, such as explicit specification of priors of model parameters, continuum model, Bayesian model comparison criteria, and posterior sampling convergence check. Key words: methods:data analysis – quasars:absorption lines 1 INTRODUCTION Therefore, uncertainties of the best-fit parameters can be under- or overestimate estimated in a coarse grid of the parameter space. Fur- Absorption line spectroscopy is undoubtedly one of the most thermore, the grid search becomes prohibitively computationally important tools for probing the physical properties of absorbing gas expensive or impossible when fitting multiple absorption compo- in a variety of astrophysical environments. Spectral analyses are nents because the number of parameters and thus the number of ubiquitous in astrophysics and have a long and rich history in stud- dimensions of parameter space is large. Thus, one cannot reliably ies of stars, galaxies, gas, cosmology and more. and rigorously find the best fit model for physical properties of the One of the basic tasks in such analyses is identification and gas using the optimal number of absorption components. This can characterization of spectral absorption lines. The most common affect our physical interpretation of the kinematics of the gas (i.e., way to do the latter is to fit the Voigt profile (VP) to the absorp- number of velocity components and the total dispersion) and the tion profile and then use its parameters to extract basic properties column density distribution function (for example, see Appendix of the absorbing gas, such as column density, Doppler parameter, in Gurvich et al. 2017). and central velocity/redshift. arXiv:1710.09852v1 [astro-ph.GA] 26 Oct 2017 A number of codes for modeling absorption lines with the Motivated by these problems, we developed a Bayesian ap- Voigt profile is available publicly. For example, VPFIT is a popular proach to the Voigt profile fitting combined with efficient algo- χ2-based code commonly used in the community over many years rithms for sampling posterior distribution of parameter values. (Carswell et al. 1991). autoVP is an another χ2 grid-search based We introduce an implementation of such method in the python code, an automated Voigt profile fitting procedure that chooses the package BayesVP, which simultaneously addresses these issues best number of absorption components, designed for Lyα forest and provides additional benefits. These include explicit control of statistics (Dave´ et al. 1997). the priors of parameters and providing constraints in a form of While these codes have provided immense values to the com- posterior distributions, which allows uniform treatment of detec- munity, there are fundamental limitations in methods based on χ2 tions and non-detections. Using efficient parallel sampling meth- grid-search in the Voigt profile fitting. For example, such methods ods, BayesVP can fully explore high-dimensional parameter space cannot constrain model parameters of non-detections in a statisti- efficiently, providing more accurate parameters uncertainties and cally rigorous manner. The methods make very specific assump- their degeneracies. This allows us to reliably find the best number tions about the likelihood and the nature of the flux errors. In ad- of absorption components and to include an additional continuum dition, it is challenging to capture the degeneracy between column model. One can also explicitly compare models using Bayesian density and Doppler parameter when absorption lines are saturated. model comparison criteria. In section 2, we describe the Bayesian approach to absorption line fitting and our specific implementation of this approach in the BayesVP package. We include a short tu- ? E-mail:[email protected] torial on the basic usage of the package in the Appendix. c 2015 RAS 2 Liang & Kravtsov 2 THE PACKAGE: BAYESVP 2.2 Voigt Profile Model 2.1 Posterior Distribution and sampling The Voigt profile is parameterized by the absorber column density N, Doppler parameter b, and redshift z. The normalized In the Bayesian approach we constrain the posterior distribu- flux at frequency ν (or wavelength λ) is simply, tion, π(θ~), of model parameters vector of column density, Doppler broadening factor, and redshift, θ~ = fN; b; zg, given the vector of F (νjN; b; z) = exp[−τ(ν)] (5) spectral fluxes and their uncertainties, F~ . The joint posterior distri- where the optical depth τ at frequency ν of a system with (N; b; z) bution is simply proportional to the product of likelihood L(F~ jθ~) is: and the prior pdf of the parameters: τ(νjN; b; z) = Nσ0foscΦ(νjb; z) (6) p(θ~jF~ ) /L(θ~jF~ )p(θ~); (1) and Φ(ν) is the convolution of a Gaussian profile (due to the where the total likelihood L is the product of all likelihoods of the Doppler broadening) and a Lorentzian profile (due to pressure individual pixels within a spectral segment of interests: broadening). For computation speed, we compute the Voigt profile Y function Φ(ν) via the real part of the Faddeeva function w(x + iy) L(θ~jF~ ) = ì(θ~jFi): (2) SciPy 3 i routine in the package : <[w(x + iy)] Using methods, such as Markov Chain Monte Carlo (MCMC), that Φ(νjb; z) = p (7) allow sampling of unnormalized distributions, we can sample the πνD unnormalized posterior: where x = ∆ν=∆νD and y = Γ=(4π∆νD). Also, ∆ν = ν − p ~ ~ ~ −1 πe2 π θjF~ L θjF~ p θ ; (3) ν0(1 + z) , ∆νD = bν=c. In addition, σ0 = 2 is the cross ( ) = ( ) ( ) mec section. The individual likelihood li of a given pixel i at Fi depends Collectively, the set of atomic constants fe; me; fosc; ν0; Γg on the noise model. The specific noise characteristic depends on are the charge and mass of the electron, oscillator strength, rest- properties of telescopes and their instruments. In the limit of a large frame frequency and damping coefficient of an atomic transition. number of photons, we usually assume that li is a Gaussian: We include atomic parameters from the UV to optical wavelength " 2 # regimes compiled from Morton (2003). Any additional transitions 1 (Fî(θ~) − Fi) ì(θ~jFi) = exp − (4) can be easily added in the supplementary data file included in p 2 2σ2 2πσi i BayesVP . where σi is the uncertainty of the flux Fi at pixel i. However, the noise model can be modified and a different likelihood can be 2.3 Continuum model used for low photon counts (e.g., Poisson distribution). Although currently BayesVP implements only Gaussian likelihood, this can We implement a polynomial continuum model in addition be easily changed to a different likelihood by supplying a dif- to the Voigt profile model fitting. The addition of the continuum ferent input log-likelihood function to the Posterior class in model moves some of the error budget from systematic uncertain- Likelihood.py. In BayesVP , we also provide an easy way to ties to statistical uncertainties. The continuum is parametrized as a specify explicit priors based on knowledge of the specific problem polynomial in BayesVP: in mind. X ¯ i ¯ We sample the posterior distribution defined by eq. 3 using C(λjai) = ai(λ − λ) + F (8) an efficient sampling method. This is particularly useful in the i Voigt profile fitting because it is common to fit multiple absorption components simultaneously. The number of parameters can Fmodel(λ) = C(λjai) × FVoigt(λjN; b; z) (9) thus grow quickly, since each absorption component adds three where fa g are the continuum parameters. Although BayesVP can parameters (e.g., N, b, z). In addition, many sampling methods i fit a high order polynomial, in our experience for typical wave- can be easily parallelized, which allows efficient sampling of the length ranges used in absorption line fits, linear or quadratic poly- high-dimensional parameter space. Due to blending between com- nomial are sufficient to model continuum accurately. Thus, we rec- ponents and the relationship between Doppler parameters and col- ommend fitting up to a quadratic polynomial (i.e., three continuum umn densities in the saturated regime, Voigt profile parameters can parameters). In any case, the degree of the polynomial should be be highly correlated. The simplest Metropolis-Hasting MCMC al- considerably smaller than the number of spectrum pixels domi- gorithm that samples the parameter space with isotropic steps is nated by continuum. thus not optimal. We thus use two MCMC sampling algorithms Note that we have pivoted the polynomial fit from the ob- designed to efficiently sample narrow distributions of highly cor- served wavelength λ to (λ − λ¯), where λ¯ and F¯ are the median related parameters and in high dimensional spaces. In particular, wavelength and flux of the input data. In Figure 1, we show a test BayesVP is set up to use the affine-invariant ensemble sampling that includes a continuum model in a synthetic absorption line, algorithm of (Goodman & Weare 2010) implemented in the emcee where BayesVP successfully captures the correct input parame- python package (Foreman-Mackey et al.

Bayesvp: a Bayesian Voigt Profile Fitting Package

Field Guide to Continuous Probability Distributions

Detection of Voigt Spectral Line Profiles of Hydrogen Radio

Integro-Differential Equations to Model Generalized Voigt Profiles

Detecting Damped Ly&Thinsp;Α Absorbers with Gaussian Processes

Evaluation of XRF Spectra from Basics to Advanced Systems

Mellin-Barnes Integrals for Stable Distributions and Their Convolutions

Deconvolution of Spectral Voigt Profiles Using Inverse Methods And

Coupling Nuclear Induced Phonon Propagation with Conversion Electron Mössbauer Spectroscopy

Stochastic Methods – Definitions, Random Variables, Distributions

Signal Processing Manuscript Draft Manuscript Number

Theoretical Model of Diffraction Line Profiles As Combinations of Gaussian and Cauchy Distributions

Etd-11062007-190817.Pdf (3.508 Mb )