Cosmic Shear: Inference from Forward Models
Total Page:16
File Type:pdf, Size:1020Kb
Cosmic Shear: Inference from Forward Models Peter L. Taylor,1, ∗ Thomas D. Kitching,1 Justing Alsing,2, 3, 4 Benjamin D. Wandelt,2, 5, 6, 7 Stephen M. Feeney,2 and Jason D. McEwen1 1Mullard Space Science Laboratory, University College London, Holmbury St. Mary, Dorking, Surrey RH5 6NT, UK 2Center for Computational Astrophysics, Flatiron Institute, 162 5th Ave, New York City, NY 10010, USA 3Oskar Klein Centre for Cosmoparticle Physics, Stockholm University, AlbaNova, Stockholm SE-106 91, Sweden 4Imperial Centre for Inference and Cosmology, Department of Physics, Imperial College London, Blackett Laboratory, Prince Consort Road, London SW7 2AZ, UK 5Sorbonne Universit´e,CNRS, UMR 7095, Institut d'Astrophysique de Paris, 98 bis boulevard Arago, 75014 Paris, France 6Sorbonne Universit´e,Institut Lagrange de Paris (ILP), 98 bis boulevard Arago, 75014 Paris, France 7Department of Astrophysical Sciences, Princeton University, Princeton, NJ 08540, USA Density-estimation likelihood-free inference (DELFI) has recently been proposed as an effi- cient method for simulation-based cosmological parameter inference. Compared to the standard likelihood-based Markov Chain Monte Carlo (MCMC) approach, DELFI has several advantages: it is highly parallelizable, there is no need to assume a possibly incorrect functional form for the likelihood and complicated effects (e.g the mask and detector systematics) are easier to handle with forward models. In light of this, we present two DELFI pipelines to perform weak lensing parameter inference with lognormal realizations of the tomographic shear field { using the C` summary statis- tic. The first pipeline accounts for the non-Gaussianities of the shear field, intrinsic alignments and photometric-redshift error. We validate that it is accurate enough for Stage III experiments and estimate that O(1000) simulations are needed to perform inference on Stage IV data. By comparing the second DELFI pipeline, which makes no assumption about the functional form of the likelihood, with the standard MCMC approach, which assumes a Gaussian likelihood, we test the impact of the Gaussian likelihood approximation in the MCMC analysis. We find it has a negligible impact on Stage IV parameter constraints. Our pipeline is a step towards seamlessly propagating all data- processing, instrumental, theoretical and astrophysical systematics through to the final parameter constraints. I. INTRODUCTION { even using an efficient code such as TREECORR [15] { is extremely computationally demanding. Weak lensing by large scale structure offers some Apart from [16, 17], existing studies of the shear two- of the tightest constraints on cosmological parameters. point statistics [4{8] use a Gaussian likelihood analysis Over the next decade data from Stage IV experiments to infer the cosmological parameters. This approach has including Euclid1 [1], WFIRST2 [2] and LSST3 [3] will drawbacks. For example, with the improved statistical begin taking data. Extracting as much information from precision of next generation data, we will need to propa- these ground-breaking data sets, in an unbiased way, gate complicated `theoretical systematics' (e.g. reduced presents a formidable challenge. shear [18]) and detector effects [14] into the final cosmo- The majority of cosmic shear studies to date focus logical constraints. It is difficult to derive the expected on extracting information from two-point statistics and impact of these effects as is required for a likelihood in particular the correlation function, ξ(θ), in config- analysis. It is much easier to produce forward model realizations. uration space and the lensing power spectrum, C`, in spherical harmonic space [4{8]. While the non-Gaussian It has also recently been claimed that because the true information in the shear field is accessed with higher- lensing likelihood is left-skewed, not Gaussian, parame- ter constraints from correlation functions are biased low arXiv:1904.05364v2 [astro-ph.CO] 29 Jul 2019 order statistics [9, 10], peak counts [11, 12] or machine learning [13], the impact of systematics on the two-point in the σ8 −Ωm plane [19, 20]. The same argument given functions have been extensively studied [14]. For this in these papers applies to the C` statistic. More will be reason we will focus on these statistics and leave the said about this in Section III. higher-order information to a future study. In particu- To overcome these issues, a new method lar we focus on the C` statistic because computing cor- called density-estimation likelihood-free inference relation functions from catalogues with billions galaxies (DELFI) [21{27] offers a way forward. DELFI is a `likelihood-free' method similar to approximate Bayesian computation (ABC) [28], but much more computationally efficient. Using summary statistics ∗ [email protected] (the C` statistic, in this case) generated from full 1 http://euclid-ec.org forward models of the data at different points in cos- 2 https://www.nasa.gov/wfirst mological parameter space, DELFI is used to estimate 3 https://www.lsst.org the posterior distribution. 2 Performing inference on realizations of the data may non-Gaussianities of the shear field. We also estimate seem computationally challenging, but using efficient the number of simulations required for a Stage IV ex- data compression [25, 29] most applications require only periment and check to confirm that we recover the input O(1000) simulations [27]. This is less than the number cosmology from a DA1 anlysis on mock data. Next we of simulations already required to produce a valid es- discuss the feasibility of the DA1 analysis for Stage III timate of the inverse covariance matrix in a Stage IV data in Section V. In Section VI we test the impact likelihood analysis. DELFI is also highly parallelizable. of the Gaussian likelihood approximation by comparing Two additional likelihood-free methods are intro- the DA2 analysis to the LA analysis. In Section VII duced in [30, 31]. The aims of these methods respec- we discuss future prospects for DELFI in cosmic shear tively are to optimally choose points in parameter space, studies, before concluding in Section IX. reducing the number of simulations, and to infer a larger number of parameters. Nevertheless we choose to work with DELFI because cosmic shear simulations are ex- II. COSMIC SHEAR FORMALISM AND THE pensive, so we want to take advantage of parallelism LOGNORMAL FIELD APPROXIMATION and we only need to infer a small number of cosmolog- ical parameters. Methods to deal with a large number A. The Lensing Spectrum of nuisance parameters in DELFI are discussed in [32]. The goals of this paper are threefold: Assuming the Limber [33, 34], spatially-flat Uni- verse [35], flat sky [34] and equal-time correlator ap- • To develop a more realist forward model of the ij proximations [36], the lensing spectrum, C , is given shear field than the one presented in [27], includ- `;GG ing the impact of intrinsic alignments and non- by [4]: Gaussinities of the field, and determine whether Z rH ij qi(r)qj(r) ` this changes the number of simulations needed to C = dr P ; r ; (1) `;GG r2 r perform inference with DELFI. 0 where P (k; r) is the matter power spectrum and the • To test the impact of the Gaussian-likelihood as- lensing efficiency kernel, qi is defined as: sumption used in nearly all cosmic shear studies 2 Z rH 0 and in so doing test whether the data compres- 3H0 Ωm r 0 0 r − r qi(r) = 2 dr ni(r ) 0 ; (2) sion of the C` summary statistic used in DELFI is 2c a(r) r r lossless. 0 and we generate the Ntomo tomographic bins, ni(r ), by • To validate that the forward model presented in dividing the radial distribution function: this paper will be accurate enough to perform in- 2 2 − (z−0:7) − (z−1:2) ference on today's stage III data sets. a1 b2 d2 n (zp) / e 1 + e 1 ; (3) c1 To achieve these aims we develop two cosmic shear for- ward model pipelines for DELFI, using the publicly with (a1=c1; b1; d1) = (1:5=0:2; 0:32; 0:46) [37] into bins available pydelfi4 implementation, summarized in Fig- with an equal number of galaxies per bin. To account ure 1. Pipeline I takes full advantage of the benefits of for photometric redshift error, each bin is smoothed by forward modelling and is intended for application to real the Gaussian kernel: (z−c z +z )2 data-sets, while Pipeline II is intended only for compar- − cal p bias 1 2σz ison with the standard likelihood analysis. We also con- p (zjzp) ≡ e p ; (4) 2πσ (z ) sider 3 different analyses summarized in Table I. DA1 z p is a DELFI analysis using shear Pipeline I. Meanwhile with ccal = 1, zbias = 0 and σzp = A (1 + zp) with A = we compare the DELFI analysis, DA2, to the likelihood 0:05 [38]. analysis, LA, to test the impact of the Gaussian likeli- hood approximation. It is useful for the reader to refer back to Table I and Figure 1 throughout the text. B. Intrinsic Alignments The structure of this paper is as follows. The for- malism of cosmic shear and cosmological parameter in- The tidal alignment of galaxies around massive ha- ference is reviewed in Sections II-III. While DELFI has los adds two additional terms to the lensing spectrum. already been applied to cosmic shear in a simple Gaus- An `II term' accounts for the intrinsic tidal alignment sian field setting [27], in Section IV we go beyond this of galaxies around massive dark matter halos, while and present a more realistic forward model (Pipeline I) a `GI term' accounts for the anti-correlation between which includes the impact of intrinsic alignments and tidally aligned galaxies at low redshifts and weakly lensed galaxies at high redshift. We model this effect using the non-linear alignment (NLA) model [4, 39].