Sophie Maurogordato
Total Page:16
File Type:pdf, Size:1020Kb
261 STATISTICAL ANALYSIS OF WIDE-ANGLE GALAXY SURVEYS Sophie Maurogordato Centre National de la Recherche ScientifiqueU A 173 Laboratoire d'Astrophysique Extragalactique et de Cosmologie Observatoire de Meudon, 92195 Meudon, France Paris - Abstract Our knowledge of large scale clustering has improved very rapidly in the last decade due to the increasing width and depth of galaxy redshift surveys. I will review hereafter the main statistical tools which have been applied to quantify this clustering. However three main problems are to be solved in order to reconstruct the mass density field from observations and compare to the prediction from models : redshift distortions, biasing, and non-linear effects. High-order correlations are necessary to describethe present day galaxy distribution and show a hierarchical relation which is predicted for the mass distribution resulting from gravitational evolution of Gaussian initial fluctuations. In future, the next generation of galaxy surveys, combined to small-angle measurements of the fluctuations will allow to constrain more tightly the models of galaxy formation. CMB 262 1 Introduction The history of large-scale clustering in the Universe has undergone a rapid evolution in the last decades. Several groups have focusedon the more salient high density structures evidenced from the angular distribution on the sky and verified their reality by intensive redshift measurements. Under-dense regions have been discovered too, the most spectacular one being the 'Bootes Void' (Kirschner et al. 1981) which spreads over a diameter of 6000kms-1• On the other side, the statistical analysis of the projected galaxy distribution under the care of Peebles began to give substantial results (He.user and Peebles 1973, Davis and Peebles 1977, Fry and Peebles 1978). The need forrepresentative three-dimensional samples became urgent and led to the completion of flux-limitedredshift catalogs which should allow a real quantitative analysis of the frequency and size of structures in the Universe. The CfA redshift survey was then completed (Huchra et al. 1983), followed by the Southern Sky Redshift Survey (da Costa et al. 1991), the Pisces-Perseus survey (Giovanelli and Haynes 1991) and their respective extensions to fainter magnitudes ( de Lapparent et al. 1986, Geller and Huchra 1989, da Costa et al. 1994). The APM 2D catalog (Maddox et al. 1990) has provided the positions of 2 million galaxies on the sky, and is followed by the Stromlo-APM redshift survey (sparse-sampling of in 10, Loveday l et al. 1992). From the IRAS point source catalog, "whole sky" redshift surveys have been progressively completed at fainter limiting flux: the 2Jy (Strauss et al.1990), 1.2 Jy (Fisher et al 1992), and the 1-in-6 sparse sampled QDOT survey limited at 0.6 Jy (Lawrence et al. 1995). While the wide-angle surveys give a detailed vision of the structure of the nearby Universe, the 'pencil-beam' surveys, complete to much fainter magnitudes on a very small area of the sky, allow to probe very deep regions of the Universe (see the review by Valerie de Lapparent, this conference). The recent development of distance indicators independent of redshift gives access to the velocity and to the dynamical density field. The statistical indicators computed from the different surveys are then confronted to models of galaxy formation, allowing to set constraints on the nature of dark matter ingredients, on the value of and on the 'biasing' l1 mechanism. I will not enter into details about the description of the galaxy catalogs available today in this paper as it was covered previously (Maurogordato 1994), and will address rapidly the statistical analysis and evocate the difficultiesencountered along the chain of processes to compare the data to the predictions of models. For a deeper and complete analysis, see the detailed review by George Efstathiou (1995). On flux-limited 2D and 3D catalogs, various statistical tools are applied in order to charac terize the density distribution. From the data, one has access to the distribution of galaxies in redshift space. In order to test various models which give predictions on the mass distribution, the first task to achieve is to understand the relation between the _galaxyand the mass density distribution. As a first approximation, the fluctuations of the density field of mass and galaxy are often assumed to be related by a linear relation: op9 /p9 bOpM/PM, (linear biasing). This = assumption embodies the standard model for biased galaxy formation, but several studies in which feedback mechanisms cause the efficiency of galaxy formation to be modulated by en vironment effects could lead to a more complicated relation between the mass and the _galaxy density fields. A second problem to deal with is that the positions of galaxies are derived via 263 the Hubble law from the radial component of their velocity. Thus, the peculiar motions of galaxies will produce systematic errors on the positions and then distortions in the clustering pattern. I will first focus on the methods of estimating the well-known two-point correlation function, and give some of its determinations both from angular and redshift catalogs. In the following sections, I will discuss the effects of redshift distortions and of biasing mechanisms, which are essential to model in order to reach the mass distribution. I will then shift to Fourier Space with a confrontation of recent estimates of the power spectrum for different catalogs to the predictions of various models of galaxy formation once their amplitude is normalized to COBE measurements. High-order correlations will then be addressed through related statistics as the void probability function and the counts in cells. These promising tools can hopefully set constraints on the type of initial conditions and on the various biasing mechanisms. The two-point correlation function 2 The two-point correlation function has been the most popular statistical indicator since it was introduced by Peebles in the 70's to analyze the galaxy distribution. The spatial two-point correlation function �(r) is directly connected to the power-spectrum by a Fourier transform. On 2D catalogs, one can measure the projected angular two-point correlation function w(B). As it has the property to scale with the depth of the survey D as w( B) W(BD) where Wis = n-1 a function only of the shape of the catalog, a comparison of w( B) for 2D catalogs with different limit in flux is possible. Moreover, under some specific assumptions, it can be de-projected via the Limber equation to derive the spatial correlation function �(r). Although some dilution on the statistics does occur because of the projection on the sky, the 2D estimates are measured with great accuracy as the number of objects used for the statistical analysis is large, and have the great advantage of being independent of redshift distortions (see next section). The first estimate of B) was performed on the Lick catalog of counts by Groth and Peebles w( in 1977, who found a clear power law behavior w(B) 01-� with 1.77. The APM catalog ex I= (Maddox et al. 1990), providing the angular positions of about 2 million galaxies, has allowed to improve the precision on w( B) and thanks lo its depth to measure it up to large angular separations ( B 20°). The power law is confirmed up to 3° with a slope 1.668. Above 3°, = I = a break is apparent in w(B). At large scales, the data show an excess of power which is one of the most serious challenge to the standard CDM model. The avaibility of flux-limitedredshift catalogs has allowed to estimate directly the direction- average d re d s h'ft corre1 at10n· funct10n · " s ) w ]iere (V2 V2-2V1 V2 cos 1 2 The two-pomt " + B12) 1 • I ( s = ' 2 • correlation function in redshift-space is commonly estimated by weightedHo counts of the galaxy galaxy pairs DD(s) and of the galaxy-random pairs DR(s) in a random catalog with the same ge ometry as the galaxy one in order to account directly foredge effects: DD( s) = Li Lj WiWj NDD and DR(s) Then, 1 + ((s) (Davis and Peebles 1983). = Li Lj WiWjNDR· = ��f:j;;; However, in presently available catalogs, the mean density is still fluctuating up to scales of the order of the sample size . With the previous estimator, the uncertainty on �(s) grows as fast as the relative over-density The correlation function is then very difficult to measure i5 = �ii . 264 correctly at large scales where fluctuations are small compared to the mean and the accuracy is limited by the uncertainty on the mean density. Hamilton (1993a) introduced a new estimator: 1 + f(s) = D1i}�f!r(•) (RR(s) is the number of pairs in the random catalog defined as above) whose dependence on the uncertainty in the mean density is quadratic and not linear in lowest order. In this case, the statistical uncertainty is limited by pair counts. Besides to the choice of the most accurate estimator, several weighting schemes have been advocated : the uniform weighting , the selection function weighting, and the minimum variance weighting. The uniform weighting is generally used when working on complete sub-samples limited both in distance and in absolute magnitude, so of roughly constant density. When analyzing magnitude-limited samples, one has to take into account the fall-off of the density with radial distance. In this case, the uniformweighting gives too much weight to foreground galaxies. A classical weighting is to multiply the contribution of each galaxy by the inverse of the selection function (Davis and Peebles, 1983) l/<P(r). This however gives quite a high weight to distant pairs and w; = increases the white noise for small separations.