<<

The Astrophysical Journal, 591:575–598, 2003 July 10 # 2003. The American Astronomical Society. All rights reserved. Printed in U.S.A.

A FAST GRIDDED METHOD FOR THE ESTIMATION OF THE POWER SPECTRUM OF THE COSMIC MICROWAVE BACKGROUND FROM INTERFEROMETER DATA WITH APPLICATION TO THE COSMIC BACKGROUND IMAGER S. T. Myers National Radio Astronomy Observatory, P.O. Box O, Socorro, NM 87801 C. R. Contaldi, J. R. Bond, U.-L. Pen, D. Pogosyan,1 and S. Prunet2 Canadian Institute for Theoretical Astrophysics, 60 St. George Street, Toronto, ON M5S 3H8, Canada and J. L. Sievers, B. S. Mason, T. J. Pearson, A. C. S. Readhead, and M. C. Shepherd Owens Valley Radio Observatory, California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125 Received 2002 May 23; accepted 2002 October 8

ABSTRACT We describe an algorithm for the extraction of the angular power spectrum of an intensity field, such as the cosmic microwave background (CMB), from interferometer data. This new method, based on the gridding of interferometer visibilities in the aperture plane followed by a maximum likelihood solution for band powers, is much faster than direct likelihood analysis of the visibilities and deals with foreground radio sources, multiple pointings, and differencing. The gridded aperture-plane estimators are also used to construct Wiener-filtered images using the signal and noise covariance matrices used in the likelihood analysis. Results are shown for simulated data. The method has been used to determine the power spectrum of the CMB from observations with the Cosmic Background Imager, and the results are given in companion papers. Subject headings: cosmic microwave background — methods: data analysis

1. INTRODUCTION Recent detections of the first few of these ‘‘ acoustic peaks ’’ at l < 1000 in the power spectrum (Lange et al. 2001; The technique of interferometry has been widely used in Hanany et al. 2000; Lee et al. 2001; Halverson et al. 2002; radio astronomy to image the sky using arrays of antennas. Netterfield et al. 2002) have supported the standard infla- By correlating the complex voltage signals between pairs of antennas, the field of view of a single element can be subdi- tionary cosmological model with tot 1. Measurement of the higher l peaks and troughs, as well as the damping tail vided into ‘‘ synthesized beams ’’ of higher angular resolu- due to the finite thickness of the last scattering surface, is the tion. In the small-angle approximation, the interferometer forms the Fourier transform of the sky convolved with the next observational step. Interferometers are well suited to the challenge of mapping out features in the CMB power autocorrelation of the aperture voltage patterns. In stan- spectrum, with a given antenna pair probing a characteristic dard radio interferometric data analysis, as described, for l proportional to the baseline length in units of the observing example, in the text by Thompson, Moran, & Swenson wavelength (a 100 projected baseline corresponds to (1986) and the proceedings of the NRAO Synthesis Imaging l 628; see 3). School (Taylor, Carilli, & Perley 1999), the correlations or x There are many papers in the literature on the analysis of visibilities are inverse Fourier transformed back to the CMB anisotropy measurements, estimation of power spec- image plane. However, there are applications such as esti- tra, and the use of interferometry for CMB studies. General mation of the angular power spectrum of fluctuations in the issues for analysis of CMB data sets are discussed in Bond, cosmic microwave background (CMB) where it is the distri- bution of and correlation between visibilities in the aperture Jaffe, & Knox (1998, 2000). Hobson, Lasenby, & Jones (1995) present a Bayesian method for the analysis of CMB or (u, v)-plane that is of most interest. interferometer data, using the three-element Cosmic Aniso- In standard cosmological models, the CMB is assumed to tropy Telescope data as a test case. A description of analysis be a statistically homogeneous Gaussian random field techniques for interferometric observations from the Degree (Bond & Efstathiou 1987). In this case, the spherical har- Angular Scale Interferometer (DASI) was presented in monics of the field are independent and the statistical prop- White et al. (1999a, 1999b), while Halverson et al. (2002) erties are determined by the power spectrum C , where l l report on the power spectrum results from the first season of labels the component of the Legendre polynomial expan- DASI observations. Ng (2001) discusses CMB interferome- sion (and is roughly in inverse radians). Bond & Efstathiou try with application to the proposed AMIBA instrument. (1987) showed that in cold dark matter–inspired cosmologi- cal models, there would be features in the CMB power spec- Hobson & Maisinger (2002) have recently presented an approach similar to ours and demonstrate their technique trum that reflected critical properties of the cosmology. on simulated (VSA) data; a brief compari- son of their algorithm with ours is given in Appendix C. In this paper we describe a fast gridded method for the (u, 1 Department of Physics, University of Alberta, Edmonton, AB T6G 2J1, Canada. v)-plane analysis of large interferometric data sets. The basis 2 Institut d’Astrophysique de Paris, 98 bis Boulevard Arago, F-75014 of this approach is to grid the visibilities and perform maxi- Paris, France. mum likelihood estimation of the power spectrum on these 575 576 MYERS ET AL. Vol. 591 compressed data. Our use of gridded estimators is signifi- the temperature field T(x) in direction x as its Fourier cantly different from what has been done previously. In transform T~ ðuÞ (Bond & Efstathiou 1987), where u is the addition to power spectrum extraction, this procedure has conjugate variable to x. We adopt the Fourier convention the ability to form optimally filtered images from the Z Z gridded estimators and may be of use in interferometric F~ðuÞ¼ d2x FðxÞe2iu x x , FðxÞ¼ d2u F~ðuÞe2iu x x observations of radio sources in general. We have applied our method to the analysis of data from ð1Þ the Cosmic Background Imager (CBI). The CBI is a planar interferometer array of 13 individual 90 cm Cassegrain of Bracewell (1986), Thompson et al. (1986), and Taylor antennas on a 6 m pointable platform (Padin et al. 2002). It et al. (1999). In terms of the multipoles l, covers the frequency range 26–36 GHz in 10 contiguous 1 DE ~ 2 1 GHz channels, with a thermal noise level of 2 lK in 6 hr and TðuÞ Cl; l þ 2 2jju ; ð2Þ a maximum resolution of 40 limited by the longest baselines. The CBI baselines probe l in the range of 500–3900. The 90 which we simplify to l ¼ 2jju for the l > 100 of interest in cm antenna diameters were chosen to maximize sensitivity, this paper. For the low levels of anisotropy seen in the CMB but their primary beam width of 45<2 (FWHM) at 31 GHz on these scales, it is convenient to give T in units of lK, and 2 limits the instantaneous field of view, which in turn limits thus Cl will be in units of lK . the resolution in l. This loss of aperture plane resolution can Because the CMB is assumed to be a statistically homoge- be overcome by making mosaic observations, i.e., observa- neous Gaussian random field, the components of its Fourier tions in which a number of adjacent pointings are combined transform are independent Gaussian deviates: (Ekers & Rots 1979; Cornwell 1988; Cornwell, Holdaway, ~ ~ 0 2 0 & Uson 1993; Sault, Staveley-Smith, & Brouw 1996). In the TðuÞT ðÞu ¼ CðjjÞu ðÞu u ; ð3Þ CBI observations, mosaicking a field several times larger where CðjjÞu ¼ C2jju . Because TðxÞ is real, its transform than the primary beam has resulted in an increase in resolu- must be Hermitian, with T~ðuÞ¼T~ ðuÞ, and therefore tion in l by more than a factor of 3, sufficient to discern features in the power spectrum. T~ ðuÞT~ ðÞu0 ¼ T~ ðuÞT~ ðÞu0 ¼ CðjjÞu 2ðÞu þ u0 : ð4Þ The first CBI results were presented in Padin et al. (2001, hereafter Paper I), using earlier versions of the software that Note that it is common to write the CMB power spectrum did not make use of (u, v)-plane gridding, and were far too Cl in a form slow to be used on larger mosaicked data sets. It was there- 2 lðl þ 1Þ l 2 fore essential to develop a more efficient analysis method Cl ¼ Cl Cl , CðjjÞu 2jju CðjjÞu ð5Þ that would be fast enough to carry out extensive tests on the 2 2 CBI mosaic data. The software package described below (White et al. 1999a; Bond et al. 1998, 2000). Constant C has been used to process the first year of CBI data. In the corresponds to equal power in equal intervals of log l. companion papers by Mason et al. (2003, hereafter Paper Although the power spectrum Cl is defined in units of II) and Pearson et al. (2003, hereafter Paper III), the results brightness temperature, the interferometer measurements 26 from passing CBI deep field data and mosaic data, respec- carry the units of flux density S (jansky, 1 Jy ¼ 10 W 2 1 tively, through the pipeline are presented. This paper is m Hz ). In particular, the intensity field on the sky IðxÞ Paper IV in the series. The output from this pipeline is then has units of specific intensity (W m2 Hz1 sr1,orJysr1), used to derive constraints on cosmology (Sievers et al. 2003, and thus to convert between I and T we use hereafter Paper V). Finally, analysis of the excess of power IðxÞ¼fT ðÞTðxÞ with the factor at high l seen in results shown in Paper II in the context of 2 2 x the Sunyaev-Zeldovich effect is carried out, again using the 2 kBgð;T0Þ x e h fT ðÞ¼ 2 ; gð;T0Þ¼ 2 ; x ¼ ; method presented here, in Bond et al. (2003, hereafter c ðÞex 1 kBT0 Paper VI). An introduction to the properties of the CMB power ð6Þ spectrum, the response of an interferometer to the incoming where g corrects for the blackbody spectrum. Note that we radiation, and the computation of the primary beam are have treated the temperature T as small fluctuations about given in xx 2, 3, and 4, respectively. The gridding process is the mean CMB temperature T0 ¼ 2:725 K (Mather et al. presented in x 5, followed by a description of the likelihood 1999), and thus the g appropriate to T0 is used with g 0:98 function and construction of the various covariance matri- at ¼ 31 GHz. ces in x 6. Details on the maximum likelihood solution and We are not restricted to modeling the CMB. For example, the calculation of window functions and component band we might wish to determine the power spectrum of fluctua- powers are given in x 7, while x 8 presents our method for tions in a diffuse galactic component such as synchrotron making optimally filtered images from the gridded estima- emission or thermal dust emission. In this case, one might tors. Finally, a description of the CBI implementation of wish to express I in Jy sr1 but take out a power-law spectral this method and the performance of the pipeline, including shape demonstrations using simulated CBI data sets, is given in x 9, followed by a summary and conclusions in x 10. I ¼ f0ðÞI0; f0ðÞ¼ ; ð7Þ 0 2. THE CMB POWER SPECTRUM where is the spectral index and f0() is the conversion At small angles, the curvature of the sky is negligible and factor that normalizes to the intensity I0 at the fiducial we can approximate the spherical harmonic transform of frequency 0. Note that this normalization is particularly No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 577 useful for fitting out centimeter-wave foreground emission, and thus which tends to have a power-law spectral index in the range Z 1 <<1 that is significantly different from that for the V ¼ d2v P ðvÞT~ ðvÞþe ; ð13Þ thermal CMB ( 2). In addition, foregrounds will also k k k tend to have a power spectrum shape different from that of CMB, which must be included in the analysis (see x 6.4). where fk ¼ fT ðkÞ is the Planck conversion factor for the CMB given in equation (6). It is easiest to write these in operator notation, with 3. RESPONSE OF THE INTERFEROMETER V PT~ e ; 14 A visibility Vk formed from the correlation of an interfer- ¼ þ ð Þ ometer baseline between two antennas with projected sepa- ration (in the plane perpendicular to the source direction) b where V and e are the visibility and noise vectors, respec- ~ meters observed at wavelength meters measures (in the tively, P is our kernel, and T is the transform of the temper- ~ absence of noise) the Fourier transform of the sky intensity ature field. In this representation T can be thought of as a modulated by the response of the antennas (Thompson vector of cells in (u, v)-space. et al. 1986) Z b VðuÞ¼ d2x AðxÞIðxÞe2iu x x; u ¼ ; ð8Þ 4. THE PRIMARY BEAM In order to determine the response of the array to the where AðxÞ is the primary beam and u ¼ðu; vÞ is the conju- radiation field, we need to know the primary beam AðxÞ of gate variable to x. For angular coordinates x in radians, u the antenna elements and its Fourier transform A~ðuÞ.In has the dimensions of the baseline or aperture in units of the general, for each frequency channel, each baseline has a pri- wavelength. The Fourier domain is also referred to as the (u, mary beam that is the Fourier transform of the cross-corre- v)-plane or aperture plane in interferometry for this reason. lation of the voltage illumination functions across the We define the direction cosines aperture of each antenna (see Thompson et al. 1986 for a detailed treatment of the interferometer response). For a xk ¼ðDxk; DykÞ; Dxk ¼ cos k sinðk 0Þ ; real and symmetric primary beam that is identical between Dyk ¼ sin k cos 0 cos k antennas, then the transforms are symmetric and real, and we can ignore the differences between cross-correlation and sin cosð Þð9Þ 0 k 0 convolution and write between the point at right ascension and declination k, k 2 A~ðuÞ¼g^ ? g^ , AðxÞ¼ g~ ð15Þ and the center of the mosaic 0, 0. For the CBI, data are taken keeping the phase center fixed on the pointing center for the voltage illumination function g^ðr;Þ across the x by shifting the phase with the beam and rotating the plat- k radius of the aperture r ¼ jjr at frequency , where g~ is the form to maintain constant parallactic angle during a scan, Fourier transform of g^. The CBI beams have been measured so that the response to a point source at the center of the and are nearly identical and symmetric, and thus we will use field IðxÞ¼2ðx x Þ is constant, and thus k a single mean primary beam and its transform for the array. 2iuk x xk For a heterogeneous array, the individual beams can be AðxÞ¼Akðx xkÞe ð10Þ used with some added complication. in equation (8), where Ak is the normalized primary beam For most antennas, such as those used in the CBI, the pri- response at the observing frequency of visibility k. Then, by mary beam width scales linearly with observing wavelength, application of the Fourier shift theorem, it is easy to show and thus g^ðrÞ is approximately constant with wavelength. that Then, we can define G(r) as the normalized aperture auto- Z correlation function and write 2 2iuk x ðxxkÞ Vk ¼ d x Akðx xkÞIk ðxÞe þ ek Z 1 A~ kðuÞ¼ Gðjju kÞð16Þ x A 2 ~ ~ 2iv xk 0 ¼ d v Akðuk vÞIk ðvÞe þ ek ; ð11Þ for a channel centered at wavelength k, with ~ where Ak is the Fourier transform of the primary beam Ak Z Z and I ðxÞ is the sky brightness field (expressed in units such 2 1 A ¼ d2u Gðjju Þ¼ rdrGðrÞð17Þ as Jy sr1) with transform ~I ðvÞ. The instrumental noise on 0 k 2 k 0 the complex visibility measurement is represented by ek. The (u, v)-plane resolution of an interferometer in a single normalizing the response to give unity gain on sky at the pointing is thus limited by the convolution with A~. How- beam center [Að0Þ1]. If gðrÞ¼g^ðrÞ=gð0Þ, then GðrÞ¼ ever, these subaperture spatial frequencies can be recovered g ? g. by using the phase gradient in the exponential expðÞ 2iv x xk The two-dimensional primary beam response, AðxÞ,is from a raster of mosaic pointings fxkg, provided that the determined by means of measurements of a bright radio spacing of the pointings is sufficiently small to avoid aliasing source over a two-dimensional grid of offset pointings cen- as discussed in Appendix A. tered on the source. The central lobe of AðxÞ for the CBI is To aid us later on, we introduce a convolution kernel well approximated by a circular Gaussian, which is charac- terized by its dispersion , which is related to the FWHM 2iv x xk x PkðvÞ¼fkA~ kðuk vÞe ð12Þ 1=2 ax by x ¼ ax=ðÞ8ln2 . The Fourier transform of an 578 MYERS ET AL. Vol. 591 infinite circular Gaussian is given by mosaic pointings. Thus, a direct sum of visibilities taken at the same u but over the whole mosaic x will result in an esti- 2 2 1 2 2 1 x =2x ~ jju =2u mator that has a higher effective resolution in the (u, v)- AðxÞ¼e , AðuÞ¼ 2 e ;u ¼ ; 2u 2x plane. The result is that we can speed up the likelihood esti- ð18Þ mation at the cost of complicating the covariance matrix. In general, this matrix can be computed relatively quickly as it where u is the Gaussian dispersion in Fourier space. The is an N2 process, and thus this is a worthwhile trade-off ver- function G(r) is therefore sus the N3 cost of calculating the likelihood. The estimators

r2=2r2 derived in Appendix A are not orthogonal combinations of GðrÞ¼e g ð19Þ the original visibilities, and thus some information loss is expected. However, the tests performed in 9.1 show that for Gaussian radius r ¼ . For the CBI the measured pri- x g u these estimators are unbiased, and comparisons to results mary beam (see Paper III) has a ¼ 45<2ð31 GHz=Þ,so x obtained using the visibilities directly show that there is no ¼ 28:50 at 31 GHz ( ¼ 0:967 cm), which corresponds to u noticeable loss in sensitivity. Thus, our gridding can be con- rg ¼ 27:56 cm. For the CBI software pipeline, instead of using a sidered to be an efficient form of (lossy) compression using Gaussian approximation to G(r), we have chosen to model the beam as a signal template. the antenna illumination g(r) as a Gaussian truncated at In Appendix A we argue that a Di formed by a linear com- bination of visibilities will give an estimate of the weighted both the dish edge and the secondary blockage radius ~ ~ 8 average of I or T around (u, v) locus ui. We have from > ; r r ; equation (A13) > 0 jj inner <> ðÞr=s 2 D e ; rinner < jjr < ; D ¼ QV þ QQV ; ð21Þ gðrÞ¼> 2 ð20Þ > :> D where the kernel Q is defined in equation (A13) and the ker- 0; jjr ; 2 nel for the conjugate visibilities Q is defined in equation (A17). In particular, where for the CBI antennas rinner ¼ 7:75 cm. Note that if !k x g(r) and G(r) were infinite circular Gaussians, then s ¼ rg.A ~ 2iui xk Qik ¼ Ak ðuk uiÞe ; best-fit taper parameter s is obtained using the measured zi !k primary beam A, giving s ¼ 30:753 cm or an edge taper of ~ 2iui x xk Qik ¼ Ak ðuk uiÞe ; ð22Þ 0.118 (18.6 dB of power) at the dish edge. We then numeri- zi cally tabulate the autocorrelation G(r) assuming s ¼ 30:753 cm, which is then interpolated in the code when A~ is where zi is the normalization factor given in equation (A21) ! 2 required. This model is a better fit to the observed beam and k ¼ k is the visibility weight given in equation (A19). than a pure Gaussian beam (see Fig. 1 in Paper III for a plot By operating with the gridding kernel on equation (14), of this model). we get The resolution in (u, v)- or l-space is set by the width of D ¼ RT~ þ n; R ¼ QP þ Q P; n ¼ Qe þ QQe ; ð23Þ A~ kðuÞ. For a Gaussian approximation to the beam, the dis- persion in multipole l is l ¼ 2u ¼ 1=x, and the FWHM where we define R as the convolution kernel that operates is al ¼ 8ln2=ax. For ax ¼ 45<2 at 31 GHz we have FWHM on the transform of the intensity (the gridded version of P) al ¼ 422 (l ¼ 179). Given that features are expected in the and n is the gridded noise. The conjugate to P defined in power spectrum of widths significantly less than this, it is equation (12) is given by highly desirable to reduce the effective resolution width of ~ 2iv x xk the CBI by at least a factor of 3 using mosaicking. PkðvÞ¼fkAkðuk vÞe ; ð24Þ which gathers the conjugate visibilities under the transfor- mation uk !uk. 5. GRIDDED ESTIMATORS Although it is not necessary to do so, it is convenient to The principal problem in using likelihood (see x 6) to construct the Di on a regular lattice in ui with a spacing du. determine confidence limits on the power spectrum for CBI Thus, the grid ‘‘ cells ’’ represented by the Di represent an data is the large number of visibilities compounded by the interpolation using Q of the visibilities onto the (u, v)-plane. large number of mosaic pointings (typically 7 6 or larger). This will be useful when using the estimators to form filtered Even a modest reduction in the number of matrix elements images (x 8). passed to the likelihood calculation will greatly aid the com- If it is desired that the visibilities be used directly, for putation. This suggests that we grid the visibilities before example, when the data sets are small, then the ungridded computing the likelihood function. For an effective resolu- case can be recovered by setting Qik ¼ ik and Qik ¼ 0, giv- tion in the aperture plane determined by the primary beam ing D ¼ V and R ¼ P, with no loss of generality in the and mosaic size, there is little use in sampling below this derivations. smearing scale, and we can define an optimum gridding scheme that minimizes the quantity of data and information loss (the gridding is a form of compression). 6. THE LIKELIHOOD FUNCTION We implement this by defining estimators DðuÞ for the To carry out the power spectrum estimation, we form the true complex brightness transform that are linear combina- likelihood of the data given covariance matrices for the sig- tions of the measured visibilities. These estimators bin nal, noise, and foregrounds. Since the estimators are linear together data from the different frequency bands and combinations of the visibilities, which we assume are made No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 579 up of Gaussian noise and Gaussian signal components, we v can use a multivariate Gaussian probability distribution to Support for Support for A (u −v) D j j * describe the estimators also. Because is complex, it is eas- V V <>Vi Vj ier to deal with the real and imaginary parts by packing <>ij them together in a double-length real vector 90cm ReD d ¼ ð25Þ A (u +v) A (u −v) ImD i i i i written here as a column vector of length 2Nest. The log likelihood function for a real multivariate Gaussian probability distribution is u

1 1 t 1 ln LðxjqÞ¼Nest ln 2 2 lnðdet CÞ2 d C d ; ð26Þ where dt is the transpose of d and 100cm hiReDReDt hiReDImDt C ¼ ð27Þ hiImDReDt hiImDImDt is a block matrix of the real and imaginary covariances. The vector q represents the parameters of the model or theory against which the data are being measured (see below). These parameters are contained in C. Fig. 1.—Graphical representation of the regions of support in the aper- In terms of D and D , we can write turep planeffiffiffi for the correlation between visibilities on short baselines B < 2D. Note that both Vi and its conjugate Vi ¼ VðuiÞ have overlap- 1 1 ping support for visibility V , and this must be taken into account in ReD ¼ D þ D ; ImD ¼ D D ð28Þ j 2 2i computing the covariance matrix element. and therefore model parameters to be determined by maximizing the like- t 1 y t lihood. The factor q is the amplitude of the covariance hiReDReD ¼ 2 Re DD þ hiDD ; res due to a residual Gaussian foreground, and q is the ampli- hi¼ImDImDt 1 Re DDy hiDDt ; src 2 tude of the covariance contributed by known point sources; hiReDImDt ¼1 Im DDy hiDDt ; there may be more than one of each of these types of fore- 2 t 1 y t ground covariance matrices. The qsrc and qres can be treated hiImDReD ¼ 2 Im DD þ hiDD ; ð29Þ as adjustable parameters to be determined by maximum t likelihood, or they can be held fixed at a priori values, in where Dy ¼ðDÞ is the Hermitian transpose of D (a row which case C src and Cres are constraint matrices with their vector containing the complex conjugate of a column vec- corresponding terms in equation (30) behaving like tor) and DDy is the tensor or outer product of D and Dy, additional noise terms. which is a matrix with elements DiDj . t In the following sections we consider each of the terms It is important to include the covariances of DD , as well N S src res y C , CB, C , and C in turn. If we write as those for DD . Normally, only one of a given visibility Vk y t or its conjugate Vk pwillffiffiffi correlate with Vk0 . However, for M ¼ DD ; M ¼ hiDD ; ð31Þ short baselines b < 2D (less than 127.3 cm for the 90 cm CBI dishes), there may be overlap between the support for a then in each case we calculate the contributions to the given visibility and both another visibility and its conjugate, covariance matrix for the real and imaginary parts of the as shown in Figure 1, and thus both may be nonzero. Note estimators using equations (27) and (29): that the correlation between distant conjugate pairs is small, "# since the overlap occurs far out in the antenna response A~, 1 Re M þ M 1 Im M M C ¼ 2 2 ; ð32Þ although it is nonnegligible on the shortest CBI baselines 1 Im M þ M 1 Re M M where the overlap occurs at the 0.57D point (illustrated in 2 2 Fig. 1) for perpendicular 1 m baselines withp theffiffiffi beam with the individual covariance matrices given by insertion response 30%. Outside the baseline range b > 2D one of of the appropriate contribution to M and M for that com- S S hVk Vk0 i or hVkVk0 i will be zero. ponent, e.g., MB and MB to compute the block elements of S The covariance matrix C can be split into a sum of inde- CB. pendent contributions from instrumental noise C N , the CMB signal C S, and foreground signals C src and C res.We S S further split C into a sum of terms C B from separate l 6.1. The Noise Covariance Matrix bands of the power spectrum, X The instrumental noise correlations are assumed to be Gaussian and independent between different baselines and C ¼ CN þ q CS þ q C src þ q C res : ð30Þ B B src res frequency channels. For the CBI, tests have been carried B out on the data that show this to be true to a high level of The factors fqB; B ¼ 1; ...; NBg are the ‘‘ band powers ’’ accuracy. In this case, the noise contributions to the real (Bond et al. 1998) for bins with centers at l ¼ lB and are the and imaginary parts of the visibilities are independent 580 MYERS ET AL. Vol. 591 zero-mean normal deviates with We furthermore write the radial integral over $ ¼ l=2

2 as a sum with respect to Cl of equation (5): Ree Ree 0 Ime Ime 0 0 ; Ree Ime 0 0 ; hik k ¼ hik k ¼ k kk hik k ¼ X Wlij l ð33Þ MS ¼ C W ¼ W ; ij l l lij ij 2 l and thus we can write X W lij l M S ¼ C W ¼ W ; ð40Þ eey ¼ E; hieet ¼ 0 ð34Þ ij l l lij ij 2 l 2 for real noise matrix E, where E 0 ¼ 2 0 . kk k kk where Wlij is the variance window function (e.g., Knox 1999). It can be shown that the noise contributions n to the esti- We define the band powers fqB; B ¼ 1; ...; NBg by con- mators defined in equation (23) have the contributions to shape structing Cl piecewise with respect to a fiducial shape C , the covariance elements M and M defined in equation (31) X l shape given by Cl ¼ qBCl Bl ; ð41Þ B MN ¼ nny ¼ QEQy þ QQE Q y ; N t t t where M ¼ hinn ¼ QEQ þ QQEQ ; ð35Þ  1 l 2 B using the covariances of e given in equation (34). This is Bl ¼ ð42Þ N l B assembled into the covariance matrix C using equation 0 62 (32). In general, the gridding kernel Q will map a given visi- breaks the power spectrum into non-overlapping bands. The N bility to more than one estimator, and thus C will have standard choice for the shape is Cshape ¼ 1 for equal power nonzero off-diagonal elements. Furthermore, if there are l per log l interval, with qB then giving the band powers in units noise correlations between baselines or channels, then the of T2. Then, to calculate S , we construct band versions of N C B structure of C will be even more complicated. the covariance matrix elements in equation (36), X X 6.2. The CMB Signal Covariance Matrix W l W l MS ¼ Cshape ; M S ¼ Cshape ; ð43Þ B l l Bl B l l Bl The CMB contribution to the visibility covariance matrix l l ~ P P is given by the covariance of the RT term in equation (23), S S S S where M ¼ B qBMB and M ¼ B qBMB. These are MS ¼ R T~T~ y Ry; M S ¼ R T~T~ t Rt ; ð36Þ then combined following the prescription in equation (32) S to assemble the C B. ~ ~ y ~ ~ t where hTT i and hTT i are given in equations (3) and The variance window function WijðvÞ is the convolution S S (4), respectively. Then, the elements of M and M for of the A~ iðvÞ and A~ jðvÞ, and thus its width is characteristic of estimators i and j are the square of theffiffiffi Fourier transform primary beam, or Z Z p FWHM Dl al= 2. Thus, we would expect in a single field S 2 Mij ¼ d v CðjjÞv RiðvÞRj ðvÞ¼2 d$$Cð$ÞWijð$Þ ; to be able to achieve a limiting resolution of Dl 300 for Z Z al ¼ 422 at 31 GHz. This will be increased by the mosaick- ing by a factor roughly equal to the extent of the half-power M S ¼ d2v CðjjÞv R ðvÞR ðvÞ¼2 d$$Cð$ÞW ð$Þ ; ij i j ij width of the mosaic relative to that of a single field. In prac- ð37Þ tice, the limiting useful width for the l bins for the band powers will be set by the band-band correlations introduced with in the maximum likelihood estimation procedure (see xx 7 Z and 9.1 for further discussion and examples). 1 2 W ð$Þ¼ d R ð$; ÞRð$; Þ ; ij 2 i j 6.3. Known Point-Source Constraint Matrices Z0 2 1 Consider a set of Nc point sources at positions xc with flux W ijð$Þ¼ d Rið$; ÞRjð$; Þ ; ð38Þ 2 0 densities Sc()(c ¼ 1; ...; Nc). The intensity field at fre- quency is then given by where to aid in breaking up the CMB covariance matrices X 2 into bands we write the integrations in terms of polar Four- IðxÞ¼ ScðÞ ðx xcÞ ; ð44Þ ier coordinates ðu; vÞ!ð$; Þ ($ ¼ jjv ). c As an illustration, consider the case without gridding. which is assumed to be uncorrelated with other intensity Then, R P, and using equation (12) in equation (37), we ¼ components like the CMB. The effect V src on the visibilities get k Z Vk (e.g., eq. [11]) is then given by the sum over sources X x S 2 ~ ~ 2iv ðxkxk0 Þ src 2iuk x ðxcxkÞ Mkk0 ¼ fkfk0 d v CðjjÞv Akðuk vÞAk0 ðÞuk0 v e ; Vk ¼ Vck; Vck ¼ ScðkÞAkðxc xkÞe ; Z c x S 2 ~ ~ 2iv ðxkxk0 Þ Mkk0 ¼ fkfk0 d v CðjjÞv Akðuk vÞAk0 ðÞuk0 þ v e ð45Þ

ð39Þ where Vck is the contribution to visibility k of source c.We assume that the positions of the sources can be determined for the covariance matrix element between visibilities Vk with negligible uncertainty through radio surveys and that and Vk0 . the errors are due to uncertainties in the measurements of No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 581

src the flux densities. Then, the covariance between the source only the vector rc for each source is needed. We can grid contributions to visibilities k and k0 is this onto the estimators X X src src src src src Vk Vk0 ¼ hiScðkÞSc0 ðÞk0 Akðxc xkÞAk0 Dc ¼ Qrc þ Qrc ð54Þ c c0 and then 2iukðxcxkÞ 2iu 0 x ðx 0 x 0 Þ ðÞxc0 xk0 e e k c k ; ð46Þ X X src src src;y src src src;t M ¼ Dc Dc ; M ¼ Dc Dc ; ð55Þ where S S 0 0 is the flux density covariance matrix h cð kÞ c ð k Þi c c 0 between sources c and c at frequencies k and k0 , respec- src src src tively. There is a similar covariance matrix hVk Vk0 i. which are used to build C . These can be passed through the gridding procedure using There are two components to the source flux density equation (21) to make uncertainties Sc(k), one from the uncertainties on the source frequency spectrum, and the other from the uncer- src src src D ¼ QV þ QQV ð47Þ tainties on the flux density measurements and any extrapo- lation of the measured flux densities to the observing and used to construct the covariance elements frequencies k (using the estimated source spectrum). As an Msrc ¼ DsrcDsrc;y ; M src ¼ hiDsrcDsrc;t ð48Þ example, consider a source with a flux density Sc(0) mea- sured with standard deviation S0 at frequency 0 and a using equation (31). known power-law frequency spectrum with spectral index This covariance matrix can be greatly simplified if we can , subtract off the mean source flux densities, leaving a zero- k k k mean residual error. Let the true source flux density Sc()be ScðkÞ¼Scð0Þf ; ; f ; ¼ : ð56Þ obs the sum of the measured flux density Sc ðÞ and an error 0 0 0 S (). If our measurements of these foreground sources are c Then, it is easy to show that accurate, then the residuals Sc() should be uncorrelated between sources (they are due to measurement errors) ScðkÞ S0 and have zero mean. In this case, we can make corrected ¼ ; ð57Þ cor ScðkÞ Scð0Þ visibilities Vk , X X with the fractional uncertainty in the flux density /S cor obs obs Sc c Vk ¼ Vk Vck ¼ Sc ðkÞ remaining independent of the frequency. c c On the other hand, consider the case in which there is 2iuk x ðxcxkÞ Akðxc xkÞe ; ð49Þ now an uncertainty in the spectral index between k and 0. Then, our extrapolation factor f ð=0;Þ, which we to be used in place of V in subsequent analysis. Then, we are write as left with the fluctuating component X lnð=0Þ V src ¼ V src V obs ; f ; ¼ e ; ð58Þ k k ck 0 c V S A e2iuk x ðxcxkÞ ; propagates to the extrapolated flux density as ck ¼ cð kÞ kðxc xkÞ ð50Þ ScðkÞ which we must deal with statistically. The covariance ¼ ln ; ð59Þ between the source error contributions to the visibilities, ScðkÞ 0 assuming that the flux density errors are independent which can be negative: for two channels flanking the fiducial between sources (but not between frequency channels for frequency (e.g., < <0) the errors will be anticorre- the same source), is given by 0 X lated. Note that we have approximated the resulting distri- src src bution as Gaussian. In general it is not, e.g., for a Gaussian Vk Vk0 ¼ hiScðkÞScðÞk0 Akðxc xkÞAk0 c distribution in we find a lognormal distribution in S(). Although the dominant spectral error is due to the extrap- 2iuk x ðxcxkÞ 2iu 0 x ðxcx 0 Þ ðÞxc xk0 e e k k olation from a frequency 0 outside the range of the CMB ð51Þ instrument, there is an additional error due to an error in the spectral index over the frequency channels of the visi- src src k and similarly for hVk Vk0 i. Finally, if the covariance is bilities. This is as if you extrapolated using one spectrum separable, e.g., appropriate for the band center of the instrument, but obs 0 0 when the flux densities Sc ðkÞ are extrapolated from band hiScðÞScðÞ ¼ ScðÞScðÞ ; ð52Þ obs center Sc ðÞ, there is an error from using the wrong over then we can write the band. This is handled using equation (59) with another X appropriate to the uncertainty in the spectral index over src src src src Vk Vk0 ¼ ck ck0 ; the k. c For the CBI analysis, we have approximated both the flux src 2iuk x ðxcxkÞ density error and the spectral extrapolation error as a single ck ¼ ScðkÞAkðxc xkÞe : ð53Þ equivalent flux density error. For the CBI, the frequency src src The other covariance hVk Vk0 i can be computed in the span (26–36 GHz, or d= ¼16%) is small enough that same way. Because we have assumed that the covariance is we can approximate the spectral index uncertainty as an separable, we can speed up the covariance calculation as effective flux density uncertainty c extrapolated to band 582 MYERS ET AL. Vol. 591

res res center from 0 using 0, using a frequency factor fk ¼ f ðkÞ appropriate to the  res foreground in question. The matrix C is then obtained by ð Þf ; ; substitution of Mres and M res as usual using equation (32). Sc k c () Although it is possible to break up the Gaussian foreground 2 2 2 2 2 2 component into bands as we did the CMB, it is preferable to c ¼ f ;0 S0 þ S0 ln ; ð60Þ 0 0 compute the foreground covariance matrix in a single band res using its shape Cl , to reduce the degeneracy with the where need not equal 0 and should reflect the spectral CMB; you cannot distinguish between the two in narrow l index over the observing band, not the one used for bands where the shapes are unimportant. extrapolation from 0. An example of a foreground that strongly affects the CBI In principle, if the true mean flux densities for the sources data is that from point sources below the limit for subtrac- are correctly subtracted from the visibilities and the cova- tion contaminating the CBI fields. This residual statistical riance matrix Csrc is built using the correct elements background, in the limit where there are many sources per hScðkÞSc0 ðk0 Þi, then inclusion of this as a noise term in C field, can be modeled as a white-noise Gaussian field with using qsrc ¼ 1 would remove the effects of these sources from constant angular power spectrum and power-law frequency our power spectrum estimation in a statistical sense. How- spectrum. Each individual source has a flux density drawn ever, there are a number of factors that make this difficult. If from a differential number count distribution dN(S)/dS, the source flux density measurements have a calibration which represents the number of sources per steradian with error, then the errors will not be independent between sour- flux densities between S and S þ dS at observing fre- ces. In addition, the fainter sources (which are still signifi- quency . The angular clustering in these sources is very cant contributors to the signal) have flux densities that are small and can be neglected. often extrapolated from much lower frequencies (e.g., the The contribution of a source c to visibility Vk was given ‘‘ NVSS ’’ sources in Papers II and III). Furthermore, since by Vck in equation (45). The sources are independently dis- there are a relatively small number of discrete sources con- tributed in flux density and on the sky, so  tributing to a given field, it is not clear that we are in the stat- X istical limit where the flux density covariance is an accurate VkVk0 ¼ ScðkÞScðÞk0 Akðxc xkÞ description of what is happening to the data. For these rea- c  sons, for the CBI analysis we treat the covariance matrix x x src 2iuk ðxcxkÞ 2iuk0 ðxcxk0 Þ C constructed using the approximation in equation (60) Ak0 ðxc xk0 Þe e as a constraint matrix for the nuisance parameters due to *+ X the sources and set qsrc to a high enough amplitude to project 1 ¼ S ð ÞS ðÞ 0 B 0 out the contaminated modes in the data. Because the source c k c k kk modes are spread out by the effect of the synthesized beam c res (the ‘‘ point-spread function ’’ [PSF] in imaging terms), set- ¼ C ðÞk;k0 Bkk0 ; ð63Þ ting q to too high a value will start to down-weight modes src where the angular average can be written as an integral over that are not significantly contaminated, while too low a Z value will eat into the noise and CMB signal power in those 2 2iuk x ðxxkÞ modes without down-weighting them sufficiently, thus bias- Bkk0 ¼ d x Akðx xkÞAk0 ðÞx xk0 e ing the affected band powers low. The exact values to be 2iu 0 x ðxx 0 Þ used thus depend on the signal and noise levels in the data; e k k ; ð64Þ we refer the reader to Papers II and III for descriptions of what was chosen for the CBI analysis. See Bond et al. (1998) with as the normalizing solid angle. This integral is just a Fourier transform, and so for a description of the constraint matrix formalism and the Z technique of projection. x 2 ~ ~ 2iv ðxkxk0 Þ Bkk0 ¼ d v Akðuk vÞAk0 ðÞuk0 v e ; ð65Þ

6.4. Gaussian Foregrounds and Residual Point Sources which is the same as the CMB visibility covariance matrix In x 2 it was mentioned that a single foreground compo- Mkk0 in equation (39) with fk ¼ fk0 ¼ 1 and CðvÞ¼1. Simi- nent could be modeled with a modified covariance matrix, larly, the other covariance hVkVk0 i reduces to M kk0 . Thus, in power spectrum shape, and frequency dependence. As long the stochastic limit the residual source background behaves as these foregrounds can be treated as a Gaussian random as a Gaussian random field and can thus be treated as we do field, they can be processed in the same manner as the CMB. the CMB signal in x 6.2 but with a power spectrum shape res Therefore, once the amplitude and shape of the foreground Cl ¼ C ðk;k0 Þ, which is constant over l for a given pair of fluctuation power spectrum CresðvÞ are input, we compute frequency channels. the foreground covariance matrix elements The amplitude of the covariance matrix is the ensemble average of the source power per solid angle, which is X W res X W res Mres ¼ l Cres; M res ¼ l Cres ; ð61Þ obtained by integration over the flux density and spectral l l l l l l index distributions Z Z res res Smax 1 where the variance window functions W and W are res 2 dNðSÞ C ðÞ¼ ; 0 dS S d pðjS; Þ given by substituting for R in equation (38) a new Rres built k k dS 0 Smin 1 from a kernel 0 k k ; ð66Þ res res ~ 2iv x xk 2 Pk ðvÞ¼fk Akðuk vÞe ð62Þ 0 No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 583 where we have again assumed that the spectrum is a power take this differencing into account in our correlation law with spectral index over the range of interest for the k analysis. This effectively modifies the window function, and integrate over the number counts over the flux density quenching low spatial frequencies further. range from Smin to Smax. We also assume that there is a large Let us write number of these faint sources over the solid angles of inter- sw main trail trail est (e.g., the CBI primary beam), and thus the Poisson con- Vk ¼ Vk Vk ; xk ¼ xk þ Dxk ð68Þ tribution to the probability can be ignored and we can use m for switching offset Dxk (e.g., 8 in R:A: 2 for the CBI the mean source density given by the number counts dN/dS fields near the celestial equator). Then, from equation (11) at the fiducial frequency 0 for which S is given. The spectral we find index distribution as a function of flux density p S; Z ð j 0Þ must be estimated from radio surveys, although it can be V sw ¼ f d2v A~ ðu vÞT~ ðvÞe2iv x xk 1 e2iv x Dxk þ esw ; uncertain at the high frequencies and faint levels at which k k k k k the CMB experiments are carried out. If pðjS;0Þ¼ ð69Þ pðj0Þ and thus is independent of flux density, then it can sw main trail be shown (e.g., Appendix B) that the integrals in equation where the switched noise ek ¼ ek ek . In terms of the (66) can be evaluated at a single frequency in the band and kernel of equation (12), Z scaled using an effective spectral index eff,  sw 2 sw sw V ¼ d v P ðvÞT~ ðvÞþe ; res res eff eff eff k eff k k k C ðÞ¼k;k0 C fk fk0 ; fk ¼ ; ð67Þ sw 2iv x Dxk P ðvÞ¼PkðvÞ 1 e ; ð70Þ res k where C is the amplitude of the fluctuation power per solid angle (in units of Jy2 sr1)at. In terms of the logarith- and thus for our switched visibilities we compute everything res 2 res sw mic average C for the CMB, C ¼ l C =2, which rises at as before, but substituting Pk for Pk. high l with respect to the CMB. See Appendix B for an Note that if the trail field offsets Dx were constant in arc example analytic calculation using power-law source counts length rather than in right ascension (this is approximately and a Gaussian spectral index distribution. true since the declination range of the mosaic is limited), we The frequency range of the CBI is insufficient to distin- could write the convolution kernel as guish nonthermal foreground emission from the thermal sw sw x Pk ðvÞPk0 ðvÞ¼PkðvÞPk0 ðvÞ½2 2cosð2v DxÞ ; ð71Þ CMB, and thus this is treated as a constraint matrix (i.e., qres is not solved for as a parameter). Therefore, in the CBI anal- res where the leading factor of 2 dominates (you essentially get ysis we construct the covariance matrix C using the twice the CMB power). Note that a noticeable effect of the matrix elements in equation (61) built assuming unit power differencing is that the window function will have a ripple of 2 1 eff (1 Jy sr ) and the frequency dependence fk ¼ fk from ‘‘ wavelength ’’ Dx1 superimposed on its envelope. For equation (67). The value used for qres is equal to the source m res example, the 8 switching in right ascension that the CBI fluctuation power C calculated as an a priori estimate uses corresponds to Dx ¼ 2 at the celestial equator, and based on knowledge of the residual foreground source pop- thus the ripple has Dx1 ¼ 28:6inu. This corresponds to ulations (see Appendix B). 180 in l but is azimuthally averaged in the (u, v)-plane, and 6.5. Other Signal Components thus the peak-to-peak amplitude is reduced. We are not restricted to CMB, Gaussian foreground, and discrete point sources as the components of our signal or 7. SOLVING THE LIKELIHOOD EQUATION noise in the covariance matrix C in equation (30). This We have expressed the estimators as a real vector d and approach can be generalized to deal with other signals of obtained expressions for the components of its covariance interest. For example, extended sources with a known pro- matrix, and we now turn to the problem of solving for the file, such as the Sunyaev-Zeldovich effect from clusters of maximum likelihood estimators for the band powers using galaxies, could be modeled either analytically or numeri- equation (26). As shown below, we will be carrying out a cally given a power spectrum shape (e.g., Bond & Myers large number of matrix operations using C and its compo- 1996). In the case of a signal with a known distribution on N S nent matrices (C , C B, etc.), and thus these will need facto- the sky, a template can be used. Examples of this include rization. Because C is positive definite, we use optimized dust emission in the millimeter-wave bands or the anoma- Cholesky decomposition routines3 (DCHDC from LIN- lous centimeter-wave emission observed at the Owens PACK, or DPOTRF from LAPACK) to carry out the Valley Radio Observatory (OVRO; Leitch et al. 1997) and required factorizations. by COBE (Kogut et al. 1996). In particular, the latter The large number of visibilities times the number of foreground, which is posited as due to spinning dust grains mosaic pointings makes this computation extremely costly by Draine & Lazarian (1999), correlates with the 100 lm (the matrix inversions and/or solution of systems of equa- dust emission as measured by IRAS and DIRBE, and thus a tions are order N3 processes!), especially for a large number template of emission can be constructed. of bands NB. Clever perturbative or gradient search meth- 6.6. Differencing ods can help to reduce the overhead in finding the maximum in parameter space. One such method is the quadratic relax- Unfortunately, with its low intrinsic fringe rates and ation technique of Bond et al. (1998). To summarize here, if extremely short (<90) spacings, the CBI is susceptible to one Taylor expands the log likelihood around the maximum ground pickup. To remove this, we observe for each field a trailing field displaced 8m in right ascension 8m later and dif- ference the corresponding visibilities. Therefore, we must 3 Available at http://www.netlib.org. 584 MYERS ET AL. Vol. 591

src res likelihood band powers q^ ¼fq^B; B ¼ 1; ...; NBg to second amplitudes qsrc or qres and treating qsrcC or qresC as N order, additions to the noise matrix C , or by solving for the qsrc X or qres and treating them as extra band powers qB with asso- @ ln Lðq^Þ ciated entries in the Fisher matrix. In practice, for the CBI, ln Lðq^ þ qÞ¼ln Lðq^Þþ qB B @qB it is necessary to hold fixed the qres because the contribution X X from the source foreground with a white-noise power spec- 1 @2 ln Lðq^Þ þ qB qB0 ; ð72Þ trum and appropriate frequency spectrum is largely indis- q q 0 2 B0 B00 @ B@ B tinguishable from an overall offset of the CMB power spectrum. In addition, the uncertainties on the individual then we can move toward the maximum using the quadratic known source contributions to an aggregate C src will be approximation substantial, and thus solving for a single amplitude qsrc will  src X 1 not be as useful as it might appear. In this case, the C acts @2 ln LðqÞ @ ln LðqÞ qB ¼ : ð73Þ as a constraint matrix and the qsrc can be set to an arbitrarily 0 0 B0 @qB@qB @qB high value, which will effectively project out the modes cor- responding to the known sources by down-weighting the The first derivative (gradient) is given by relevant combinations of the estimators in the likelihood  (Bond et al. 1998; Bond & Crittenden 2001). @ ln LðqÞ 1 @C ¼ Tr ðÞddt C C 1 C 1 ; ð74Þ @qB0 2 @qB0 and the second derivative (curvature matrix) is given by 7.1. Combination of Independent Data Sets Consider observations taken of separate sets of single @2 ln LðqÞ FBB0 ¼ fields or mosaics f where there is effectively no correlation @qB@qB0 between fields from separate f and the fields within a given @C @C set f are related by the mosaic covariance given in the pre- ¼ Tr ðÞddt C C 1 C1 C1 @q @q 0 vious sections. In this case, we can assemble a giant data B B 2 vector 1 1 @ C 1 C C t D ¼ ðÞd ... d ð78Þ 2 @qB@qB0 1 M 1 1 @C 1 @C from the M individual field vectors d (e.g., eq. [25]), with þ Tr C C : ð75Þ f 2 @q @q 0 the block diagonal covariance matrix B B 0 1 Note that the partial derivatives of the covariance matrix C 1 S B C are just the band signal covariance matrices @C=@qB ¼ C B C ¼ @ ... A ; ð79Þ defined above. C The final approximation is to replace the curvature M matrix with its expectation value, which is the Fisher which in turn can be written as sums of block-diagonal noise N S information matrix and signal covariance matrices C f and CBf , etc., with blocks given by C N and C S , etc. Because they are block 1 1 S 1 S f Bf FBB0 ¼ hiFBB0 ¼ 2 Tr C CBC CB0 : ð76Þ diagonal, we can write the log likelihood in equation (26) as the sum over data sets This yields X XÂÃ ÂÃ ln L ¼lnð2Þ N 1 lnðÞ det C 1 DtC1D 1 1 t 1 S 1 f 2 2 qB ¼ 2 F BB0 Tr ðdd CÞ C C B0 C ð77Þ f B0 X X X 1 1 t 1 ¼lnð2Þ Nf 2 ln det Cf 2 df Cf df : for the iterative correction to the band powers. This f f f amounts to making a quadratic approximation to the shape ð80Þ of the likelihood function around the maximum and itera- tively approaching it. At each step, the total covariance We proceed as before, with the same band powers fqBg matrix C must be updated using the new band powers and with the block-diagonal band covariance matrix q þ q. A convergence criterion based on the magnitude of 0 1 S the corrections qB will allow approach to the true fq^Bg to C B1 S @C B C be controlled. C B ¼ ¼ @ ... A ; ð81Þ 1 @qB The inverse of the Fisher matrix ½F BB0 evaluated at S maximum likelihood is the covariance matrix of the param- CBM 1 eters (Bond et al. 1998). The diagonals ½F BB give an esti- and thus all matrices are block-diagonal and composed of mated Gaussian error bar for the derived band powers the individual single field or mosaic matrices. Therefore, fq^Bg, although the full Fisher matrix must be used to take X 1 1 S 1 S the (usually significant) band-band correlations into FBB0 ¼ 2 Tr C f C Bf Cf C B0f ; f account. As the width of the l bins for the bands B is XÂÃX ÂÃ reduced, anticorrelation between adjacent bands increases 1 1 t 1 S 1 qB ¼ 2 F BB0 Tr df df C f C f C B0f C f ; as a result of the intrinsic l-space resolution of the data. B0 f The presence of known or residual point-source fore- grounds in equation (30) is dealt with either by fixing the ð82Þ No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 585 which is used to iteratively approach the fq^ g using the where ðflatÞ are band powers calculated using flat filters B CBf Bf Bond et al. (1998) scheme as in the single data set case. (Wl ¼ 1). We find that a fine band width DlBf 20 is suffi- cient to adequately sample the window functions and ensure 7.2. The Band Power Window Function normality and orthogonality to within 0.5% with respect to integration over the bands (e.g., eq. [87]). Example window To compare the band powers obtained from the data to functions calculated in this manner for mock deep fields and model power spectra, we need to define a set of filter func- mosaics are shown in the bottom panels of Figures 2 and 3, tions that project models Cl into a set of band powers CB, respectively. X W B C ¼ l C ; ð83Þ 7.3. Component Band Powers B l l l A further complication at the parameter end of the proc- as in Bond et al. (1998). In the ensemble limit, the expecta- ess is that the likelihood of the band powers cannot be tion value hðxxt C N Þi will approach the underlying signal assumed to be a Gaussian. This is especially so in cases in covariance matrix C S. We can then use the expression for which the error in the band powers is sample or cosmic var- the minimum variance estimate of the band power to derive iance limited. Assuming the band powers to be Gaussian B distributed can lead to the well-known problem of cosmic the filter functions Wl (Knox 1999). Since XÂÃ ÂÃ bias where the likelihood of low-power models can be over- 1 1 1 S 1 S estimated and conversely that of high-power models can be hqBi¼2 F BB0 Tr C C B0 C C ð84Þ B0 underestimated. Bond et al. (2000) have shown how one can avoid this problem while still retaining Gaussianity in the 2 and analysis by treating certain functions of the band powers as X X S Gaussian distributed. Very good fits to the non-Gaussian S S @C C C ¼ Cl ; ð85Þ distribution of the band powers can be obtained by use of B @C B l l the offset lognormal and equal variance approximations to the normalized filter functions can be computed using the the likelihood. band-averaged Fisher matrix (e.g., eq. [76]) Both approximations use offsets xB in the band powers that describe the contributions to the error in the band B XÂÃ S powers due to components other than the CMB. For the Wl 1 1 1 S 1 @C ¼ F Tr C C 0 C ð86Þ l 2 BB0 B @C range of scales probed by instruments such as CBI these B0 l components will include the foregrounds such as point sour- shape S ces in addition to the usual noise ‘‘ on the sky ’’ offset with respect to the Cl ¼ 1 that is built into the C . P S xN x , where x is the offset due to the noise contri- Because of the Bl used in the construction of the C B in B l Bl l l equation (43), bution to the error such that the quantity Zl ¼ lnðCl þ xlÞ has a normal distribution (Bond et al. 2000). For accurate X B Wl parameter fits we therefore require estimates of band 0 ¼ 0 ; ð87Þ B l l BB powers for all the components making up the total cova- l riance C. An approximation for these can be obtained by B and thus Wl =l is orthonormal with respect to the bands modifying the minimum variance estimator for the band defined by Bl. powers qB at the maximum likelihood l X ÂÃ Calculating the filter functions at each becomes some- X 1 1 1 S 1 X what prohibitive in both processor time (the problem scales qB ¼ 2 ½FBB0 Tr C C B0 C C ; ð90Þ 3 2 B0 as an extra N þ 2lmaxN operations) and storage since the calculation of equation (86) can only proceed once we have where we have substituted C X in equation (77) for the relaxed to the maximum likelihood solution. For this rea- observed measure for the signal covariance ðddt C N Þ.We X N src son, in practice we sample the full filter functions in bands then set C to the noise C , foreground source qsrcC ,or res at intervals Bf where the Bf are narrower than the bands B, Gaussian residual foreground qresC covariance compo- with nents as desired (or use the maximum likelihood values q^src and q^ if these are included as parameters in the solution W B XÂÃ ÂÃ res Bf 1 1 1 S 1 S rather than being fixed). Examples of these are shown in ¼ F Tr C C 0 C C : ð88Þ l BB0 B Bf Bf 2 B0 Papers II and III for the deep field data and mosaic data, respectively. The offset to the lognormal is then obtained by In principle, this is equivalent to assuming a flat window X N src res summing the qB over the components, xB qB þ qB þ qB over the ‘‘ fine ’’ band Bf, and as long as the curvature of the (e.g., Bond & Crittenden 2001). This formalism is used in exact window function is small over the intervals Bf, this Paper V to approximate the shapes of the likelihood should provide an adequate sampling of the continuous functions in order to derive limits on the cosmological limit. parameters. To obtain model band powers, we can then either interpo- late the samples W B to obtain an approximate form for W B Bf l 8. IMAGING FROM THE GRIDDED ESTIMATORS for use in equation (83) or preaverage the model spectrum over the fine bands Bf as Although not the primary goal of this method, an image ! can be constructed by Fourier transforming back to the sky X W B Bf ðflatÞ plane using equation (1). If the estimators Di are constructed CB ¼ C ; ð89Þ l Bf on a regular lattice in ui with spacing du and (u, v) extent B Bf f Ldu, then the resulting image will have an extent on the sky 586 MYERS ET AL. Vol. 591

Fig. 2.—Results of the gridded method plus quadratic relaxation for 387 mock CBI 08h deep field data sets (top), with each mock observation drawn from an independent realization of the sky given the model power spectrum (dashed curve) and the instrumental noise with the appropriate rms. The points (black circles) are placed at the band centers, at the mean of the reconstructed band powers with the red error bars given by the scatter among the realizations. The blue error bars to the right of the points show the average of the inverse Fisher matrix diagonals. The histograms show the width of each band and the level B expected by integrating the model Cl over the window functions Wl , which are shown in the bottom panel. The mean of the realizations converges to the expected value within the Poisson uncertainty, taking into account the correlations between adjacent bands.

1 given by the inverse of the spacing du and a resolution The situation for the mosaics is somewhat more compli- 1 given by du =L. In the continuum limit (see Appendix A), cated, as the mosaic offsets must be taken into account in we can define an estimator T^ðxÞ for the temperature field constructing equivalent visibilities for point sources TðxÞ, XÈÉ Z PSF PSF PSF Di ðx^Þ¼ QikVk ðx^ÞþQikVk ðx^Þ ; T^ ðxÞ¼ d2u DðuÞe2iu x x ; ð91Þ k PSF 2iuk x ðx^xkÞ Vk ðx^Þ¼AkðÞx^ xk e ð93Þ where DðuÞ is the continuous functional form (e.g., eq. [A2]) for the estimators, with Di ¼ DðuiÞ. In practice, the lattice of obtained by evaluating equation (11) with no noise and 2 estimators Di is embedded in a wider grid padded with zero IðxÞ¼ ðx x^Þ. In this case one would evaluate the PSF at in the unsampled cells, and a fast Fourier transform is various positions x^ in the map. carried out. Because our estimators use the kernel Q as given in equa- ð1Þ ~ For our standard gridding normalization zi ¼ zi given tion (22), which includes the beam transform A, we are effec- in equation (A21), the units of D will be flux density units tively multiplying the image on the sky by the primary beam (Jy), and thus its inverse Fourier transform will produce a squared: once in the kernel, and once due to the instrument map in units of Jy beam1, where the beam area is given by itself (e.g., eq. [12]). Images made directly from the D will the PSF of the image. For a single field, the PSF is just the therefore be strongly attenuated in the (noisy) outskirts. PSF image generated using equation (91) using estimators Di As mentioned in Appendix A, the optimal weighting for computed by introducing unit ‘‘ visibilities ’’ into equation the imaging of the CMB component is to use the Planck fac- (21), tor in equation (6) to correct for the thermal frequency spec- X È É trum (e.g., eq. [A20]), while our standard intensity PSF Di ¼ Qik þ Qik : ð92Þ weighting given in equation (A19) is optimized for a flat k nonthermal power-law spectrum with spectral index ¼ 0. No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 587

Fig. 3.—Top: Results of the gridded method plus quadratic relaxation for 117 mock CBI (7 6 field) mosaic data sets with a CDM-based power spectrum (dashed curve). As in Fig. 2, the points (black circles) are centered in each bin in l with red error bars giving rms scatter of about the mean for the band powers from the processed realizations, with the blue error bar from average inverse Fisher diagonals to the right of the points. The window functions are plotted in the bottom panel.

We can also filter the gridded estimators in such a way as the possibility of detection of the Sunyaev-Zeldovich effect to enhance or down-weight certain signals or noise. We in the data at high l. can do this with optimal or Wiener filtering (e.g., Bond & Crittenden 2001), D ¼ D ; ð94Þ 9. IMPLEMENTING THE METHOD The algorithm described above was coded as a scalar where the choice of the filter depends on the application. X FORTRAN (f77 compatible) program designated CBI- For example, the covariances C calculated from equa- GRIDR, with a parallelized FORTRAN 90 version using tion (32) can be used to construct optimal filters for each OpenMP4 directives also available for use on multiproces- component contributing to the observations. For the signal X sor machines. In addition, the Bond et al. (1998) likelihood component described by the covariances C we can relaxation was coded in a second parallelized FORTRAN construct a Wiener filter to be applied to the gridded (u, v) 90 program called MLIKELY using parallel versions of estimators LAPACK matrix algebra routines (e.g., x 7). Together, DX ¼ CX C1D : ð95Þ these two programs make up the CBI analysis pipeline. This pipeline has undergone numerous tests and development The amplitudes for the signal models such as the band since its inception in 2001 April and has been used to pro- powers qB or the source amplitudes qsrc can be set to their duce the power spectra and to provide the band powers as maximum likelihood values or to fiducial model amplitudes. input to the cosmological parameter analysis given in the The Wiener-filtered image is then recovered by Fourier companion papers. We now give a brief description of our transforming. implementation. Examples of images created using this method are described in the next section and shown in Figure 6 below. Wiener-filtered images are also used in Paper VI to explore 4 Available at http://www.openmp.org. 588 MYERS ET AL. Vol. 591

In order to carry out the numerical integrations, a fine- The storage for R, MN , and M N dominates the memory grain rectangular lattice in (u, v)-space was used. The fine- requirements. For example, the CBI mosaics use around grain grid size Dufine (in units of wavelength) was chosen to nest ¼ 2500 estimators, and thus storage for 53 53 adequately sample the phase turns in the (u, v)-plane as a 2500 7 106 double-precision complex numbers is 2 6 result of the mosaic size and differencing; for a standard needed. A single packed nest 6 10 array is needed to 7 6 CBI mosaic with 200 spacing, the maximum field sepa- hold MN and M N . The CX matrices are calculated in place ration along the grid direction is xmax ¼ 2 , which gives and written out row by row, and thus they need not be oscillations in the (u, v)-plane with wavelength of stored. There are no instances where matrices of dimension 1 2 xmax ¼ 28:65, giving Dufine 14:3 for two samples per cycle. Nvis are stored; the storage for a matrix of this size would be 5 To evaluate the projection operators RiðvÞ (e.g., eq. [23]), we prohibitive as our largest CBI mosaics have Nvis > 2 10 . store a small fine-grain lattice around each ui. The maxi- When all the visibilities have been processed, the esti- mum radius in (u, v)-space needed for the support of R is mators are normalized by zi, split into real and imaginary ru ¼ 2D=min. For CBI D ¼ 90 cm, and min ¼ 0:844 cm parts, and written out to disk. The covariance matrices N S src at 35.5 GHz, we get ru ¼ 213:1. A grid of size 53 53 cells Cij , CBij, and any Cij are constructed by looping over with Dufine ¼ 8:526 will fit both the sampling and radius pairs of rows corresponding to the real and imaginary requirements. parts of each estimator, e.g., rows i and i þ Nest for esti- The estimators are evaluated on a coarse-grain lattice of mator i. For each j i, the stored Ri and Rj are cross- ui, with a spacing of Ducoarse. The fine-grain lattices on which correlated along with the shape function CðjjÞv to form we accumulate the RiðvÞ will be cross-correlated to form the the band power covariance elements MBij and MBij of S covariance elements MB and MB (e.g., eq. [37]); it is desir- equation (37), combined to make CBij using equation able to have the coarse-grid size locked to integer multiples (32), and stored. The relevant columns of these rows of N N N of the fine-grid cells. This coarse grid does not have to sam- Cij are formed from the stored Mij and Mij . At this ple the highest mosaic frequencies, but only the effective point, for each C src desired (there may be more than one; src width of R. Tests were carried out using the mock data (see in the CBI analysis we use three), the relevant Dc are src N below) using different fine-grain cell sizes and coarse-grain combined using equation (55) to form Mij and Mij , src spacings, looking for changes in the derived band powers as which in turn are used to make Cij . After all columns these were varied. We find that for single CBI fields, for these rows of the covariance matrices are computed, Ducoarse ¼ 3Dufine is adequate. For CBI mosaics, a hybrid they are written to disk, and this process is repeated for lattice with Ducoarse ¼ Dufine in the inner part (l < 800) and the pair of rows corresponding to the next estimator i. Ducoarse ¼ 2Dufine in the outer part was found to work well. When all rows are complete, the output file is complete. S As stated in x 2, the choice of sign of the exponential of Note that different binnings of the C B can be run without the Fourier transform in equation (1) is a convention. This regridding using the original R, saving significant time. choice varies throughout the literature on the subject, but in If a residual foreground covariance matrix C res is desired, practice it depends on the way the baseline vectors are the procedure outlined above is repeated in its entirety using defined in the data and how the correlation products are the description in x 6.4. The spectrum and shape appropri- made (e.g., which antenna gets the quadrature phase shift). ate for the source population or foreground emission are We note that in coding our algorithm to conform to the used during the gridding and covariance matrix construc- imaging standards of the AIPS5 and DIFMAP6 (Shepherd tion. Other than these factors, the same gridding as in the 1997) packages using the CBI data, we had to use the oppo- CMB and noise estimators must be used. site sign convention from the one presented in equation (1). The output files from CBIGRIDR are then used as input To process a data set, a spectral weighting f() and shape to MLIKELY. These can be for single fields or mosaics, or shape function Cl are chosen. The visibilities Vk are looped for combinations of independent fields or mosaics (x 7.1). src over, and any source subtraction (x 6.3) is applied. For each At this point, the prefactors qsrc and qres for any C and res estimator i that Vk contributes to either directly or as a con- C covariance matrices are chosen and fixed. Relaxation jugate, its contribution to the fine-grain lattice q for RiðvqÞ is to the likelihood maximum is carried out as described in x 7, accumulated, e.g., and the resulting band powers fqBg and inverse Fisher X 1 ÈÉ matrix elements ½F 0 are written out. If desired, the band BB RiðvqÞ¼ QikPkðvqÞþQikPkðvqÞ ; vq ¼ ui þ Dufine^vq ; power window functions W B ( 7.2) can be computed if Bf x k S CBIGRIDR was run to produce narrow-bin C Bf . The com- N src res ð96Þ ponent band powers qB , qB , and qB (x 7.3) can also be computed at this time. where ^vq is a 53 53 unit (fine-grain) lattice. This means Finally, filtered images using the formalism of x 8 can be that for nest estimators, the storage required for R is only computed from the estimators, the C (at maximum likeli- 2809nest double-precision complex numbers. If the data hood), and the component covariance matrices. Results sw were differenced (as for CBI data), then Pk ðvÞ from equa- from this are shown below and in Paper VI. tion (70) is used. The contributions to the noise covariance The timing for CBIGRIDR depends on the degree of par- N N src elements M and M (eq. [35]) and the Dc (eq. [54]) are allelization, processor speed on a given machine, number of also accumulated at this time. Finally, this visibility’s contri- visibilities gridded, number of foreground sources, and butions are added to estimator Di and normalization zi. number of bins B for the band powers. As an example, the processing of the 14h mosaic field of Paper III (the largest of the data sets) involved gridding 228,819 visibilities from 65 separate nights of data in 41 fields to 2352 complex estima- 5 See http://www.cv.nrao.edu/aips. tors. A total of 916 sources were gridded into three source 6 See ftp://ftp.astro.caltech.edu/pub/difmap. covariance matrices. A total of seven different binnings for No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 589

S CB were run at this time from the same gridding. The execu- use finer binned bands since the correlations are taken into tion time using the parallel version of CBIGRIDR was account in the analyses. h m B 2 40 running on 22 processors on a 32 processor Alpha The band window functions Wl are shown in the bottom GS320 workstation at the Canadian Institute for Theoreti- panel of Figure 2 and were computed using narrow binnings cal Astrophysics (CITA). It then took 3h22m on the same W B (e.g., eq. [88]) with Dl 20. The small-scale structures Bf ¼ computer for MLIKELY to process 4704 double-precision seen in the window functions, particularly visible around S src real estimators in 16 C B bands, with three C matrices, one the peaks, are due to the differencing that introduces oscilla- Cres, and one CN . This included the time needed to calculate tions (see x 6.6). As shown in equation (87), a window func- X B the component band powers C B , but not the window func- tion Wl is normalized to sum to unity within the given band tions. The speed of this fast gridded method has allowed us B and to sum to zero in the other bands, and thus there must to carry out numerous tests on both real and simulated data be compensatory positive and negative ‘‘ sidelobes ’’ of the sets, which would not have been possible carrying out maxi- window function outside the band. mum likelihood (e.g., using even the optimized MLIKELY) Figure 3 shows the power spectrum derived for a simu- on the 200,000 plus visibilities. lated mosaic of 7 6 fields separated by 200 using the actual CBI 20h mosaic fields from Paper III as a template. This mosaic field was chosen as it had incomplete mosaic cover- age and thus would be the most difficult test for the method. 9.1. Method Performance Tests Using Mock CBI Data S The binning for C B shown used Dl ¼ 200, which gave adja- The performance of the method was assessed by applying cent band anticorrelations of 13% to 18% in the FBB0 . it to mock CBI data sets. Simulated CBI data sets were Again the mean of the 117 realizations converges to the obtained by replacing the actual visibilities from the data value expected within the error bars, showing that there is files containing real CBI observations of the various fields no bias introduced by the method, even in the presence of used in Papers II and III with the response expected for a substantial holes in the mosaic (see Paper III for the mosaic realization of the CMB sky drawn from a representative weight map). Furthermore, the rms scatter in the realiza- power spectrum, plus uncorrelated Gaussian instrumental tions converges to the mean of the inverse Fisher error bars, noise with the same variance as given by the scatter in the as in the single-field case. As in the previous figures, the actual CBI visibilities. The differencing of the lead and trail band power window functions are shown in the lower fields used in CBI observations was included (e.g., x 6.6). panels. This mock data set had the same (u, v) distribution as the In Figure 4 are shown three randomly chosen realizations real data and gives an accurate demonstration of expected from the ensemble, plotted along with the input power spec- sensitivity levels and the effect of . The trum. This shows the level of field-to-field variations that we power spectrum chosen for these simulations was for a might expect to see in CBI data. There are noticeable devia- model that fitted the COBE and BOOMERANG data tions from the expected band powers in individual realiza- (Netterfield et al. 2002). tions, particularly at low l where cosmic variance and the Figure 2 shows the power spectrum estimation derived highly correlated bins conspire to increase the scatter. These following the procedure detailed above. The mock data sets differences are within the expected scatter when bin-bin cor- were drawn as realizations for the 08h CBI deep field from relations and limited sample size are taken into account, but S Paper II. The binning of the signal covariance matrix C B care must be exercised in interpreting single-field power was chosen to be uniform in l with bin width Dl ¼ 500. spectra. In particular, the acoustic peak structures are Because a single realization of the sky drawn from the model obscured by the sample variations. However, the average power spectrum will have individual mode powers that devi- band powers for the three runs (shown in Fig. 4 as open ate from the mean given by the power spectrum as a result black circles) are better representations of the underlying of this intrinsic so-called cosmic variance plus the effect of power spectrum. Although this is not a proper ‘‘ joint ’’ the thermal instrumental noise, we analyze 387 realizations, maximum likelihood solution (e.g., x 7.1) as is done for the each taken from a different realization of the sky and a dif- real CBI mosaic fields, the improvement seen using the ferent set of instrumental noise deviates. The mean qB for three-field average leads us to expect that the combination each band B converge to hCBi, which is obtained by inte- of even three mosaic fields damps the single-field variations B grating the model Cl over the window functions Wl (e.g., sufficiently to begin to see the oscillatory features in the eq. [89]), within the sample uncertainty for the realizations. CMB power spectrum. While we do not show the equivalent Furthermore, the standard deviation of the qB from the plots of the deep fields from Figure 2, the same behavior is mean for each band agrees with the value obtained from the seen (with even larger field-to-field fluctuations in the rela- diagonals of the inverse of the Fisher matrix. tively unconstrained first bin, although still consistent with The choice of the l bin size is driven by the trade-off the error bars). between the desired narrow bands for localizing features in The effect of adding point sources to the mock fields and the power spectrum and the correlations between bins intro- then attempting power spectrum extraction is shown in Fig- duced by the transform of the primary beam. There is an ure 5. A set of 200 realizations were made in the same man- 1 anticorrelation between adjacent bands seen in ½F BB0 at ner as in the runs in Figure 2, but the list of point-source the level of 13% to 23% for Dl ¼ 500 with a single field. positions, flux densities and uncertainties, and spectral indi- We have found that correlations up to about 25% give ces from lower frequency used in the analysis in Paper II plots of the qB that are more visually appealing than those (the ‘‘ NVSS ’’ sources) was used to add mock sources to the made with narrower band and higher correlation levels as a data. The flux densities of the sources actually added to the result of the increasing scatter in the band powers about the data were perturbed using the stated uncertainties as 1 mean values. Bins of this size do not achieve the best possi- standard deviations. The errors used were 33% of the flux ble l resolution, and thus our cosmological parameter runs density except for a few of the brighter sources that were put 590 MYERS ET AL. Vol. 591

Fig. 4.—Three randomly selected realizations from the mosaic field simulations shown in Fig. 3 plotted against the model power spectrum (solid black curve). The three lines for the band powers reconstructed from the three realizations correspond to the band powers (central lines) and the 1 excursions using the inverse Fisher error bars. The scatter in band powers between realizations is within the expected range. Also shown is the unweighted scalar average of the band powers for the three realizations, which is an approximation to a true joint likelihood solution. The average is a better fit to the model, as is expected. in with 100% uncertainties. We then used the methodology source flux densities and putting the standard deviations src described in x 6.3 to compute the constraint matrices. The into C with qsrc ¼ 1 (the red triangles in Fig. 5). The filter- first method of correction used was to subtract the (unper- ing down-weights the high spatial frequency noise seen in turbed) flux densities from the visibilities and build the C src the unfiltered image and effectively separates the CMB and from Dsrc built using the uncertainties (shown as the red tri- source components as shown by comparing the bottom pan- angles). In addition, we also did no subtraction but built els of Figure 6 to the total signal in the top right-hand panel. Csrc from Dsrc using the full (unperturbed) flux densities The signal in this realization is dominated by the residuals (shown as the blue squares). This is equivalent to assuming from two bright point sources that had 100% uncertainties a 100% error on the source flux densities and thus canceling put in for their flux densities and thus escaped subtraction. the average source power in those modes. In both cases the The effectiveness of C src in picking out the sources in the factor qsrc ¼ 1 was used. The simulations show that both image plane underlines its utility as a constraint matrix in methods are effective, with no discernible bias in the the power spectrum estimation. reconstructed CMB band powers. Finally, the production of images using the gridded esti- mators described in x 8 is demonstrated in Figure 6. The ser- 10. CONCLUSIONS ies of plots show the effect of Wiener filtering using the noise We have outlined a maximum likelihood approach to and various signal components on an image derived from determining the power spectrum of fluctuations from inter- one of the mock 08h CBI deep field realizations with sources ferometric CMB data. This fast gridded method is able to from the ensemble shown in Figure 5. The Planck factor handle the large amounts of data produced in large mosaics weighting of equation (A20) was used during gridding to such as those observed by the CBI. Software encoding this optimize for the thermal CMB spectrum, although in prac- algorithm was written and tested using mock CBI data tice this makes little difference as a result of the restricted drawn from a realistic power spectrum. The results of the frequency range of the CBI. The estimators for this realiza- code were shown to converge as expected to the input power tion were computed by subtracting the mean values of the spectrum with no discernible bias. For small data sets, this No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 591

Fig. 5.—Results using mock deep field data sets including foreground point sources based on the actual list used in the CBI data (top) along with the window functions (bottom). The input power spectrum and expected band powers are as in Fig. 2. The green stars show the average for 200 realizations where no source subtraction or projection was done, with the powers divided by a factor of 2 to fit on the plot. The expected increase with l2 is seen, along with a falloff in the last bin due to the source frequency spectrum. The points with error bars at the center of each bin (red triangles) were computed from 200 realizations processed with subtraction of the mean flux density from the visibilities and construction of C src using the uncertainties, while the points with error bars to the right of these (blue squares) are from 200 realizations where no source subtraction was done, but we built C src using the full (unperturbed) flux densities that projects out the sources with only a slight increase in noise. Despite the large power from sources at high l, our method successfully removes the foreground power from the spectrum with no sign of bias. code was also tested against independently written software We close by noting that our formalism can be extended to that worked directly on the visibilities. In addition, the pipe- deal with polarization data. In the case of CMB polariza- line was run with gridding turned off as described in x 6, tion, there are as many as six different signal covariance again for small test data sets. No bias or significant loss in matrices of interest in each band, with estimators (or visibil- sensitivity was seen in these comparisons. ities) for parallel-hand and cross-hand polarization prod- This software pipeline was used to analyze the actual CBI ucts, and thus development of a fast method such as this is data, producing the power spectra presented for the deep critical. In 2001 September polarization-capable versions of fields and mosaics in Papers II and III, respectively. The CBIGRIDR and MLIKELY were written and tested. We output of the pipeline also was used as the input for the cos- describe the method, the polarization pipeline, and results mological parameter analysis in Paper V and the investiga- in an upcoming paper (S. T. Myers et al. 2003, in tion of the Sunyaev-Zeldovich effect in Paper VI. preparation). This method is of interest for carrying out power spec- trum estimation for interferometer experiments that pro- S. T. M. was supported during the early years of the CBI duce a large number of visibilities but with a significantly by an Alfred P. Sloan Fellowship from 1996 to 1999 while at smaller number of independent samples of the Fourier plane the University of Pennsylvania. Genesis of this method by (such as close-packed arrays like VSA or DASI). The CBI S. T. M. greatly benefited by a stay in 2000 July at the ITP in pipeline analysis is carried out in two parts, the gridding and Santa Barbara, supported in part by the National Science covariance matrix construction from input uv-FITS files in Foundation under grant PHY 99-07949. The National CBIGRIDR and the maximum likelihood estimation of Radio Astronomy Observatory is a facility of the National band powers using quadratic relaxation in MLIKELY. The Science Foundation operated under cooperative agreement software for the pipeline is available by contacting the by Associated Universities, Inc. The CBI was funded authors. under NSF grants AST 94-13934, AST 98-02989, and 592 MYERS ET AL. Vol. 591

Fig. 6.—Images reconstructed from the gridded estimators using the formalism of x 8. Data are for one of theP mock 08h deep field realizations with sources X src S used in Fig. 5. Top left: ImageP computed without any filtering. Top right: Image derived by setting C ¼ C þ B qBC B the sum of the signal terms. Bottom X S X src left: Image using C ¼ B qBC B for the CMB only. Bottom right: Image for C ¼ C to pick out the point sources. The filter clearly dampens the noise and separates the CMB and source components. The residuals from several bright sources dominate the signal in all but the CMB-filtered image (the brightness scale in that image is enhanced). The white dashed circle shows the 45<2 FWHM of the CBI at 31 GHz. The attenuation of the signal brightness due to the square of the primary beam is clearly seen.

AST 00-98734, with contributions by Maxine and Ronald addition, this project has benefited greatly from the comput- Linde and Cecil and Sally Drinkward and the strong sup- ing facilities available at CITA and from discussions with port of the California Institute of Technology, without other members of the group at CITA not represented as which this project would not have been possible. In authors on this paper.

APPENDIX A

FORM OF THE LINEAR ESTIMATOR Suppose we were to construct a simple linear ‘‘ dirty ’’ mosaic on the sky obtained by a linear combination of the dirty (not deconvolved) images of the individual fields (e.g., Cornwell et al. 1993). In the (u, v)-plane, this reduces to summing (integrating) up the visibilities from each mosaic ‘‘ tile ’’ with some weighting function, e.g., X Di ¼ QikVk ; ðA1Þ k where for the time being we ignore the contribution from the complex conjugates of the visibilities (see below). For illustrative purposes, let us consider only a single frequency channel and write the estimator as a function DðuÞ, where Di ¼ DðuiÞ, which No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 593 in the absence of instrumental noise is given by Z Z DðuÞ¼ d2v d2x Fðx; vÞQðu; x; vÞhVðx; vÞi ; ðA2Þ with kernel Q, sky and aperture plane sampling given by F, and where Z Vðx; vÞ¼ d2v0 ~IðÞv0 A~ðÞv v0 e2iv0 x x ðA3Þ is the visibility at pointing position x and (u, v) locus v from equation (11). In practice, the sampling function is just a series of delta functions X 2 2 Fðx; vÞ¼ !k ðx xkÞ ðv ukÞðA4Þ k over the measured visibilities k ¼ 1; ...; Nvis each with weight !k. As an Ansatz, we let the mosaicking kernel have the form Qðu; x; vÞ¼Qðv uÞe2iu x x ; ðA5Þ where Q is the interpolating kernel. Furthermore, let us assume that the (u, v)-plane coverage is the same for all mosaic pointings, and thus Fðx; vÞ is separable, Fðx; vÞFðvÞGðxÞ ; ðA6Þ where FðvÞ and GðxÞ are the sampling and weighting in the two domains. Combining these and rearranging terms, we get Z Z Z DðuÞ¼ d2v0 ~IðÞv0 d2v FðvÞQðv uÞA~ðÞv v0 d2x GðxÞe2iðuv0Þ x x ðA7Þ Z Z ¼ d2v0 ~IðÞv0 G~ðÞu v0 d2v FðvÞQðv uÞA~ðÞv v0 ; ðA8Þ where in equation (A8) we used the fact that the final right-hand side integral in equation (A7) is the Fourier transform G~ of the mosaic function G. For an infinite continuous mosaic, G~ðv0 uÞ¼2ðv0 uÞ and thus Z DðuÞ¼~IðuÞ d2v FðvÞQðv uÞA~ðv uÞ : ðA9Þ

If we wish to recover DðuÞ¼~IðuÞ in this limit, then 1 Qðv uÞ¼ A~ðv uÞðA10Þ zðuÞ with normalization Z zðuÞ¼ d2v FðvÞA~ 2ðv uÞðA11Þ will fulfill our requirements. We have chosen A~ðv uÞ as the (u, v) kernel as it reproduces the least-squares estimate of the sky brightness in the linear mosaic (Cornwell et al. 1993). Then, equation (A8) becomes Z Z 1 DðuÞ¼ d2v0 ~IðÞv0 G~ðÞu v0 d2v FðvÞA~ðv uÞA~ðÞv v0 ; ðA12Þ zðuÞ which has a width controlled by the narrower of the width of A~ 2 or the width of G~. Thus, by widening the mosaic GðxÞ to a larger area than the beam A, we will fill in the desired information inside the A~ smeared patches in the (u, v)-plane. Thus, a properly sampled M2 mosaic will fill in an M2 subgrid within each (u, v) cell you would have normally had for a single pointing, and thus an M2 mosaic consisting of N2 ‘‘ images ’’ each is equivalent to a (u, v) supergrid of size ðM NÞ2 (e.g., Ekers & Rots 1979). Note that for a noncontinuous mosaic, there will be ‘‘ aliases ’’ in the (u, v)-plane separated by the inverse of the mosaic spac- ing in the sky (Cornwell 1988). Ideally, we would like the separation between aliased copies to be larger than the extent of the beam transform, which is satisfied for Dx =2D, which for D ¼ 90 cm corresponds to 21<6 at 26.5 GHz and only 16<1 at 35.5 GHz, the centers of the extremal CBI bands. The spacing used in the CBI mosaics is a compromise between the aliasing limits over the bands and the desire to have a fewer number of pointings on a convenient grid. We chose to observe with pointing cen- ters separated by 200, which is suboptimal above 27.5 GHz. However, the effect of aliasing is small, with the overlap point a1 D1 occurring at the 0.61% point of A~ at 31 GHz and the 6.5% point for the highest frequency channel at 35.5 GHz. 594 MYERS ET AL. Vol. 591

We obtain the gridding kernel Qik of equation (A1) corresponding to equation (A10) by using the discrete sampling in equation (A4),

!k ~ 2iui x xk Qik ¼ Ak ðuk uiÞe ; ðA13Þ zi with visibility weights !k and normalization factor zi. The discrete form of the normalization derived in equation (A11) is X ~ 2 zi ¼ !kAkðuk uiÞ : ðA14Þ k Then, 1 X D ¼ ! A~ðu u ÞV e2iui x xk ðA15Þ i z k k k i k i k is the weighted sum of visibilities used for the estimators. Note that because VðuÞ¼VðuÞ, there are also visibilities Vk0 for ~ which uk0 lies within the support range around ui, i.e., A 0 ðuk0 uiÞ > 0. Thus, we should add in the complex conjugates k Vk 1 X ÂÃ D ¼ ! A~ðu u ÞV þ A~ðu u ÞV e2iui x xk : ðA16Þ i z k k k i k k k i k i k

To do this, we construct another kernel Q ik,

!k x ~ 2iui xk Qik ¼ Ak ðuk uiÞe ; ðA17Þ zi which will gather the appropriate Vk , giving X È É Di ¼ QikVk þ QikVk ðA18Þ k for the final form of our linear estimator. 2 For estimated visibility variances k, the optimal weighting factor (in the least-squares estimation sense) is given by 1 !k ¼ 2 ðA19Þ k but may also include factors based on position in the mosaic or frequency channel. For example, up until now we have neglected the frequency dependence of the observed visibilities. If we are reconstructing an intensity field with a uniform flux density spectrum, then no changes need be made. If there is a frequency dependence, such as that for a power-law foreground (e.g., eq. [7]) or the thermal spectrum of the CMB (e.g., eq. [6]), then the visibilities should be scaled and weighted by the ~ ~ appropriate factor fk when gridded in order to properly estimate I0ðukÞ or TðÞuk , respectively. For example, for the CMB using equation (6) for the spectrum, we find

1 T fT ðkÞ!k ~ Qik ¼ Ak ðuk uiÞ ; zi 2 fT ðkÞ !k ¼ 2 : ðA20Þ k In practice for the CBI, the frequency range of the data is not great enough for the spectral weighting factor to matter, and we therefore use the default weighting given in equation (A19). This will therefore be slightly nonoptimal in the signal-to-noise sense (it will not be the minimum-variance estimator), but it will not introduce a bias in the band powers. The choice of the normalization zi is somewhat arbitrary, as it only determines the units of the Di and not the correlation properties. However, this can be important if we wish to use the estimators to make images using the formalism of x 8. For ~ 2 instance, the normalization given in equation (A14) has the drawback of diverging in cells where all the Ak are vanishingly small [such as the innermost and outermost supported parts of the (u, v)-plane] and will produce images with heightened noise on short and long spatial wavelengths. It is therefore more convenient to use the alternate normalization X ~ zi ¼ !kAk ðuk uiÞ ; ðA21Þ k which when inserted into equation (A15) will properly normalize the weighted sums of visibilities. This will then produce images with the desired units of Jy beam1 (see x 8). We therefore use equation (A21) for the normalization in our CBI pipeline. No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 595

APPENDIX B

SOURCE COUNTS AND THE RESIDUAL COVARIANCE MATRIX

res We wish to calculate C (see eq. [67]) using equation (66) with k ¼ k0 ¼ .Ifpðj0Þ is independent of flux density, then Z Z 1 2 SmaxðÞ res 2 dNðSÞ C ¼ d pðj0Þ dS S ; ðB1Þ 1 0 0 dS where we have left in the possibility that the upper flux density cutoff will depend on spectral index (see below) and set the lower flux density cutoff to zero (the results for realistic power-law counts with >2 are insensitive to the lower cutoff, but one can easily be included). As an example for the calculation of the fluctuation power due to residual sources in the Gaussian limit, consider power-law integral source counts 1 S dNðSÞ N0 S Nð> SÞ¼N0 ) ¼ ; ðB2Þ S0 dS S0 S0 where Nð> SÞ is the mean number density of sources with flux density greater than S at frequency 0 and a Gaussian spectral index distribution at frequency 0,

1 2 2 ð0Þ =2 pðjS;0Þ¼pðj0Þ¼pffiffiffiffiffiffiffiffiffiffiffi e : ðB3Þ 2 2

First consider the case in which there is a fixed flux density upper cutoff Smax at the frequency where the number counts are defined. The two parts of equation (67) separate easily, where the source count part of the integral is Z Smax þ2 2 dNðSÞ 2 Smax dS S ¼ N0S0 : ðB4Þ 0 dS þ 2 S0 For the distribution in equation (B3), the integral over becomes Z Z 1 2 1 d 2 ð0Þ=2 2 d pðj0Þ ¼ pffiffiffiffiffiffiffiffiffiffiffi e e 0 22 1 1 Z 1 ð22Þ2=22 d ðÞ2=22 ¼ e 0 pffiffiffiffiffiffiffiffiffiffiffi e 2 1 2 ¼ e2eff ; ðB5Þ where ¼ lnð=0Þ and

2 2 0 2 eff ¼ 2 ¼ 0 þ ; 4 2 ¼ 0 þ 2 ; ðB6Þ where is the mean of the extrapolated spectral index distribution, which remains a Gaussian, and the effective spectral index eff for the spectral component is shifted from the mean spectral index of the input distribution 0 by the combination of the scatter in the and the lever arm from the frequency extrapolation. Putting these together, we get S þ2 res 2 max 2eff C ¼ N0S0 e : ðB7Þ þ 2 S0 ^ One can also deal with the case in which there is an upper flux density cutoff Smax imposed at a frequency ^ other than 0 where the N(S) distribution is defined. In this case, the flux density cutoff in equation (B1) is

^ ^ ð0Þ 0 SmaxðÞ¼Smaxe Smax ¼ S^maxe ; ðB8Þ ^ ^ where ¼ lnð=^ 0Þ and Smax is the cutoff Smax extrapolated to 0 using 0. Then, Z S ðÞ þ2 max dN S S ^ 2 ð Þ 2 max ð0Þðþ2Þ dS S ¼ N0S0 e ; ðB9Þ 0 dS þ 2 S0 596 MYERS ET AL. Vol. 591 and thus Z þ2 1 S ^ res 2 max 2 ð0Þðþ2Þ C ¼ N0S0 d pðj0Þe e þ 2 S0 1 S þ2 2 max 2^eff ¼ N0S0 e ; ðB10Þ þ 2 S0 with

2 2 ^eff ¼ 0 þ ; 2 ¼ 0 þ 2; ^ ¼ 1 ð þ 2Þ ; ðB11Þ 2 where gives the modification of the effective spectral index due to the change in the frequency at which the cutoff is done. One often has an upper flux density cutoff at two different frequencies. For example, sources that are extrapolated to be ^ bright at the CMB observing frequency will have been detected and subtracted. If there is a flux density cutoff of Smax imposed 0 0 at a frequency ^ as before, but an additional upper cutoff of S^max at another frequency ^ , then there is a critical spectral index ^0 ^ ln Smax=Smax ¼ ðB12Þ crit lnðÞ^0=^

0 0 above which the effective cutoff Smax of equation (B8) changes from that appropriate to ^ to that at ^ (assuming ^ > ^). Thus, the integral over in equation (B10) will be broken into two pieces, Z þ2 crit S d 2 2 2 max 2^eff ðÞ =2 J1 ¼ N0S e pffiffiffiffiffiffiffiffiffiffiffi e ; 0 2 þ 2 S0 1 2 Z 0 þ2 1 S 0 d 0 2 2 2 max 2^ ð Þ =2 J2 ¼ N0S e eff pffiffiffiffiffiffiffiffiffiffiffi e ; ðB13Þ 2 0 S 2 þ 0 crit 2 res where C ¼ J1 þ J2. The quantities in J1 are as defined in equations (B8)–(B11), and the parameters in J2 are defined in the same way but using the higher frequency ^0. The truncated Gaussian integrals are just the integrated probabilities for the normal distribution Z x 1 2 1 1 x FðxÞ¼pffiffiffiffiffiffi dt et =2 ¼ þ erf pffiffiffi ; ðB14Þ 2 1 2 2 2 with erf ðzÞ the error function. Then, S þ2 2 max 2^eff J1 ¼ N0S0 e FðxcritÞ ; ðB15Þ þ 2 S0 0 þ2 2 Smax 2^0 0 J2 ¼ N0S0 e eff ½1 FxðÞcrit ; ðB16Þ þ 2 S0

0 0 where xcrit ¼ðcrit Þ= and xcrit ¼ðcrit Þ=. 3 1 As an example, consider the source counts presented in Paper II (x 4.3.2), with N0 ¼ 9:2 10 sr above S0 ¼ 10 mJy at 0 ¼ 31 GHz and ¼0:875, which gives N S2 ¼ 0:715 Jy2 sr1 ðB17Þ þ 2 0 0 as the raw source power. In the analysis described there, Mason et al. find that a Gaussian 1.4–31 GHz spectral index distribu- tion with 0 ¼0:45 and ¼ 0:37 fits the observed data. The CBI and OVRO direct measurements have a cutoff of Smax ¼ 25 mJy at 31 GHz (sources brighter than this have been subtracted from the CBI data and have residual uncertainties ^ placed in a source covariance matrix), and sources above Smax ¼ 3:4 mJy at ^ ¼ 1:4 GHz have already been accounted for in a second source matrix. Therefore, the critical spectral index is crit ¼ 0:644 from equation (B12). For crit, the 31 GHz cutoff holds. Since the cutoff and source distribution are at the same frequency as the observations ¼ 0, there is no 0 0 extrapolation factor ¼ 0 and the spectral index distribution is unchanged ( ¼ 0). Then, xcrit ¼ 2:957 and 0 3 1 FðxcritÞ1:56 10 ,so þ2 2 Smax 0 2 1 J2 ¼ N0S0 ½¼1 FxðÞcrit 0:003 Jy sr ðB18Þ þ 2 S0 No. 2, 2003 ESTIMATION OF POWER SPECTRUM OF CMB 597 for the flat-spectrum tail of the spectral index integral. The rest of the integral uses the 1.4 GHz cutoff, which we extrapolate using the mean spectrum to 31 GHz using equation (B8),

^ 0 ^ Smax ¼ S^maxe ¼ 0:843 mJy; ¼3:098 : ðB19Þ Because ¼ 0, we have to modify the quantities in equation (B11) by explicitly expanding the terms in and canceling remaining terms in , giving

^ 2 1 ^2 2 2 ¼ 0 ð þ 2Þ¼0:027; 2^eff ¼ 2 ð þ 2Þ ¼ 0:831 ; ðB20Þ which can then be inserted into equation (B15), giving S þ2 2 max 2^eff 2 1 J1 ¼ N0S0 e FðxcritÞ¼0:100 Jy sr ðB21Þ þ 2 S0 for xcrit ¼ 1:667, FðxcritÞ0:952, and thus we expect res 2 1 C ¼ 0:10 Jy sr ðB22Þ for the amplitude of the residual sources in the CBI fields. In Paper II, it is noted that there is a 25% uncertainty on N0,and more importantly the power-law slope of the source counts could conceivably be as steep as ¼1. Taking the extreme of ¼1, we get

res 2 1 C ¼ 0:15 Jy sr ðB23Þ

res using the above procedure. We thus conservatively estimate a 50% uncertainty on the value of C derived in this manner. res 2 1 Note that in Paper II we actually use the value of C ¼ 0:08 Jy sr derived using a Monte Carlo procedure emulating the integrals in equation (B13) but using the actual observed distribution of source flux densities and spectral indices. The agreement between these two estimates shows the efficacy of this procedure in practice.

APPENDIX C

COMPARISON WITH THE HOBSON & MAISINGER (2002) METHOD Recently, Hobson & Maisinger (2002) have independently proposed a binned (u, v)-plane method that is somewhat similar to ours, although it is more directly related to the ‘‘ optimal maps ’’ of Bond & Crittenden (2001). Hobson & Maisinger (2002) use a gathering mapping H (M in their notation), V ¼ Hs þ e ; ðC1Þ rather than our scattering kernel Q of equation (21). In the Hobson & Maisinger (2002) method, the vector s can be thought of as a set of ideal pixels in the (u, v)-plane. They show that the likelihood depends on binned visibilities v and noise n,

1 v ¼ HyE1H HyE1V ¼ s þ n ; 1 n ¼ HyE1H HyE1e ; ðC2Þ where

1 CN ¼ nny ¼ HyE1H : ðC3Þ

The Hobson & Maisinger (2002) kernel Hjk is chosen to equal 1 if the uj of visibility Vj lies in cell k, although other more com- plicated kernels could be imagined. The Hobson & Maisinger (2002) method will also give a calculational speedup through the reduction in number of independent gridded estimators, and the use of the method is demonstrated using simulated VSA data in their paper.

REFERENCES Bond, J. R., & Crittenden, R. G. 2001, in Structure Formation in the Draine, B. T., & Lazarian, A. 1999, ApJ, 512, 740 Universe, ed. R. G. Crittenden & N. G. Turok (NATO ASIC Proc. 565 Ekers, R. D., & Rots, A. H. 1979, in IAU Colloq. 49, Image Formation Dordrecht: Kluwer), 241 from Coherence Functions in Astronomy, ed. C. van Schooneveld Bond, J. R., & Efstathiou, G. 1987, MNRAS, 226, 655 (Dordrecht: Reidel), 61 Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev. D, 57, 2117 Halverson, N. W., et al. 2002, ApJ, 568, 38 ———. 2000, ApJ, 533, 19 Hananay, S., et al. 2000, ApJ, 545, L5 Bond, J. R., & Myers, S. T. 1996, ApJS, 103, 63 Hobson, M. P., Lasenby, A. N., & Jones, M. 1995, MNRAS, 275, 863 Bond, J. R., et al. 2003, ApJ, submitted (Paper VI) Hobson, M. P., & Maisinger, K. 2002, MNRAS, 334, 569 Bracewell, R. N. 1986, The Fourier Transform and Its Applications (New Knox, L. 1999, Phys. Rev. D, 60, 103516 York: McGraw-Hill) Kogut, A., Banday, A. J., Bennett, C. L., Gorski, K. M., Hinshaw, G., & Cornwell, T. J. 1988, A&A, 202, 316 Reach, W. T. 1996, ApJ, 460, 1 Cornwell, T. J., Holdaway, M. A., & Uson, J. M. 1993, A&A, 271, 697 Lange, A. E., et al. 2001, Phys. Rev. D, 63, 42001 598 MYERS ET AL.

Lee, A. T., et al. 2001, ApJ, 561, L1 Shepherd, M. C. 1997, in ASP Conf. Ser. 125, Astronomical Data Analysis Leitch, E. M., Readhead, A. C. S., Pearson, T. J., & Myers, S. T. 1997, ApJ, Software and Systems VI, ed. G. Hunt & H. E. Payne (San Francisco: 486, L23 ASP), 77 Mason, B. S., et al. 2003, ApJ, 591, 540 (Paper II) Sievers, J. S., et al. 2003, ApJ, 591, 599 (Paper V) Mather, J. C., Fixsen, D. J., Shafer, R. A., Mosier, C., & Wilkinson, D. T. Taylor, G. B., Carilli, C. L., & Perley, R. A., eds. 1999, ASP Conf. Ser. 180, 1999, ApJ, 512, 511 Synthesis Imaging in Radio Astronomy II (San Francisco: ASP) Netterfield, C. B., et al. 2002, ApJ, 571, 604 Thompson, A. R., Moran, J. M., & Swenson, G. W., Jr. 1986, Interferome- Ng, K.-W. 2001, Phys. Rev. D, 63, 123001 try and Synthesis in Radio Astronomy (New York: Wiley) Padin, S., et al. 2001, ApJ, 549, L1 (Paper I) White, M., Carlstrom, J. E., Dragovan, M., & Holzapfel, W. L. 1999a, ———. 2002, PASP, 114, 83 ApJ, 514, 12 Pearson, T. J., et al. 2003, ApJ, 591, 556 (Paper III) ———. 1999b, preprint (astro-ph/9912422) Sault, R. J., Staveley-Smith, L., & Brouw, W. N. 1996, A&AS, 120, 375