PoS(AASKA14)005 , Mario j http://pos.sissa.it/ Physics k CEA, IRFU, i , Filipe B. cd ´ c , Benjamin Mort j Rhodes University, e Jodrell Bank Centre for b , Vibor Jeli a CENTRA, Instituto Superior l , Fred Dulwich i , Geraint Harker b Kapteyn Astronomical Institute, University of c i , Jérôme Bobin Harvard-Smithsonian Center for ; Oxford e-Research Centre, University of Oxford h j f eh , Anna Bonaldi a ∗ SKA South Africa; f ASTRON - the Netherlands Institute for ; d , and Jean-Luc Starck , Gianni Bernardi ae f kl [email protected] Speaker. The exceptional sensitivity of the SKAof will allow (CD/EoR) observations in of unprecedented the detail,information Cosmic is both buried Dawn spectrally under and and Galactic Epoch and spatially. extragalactic This foregrounds,rately which and wealth must precisely of be in removed accu- order toalready reveal for the the cosmological previous signal. generation This ofmany problem radio aspects. has telescopes, been but We addressed the summarise application the to contributionsof SKA to high is the redshift different field in and of high foregroundof sensitivity removal the in 21-cm SKA the measurements. Phase context 1 observations We complete usedependent with a cosmological instrumental state-of-the-art signal, effects foregrounds simulation to and frequency- testmethods. both We compare parametric the recovered and cosmological non-parametric signalplore using foreground one several removal of different the statistics and most ex- excitingfind possibilities that with with the current SKA methods — it imaging isget of possible impressive the to power remove ionized spectra the bubbles. foregrounds and We with imagesPSF great of of accuracy and the the to instrument cosmological complicates signal.width this The recovery, into frequency-dependent so smaller we segments, resort torandom each splitting variation of the from a observation the common band- smoothness resolution. smooth of power If foregrounds law or the along a foregroundsthan the non-parametric parametrization are line ones. of allowed of However, their a we sight, behaviour showrestore methods that are the correction exploiting performances challenged techniques of the much can parametric more be approaches,power implemented as law to stands. long as the first-order approximation of a ∗ Copyright owned by the author(s) under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike Licence. Department of Physics & Astronomy, University College London; c

Tecnico, Universidade de Lisboa, Portugal E-mail: Astrophysics, University of Manchester; Groningen; South Africa; Service d’Astrophysique, France; Department, University of the Western Cape, South Africa; a Advancing Astrophysics with the Square KilometreJune Array 8-13, 2014 Giardini Naxos, Italy Santos Emma Chapman Cosmic Dawn and Epoch of Reionization Foreground Removal with the SKA Abdalla PoS(AASKA14)005 ). De- 2.1 ) has been 2 Emma Chapman 2 ). This has obvious advantages for a cosmological era so 2.2 ). 2.3 Though there are now multiple proof-of-principle papers relating to LOS fitting for the EoR For this Chapter, we choose to concentrate on comparing the major LOS methods in the field, One of the ‘saving graces’ of foreground contamination is its smoothness over frequency The majority of early line of sight methods in the literature can be termed parametric as at some The statistical detection of the 21-cm reionization signal depends on an accurate and robust signal recovery, there hasCD/EoR been information most little efficiently, accurately consideration or precisely. given Thoughfitting the to comparison methods of which polynomial- to method any aidscomplicated new the parametric, method recovery non-parametric introduced methods of is andthe fairly indeed exception common, of foreground the brief avoidance, comparison comparisons isIn between in rare, Gu this more with et Chapter al. we (2013);CD/EoR aim Patil simulation, et to comparing al. the compare (2014); recovered non-parametric signals Chapman both et and in al. parametric terms (2014). of methods statistics on and imaging. a SKA Phase 1 spite the successes of the parametricnot methods, definitively known the across fact the remains frequency thattion range the and of form resolution their of of spectral the interest. form foregroundssignal Too is great risks detection. an introducing assump- a It large istigated. element with of These this uncertainty argument allow into that,particular the the shape cosmological more data beforehand recently, to (see ‘blind’ Sec. determinefar methods the not have form directly been observed, of but inves- the resultsto are foregrounds, simulations. often Arguably without not this as assuming is promising commonforegrounds any as sense, based parametric since on results in the when parametric applied simulation methodsdifferent knowledge. one shape If has to modelled these the the methods accepted were form, applied they would to risk foregrounds suffering of a large drop in accuracy. alongside the much more2014) recent by focusing idea analysis of on the avoidingthe area the of cosmological Fourier signal foregrounds space (see where altogether Sec. the (e.g., foregrounds are Dillon sub-dominant to et al. space. While the foregrounds aremological signal expected in to comparison be is highly expectedGnedin correlated to & be on Shaver highly the (2004). uncorrelated, order e.g. LOS ofparametric Di methods MHz, methods. Matteo can the Both et be aim al. cos- divided to (2002); into findfor subcategories the each of form line parametric of of and the sight smooth non- andand foreground subtract function the this along cosmological from frequency signal. the total signal leaving residuals of noise, fittingpoint errors they assume a specific form for the foregrounds, for example a polynomial (see Sec. numerically shown to be the optimal method for power spectrum recovery (Liu & Tegmark 2011). method for removing the foregrounds fromlem the focused total on signal. exploiting the The angular first fluctuationsOh attempts of & to the address Mack 21-cm this (2003); signal, e.g. prob- Di Di Matteo,found Matteo to T. be et and swamped al. Ciardi, by (2002); various B. foregrounds. andof The Miniati, focus the F. then foregrounds, (2004), moved with on but the to the the cross-correlationet frequency 21-cm of correlation al. signal pairs 2004; was of Santos maps et al. used 2005). asof a This the naturally cleaning evolved foreground step into (Zaldarriaga across methods which wholesight exploited segments the (LOS) correlation of, fitting. or This the LOS entire fitting bandwidth (to of, be the discussed observation: in line much of greater detail in Sec. CD/EoR Foreground Removal 1. Introduction PoS(AASKA14)005 . s i ~ ν (2.1) log i a 1 = n i ∑ Emma Chapman + 0 a ) = ν ( f g , b T ´ c et al. (2008) and perform the before making our conclusions in 4 contains the instrumental dirty beams B . n ~ + s ~ 3 BA plane as a linear mixture of the foreground compo- = uv x 3. ~ -th order polynomial: log n = n we describe the state-of-the-art SKA Phase 1 simulation, consist- 3 plane we write uv contain the data and the noise in Fourier space, respectively; the vector n ´ ~ c et al. (2008) sets and we introduce the five methods used in this chapter to mitigate the simulated fore- x ~ 2 2 while Jeli . = 5 One should be careful in choosing the order of the polynomial to perform the fitting. If the In this section we describe the main principles of operation of the Fourier-domain Correlated We start by modelling the data in the The usual method of polynomial fitting is to fit the total observed spectrum along the line of In this section we describe the methods of LOS foreground removal applied in this chapter. The simplest method for foreground removal in total intensity is polynomial fitting in fre- In Sec. ´ c et al. (2008); Bowman et al. (2006); Liu et al. (2009); Petrovic & Oh (2011). n The vectors nents. For each point in the contains the astrophysical foregrounds; the diagonal matrix order of the polynomial is toobe small, dominated the and foregrounds will corrupted be by under-fittedEoR the and signal fitting the could residuals. EoR signal be If could fitted the out. order of For the this polynomial work is we too big, will the follow Jeli Component Analysis (CCA) method.HI signal More can details be on foundCCA the in is Ricciardi method a et and “model al. on learning” (2010)components algorithm, its and from which Bonaldi application the estimates & to data the Brown the exploiting frequency (2015),Cosmic second-order spectrum Microwave respectively. Background of statistics. The (CMB); the This its foreground ability methodanomalous to was microwave emission improve developed has the for been modelling the demonstrated of inrdi the Bonaldi (2012) poorly et and known al. Planck (2007), Collaboration Bonaldi (2013). & Riccia- The order of polynomial variesset slightly between different papers, for example Wang et al. (2006) fitting in log space using a 3rd order polynomial. 2.1.2 CCA (Correlated Component Analysis) sight with a smooth function such as a We divide the methods into thosemetric) which and assume those a which functional loosen form the for constraints the on foreground this signal form (para- 2.1 somewhat (non-parametric). Parametric Methods 2.1.1 Polynomial fitting quency or log frequency, e.g.Jeli McQuinn et al. (2006); Morales et al. (2006); Gleser et al. (2008); 2. Comparing Foreground Removal Methods ground contamination. In Sec. CD/EoR Foreground Removal ing of cosmological signal, foregrounds andresults instrumental of noise, applying used the in this methodsSec. chapter. to We these show the simulations in Sec. PoS(AASKA14)005 ˆ s n ~ ~ (2.2) frequency directly in n x ~ W Emma Chapman = ˆ s ~ 0). = 00 f of observations in } ) is the radius of curvature, the stan- i n s , κ i n . Since Wp smoothing always applies ) ν ( ν ( f ,..., ˆ s ) ~ i 2 ˆ H s , R 4 i 2 , where the approximation, which we adopt here, 00 ν − ( f , x ~ / ) i 1 000 s f , i 1 ν ≈ ( { 0) and becomes singular at IPs ( κ / 0 = , called the mixing matrix, contains the intensity of the foreground κ 0 . For example, in the following, a power law is assumed for the f A ) and reconstruct the foreground components as ) p ~ ( 2.1 A = A -th LOS, we have a set i is a diagonal matrix whose elements are chosen to improve the subtraction by minimizing R Wp smoothing is a technique, introduced to 21-cm work by Harker et al. (2009), to fit the fore- In the Once we have an estimate of the mixing matrix, using a relation between the cross-spectra of The additional assumptions made by the CCA are that the mixing matrix is constant within The cleaning of the HI signal consists of subtracting the estimated foreground components channels, which we wish toto fit with one a LOS, smooth function frommeasure now of on roughness the we integrated will change drop of curvature. the If superscriptholds for exactly clarity. at local extrema Wp ( smoothing takes as its grounds LOS-by-LOS. The aim is toare directly smooth, so exploit in the this physical sense expectation theassume that foreground a the separation specific is foregrounds parametric not form completely for blind. thepenalises foregrounds, It changes or does in anything not, curvature, about however, with their roughness spatial measured structure.the ‘apart Wp name from ‘Wendepunkt’ inflection (Wp), points the (IPs)’, German hence word for ‘inflection point’. where the power of the residuals at eachric frequency. model This step adopted compensates by for small thethe errors CCA, foregrounds in which the at paramet- result a given in frequency.tion a The R2 slight effectiveness (see over/underestimation of Sections of this 3.4 approach the andapplied is 4.4). amplitude to tested It of all with is methods, the and important simula- as toas such note opposed is that a to this way the minimization of inherent approach mitigating ability couldfrom the for be weaknesses our the of models. non-parametric the method parametric to method deal with foregrounds differing 2.2 Non-Parametric Methods 2.2.1 Wp smoothing dardized change of curvature is the data, we can invert eq. ( synchrotron component with unknown,adopt spatially a constant, power-law spectral behaviour index. withother fixed parametric For spectral models the index having free-free, of moremate -2.08. we degrees the parameters of When of freedom. the necessary, model,a we Though it parametric can CCA does method. exploit allows adopt a the parametrization, data and to therefore it esti- is classified as the considered area of theable sky, parametrization and that its unknown elements can be reduced by adopting a suit- CD/EoR Foreground Removal in Fourier space and the matrix components at all frequencies.together The with the 21-cm instrumental signal noise. is modelled as a noise term, contributing to the Fourier domain. at all frequencies. We perform the subtraction as: PoS(AASKA14)005 g K T , in a x ~ cmb T

, 2. Here, 30 s T + Tb Ts Tcmb Tk w n Emma Chapman 25 is the observed data multiplied by a poly- x ~ ) 20 ν ( g Redshift / z is the noise. What defines a , where 15 . We can expand the data n ~ by performing a penalised fit to n ~ n ~ + f 1s to solve for a single sightline, s ~ A ∼ 10 = x ~ we seek. Different algorithms have been f

5 with no prior knowledge of either (note the 5 5 0 10 15 20 10 −

− − − log(T / K) / log(T s ~ becomes the best-fitting function with the given as a smooth function f 5 00 f . and 30 was assumed). Methods differ in their approach to 0, b . Here we utilise another BSS technique, General- should be determined by the level of smoothness we T s ~ A A → λ up to at least a factor of 2, however. λ λ . 25 λ it becomes the best-fitting polynomial of degree ∞ 20 → , making Wp smoothing relatively slow for large data cubes. based on the performance of the method in simulations. The quality of λ is the unmixed data components and λ Redshift / z λ s ~ 15 10 0 and choose 5 1 0

=

0.4 0.3 0.2 0.1 0.9 0.8 0.7 0.6 0.5 Left: The evolution of neutral hydrogen fraction with redshift. Right: The evolution of HI x w n with redshift. The absolute log is taken of b In principle, the choice of a value for This formulation of Wp smoothing is given by Mächler (1993, 1995), who derived a boundary Blind source separation (BSS) uses a mixing model We therefore separate out the IPs, writing T is the mixing matrix, difference to CCA, where athis prior estimation with, form for for example, the independentnen component (1999),Hyvärinen analysis technique et FastICA al. (Hyväri- (2001)statistical and independence applied to of EoR the dataized components by Morphological Chapman Component et Analysis al.sparsity (GMCA), (2012)) of assuming which the assumes foregrounds morphological inPearlmutter diversity order and (2001) to who model suggested them. thatfound This one would approach could be originated find sparsely with a represented, Zibulevskybe basis & i.e. non-zero. set With in a the which basis components thecould set being then components where unlikely use to only to this be have a sparsity the to fewlogical same more of signal few easily as the non-zero separate a coefficients coefficients the residual would one mixture. of the We process, attempt i.e. to recover it the is cosmo- actually part of nomial with the IPs asthe its roots. data, where With the the IPs penaltymultiplied specified, term by we is a find given smoothing by parameter, a measure of the integrated change in curvature of proposed to solve this systemet (Mächler 1989; al. Gu (2009). et al.depending Unfortunately, 2013), on but these the we value methods use of currently that outlined take by Harker A BSS problem is the need to estimate both value problem, the solution of which is the function Figure 1: expect in our foregrounds. In the limit of inflection points, while for we fix the fit is quite insensitive to the value of 2.2.2 GMCA CD/EoR Foreground Removal and PoS(AASKA14)005 Emma Chapman , which yields the sparsest -space, separating an area k A 6 and more robustness to noise than ICA-based techniques such A in the wavelet domain. s ~ Variance of the simulated cosmological signal (black line) and of the reconstructed cosmological While the methods so far introduced have been focused on removing the foreground from the For more technical details about GMCA, we refer the interested reader to Bobin et al. (2007, This component separation method has been applied to the Planck PR1 data to estimate a over which foreground dominates, from a‘EoR region window’. which The is boundaries of virtually this foreground(Datta EoR free et window al. — have 2010; a been Vedantham et so-called discussed al. atLiu 2012; length Morales et et in al. al. the 2014a; 2012; literature Liu Trott et et al. al. 2012; 2014b) Parsons as et well al. as 2012; seen in early results from low frequency observations total signal, there has recentlybetween been the discussion expected of frequency avoiding smoothness the ofdent foregrounds foregrounds and response instead. the of The unavoidable an frequency coupling depen- interferometer leaves a characteristic footprint in Figure 2: signals for the 4 foregroundcubes removal (bottom). methods (coloured lines) for the S0 cube (top) and the S1, S22.3 and S3 Foreground Avoidance CD/EoR Foreground Removal wavelet basis and seek an unmixing scheme,components through the estimation of 2008a,b, 2013), where itestimation is of the shown mixing that matrix sparsity, as used in GMCA, allows forlow-foreground a CMB more map (Bobin precise et al.naturally 2014). non-Gaussian and In heterogeneous this components context, such sparsity as is foregrounds. well adapted to remove as FastICA. For a previous application of GMCA to EoR data see Chapman et al. (2013). PoS(AASKA14)005 Emma Chapman True Noise Wp GMCA CCA Poly. . )) -1 los k as described in the text. The los k (k / (h Mpc 10 True Noise FG avoidance log points since they are strongly affected by k -0.5 -0.4 -0.3 -0.2 -0.1 1 0 -1 -2 z=10.3899 0.5 -2 -1.5 -1 -0.5 -0.5 -1.5 -2.5 7 True Noise Wp GMCA CCA Poly. True Noise Wp GMCA2 GMCA4 GMCA6 CCA Poly. )) -1 modes inside the EoR window alone. While this will result also (k / (h Mpc k 10 where the frequency dependent response of the instrument is smooth, log mode that are likely to be dominated by foregrounds. When constructing perp k los z=7.1296 z=18.0202 -2 -1.5 -1 -0.5 k

0 0

-1 -2 -3 -1 -2 -3

10 10

/ (mK) / ( log ) (mK) / ( log ) ∆ ∆

2 2 2 2 Spherically averaged power spectrum of the simulated cosmological signal, reconstructed cosmo- Figure 3: logical signal, and noise.computed The in an top 8MHz left, slice top inpanel. the right Different centre and line of bottom styles S1, and S2 leftfor and colours the panels S3. show four show, the The different respectively, foreground central true power removal redshift (input) methods,of spectra of signal, as these each described the slice panels. in noise, is the and shown legend. Note in thecomponents The that the in reconstructed axes the we signal are bottom the show left same two panel. in extra We each do GMCA not lines show the according highest to foreground models with 2 and 6 CD/EoR Foreground Removal the point spread function. Thedifferent bottom line right styles panel show shows the the true applicationfor signal, of the foreground the data avoidance. noise at Here, and different the result the redshifts for recovered (red S1 signal, is for with poorer, S2, and the is blue differentin not for colours this shown S3), used in panel order with to is cutoffs avoid differenta compressing in the from Hanning scale the of taper the other in plot; three. noteavoidance, the that since In the frequency it scale all direction mitigates the cases, to aliasing the avoid of power ringing: unsubtracted spectra foreground this power are to is computed high after particularly applying important(Bernardi for et foreground al. 2013; Dillon ethappens al. to 2014; be Pober et at al. low 2013). The largest extent of the EoR window excluding the low other statistics such as a sphericaldominated power modes spectrum, and one use can then the simply ignore all the foreground- in the loss of anyfree of cosmological foreground signal contamination, outside if the theinstrumental foregrounds calibration EoR are errors. indeed window, so the In well-defined comparison, remaining evenrecovering while region in the foreground the should removal cosmological face allows be signal of the in possibility all of modes, there is also the possibility of foreground fitting PoS(AASKA14)005 ) 1 K − 21 T 2 0 −2 −4 −6 −8 −10 −12 −14 Mpc 63 .

0 log(P(k)) − SIMFAST 0 10 10 Emma Chapman < los k −1 , it is usual to assume and b 7 T . 0 / (Mpc) − 21, for example a minimum −8 −10 −12 −14 2 0 −2 −4 −6 10 perp k <

log(P(k)) los 1. As the aim of this chapter is to . k 0 , 0 SIMFAST 10 cells before producing ionization and −1 93 . = is the CMB temperature. However, this 0 10 3 z −

10 cmb −0.6 −0.7 −0.8 −0.9 −0.5 T

−1

. <

10 10 10 10 10

los

/ (Mpc) / k −1 los perp k k 8 / (Mpc) . The boxes are a constant 1Gpc in side length and 3 −8 −10 −12 −14 2 0 −2 −4 −6 perp k

log(P(k)) 0 10 −1 10

−0.8 −0.4 −0.5 −0.6 −0.7

−1 10 10 10 10 10

los

/ (Mpc) / k −1 is the spin temperature and / (Mpc) s . When calculating the brightness temperature, T

perp k M where −1 10 cmb T

Cylindrical power spectrum of the S0 cube at 75 MHz, 125 MHz and 175 MHz (in reading order). −1 >>

10

los s / (Mpc) / k −1 We simulate the cosmological signal using the semi-numerical reionization code We create initial conditions boxes on a grid of 1024 T assumption breaks down at highand redshift Lyman-alpha due coupling on to the the spin increasing temperature. effect of We calculate the the gas full temperature spin ( temperature for all Figure 4: The foreground contamination can be clearly seen for CD/EoR Foreground Removal assess the effectiveness of foreground removal methods,of as reionization, opposed we to have studying simply a usedhalo particular the model mass default of options 1.0e8 for that are output between redshifts 6 and 28 at separations of d brightness temperature boxes on a grid of 256 3. SKA Phase 1 Simulations 3.1 Cosmological Signal respectively, along with the action of the PSF at high bias being introduced on all modes. (Santos et al. 2010). PoS(AASKA14)005 and K T , cmb T , s T Emma Chapman , and HI and a list of preliminary x 1 = (218, 34.5) degrees. A 5-minute ) δ , α ( 9 . β ν ∝ b T ´ c et al. (2010). The foreground contributions considered in these boxes we create an observation cube, or ‘light cone’, with the LOS axis b T ´ c et al. (2008); Jeli . 1 S0: A cube running from 50–200 MHzsignal consisting convolved of the with clean the foregrounds andsampling PSF cosmological equivalent to at the 50 50 MHz MHz sampling and in the each channel. instrumental noise constructed with a http://www.oerc.ox.ac.uk/ ska/oskar The foreground simulations used in this paper are obtained using the foreground models de- The instrumental effects were modelled using the OSKAR simulator The images of the SKA-low PSF were produced by assuming full correlation between all 866 In order to simulate an observation, one normally constructs a ‘dirty’ cube whereby the cos- Though there have been foreground observations at frequencies relevant to LOFAR using In general, the foreground components are modelled as power laws in 3+1 dimensions (i.e. From the real space 1 • in Fig. b scribed in Jeli core stations, where the maximumdegrees baseline latitude. length Baseline is coordinates 5.29 were km.phase generated centre The on for the telescope a sky was 12-hour at apparent located synthesis equatorial at observation, coordinates with 52.7 a mological signal and foregrounds areconstruct convolved the with instrumental the noise. same frequency-dependentchannels However, PSF the to used standard have to common foreground resolution removal inall methods order with require to channels the separated work by optimally. 0.5 In MHz, which in case our we analysis: use five cubes, simulations are Galactic synchrotron emission,grounds. Galactic free-free emission and extragalactic fore- 3.3 Instrumental Effects station positions for the SKA1-LOW. sampling interval was used to give 144generated snapshots in each CASA of 374545 across baselines. athen PSF 5-degree normalized images according field-of-view were to then using a 256 1000Thompson w-projection hour et planes. al. integration (2001). time The using the noise prescription is described in e.g. 3.4 Cubes for Analysis evolving in redshift and a constant 5 degree field of view. 3.2 Foregrounds WSRT (Bernardi et al. 2009;cies Bernardi and resolution et of al. LOFAR remains 2010)relevant poorly the for constrained. foreground this As contamination a paper result, at rely foregroundolution the on models ranges. frequen- using directly constraints These from constraints observationsobservations are at to create different used a frequency to model and normalize relevant res- for the LOFAR-EoR necessary observations. extrapolations madethree from spatial and frequency) such that CD/EoR Foreground Removal redshifts above 10. We plot the evolutionT of the neutral hydrogen fraction, PoS(AASKA14)005 Emma Chapman 60MHz which is due to foreground residuals. The 10 < ν for both the S0 cube and the collated S1, S2 and S3 cubes. 2 S1: A cube running from 50–99.5 MHzsignal consisting of convolved the clean with foregrounds and the cosmological sampling PSF equivalent to at the 50 50 MHz MHz sampling and in the each channel. instrumentalS2: noise A cube constructed running from with 100–149.5 MHzical a consisting signal of convolved the with clean foregrounds the and PSFa cosmolog- at sampling 100 equivalent MHz to and the 100 the MHz instrumental sampling noise in constructed each with S3: channel. A cube running from 150–200cal MHz signal consisting convolved of with the the clean PSFsampling foregrounds at equivalent and 150 MHz to cosmologi- and the the 150 MHz instrumental sampling noise in constructed each with channel. a R2: We multiply eachfrom channel a Gaussian of distribution with the standard deviation cleanalong of foreground 0.05, the simulating line cube a of 5% by sight. random wiggle a Thisas described foreground random in cube number S2. is drawn then convolved and used to construct a cube All methods show an excess variance at There is not an apparent major disadvantage to any of the methods by splitting the cube into The adjustment of the entire frequency range to a common resolution in S0 results in the loss We construct R2 (R - rough) in order to test the reliance of the methods on the smoothness of We computed the variance of the true input cosmological signal and of the reconstructed cos- • • • • three segments and so we pursue analysis of only the S1, S2 and S3 cubes in the following, in results are good for the othervariance frequencies. in GMCA the and Wp 80–90 MHz smoothing somewhatover frequency underestimate a range. the broad frequency For range. all methods the variance is correctly recovered of a lot of highthe resolution amount information of at high information frequency.foreground lost estimate. There and is the We a therefore amount balance testable of to S1, to data be S2 make the made and between good S3 methodsdespite in signal need a order to recoveries third to of of provide assess the an the whether data optimal being higher the available methods resolution to are constrain information the at foregrounds. the foregrounds. higher We frequencies, expect methods with strongCCA constraints and, on to the smoothness some such degree, as Wp polynomial, smoothingprior to be on more the affected than smoothness. GMCAforegrounds which places themselves no This or explicit roughness as a couldleakage simple of be polarized approximation interpreted foregrounds. of as an an instrumental calibration inherent error roughness such of a the 4. Results 4.1 Variance mological signals for each of thefor 4 a foreground pixel removal size methods. of Thesize. 2.3 variance arcmins. has The been results Given computed are the shown low in noise, Fig. the results are stable for changes of the pixel CD/EoR Foreground Removal PoS(AASKA14)005 , ). 93 4 . ) = 0 k − ( 2 10 ∆ . < k los at k ) Emma Chapman k ( bound, below which modes are los k . Following convention, we plot ) than in real space. The quality of power 2 los los k k + 2 perp 11 k q we can define a = los k k for the data at 75, 125 and 175 MHz respectively (see Fig. , 1 − perp k the result of implementing foreground avoidance. By constructing Mpc 3 63 . 0 − -space (residing mainly at low 10 k < los k and , which traces the contribution to the variance of a unit interval in log shows the spherically averaged power spectrum for a slice in frequency in the centre 7 . 2 0 3 π − 2 / 10 ) k In each case, the noise power spectrum has been subtracted from the residual power to recover We also include in Fig. One might wonder at the two peaks in variance below 100 MHz. While the clean signal does The spherically-averaged 21-cm power spectrum is one of the quantities most readily com- Fig. < ( P 3 -space might be considered a natural space in which to study the 21-cm signal, since interferom- los k considered contaminated by foregrounds andWe above can which then lies construct a what spherical is powerthis termed spectrum the as bound. described ‘EoR above window’. This but ignoring is all modes the below ‘foreground avoidance’ line.As one We can find see, this this severely bound limits to the be range of at scales which can be recovered; however, there are an estimate of theestimate cosmological of signal the power. cosmological signal,slice Foreground with (in the removal S3). possible mostly It exception yieldsperform is of a not GMCA as immediately reasonable in well clear the in why lowest-redshift chosen this GMCA rather frequency as arbitrarily and opposed segment. ideally to all any Theorder methods to other should fiducial select undergo method the four-component a various would full model input not parameters Bayesiannumber of (such model of as selection GMCA components the in was smoothing in parameter the in GMCApower Wp foreground spectrum smoothing model). for and GMCA In with the two S3 andby panel six changing components we the in have number the also foreground of plotted model. componentsfit the We in to can the see the GMCA that other foreground methods modelis which we needed. is can perhaps achieve It indicative a is that similar alsominimization a worth which more allow remembering robust it that method to the of performThis CCA model much minimization results selection better could have than equally undergone if be an the applied in-built minimization to were residual any not method. carried out. k a cylindrical power spectrum in of S1, S2 and(magnitude S3. squared of the In visibility) a inspherical each shells, given cell i.e. frequency in shells range, Fourier with space, a this and given is then binning computed this simply power in by finding the power not show such clear peaks,induces the this action effect. of the realistic SKA PSF on the4.2 clean Power cosmological Spectrum signal puted from theory and isredshift sources a (e.g. rich source Barkana & ofSKA Loeb will information 2005, be on and able many measure subsequent andk it works). on with It the high is nature signal-to-noise expected that of acrosseters the natively a high- measure broad Fourier range modes of into scales. the be more plane Moreover, isolated of in the sky, while smooth foregrounds are likely CD/EoR Foreground Removal order to retain as much spatialrequirements information of as possible the while methods. conforming to the common resolution spectrum recovery has thereforemethods become as a well as popular instruments. metric when comparing foreground removal PoS(AASKA14)005

15 15 GMCA Log Poly Wp CCA R(cs,nocs) 90 Emma Chapman 10 10 5 5 80 0 0 / mK / mK b b 70 T T Frequency / MHz δ δ −5 −5 60 −10 −10

50 0

0.8 0.6 0.4 0.2

−0.2 Correlation Coefficient Correlation −15 −15 12

15 15

15 10 10 10 5 5 5 0 0 / mK / mK b b / mK 0 b T T T δ δ δ −5 −5 −5 −10 −10 −10 Top two rows, reading order: the smoothed residual maps of S1 at 75 MHz from the Polynomial,

−15 −15 −15 Figure 5: CCA, Wp and GMCA methods.and the Bottom correlation row, coefficient left relatingpossible, to whereby to right: foreground residuals fitting errors The vs. are smoothed zero,cosmological is cosmological cosmological shown signal signal. by signal vs. the at correlation simulated The between 75 cosmological noise best MHz signal plus theoretical simulated in recovery solid black. CD/EoR Foreground Removal PoS(AASKA14)005

4 4 GMCA Log Poly Wp CCA R(cs,nocs) 140 Emma Chapman 2 2 130 0 0 / mK / mK b b 120 T T Frequency / MHz δ δ 110 −2 −2

0 100 0.4 0.2 0.8 0.6

−0.2 −4 Coefficient Correlation −4 13

4 4

4 2 2 2 0 0 / mK / mK b b / mK 0 b T T δ δ T δ −2 −2 −2 Top two rows, reading order: the smoothed residual maps of S2 at 125 MHz from the Polynomial,

−4 −4 −4 Figure 6: CCA, Wp and GMCA methods.and the Bottom correlation row, coefficient left relating to to right: residuals vs. The smoothed cosmological signal. cosmological signal at 125 MHz CD/EoR Foreground Removal PoS(AASKA14)005

200 6 4 GMCA Log Poly Wp CCA R(cs,nocs) Emma Chapman 190 4 2 180 2 0 / mK / mK b b T T 170 δ δ Frequency / MHz 0 160 −2 −2

0 150 0.8 0.6 0.4 0.2

−0.2 −4 Coefficient Correlation −4 14

6 6

6 4 4 4 2 2 2 / mK / mK b b / mK b T T δ δ T 0 0 δ 0 −2 −2 −2 Top two rows, reading order: the smoothed residual maps of S3 at 175 MHz from the Polynomial,

−4 −4 −4 Figure 7: CCA, Wp and GMCA methods.and the Bottom correlation row, coefficient left relating to to right: residuals vs. The smoothed cosmological signal. cosmological signal at 175 MHz CD/EoR Foreground Removal PoS(AASKA14)005 Emma Chapman . we take S1 and show 8 5 15 we take S2 and show the residual slices at 125 MHz and in 6 . It is interesting to see the failure of the polynomial method compared 8 for which it performs very well. An optimal power spectrum estimation strategy k . This correction method could be applied to all approaches and relies on the first and z 2.1.2 we take S3 and show the residual slices at 175 MHz. Note that, for all images shown, the 7 We now look at how one of our results is affected in the case of an early-SKA implementation We now demonstrate how the methods fare when analysing a cube containing foregrounds One of the most exciting scientific outcomes of the SKA is the ability to image the EoR and where the sensitivity is halvedWe and do an this SKA2 for implementation where onlycoefficient the between one the sensitivity method, recovered is maps GMCA, quadrupled. in for the three clarity remaining and panels of conciseness. Fig. We show the correlation to the ability of CCASection to recover from a similarorder failure approximation using (i.e. the correctionbeing method that accurate mentioned the enough. in wiggle TheGMCA, is which crucial superimposed do point not on have to a a take prioraccurately on power away with the is no law) foreground smoothness, extra that of are modelling non-parametric ablenon-parametric the input to methods, methods model foreground by like such the the Wp foreground as user. smoothingforegrounds) which In need require, comparison, as some the opposed form parametric to offitting (and assume, parameters. ‘extra indeed smoothness modelling’ It of such is likely ascorrection that correction factors, Wp as factors fitting the or same would tweaking first-order see approximation a of of similar the smoothness improvement applies. to4.5 CCA with Other such SKA configurations which have a random 5%the deviation from correlation the coefficient smooth between powertop-left the law panel along residuals the of of line Fig. cube of sight. R2 and We show the cosmological signal in the Fig. residual cube has been smoothed with ato Gaussian kernel 9.36 of arcminutes, FWHM eight in pixels, order which to isshow equivalent mitigate the the correlation effect coefficient of between thecosmological the instrumental signal smoothed noise. for residual In the cube each and different figurenoise, we methods. the we also smoothed would As not simulated be we well willsignal, motivated only in correlating and have the statistical instead reconstructed knowledge and also simulated ofcosmological cosmological signal plot the and an the ‘envelope’ simulated cosmologicalThis in signal provides the combined an with form the upper of instrumentalforeground bound fitting noise. a errors for were correlation the present. between We bestmethods, see the that though correlation an the simulated we impressive extent image of can recovery that is expect recovery apparent is for to highly all see variable with in4.4 frequency. the Relaxation of data foreground if smoothness zero will likely combine removalhaving and been made avoidance by in Liu et some al. way, (2014b). with some4.3 attempts Images in this direction Cosmic Dawn. We nowpresent review slices how from the the foreground residual removal cubes for methods three affect different this scenarios. capability. In Fig. We CD/EoR Foreground Removal values of the residual slices at 75 MHz; in Fig. PoS(AASKA14)005

200 SKA SKAearly SKA2 SKA SKAearly SKA2 90 190 Emma Chapman 80 180 70 170 Frequency / MHz Frequency / MHz 60 160

50 0 1 0 1 150

0.8 0.6 0.4 0.2 0.4 0.2 0.8 0.6 Correlation Coefficient Correlation Correlation Coefficient Correlation

16 SKA SKAearly SKA2 GMCA Log Poly Wp CCA R(cs,nocs) 140 140 130 130 120 120 Frequency / MHz Frequency / MHz 110 110 In reading order: The correlation coefficient relating to the R2 cube residuals and the cosmolog-

0 100 0 1 100

0.6 0.4 0.2 0.8

We have applied a suiteSKA of Phase 1 foreground observations removal and analysed methods thestatistics. to recovered cosmological a signal state-of-the-art by using simulation different of The variance of the 21-cmmethods. signal is While well setting recovered over the aa entire broad lot bandwidth range of to of spatial a frequencies information common by atof the high all resolution recovered frequency, results there variance. in is We therefore the nocommon opt obvious loss resolution. for advantage of a in compromise the of accuracy setting several segments to

0.2 0.8 0.6 0.4 −0.2

Correlation Coefficient Correlation Correlation Coefficient Correlation As the instrumental noise level decreases, the recovery is greatly improved. The general shape • • of the curves remains similar, suggestingerrors that there which is affect significant the contribution correlation byexample from foreground at fitting slice 85 to MHz, slice. whichforeground There produce leakage are is the indeed the same dominant slices correlation cause in of independent the error S1 of at cube, those noise frequencies. for level, suggesting 5. Conclusions ical signal. The correlationcosmological coefficient signal relating for to the the S1, GMCA S2 residuals and S3 for cubes. the different SKA scenarios vs. Figure 8: CD/EoR Foreground Removal PoS(AASKA14)005 Emma Chapman 17 Foreground removal methods generally yieldspectrum a of reasonable the estimate cosmological ofscales the signal. for spherical which power the Foreground power avoidanceresults spectrum are severely can good. be limits recovered, the however in range such of a limitedWe obtain range an impressive the recovery of the imagesvaries for with all methods, frequency. the quality of which however The relaxation of the hypothesisof of GMCA, foreground smoothness which does is not non-parametric,However, affect while as the it shown performance affects by polynomial the fittingmodelling, CCA, and at the least Wp quality as smoothing. long of as the theforeground smooth spectrum. results model can adopted be is a restored reasonable with approximation some of the extra As the instrumental noise level decreases,some the recovery frequencies of (85 the MHz signal for is example)that greatly the foreground improved. subtraction results For errors are in much these more cases similar, are which dominant. suggests EC would like to thank Jonathan Pritchard and the SKA-EoR working group for useful dis- • • • • Physics", 152, 221 16, 2662 307 Bobin, J., Starck, J.-L., Sureau, F.,Bobin, & J., Basak, Sureau, S. F., 2013, Starck, A&A, J.-L., 550,Bonaldi, Rassat, A73 A., A., & & Brown, Paykari, M. P. L. 2014, 2015,Bonaldi, A&A, Monthly A., 563 Notices & of Ricciardi, the S. Royal 2012, AstronomicalBonaldi, Advances Society, A., 447, in Ricciardi, 1973 Astronomy S., Leach, S.,Bowman, et J. al. D., 2007, Morales, MNRAS, M. 382, F., 1791 & Hewitt, J. N. 2006, ApJ, 638, 20 Bobin, J., Starck, J.-L., Moudden, Y., & Fadili, M. J. 2008b, in “Advances in Imaging and Electron Bobin, J., Starck, J.-L., Fadili, J., & Moudden, Y. 2007, IEEE Transactions on Image Processing, 6. Acknowledgments cussions. GH acknowledges fundingEuropean Union’s from Seventh Framework the Programme People (FP7/2007-2013) under ProgrammeNo REA Grant (Marie 327999. Agreement Curie AB Actions) acknowledges ofgrant support number the 280127. from the European Research Council under the ECReferences FP7 Barkana, R., & Loeb, A. 2005,Bernardi, ApJL, G., 624, de L65 Bruyn, A. G.,Bernardi, Brentjens, G., M. de A., Bruyn, et A. al. G.,Bernardi, 2009, Harker, G., A&A, G., Greenhill, 500, et L. 965 al. J., 2010, Mitchell,Bobin, A&A, D. J., 522, A., Moudden, A67 et Y., Starck, al. J.-L., 2013, Fadili, ApJ, J., 771, & 105 Aghanim, N. 2008a, Statistical Methodology, 5, CD/EoR Foreground Removal PoS(AASKA14)005 , Re- Emma Chapman (John Wiley and Interferometry and Synthesis in Radio 18 Independent Component Analysis (John Wiley and Sons) Very smooth nonparametric curve estimation by penalizing change of curvature ´ ´ c, V., Zaroubi, S., Labropoulos, P., etc, al. V., Zaroubi, 2010, S., MNRAS, Labropoulos, 409, P., 1647 et al. 2008, MNRAS, 389, 1319 Astronomy Sons) search report 71, Seminar für Statistik ETH Zürich CD/EoR Foreground Removal Chapman, E., Abdalla, F., Harker, G.,Chapman, et E., al. Abdalla, 2012, F. MNRAS, B., 423, Bobin, 2518 Chapman, J., E., et Zaroubi, al. S., 2013, & MNRAS, Abdalla, 429,Datta, F. 165 A., 2014, Bowman, preprint J. (astro-ph/1408.4695) D., &Di Carilli, Matteo, C. T., L. Perna, 2010, R., ApJ, Abel, 724,Di T., & 526 Matteo, Rees, T. and M. Ciardi, 2002, B. ApJ,Dillon, and 564, J. Miniati, 576 S., F. 2004, Liu, MNRAS, A., 355, Williams,Gleser, C. 1053 L., L., Nusser, et A., al. & 2014, Benson, Phys.Gnedin, A. Rev. N. D, J. Y., & 89, 2008, Shaver, 023002 MNRAS, P. 391, A.Gu, 383 2004, J., ApJ, Xu, 608, H., 611 Wang, J.,Harker, An, G., T., & Zaroubi, Chen, S., W. Bernardi, 2013, G.,Hyvärinen, ApJ, et A. 773, al. 1999, 38 2009, Neural MNRAS, Networks, 397, IEEEHyvärinen, 1138 Transactions A., on, 10, Karhunen, 626 J., & Oja, E. 2001, Trott, C. M., Wayth, R. B.,Vedantham, & H., Tingay, Shankar, S. N. J. U., 2012, & ApJ, Subrahmanyan,Wang, 757, X., R. 101 M.Tegmark, 2012, Santos, ApJ, M. 745, G., 176 &Zaldarriaga, Knox, M., L. Furlanetto, 2006, S. ApJ, R., 650, &Zibulevsky, 529 M., Hernquist, & L. Pearlmutter, 2004, B. ApJ, A. 608, 2001, 622 Neural Computation, 13, 863 Jeli Jeli Liu, A., Parsons, A. R., &Liu, Trott, A., C. Parsons, M. A. 2014a, R., Phys. & Rev.Liu, Trott, D, A., C. 90, & M. 023018 Tegmark, 2014b, M. Phys. 2011, Rev.Liu, Phys. D, A., Rev. 90, D, Tegmark, 023019 M., 83, Bowman, 103006 J.,Mächler, Hewitt, J., M. & 1989, Zaldarriaga, PhD M. thesis, 2009, ETH—. MNRAS, Zürich 1993, 398, 401 —. 1995, Annals of Statistics, 23,McQuinn, 1496 M., Zahn, O., Zaldarriaga, M.,Morales, Hernquist, M. L., F., & Bowman, Furlanetto, J. S. D.,Morales, R. & M. 2006, Hewitt, F., J. ApJ, Hazelton, N. 653, B., 2006, 815 Sullivan, ApJ,Oh, I., 648, S. & 767 P., Beardsley, & A. Mack, 2012, K. ApJ,Parsons, J. 752, A. 2003, 137 R., MNRAS, Pober, 346, J. 871 C.,Patil, Aguirre, A. J. H., E., Zaroubi, et S., al. Chapman, 2012,Petrovic, E., ApJ, N., et 756, & al. 165 Oh, 2014, S. ArXiv P. e-prints Planck 2011, Collaboration. MNRAS, 2013, 413, A&A, 2103 557, A53 Pober, J. C., Parsons, A. R.,Ricciardi, Aguirre, S., J. Bonaldi, E., A., et Natoli, al. P., 2013,Santos, et ApJL, M. al. 768, G., 2010, L36 Cooray, MNRAS, A., 406, & 1644 Santos, Knox, M. L. G., 2005, Ferramacho, ApJ, L., 625, Silva, 575 M.Thompson, B., A. Amblard, R., A., Moran, & J. M., Cooray, A. & 2010, Swenson, MNRAS, Jr., G. 406, W. 2421 2001,