Research Collection

Doctoral Thesis

Tracing early structure in the high redshift Universe

Author(s): Diener, Catrina

Publication Date: 2015

Permanent Link: https://doi.org/10.3929/ethz-a-010639768

Rights / License: In Copyright - Non-Commercial Use Permitted

This page was generated automatically upon download from the ETH Zurich Research Collection. For more information please consult the Terms of use.

ETH Library DISS. ETH NO. 23072

Tracing early structure in the high redshift Universe

A thesis submitted to attain the degree of

DOCTOR OF SCIENCES of ETH ZURICH

Dr. sc. ETH Zurich

presented by

Catrina Diener MSc ETH Physics, ETH Zurich

born on 02.03.1987 citizen of Fischenthal ZH

accepted on the recommendation of

Prof. Dr. Simon Lilly, examiner Prof. Dr. Simon Morris, co-examiner

2015

To my family: my mother Ursula and my sister Christina

iii iv Abstract

Structure and its growth significantly drive the evolution and properties of our Universe. On large enough scales the distribution and extent of structure is mostly seeded by pri- mordial density fluctuations. Its observation therefore constitutes a direct test of our cosmological model and our understanding of how structures form and evolve within the framework of that model. Furthermore galaxies evolve within the context of their sur- rounding dark matter structure: the exact properties this environment and connected with that the proximity to (or absence of) other galaxies has significant impact on this evolution. In this thesis we study tracers of structure in the early Universe, at 2 . z . 3, using two different approaches. On the one hand we focus on overdensities of galaxies, both their identification and properties. On the other hand we study the connection between high redshift galaxies and gas, more precisely HI (believed to closely trace dark matter) by mapping distribution of Lyα-absorption around the galaxies. With these two approaches we detect and characterise structure over a large range of scales, from ∼ 1 comoving Mpc up to around 100 Mpc. In the first part we identify high-redshift overdense systems in the zCOSMOS-deep survey which provides spectroscopic redshifts for ∼ 3500 galaxies at 2 . z . 3 in a 0.92 × 0.91 deg2 field. The rather dense sampling of zCOSMOS-deep and the relative accuracy of the spectroscopic redshifts allows us to conduct a systematic search for these overdensities by applying a friends-of-friends group-finder, a technique commonly used at lower redhift. The parameters for this algorithm, carefully adjusted to encompass the present structure whilst avoiding contamination from the background, are r = 500 kpc (physical, transverse linking length) and ∆v = 700 km/s (line-of-sight linking length). Applying the group-finder to the zCOSMOS-sample yields a catalogue of 42 systems, mostly triplets but also some with 4 or 5 members. In order to understand the nature of the systems identified in zCOSMOS-deep, we take advantage of the dark matter and galaxy connection provided by the Millennium simulation. We use mock catalogues derived from the simulation and match them to replicate the zCOSMOS-deep redshift number densities. By applying the same group- finder to these simulated catalogues we construct a mock sample of overdensities which, through the connection to the simulation, contains the dark matter information as well as the corresponding dark matter merger trees. By studying these mock systems, we find that at the epoch of observation, they are, although significantly overdense, mostly associations of central galaxies. They therefore do

v not conform to the definition of galaxy groups where a central and a number of satellites share the same halo. However, by redshift z = 0 the high-redshift systems will mostly (93%) evolve into galaxy groups. We are thus observing proto-groups, i.e progenitors to those z = 0 groups. Furthermore we find that most of these systems evolve into groups 13 14 with ∼ 10 −10 M /h halo masses at z = 0. A zCOSMOS-type survey also catalogues a 14 representative fraction of > 10 M /h haloes if the spectroscopic sampling rate of galaxies was 100%. With the actual sampling of zCOSMOS, still 35% of these high-mass haloes are identified. Using the interpretation of these high-redshift systems as proto-groups, we search for potential effects of environment on their member galaxies, in particular for differences in colour. We find that at fixed stellar masses, the red fraction of proto-group galaxies is indistinguishable from field galaxies at the same redshift. We then present a z = 2.45 proto-cluster as a case study. It has been discovered in a re-observation of the proto-groups discussed above and has to date 11 spectroscopically confirmed members with an estimated overdensity of ∼ 10. We again use the Millennium simulation to understand the likely evolution of this structure. Within mock catalogues 16 proto-clusters are identified that have been selected to mimic the observations. Again, at z ∼ 2.5 most of the member galaxies are still singletons in their own dark matter halo, although some may already have started the assembly process and are sharing a halo. By following the evolution of the mock proto-clusters down to z = 0 we find that they almost 14 15 fully assemble by z ∼ 1 and reach a halo mass of ∼ 10 − 10 M /h by z = 0. We study the progenitor galaxies of the z = 0 mock clusters by tracing all cluster galaxies back to z ∼ 2.5. We find that these progenitors occupy rather large fields, 3- 20 Mpc in diameter, and that the actual proto-cluster galaxies we would identify from spectroscopy are concentrated in a smaller region of this progenitor field. This hints at the presence of a more extended structure in the COSMOS field. Finally we search for differences in the properties of the proto-cluster galaxies with respect to the field. We find no significant evidence for variation in stellar mass, rate or fraction of quiescent galaxies. In the second part we study the connection of the intergalactic medium traced by the Lyα-forest and the zCOSMOS-deep galaxies. Our sample consists of eight QSO sightlines in the zCOSMOS-deep field, with 2.5 . zQSO . 3, which were observed by XSHOOTER. To obtain an estimate for the Lyα absorption distribution along these sight-lines, we use a pixel-optical-depth approach which has the advantage of being automated and and providing a quasi-continuous distribution along the line-of-sight. Two of the QSOs had been re-observed with the high-resolution UVES instrument and we use those to analyse the impact of the lower resolution of XSHOOTER on the optical depth estimates. We find that at XSHOOTER resolution, with the typical signal-to-noise of our data, the impact of varying signal-to-noise is more important and that the effect of resolution is negligible. The distribution of the average optical depths around zCOSMOS-deep galaxies is used to study systematics in the assigned galaxy redshifts as well as their reliability. We find that the zCOSMOS redshifts are systematically blueshifted by ∼ 500 km/s with respect to the Lyα absorption. This shift is explainable by systematics in the wavelength calibration

vi of zCOSMOS-deep and galactic winds affecting the lines used for the redshift assignment. We further confirm that the redshift uncertainty of zCOSMOS-deep is of order of a few hundred km/s. As for the reliability of the zCOSMOS-redshifts it is found that redshifts classed as less reliable in the survey itself, indeed have lower average absorption around the corresponding galaxies. We also quantify the fraction of reliable redshifts for different classes of redshifts. This can serve as a guideline when drawing redshift samples from zCOSMOS-deep for other applications. Lastly, we perform a measurement of the Galaxy-Lyα-forest cross-correlation function at 2 . z . 2.5 by estimating the mean overdensity of Lyα-absorption in terms of the optical depth at a distance rp (transverse) and π (line-of-sight) from the galaxies. We are able to cover scales up to rp ∼ 60 Mpc (comoving) and several hundred Mpc along the line of sight, but are limited at scales of rp < 5 Mpc due to small numbers of galaxy-sight-line pairs. The correlation function exhibits the signature of redshift space distortions: a compres- sion in the line-of-sight direction with respect to the transverse direction. The estimated correlation function is then compared with a model for the redshift space distortions that incorporates the compression due to large scale infall of matter. We find no evidence for inconsistency between the observed correlation function and the model; the observed compression is thus explainable by large scale infall.

vii viii Zusammenfassung

Kosmische Struktur und deren Wachstum praegt die Entwicklung unseres Universums massgeblich. Bei ausreichenden Groessenordnungen ist die Verteilung und die Ausdehnung von Materienansammlungen groesstenteils bestimmt durch primordiale Dichtefluktuatio- nen. Die Beobachtung solcher Strukturen stellt daher einen direkten Test fuer das kos- mologische Modell und unser Verstaendnis fuer Strukturentstehung und deren Evolution im Rahmen dieses Modells dar. Desweitern entwickeln sich Galaxien im Kontext der umgebenden dunklen Materienstruktur. Die genauen Eigenschaften dieser Umgebung sowie die Naehe (oder Distanz) zu anderen Galaxien hat einen signifikanten Einfluss auf die zugehoerigen Galaxien. In dieser Dissertation werden Indikatoren fuer kosmische Struktur im fruehen Univer- sum (bei einer Rotverschiebung 2 < z < 3) untersucht. Einerseits konzentrieren wir uns auf raeumliche Verdichtungen in der Galaxieverteilung, und zwar hauptsaechlich auf deren Identifizierung und Eigenschaften. Andererseits charakterisieren wir die Verknuepfung zwischen fruehen Galaxien und Wasserstoffgas. Wir ”kartographieren” die Gasverteilung um Galaxien, der Idee folgend, dass das Gas im Wesentlichen die ansonsten unsichtbare dunkle Materie nachzeichnet. Durch diese zwei Herangehensweisen koennen wir kosmische Struktur ueber weite Groessenordnungen identifizieren und charakterisieren, von ∼ 1 Mpc bis zu etwa 100 Mpc. Im ersten Teil identifizieren wir raeumlich verdichtete Systeme bei hoher Rotver- schiebung im zCOSMOS-deep Sample. Der zCOSMOS-Katalog stellt spektroskopische Rotverschiebungen fuer ∼ 3500 Galaxien in einem 0.92 × 0.91 deg2 Feld und bei Rotver- schiebung 2 < z < 3 bereit. Die, zumindest fuer diese Rotverschiebung, relativ hohe Dichte von vermessenen Galaxien und die relative Genauigkeit der spektroskopischen Rotver- schiebungen ermoeglichen eine systematische Suche nach Verdichtungen in der Galaxie- Verteilung mittels eines ”friends-of-friends” Gruppenfinders, eine Technik, die normaler- weise bei tiefen Rotverschiebungen angewandt wird. Die Parameter fuer diesen Algo- rithmus sind r = 500 kpc (als transverse Verbindungsskala) und v = 700 km/s (entlang der Sichtlinie) und werden sorgfaeltig angepasst, so dass sie vorhandene Strukturen ent- decken, aber die Kontaminierung durch Hintergrundobjekte vermieden wird. Die Anwen- dung dieses Gruppenfinders auf das zCOSMOS-deep Sample resultiert in einem Katalog von 42 Systemen, die hauptsaechlich aus Tripletten bestehen mit einigen zusaetzlichen Quartetten oder Quintetten. Um die Natur der Systeme, die wir in zCOSMOS-deep identifiziert haben, zu verste- hen, ziehen wir die Millennium Simulation heran, welche die Daten ueber die Verteilung

ix der dunklen Materie mit den Daten ueber die zugehoerigen Galaxien vereint. Von dieser Simulation wurden Testkataloge abgeleitet, welche wir dann so anpassen, dass sie die Num- merdichte von zCOSMOS-deep replizieren. Durch die Anwendung des oben beschriebenen Gruppenfinders auf diese Testkataloge, koennen wir ein simuliertes Sample von Galaxiesys- temen konstruieren, welches durch die Verbindung mit der Simulation, zusaetzlich die volle Information ueber die dunkle Materie und deren Entwicklung enthaelt. Die Analyse dieser Testsamples zeigt, dass die identifizierten Systeme zum Beobach- tungszeitpunkt immer noch Assoziationen von Zentralgalaxien sind. Sie entsprechen damit nicht der gaengigen Definition von Galaxiegruppen, in welcher eine Zentralgalaxie und eine Anzahl Satellitengalaxien das gleiche Halo von dunkler Materie besetzen. Bis zu Rotverschiebung z = 0 werden sich jedoch die meisten (93%) der identifizierten Systeme zu Galaxiegruppen entwickelt haben. Daraus laesst sich schliessen, dass wir eigentlich ”Proto-Gruppen” beobachtet haben, das heisst die Vorgaenger der z = 0 Gruppen. Die meisten dieser Proto-Gruppen entwickeln sich zu Gruppen mit einer Halomasse von 13 14 ∼ 10 − 10 M /h bis z = 0. Desweitern ist man mit einem Sample wie zCOSMOS- 14 deep in der Lage einen repraesentativen Teil der > 10 M /h Strukturen zu katalogisieren, falls man Galaxien, die dem Selektionskriterium des Samples entsprechen auch tatsaechlich beobachtet. Wenn man dann die tatsaechlich Beobachtungsrate von zCOSMOS annimmt, identifiziert man immer noch ∼ 35% dieser massiven Systeme. Wir analysieren dann moegliche Einfluesse der Proto-Gruppen auf die zugehoerigen Galaxien, und pruefen insbesondere Unterschiede in der Farbe, was einen Uebergang von sternenformenden Galaxien zu untaetigen Galaxien indizieren wuerde. Es scheint jedoch, dass bei fixer Sternenmasse, der rote Anteil von Proto-Gruppen-Galaxien derselbe ist wie von Feldgalaxien. In einer Fallstudie stellen wir einen Proto-Cluster mit einer Rotverschiebung von z = 2.5 vor. Dieses System wurde bei einer Nachfolgebeobachtung der zCOSMOS Protogruppen entdeckt und hat 11 spektroskopisch bestaetigte Mitgliedergalaxien und einen Dichtekon- trast von ∼ 10. Unter erneuter Benutzung der Millennium Simulation identifizieren wir 16 vergleichbare Proto-Clusters in den zugehoerigen Testkatalogen und benutzen diese um das tatsaechlich beobachtete Objekt besser zu verstehen. Bei z = 2.5 sind die meisten zugehoerigen Galaxien noch isoliert in ihrem Halo von dunkler Materie, obwohl einige den Cluster-Formierungsprozess schon begonnen haben und das Halo mit anderen Mitglieder- galaxien teilen. Wir verfolgen die Entwicklung dieser simulierten Proto-cluster ueber die kosmische Zeit und es zeigt sich, dass sie sich bei Rotverschiebung z ∼ 1 fast vollstaendig zu Gruppen oder Clustern ausgebildet haben und dass sie zum heutigen Zeitpunkt eine 14 15 Halomasse von ∼ 10 − 10 M /h erreichen. Desweitern haben wir die Vorgaengergalaxien der z = 0 Clusters analysiert und sie der kosmischen Zeitlinie entlang nachverfolgt bis zu z ∼ 2.5. Diese Vorgaenger besetzen ein relativ grosses Feld (3-20 Mpc). Diejenigen Galaxien, welche tatsaechlich in Beobachtun- gen identifiziert werden, entsprechen hingegen nur einem kleineren Teil dieses Feldes. Dies ist ein Hinweis auf eine ausgedehntere Struktur im COSMOS-Feld. Zuletzt untersuchen wir die Galaxien des Proto-Cluster auf Unterschiede zu den Feldgalaxien. Allerdings scheint es keine Hinweise auf unterschiedliche Sternenmassen, Sternentstehungsraten oder Anteil an untaetigen Galaxien zu geben.

x Der zweite Teil ist der Untersuchung der Verbindung zwischen dem intergalaktischen Medium, beobachtbar durch HI, und den zCOSMOS Galaxien gewidmet. Unser Sam- ple besteht aus 8 Quasaren mit Rotverschiebung 2.5 . zQSO . 3, welche im Feld von zCOSMOS-deep liegen. Diese Quasare wurden mit dem XSHOOTER Instrument spek- troskopisch beobachtet. Die Lyα-Absorptionsverteilung entlang der Sichtlinie wird mit der Methode der optischen Tiefe per Pixel berechnet. Diese Methode hat den Vorteil, dass sie eine physikalische Interpretation hat, im Unterschied zu, zum Beispiel, der Flussfluktua- tionsmethode. Zwei der Quasare wurden auch mit dem hochaufloesenden UVES Spektrograph beobachtet um den Einfluss der tieferen Aufloesung von XSHOOTER auf die Werte der optischen Tiefen abzuschaetzen. Es hat stellt sich heraus, dass mit der gegebenen Aufloe- sung von XSHOOTER und den typischen Werten fuer das Signal-zu-Rausch Verhaeltnis, der Effekt von variierendem Signal-zu-Rausch Verhaeltnis wichtiger ist, und dass der Ein- fluss der Aufloesung vernachlaessigbar ist. Wir benutzen die Verteilung der optischen Tiefe um zCOSMOS-deep Galaxien dazu um systematische Effekte in der Messung der Rotverschiebungen, sowie deren Verlaesslichkeit zu ueberpruefen. Die zCOSMOS Rotverschiebungen sind im Vergleich zur Lyα-absorption systematisch um ∼ 500 km/s blauverschoben. Diese Verschiebung kann erklaert werden durch systematische Effekte in der Wellenlaengenkalibrierung von zCOSMOS-deep, sowie galaktische Winde, welche diejenigen Linien beinflussen, die fuer die Rotverschiebungsmes- sung benutzt werden. Desweitern bestaetigen wir die Rotverschiebungsgenauigkeit von der Groessenordnung von einigen hundert km/s. Wir bestimmen den Anteil an ver- laesslichen Rotverschiebungen fuer zCOSMOS-deep, was als Leitlinie fuer die Konstruktion von Rotverschiebungssamples aus zCOSMOS-deep dienen kann. Letztendlich praesentieren wir die Messung der Galaxie-Lyα Korrelationsfunktion bei einer Rotverschiebung von 2 . z . 2.5. Dazu bestimmen wir den mittleren Gasueber- schuss ausgedrueckt durch die optische Tiefe bei einer Distanz rp (transvers) und π (entlang der Sichtlinie) von den Galaxien. Wir sind in der Lage Distanzen von bis zu rp ∼ 60 Mpc und mehreren hundert Mpc entlang der Sichtlinie abzudecken. Bei transversen Distanzen kleiner als 5 Mpc ist die Aussagekraft limitiert wegen der kleinen Anzahl von Galaxie- Quasar Paaren bei dieser Entfernung. Soweit uns bekannt, ist das die erste Messung der Korrelation zwischen den Galaxien und der Lyα-Absorption bei dieser Rotverschiebung und ueber diese Bandbreite von Entfernungen. In der Korrelationsfunktion ist die Signatur von Rotverschiebungsraumverzerrungen erkennbar, manifestiert in einer Kompression in der Richtung der Sichtlinie. Wir ver- gleichen die Korrelationsfunktion mit einem Modell fuer die Verzerrung des Rotver- schiebungsraumes. Das Modell beschreibt die Kompression durch den Wachstum von kosmischer Stuktur unter der Annahme der Konkordanzkosmologie. Unser Resultat zeigt, dass die gemessene Korrelationsfunktion durch dieses Modell erklaert werden kann.

xi xii Contents

Contents xiii

List of Figures xvii

List of Tables xix

Part I Introduction

1 Introduction and background3 1.1 The cosmological framework...... 3 1.1.1 Basic assumption I: The Cosmological Principle...... 3 1.1.2 Basic assumption II: General relativity...... 3 1.1.3 A concordance cosmology...... 4 1.1.4 Distances and redshifts...... 4 1.2 Structure formation...... 5 1.2.1 Power spectrum and correlation functions...... 6 1.2.2 Bias...... 6 1.2.3 Large scale structure...... 7 1.3 Overdensities traced by galaxies...... 9 1.3.1 Galaxy groups and clusters...... 9 1.3.2 Proto-groups and proto-clusters...... 10 1.4 Gas and galaxies: The intergalactic medium at z ∼ 2...... 12 1.4.1 The Lyα-forest of QSOs...... 12 1.4.2 Using the Lyα-forest as a window to early structure...... 14 1.4.3 Measuring absorption lines...... 14 1.5 Outline...... 15

2 Data and simulations 17 2.1 The COSMOS survey...... 17 2.2 The zCOSMOS-deep redshift sample...... 17 2.2.1 Sample selection and observations...... 18 2.2.2 Redshift determination and reliability...... 19 2.3 The photometric redshift samples...... 22 2.3.1 The COSMOS photometric redshift sample...... 22

xiii CONTENTS

2.3.2 The UVISTA photometry and photometric redshift sample..... 24 2.4 The Millennium simulation and its derived mock catalogues...... 25 2.4.1 The Millennium Simulation...... 25 2.4.2 The Kitzbichler lightcones...... 26 2.4.3 The Henriques lightcones...... 27 2.4.4 Application of the mocks and the simulation...... 27

Part II Proto-groups and proto-clusters

3 Proto-groups at 1.8 < z < 3 in zCOSMOS-deep 31 3.1 Introduction...... 31 3.2 Data...... 33 3.2.1 The zCOSMOS-deep sample...... 33 3.2.2 Mock catalogues...... 33 3.3 Methods...... 35 3.3.1 Group definition...... 35 3.3.2 The nature of groups in the mocks...... 35 3.3.3 Group finder algorithm...... 36 3.3.4 Application to zCOSMOS sample and comparison with mocks... 38 3.4 Results...... 39 3.4.1 Are we detecting real groups at z & 2?...... 39 3.4.2 Assembly timescale...... 40 3.4.3 Halo masses...... 43 3.4.4 Overdensities...... 45 3.4.5 Excess of high mass objects and red fractions...... 46 3.5 Summary and Conclusions...... 49

4 A proto-cluster at z=2.45 53 4.1 Introduction...... 53 4.2 Data...... 54 4.2.1 The FORS2 data...... 55 4.2.2 Proto-cluster at z = 2.45...... 59 4.2.3 The mock sample...... 60 4.3 Evolution in simulations...... 61 4.3.1 Surface number densities...... 62 4.3.2 Assembly history...... 62 4.3.3 Halo masses...... 63 4.3.4 Progenitor galaxies...... 64 4.4 Observational Characteristics...... 65 4.4.1 Photometric redshift samples...... 65 4.4.2 Overdensity estimation...... 67 4.4.3 Connection to radio galaxies...... 67 4.4.4 Does environment matter?...... 68 4.5 Summary and conclusions...... 68

xiv CONTENTS

Part III The Lyα-forest-Galaxy cross-correlation at z ∼ 2

5 The Lyα-forest data 75 5.1 Description of the programme...... 75 5.2 The XSHOOTER observations...... 76 5.2.1 Data-reduction...... 77 5.3 Continuum fitting algorithm...... 80 5.4 The UVES data...... 82 5.4.1 Observations and set-ups...... 83 5.4.2 Data reduction and continuum fit...... 85 5.4.3 The effects of resolution and S/N: Comparison of UVES and XSHOOTER data...... 86 5.5 Estimates of optical depths...... 87 5.6 The QSO sample in relation to zCOSMOS-deep...... 90

6 Analysis of the zCOSMOS-deep redshift reliability 95

7 Measuring the Lyα-forest-Galaxy cross-correlation 99 7.1 Measurement of the cross-correlation function...... 100 7.1.1 Mean vs Median...... 100 7.1.2 Number counts...... 100 7.1.3 Construction of the background map...... 103 7.1.4 The impact of sampling...... 103 7.1.5 Cross-correlation function...... 105 7.2 Redshift space distortion model...... 109 7.2.1 Construction of the model...... 109 7.2.2 Basic assumptions...... 111

7.3 Comparing ξ(rp, π) to the redshift space distortion model...... 113 7.4 Summary and conclusions...... 115

Part IV Summary and Outlook

8 Summary and Conclusions 119 8.1 Proto-groups and proto-clusters...... 119 8.2 The Lyα-forest-Galaxy cross-correlation at z ∼ 2...... 121

9 Outlook: “Hay m´asfuturo que pasado” 123

Part V Appendix

10 A new calibration algorithm for KMOS 127 10.1 Instrument description...... 127 10.1.1 Science driver...... 127 10.1.2 Design and data organisation...... 127

xv CONTENTS

10.1.3 Calibration and re-commissioning procedures...... 128 10.2 The calibration position finder algorithm...... 129 10.2.1 Basic concept...... 129 10.2.2 Implementation...... 129 10.2.3 Other possible approaches...... 131

xvi List of Figures

1.1 Redshift evolution of the cosmic web...... 7 1.2 Example of the effect of redshift space distortion...... 8 1.3 The 2dFGRS correlation function...... 9 1.4 Comparison of Abell1689 to a proto-cluster...... 11 1.5 Schematic overview on the Lyα-forest...... 13

2.1 Overview on the COSMOS survey...... 18 2.2 RA-Dec of zCOSMOS-deep galaxies...... 19 2.3 Redshift distribution in zCOSMOS-deep...... 22 2.4 Properties of the COSMOS and UVISTA samples...... 23 2.5 Stellar mass distribution in the COSMOS and UVISTA samples...... 24 2.6 Snapshots of the Millennium simulation...... 26 2.7 Pointers within the Millennium simulation...... 28

3.1 The average N(z)-distribution in the mock catalogues, compared to the N(z)-distribution of zCOSMOS-deep...... 34 3.2 Number of proto-groups at 1.8 < z < 3 as a function of varying velocity linking lengths and projected linking lengths...... 37 3.3 The N(z)-distribution of the galaxies in the zCOSMOS-deep candidate groups 38 3.4 Position of candidate groups in the COSMOS field...... 39 3.5 Comparison of the basic properties of the candidate groups in the mock sample with those in zCOSMOS-deep...... 40 3.6 The fraction of proto-groups with respect to all candidate groups in the mocks as a function of their velocity dispersion vrms and size rrms ...... 41 3.7 Assembly of the proto-groups in the mocks...... 41 3.8 Assembly history of all candidate mock groups with richness three..... 42 3.9 Present-day halo masses that are detectable in a zCOSMOS-like survey.. 44 3.10 Distribution of the proto-group “overdensities” in zCOSMOS-deep and in the mocks...... 46 3.11 zCOSMOS proto-groups as seen from the COSMOS photometric sample.. 47 3.12 Red fraction of objects in the photo-z sample at the position of our candi- date groups...... 48

4.1 Pointings of the FORS2 observations...... 57 4.2 Positions of the members of the z = 2.45 proto-cluster...... 60 4.3 Assembly history of the proto-cluster...... 63

xvii LIST OF FIGURES

4.4 Evolution in halo masses from z ∼ 2.5 to the assembled z = 0 halo..... 64 4.5 Field of the progenitor galaxies that will accrete onto the z = 0 cluster... 66 4.6 Properties of the proto-cluster galaxies as seen by the UVISTA photometric sample...... 69

5.1 Redshift distribution of the QSOs and zCOSMOS-deep galaxies...... 76 5.2 Overview of the XSHOOTER data reduction workflow...... 78 5.3 Spectrum extraction for the XSHOOTER data...... 80 5.4 Impact of continuum estimation errors...... 81 5.5 Flux distribution of QSO-51...... 82 5.6 Example for a continuum fit...... 83 5.7 Overview of the Lyα-forest region of the XSHOOTER data...... 84 5.8 XSHOOTER versus UVES fluxes...... 87 5.9 Comparison of a high signal-to-noise to a low signal-to-noise UVES spectrum 88 5.10 Positions of the QSOs and zCOSMOS-deep galaxies...... 90 5.11 Lyα absorption in the eight sightlines...... 91

6.1 Line-of-sight distribution of optical depths around flag 3 & 4 galaxies at varying rp ...... 96 6.2 Line-of-sight distribution of optical depths at rp = 15 cMpc for different zCOSMOS-deep confidence classes...... 98

7.1 Mean versus Median estimate...... 100 7.2 Map of the number of galaxy-sightline pairs...... 101 7.3 Map of the number of galaxy-sightline pairs on large scales...... 102 7.4 Random map...... 104

7.5 Sampling effects for ξ(rp, π)...... 105

7.6 ξ(rp, π) for all π ...... 106

7.7 Measured ξ(rp, π) for both zCOSMOS galaxy sets...... 108

7.8 S/N map for ξ(rp, π)...... 109 7.9 Illustration of the estimator for redshift space distortion...... 111 7.10 Influence of variations in the redshift space distortion model parameters.. 112 7.11 Redshift space distortion, data vs model comparison...... 114

10.1 Overview on the data organisation for KMOS...... 128 10.2 The KMOS calibration unit...... 129 10.3 Trace for one KMOS IFU...... 130 10.4 Output of the KMOS algorithm...... 132

xviii List of Tables

2.1 Flags in zCOSMOS-deep...... 21

3.1 Candidate groups detected in zCOSMOS-deep...... 51

4.1 Proto-group targets for the FORS2 observations...... 55 4.2 Summary of the FORS2 observations...... 56 4.3 Summary of theFORS2 success-rates...... 58 4.4 Comparison of FORS2 and zCOSMOS-deep redshifts...... 58 4.5 Spectroscopically confirmed members of the z = 2.45 proto-cluster..... 59 4.6 FORS2 redshifts...... 70 4.7 Continuation FORS2 redshift...... 71

5.1 Overview of the XSHOOTER observations...... 77 5.2 Overview of the UVES observations...... 85 5.3 Overview of UVES signal-to-noise ratios...... 85 5.4 Signal-to-noise correction from the continuum fit...... 87 5.5 Overview of the Lyα data...... 89 5.6 Metal line systems identified in the XSHOOTER spectra...... 92 5.7 Continuation of metal line systems identified in the XSHOOTER spectra. 93

6.1 Summary of the redshift reliability in zCOSMOS-deep...... 97

xix LIST OF TABLES

xx Part I

Introduction

Chapter 1

Introduction and background

1.1 The cosmological framework

1.1.1 Basic assumption I: The Cosmological Principle

Modern cosmology is founded on the “Cosmological Principle” which states that “the Universe is homogeneous and isotropic”, meaning that the Universe basically is the same at every point in space and looks the same at all angles. This implies that our place in it is not privileged. Obviously, this principle cannot be true on small scales, the mere existence of stars and galaxies defy it. However it holds on larger “global” scales.

1.1.2 Basic assumption II: General relativity

The second pillar to modern cosmology is the assumption that the dynamics of our Uni- verse is entirely described by the theory of General Relativity, i.e. the Einstein field equa- tions. When requiring homogeneity and isotropy, the space-time metric of the Universe takes the Robertson-Walker form with coordinates (ct,r,θ,φ):

2 2 2 2 2 2 2 2 ds = −c dt + a (t)[dr + SK (r)(dθ + sin (θ)dφ )].

Here a(t) is the scale factor of the Universe, describing its expansion. Its present day value is set to a(t0) = 1. Further, using the curvature radius R:  Rsin( r ),K = +1  R SK (r) = r, K = 0 (1.1)  r Rsinh( R ),K = −1

This metric and the Einstein field equations then lead directly to the Friedmann equa- tions describing the dynamics of the Universe under the assumption that it acts as an ideal fluid. With the matter density ρ, pressure p, curvature K, constant Λ and H(t) =a/a ˙ , the Friedmann equations are:

3 CHAPTER 1. Introduction and background

a˙ 8πG Kc2 Λ ( )2 = ρ − − , a 3 a2 3 p ρ˙ = −3H(ρ + ). c2 The solution to the latter equation are the equations of state for the matter density:

−3(1+ω) ρ(t) = ρ(t0)a(t) .

In the case of ordinary (non-relativistic matter) we have ωm = 0, for radiation ωr = 1/3 and for a cosmological constant Λ it is ωΛ = −1.

1.1.3 A concordance cosmology

Assuming that the matter density in the Universe is entirely consisting of matter, radiation 3H2 and a “false” vacuum and defining ρc = 8πG , we can now introduce the more commonly used parameter Ω = ρ for each contribution to the universal density. Ω = 1 corresponds ρc tot to a flat Universe, Ωtot > 1 means positive curvature and Ωtot < 1 a negative curvature. The concordance model of cosmology (ΛCDM-cosmology) relies on only a few, well- constrained parameters to describe the Universe. The most recent parameters (consistent with WMAP9, Bennett et al. 2013) predict a flat Universe. When expressing present day value of H(t) as H0, we have: −1 −1 ΩM = 0.27, ΩΛ = 0.73,H0 = 70 km s Mpc

1.1.4 Distances and redshifts

Already very early on in the history of cosmology it was realised that most objects are moving away from us. Edwin Hubble discovered the Hubble law, relating the receding velocity v of a galaxy to its distance d via v = H0d, with the proportionality constant H0. The receding of distant objects was the first indication for the expansion of the Universe, and its implications i.e. the Big Bang. In modern day cosmology the receding velocity has been replaced by redshifts z, defined via the ratio between observed wavelength λobs of a given spectral feature to its restframe wavelength λrf : z = λobs/λrf − 1. In principle this is then straightforward to measure at least if one believes to know which spectral feature one is looking at. To first order the redshift can be viewed as caused by the expanding Universe; this is called cosmological redshift. As a such it is related to the scale factor a of the Universe via 1 a(t ) = , em 1 + z which allows to understand a redshift as a measure for the scale factor of the Universe, but at the time tem when the light was emitted. The redshift itself however does not contain any information on this point in time.

4 1.2. Structure formation

As already hinted at, the measured redshifts of galaxies are however not purely cos- mological but (in the absence of observational uncertainties) a mixture of cosmological redshift and so-called peculiar velocities. These appear two-fold: firstly from the random motion within haloes and secondly from coherent matter infall. We will discuss this later in the text. A third deviation from the cosmological redshift is not physical but originates from redshift measurement errors. At z ∼ 2 and in large redshift surveys, these can easily reach a few hundred km/s.

Measuring distances

In the far Universe, redshifts are our only distance indicators for the objects we observe. Any distance estimator however does assume cosmological redshifts and is hence imprecise on the level of peculiar velocities. Under this assumption the comoving line of sight distance to an object at redshift z can then be calculated as:

c Z z dz0 Dc = . p 3 2 H0 0 ΩM (1 + z) + ΩK (1 + z) + ΩΛ

The comoving transverse distance between two objects on the sky separated by ∆RA and ∆Dec, can, in the approximation of small angular scales, be calculated as:

p 2 2 rc = ∆RA + ∆Dec · Dc, under the assumption of zero curvature.

1.2 Structure formation

The early Universe was almost completely homogeneous, a statement which is not exactly true any more at present day. The aim of structure formation theory is to explain how the observed diversity of structure has formed from this early Universe. Precision mea- surements of the cosmic microwave background have shown that tiny density fluctuations were present (first detected by the COBE satellite), which can be described by

ρ(x) − ρ¯ δ(x) = , ρ¯ whereρ ¯ is the mean density of the Universe. These fluctuation are primordial in nature and are thought to constitute the origin of the structure observed at present day. The general theory of how these fluctuations grow is quite complicated, as it has to include the effects of baryonic physics as well as non-linearities that apply if such a fluctuation collapses gravitationally. However, the majority of matter in the Universe is in the form of dark matter, even more so cold dark matter, which only interacts gravitationally. Therefore, for the large scale formation and evolution of structure the theory will be entirely dominated by the dark matter physics. Furthermore on scales of a few comoving Mpc linear theory applies, simplifying it further.

5 CHAPTER 1. Introduction and background

The linear regime is defined by δ  1. In the limit of large volumes the Universe can then be viewed as periodic, so that δ can be expressed as a Fourier series:

δ(x) = Σkδkexp(ik · x).

These density fluctuations δk then evolve according to the equation a˙ δ¨ + 2 δ˙ = 4πGρδ. a The growth of structure finally is described by the function f, which together with the Friedmann equations relates to the cosmological parameters: dlnδ Ω (2 + Ω ) f = = Ω0.6 + Λ m ' Ω0.6. dlna m 140 m

1.2.1 Power spectrum and correlation functions

When studying the clustering of matter one usually measures the correlation function. Assuming two tracer populations t and t’ of the underlying matter distribution, the cross- correlation function is defined as

ξ(r) =< δt(x)δt0 (x+r) > . Here δ denotes the respective overdensities. The correlation function therefore corresponds to the excess probability of finding another tracer at distance r. In the case of only one tracer population this becomes an auto-correlation function, which for galaxies and at scales of r . 10 − 20 cMpc has approximately the form of a powerlaw, ξ(r) = ( r )−γ with a correlation length of r ∼ 5 cMpc/h and γ ∼ 1.8. The r0 0 exact values depend on the properties of the galaxy sample and the redshift. 2 The power spectrum P (k, t) = |δk| and the correlation function are related by a simple Fourier transformation: 1 Z sin(kr) ξ(r) = dkk2P (k) . 2π2 kr

1.2.2 Bias

The power-spectrum derived from linear theory is valid for dark matter (DM) or any species that exactly traces DM. Typically however one attempts to trace the DM distri- bution by electromagnetically radiating objects. Since the physics of baryons differ from DM, this may not be an accurate tracer. On large scales however, one can assume a linear bias b to connect the DM and the tracer population t:

Pt(k) = b · PDM (k). By definition the bias of DM is therefore 1, at all redshifts. For correlation functions this translates to ξt,t0 (r) = btbt0 ξDM (r) 2 which in the case of an auto-correlation reduces to the familiar ξt(r) = bt ξDM (r).

6 1.2. Structure formation

Fig. 1.1 — Evolution in the dark matter distribution in the Millennium II simulation from z ∼ 6 to z = 0. The field size is 100×100 Mpc/h. The cosmic web evolves drastically over this time as matter accretes on to denser region. Image adapted from Boylan-Kolchin et. al (2009).

1.2.3 Large scale structure

On Megaparsec scales the matter in the Universe is distributed anisotropically, forming the so-called cosmic web. This structure consists of nodes, the seeds or locations of clusters, filaments that connect those nodes and sheets with their intersections forming the fila- ments. In between those structures there are large underdense regions called voids. There is evidence that the structure of the cosmic web has already been present in primordial density fluctuations, with filaments and nodes being somewhat overdense and sheets and voids constituing underdense structures. The cosmic web evolves strongly over cosmic time, as indicated in Figure 1.1. The most prominent features are already in place at high redshifts and evolve from there, the weaker underdense regions however are disappearing almost completely between z ∼ 2 and z = 0.

Redshift space distortion

In cosmology redshifts are commonly used as distance indicators, assuming an underlying cosmological model and Hubble flow (motion solely due to the expansion of the Universe). Since redshifts are essentially a velocity measure, this assumption only corresponds directly to real distances if the redshifts are cosmological in nature (and the assumed cosmological parameters are correct). One therefore speaks of real versus redshift space to account for possible deviations from the “true” distances. In the presence of peculiar velocities, redshift space starts to deviate significantly from real space, an effect known as redshift space distortion. One distinguishes two regimes, that are detectable on different scales. The random motions in e.g., cluster haloes cause a “Finger-of-God” effect, by making structures appear elongated in redshift space. This effect can only act on halo scales. The second effect (known as Kaiser effect) is caused by large scale coherent infall of matter, and manifests itself as a compression along the line-of-sight. Due to the “Finger-of-God” acting on small scales this effect is only detectable on larger scales. In correlation analysis redshift measurement errors on the other hand enter on all scales and essentially act to

7 CHAPTER 1. Introduction and background

Ω =0.3, b=1, r =5cMpc/h, γ=1.8 M 0 20 20

0 0 [cMpc/h] [cMpc/h] π π

−20 −20 −20 0 20 −20 0 20 r [cMpc/h] r [cMpc/h] p p

20 20

0 0 [cMpc/h] /h [cMpc/h] π π

−20 −20 −20 0 20 −20 0 20 r [cMpc/h] r [cMpc/h] p p

Fig. 1.2 — An illustration of the effects of redshift space distortion calculated by a model that we will introduce in chapter 7. We show an auto-correlation function assuming r0 = 5 cMpc/h, the canonical γ = 1.8, ΩM = 0.3 and a galaxy bias of b = 1. Contours are at [4 2 1 0.5 0.2 0.1]. Top left: no redshift space distortion, i.e. no infall and no finger-of-god effect. Top right: only large scale infall. Bottom left: only finger-of-god effect. Bottom right: both Kaiser infall and random motion combined.

dilute the signal along the line of sight. In Figure 1.2 we show an illustration for the effect of redshift space distortion, for the two contributing effects. Redshift space distortion can be observed in correlation analysis and have been suc- cessfully detected in a number of surveys, first in the 2dF-survey (see Figure 1.3, Hawkins et al. 2002). They introduce complications for a number of applications like any attempt at inferring real space clustering measurements, or in the identification of galaxy groups where the apparent elongation needs to be accounted for in the group-finding parameters. On the other hand they can be a powerful cosmological tool. The redshift space distor- tion can be quantified by the parameter β. It is related to the growth of structure f (as 0.6 introduced above), which can be approximated by f ≈ ΩM . For galaxies with a bias b, β 0.6 can be expressed by β = ΩM /b. In Chapter 7 we will be looking at redshift space distortion effects in a Lyα-forest and galaxy cross-correlation analysis. There we will also discuss the exact modeling and parameter assumptions.

8 1.3. Overdensities traced by galaxies

Fig. 1.3 — The galaxy-galaxy correlation function as measured in the 2dF survey. There is a clear detection of the finger-of-god effect (elongation at σ = 0) and Kaiser infall (compression at ∼ 20 cMpc/h).

1.3 Overdensities traced by galaxies

1.3.1 Galaxy groups and clusters

A galaxy group (or cluster) is defined as an assembly of galaxies that occupy the same dark matter halo. They constitute the most massive gravitationally bound systems at a given redshift, and are a valuable laboratory for cosmology. Amongst other applications their number density and correlation length can serve as a probe for the underlying cosmological model; for example their mass-to-light ratio can be used to constrain Ωm (Borgani 2006). Besides the cosmological importance, the group and cluster environment is believed to significantly influence the evolution of their member galaxies. Processes that occur in the group environment include enhanced merger rates, galaxy harassment, ram pressure stripping or strangulation. Group and cluster galaxies are shown to exhibit systematically different properties from field galaxies. For instance, at a given mass the properties of satellite galaxies differ from those of central galaxies (e.g. Weinmann et al. 2009, Pasquali et al. 2010, Peng et al. 2010, Knobel et al. 2013). Whilst great advances have been made over the last decade in the understanding of the influence of the group and cluster environment, the identification of galaxy groups and clusters from galaxy surveys remains challenging. Observationally we only have access to light, which is a biased tracer of the underlying DM distribution. The case gets further

9 CHAPTER 1. Introduction and background complicated by observational uncertainties on, in particular, the redshifts of galaxies. Over the years several methods have been developed to identify groups and clusters with the aim of yielding pure (groups should not contain interlopers) and complete (all groups should be listed) group catalogues. For spectroscopic surveys the most commonly used are friends-of-friends algorithms (Huchra & Geller 1982, Eke et al. 2004, Diaz-Gimenez & Zandivarez 2015) or Voronoi tesselation (Marinoni et al. 2002, Cucciati et al. 2010). When only photometric redshifts (or colours of galaxies) are available the identification via red sequence has proven to be successful (Murphy et al. 2012, Rykoff et al. 2014).

1.3.2 Proto-groups and proto-clusters

Definition

In the picture of hierarchical assembly, groups must have formed over cosmological time by continuously accreting galaxies onto their common halo. This implies the existence of progenitors to todays groups and clusters, which are either only partially assembled or are entirely comprised of galaxies occupying their individual DM halo. Proto-groups and proto-clusters are defined as those progenitors to z = 0 groups and clusters; member galaxies of proto-groups have not at all or only partially accreted onto the later group halo. From an observational point of view, such progenitors are thought to constitute over- densities at high redshifts. They are believed to have emerged already at much earlier time, z > 2, with some claims of detection of the very earliest overdensities at already z ∼ 8 and z ∼ 5 (Trenti et al. 2012, Capak et al. 2011). These very early overdensities must represent the most extremes of the mass range. However, as galaxies rapidly build up their stellar mass at z ∼ 2 one can expect significant numbers of high-redshift over- densities to become observationally accessible at this epoch (as shown in chapter 3 of this thesis).

Identification

The challenges in the identification of those progenitors at z ∼ 2 are much more severe than in group searching at z < 1. Photometric redshifts become punishingly inaccurate, diluting any signal on scales of 10’000 km/s or more. Large spectroscopic redshift surveys are time-expensive and do not compare to their lower redshift counterparts in terms of sampling density and redshift accuracy. Further, the high-z structures are often not as far evolved as z < 1 groups and cluster, features like a red sequence still absent or only just emerging. Whilst the z < 1 groups and clusters constitute gravitationally bound structures, this is not necessarily true for the high redshift counterparts. As shown in Chapter 3 those are often quite significantly overdense regions, however far from being gravitationally bound or, in other words, having their member galaxies occupying the same DM halo. This also results in somewhat “looser”, less concentrated structures (see Chapter 4), which makes them more difficult to detect against the background (this is essentially a contrast issue); see Figure 1.4 for an illustration.

10 1.3. Overdensities traced by galaxies

Fig. 1.4 — Comparison of Abell1689 to a proto-cluster at z = 2.5 (presented in chapter 4), with the members indicated by the red circles. This illutrates the immense difference between a full evolved cluster as a compact structure and the looser and much less pro- nounced proto-cluster. Image-credit for Abell1689: NASA, ESA, J. Blakeslee (NRC Herzberg Astrophysics Program, Dominion Astrophysical Observatory), and K. Alamo-Martinez (National Au- tonomous University of Mexico)

Consequently the methods for the identification of high redshift overdensities have taken a mostly different road than the group searches at z < 1. Frequent techniques include the search for overdensities in photometric redshift samples (Spitler et al. 2012, Capak et al. 2011, Castignani et al. 2014) at time coupled with a spectroscopic follow-up, narrow-band imaging (in particular Hα) around known tracers, e.g. radio galaxies (Miley et al. 2006, Venemans et al. 2007, Hatch et al. 2011), or serendipitous detections. These methods get complemented by a first application of a “low redshift technique”, the friends-of-friends algorithm used in this thesis. The structures detected at high redshift are, as mentioned before, mostly gravitationally unbound, though overdense. As shown in chapter 3 and 4, if carefully selected, most of these overdensities will however evolve into low-z groups or clusters.

Properties

In the discussion of possible environmental influences one should first of all note that proto-clusters are quite different from low redshift clusters, in the sense that most proto- cluster galaxies are actually centrals within their own dark matter halo, as opposed to the cluster galaxy population, which overwhelmingly consists of satellite galaxies. It has been established that almost all of the environmental effects in low redshift clusters occur in the satellite population (e.g. Peng et al. 2010 and many others). As this is quasi non-existent in proto-clusters, one may not expect the same kind of processes shaping the proto-cluster galaxies. On the other hand, proto-clusters do represent overdense regions of the Universe, and it is plausible that environmental effects enter in different ways than at low redshift.

11 CHAPTER 1. Introduction and background

The proto-clusters reported in the literature exhibit quite different properties from their low redshift counterparts. First of all, there is no consensus on the actual influence (if any) of the proto-cluster environment on the member galaxies. Hatch et al. (2011) and Lemaux et al. (2014), for example, show proto-cluster galaxies to be more massive and with a lower specific star-formation rate than co-eval field galaxies. Shimakawa et al. (2014) report increased star-formation rates in their proto-clusters, whilst Cucciati et al. (2014) find no evidence for environmental differentiation. As to why those properties are so different from each other is yet unclear. Explanations could be as simple as different selections chosen by the authors. It could also be that the proto-clusters which seem to look very similar in terms of their galaxy overdensity are nonetheless at different stages of their evolution, possibly with different underlying halo masses. And obviously, it could also be as simple as proto-clusters being intrinsically different from each other, in which case the key question would be, as to why they would then evolve into (mostly) very similarly looking z = 0 clusters.

1.4 Gas and galaxies: The intergalactic medium at z ∼ 2

1.4.1 The Lyα-forest of QSOs

The Lyα-forest in the spectra of distant QSOs was first reported by Roger Lynds in 1971, when observing the QSO 4C 05.34. (Lynds 1971). It manifests itself as grouping of absorption lines bluewards of the Lyα emission of the QSO, giving rise to the name “forest”. For quasars at higher redshift the number of lines in the forest is higher, until at a redshift of about 6, when there is so much neutral hydrogen in the intergalactic medium (IGM) that the forest turns into a Gunn-Peterson trough with a complete suppression of any flux bluewards of the QSO Lyα-emission. This shows the end of the reionization of the universe. The discovery of the Lyα-forest lines sparkled a debate on their origin, as to whether they were caused by the QSO itself or originating from intervening systems. The interpre- tation prevailing now is that the Lyα forest stems from neutral hydrogen located in the line-of-sight between the QSO and the observer (Bergeron 1986, Petitjean et al. 1994). As the photons from the QSO travel through this gas the ones possessing the transition energy of the ground-state to the first excited state of the hydrogen atoms get absorbed. We show the schematics in Figure 1.5, also comparing the difference between low and high-redshift QSOs. That we observe the Lyα-forest as a suite of lines, and not as a com- plete suppression of flux blue-wards of the QSO Lyα-line also shows the IGM is mostly ionised with only traces of HI. The multitude of lines at various wavelengths is caused by the intervening gas appearing at different distances from the observer and therefore causing redshifted Lyα lines. The “pure” Lyα-forest is only observable at wavelengths lower than the QSO Lyα emission, but also higher than the QSO Lyβ line. At wavelengths beyond the Lyβ emission higher order Lyman-transitions start to enter and can be mistaken for Lyα-lines. The Lyα forest is now mostly comprised of those Lyα transition lines, but also contaminated by a small level of metal lines from intervening metal line systems.

12 1.4. Gas and galaxies: The intergalactic medium at z ∼ 2

Fig. 1.5 — Top: The Lyα-forest in the spectra of QSOs is explained by their light penetrating intervening material on the way to earth. The absorption lines arise from redshifted Lyα absorption. Middle and bottom: A quasi-local QSO in comparison to a z > 3 QSO, showing the strong redshift evolution of the Lyα-forest. Image credits: QSO spectra from Bill Keels webpage: http://www.astr.ua.edu/keel/agn/forest.html; XSHOOTER image from http://www.eso.org/sci/publications/announcements/sciann14025.html; QSO from http://www.spacetelescope.org/videos/archive/category/blackholes/

Whilst for quite some time the Lyα-forest was understood as a consequence of dis- crete “spherical” absorbing clouds, this picture has now given way to instead discussing a continuous density field where the IGM traces the cosmic web. The Lyα forest of a QSO sightline can therefore be interpreted as tracing the one-dimensional gas overden- sity. We distinguish three different regimes in the Lyα lines observable in QSO spectra, characterised by their HI column density NHI :

12 16 −2 1. 10 < NHI < 10 cm , the actual Lyα-forest: this optically thin, mostly ionised gas is thought to constitute the IGM, tracing relatively closely the cosmic web.

17 20 −2 2. 10 < NHI < 10 cm , Lyman limit systems: this regime lies inbetween the Lyα-forest lines and damped Lyα-systems and is defined through being optically thick, however still significantly ionised. It is believed to be associated with low-mass galaxies that only have little ongoing star-formation.

13 CHAPTER 1. Introduction and background

20 −2 3.N HI > 10 cm , damped Lyα systems (DLAs): At this high densities the gas is self-shielding and, opposed to the IGM, mostly neutral. DLAs are mostly interpreted as galaxy disks with a QSO-sightline passing through.

1.4.2 Using the Lyα-forest as a window to early structure

Lyα-forest observations have given rise to a number of applications, some of those we will briefly describe and summarise here.

Lyα-forest and circumgalactic medium

When combining high resolution QSO-spectra with galaxy spectroscopy around the QSO- sightline, it is possible to construct detailed maps of HI around those galaxies as has for example be done by Rakic et al. (2012) or Rudie et al. (2012). Due to the focus on spectroscopically accessible regions around the QSOs, these studies are limited to rather small (. 5 cMpc) scales, but give a very accurate view on the gas-reservoir around galaxies. Such an HI-map can also be used to estimate the DM halo mass of the respective galaxies, independent from clustering analysis (Rakic et al. 2013).

Lyα-forest as a cosmological tool

As described before the Lyα-forest can be interpreted as fluctuations in the IGM, origi- nating from primordial density fluctuations. It is thought to be a largely bias free tracer of the underlying DM distribution. This can then be used to infer the DM power spec- trum from Lyα-forest of large samples of QSOs. Such studies usually need several tens of thousands of QSOs to achieve a reasonable S/N and are therefore restricted to large spectroscopic surveys. The Lyα-forest and QSO cross-correlation function in BOSS has for instance been used to measure the Baryonic Acoustic Oscillation scale and position (Font-Ribera et al. 2014, Slosar et al. 2013). Further applications of Lyα-forest data include the measurement of the IGM tempera- ture through Lyα line profiles or estimating the mean baryonic density through measuring τ(z) and comparing it to simulations.

1.4.3 Measuring absorption lines

To quantify the strength of absorption lines detected in astronomical spectra is non-trivial due to the non-linear relation between the observed flux and the gas density. Furthermore such observations are affected by noise, instrument resolution and observing conditions. To address some of these issues a variety of tools to measure and interpret absorption lines have been developed, which we will briefly describe here. In studying absorption lines, we are not interested in the continuum of the spectrum of the object they are detected in, but only in the flux deficit arising from them. Consequently, for any applications we consider the spectrum to be normalised to unity within the continuum. Any absorption (in the absence of noise) will then have a flux value between zero and one.

14 1.5. Outline

Assuming that we can identify a distinct feature, we can now define the following useful quantities:

• Apparent optical depth: τ = −ln(F ), where F is the normalised flux. The optical depth is a measure of the amount of light absorbed. τ << 1 means that a medium is optically thin and photons can pass through. In the optical thick case τ >> 1 almost all of the light gets absorbed.

• Column density N: This refers to the density along a line of sight and has therefore the units of number of absorbers per area (cm−2). In the case of a well resolved line mec R it can be calculated from the optical depth via N = 2 τνdν with me being πe fλ0 the electron mass, f the oscillator strength and λ0 the rest-frame wavelength of the transition.

• Equivalent width: W (λ) = R [1 − F (λ)]dλ. W (λ) represents the width in wavelength of a hypothetical line which drops to f(λ) = 0 across the profile and has identical amount of absorption than the actual line. Equivalent widths are a useful tool as they do not depend on line-broading effects (like thermal motion) or detector resolution.

The number of atoms of the absorbing species and the equivalent width are related by the so-called curve of growth for which there exists three regimes: (i) the linear regime with τ << 1: W∝N. Measuring an equivalent width allows a direct measurement of the column density; (ii) the saturated regime: As the line gets saturated, an increase in the column density does not change much the equivalent width. Instead the equivalent width p starts to depend on the line profile and we have W ∝ ln(N); (iii)√ for very strong lines the line starts to develop damping wings. As those dominate W ∝ N holds. The challenge in practice is now to know where on the curve of growth a line is lo- cated, which in particular for lower resolution spectrographs is virtually impossible, as an otherwise saturated line may be smeared out to appear as a line in the linear regime. In the case of medium resolution spectra with a lot of lines, like the ones we are using in this thesis, it may not be possible to distinguish individual features because of line- blending; techniques like the equivalent width fail then, and instead one will mostly rely on apparent optical depth methods.

1.5 Outline

This thesis is organised as follows: In Chapter 2 we will present the COSMOS and zCOSMOS surveys that are used throughout this work as well as their properties where relevant for later application. Fur- thermore we discuss the Millennium simulation and its derived data products, describing how to use it to understand and interpret observations of high-redshift overdensities. Chapter 3 then describes a systematic approach to the identification and analysis of high-redshift proto-groups within zCOSMOS-deep, presenting a first catalogue of such overdensities. We develop and apply the techniques to use observations and simulations likewise to obtain a more complete picture of the nature of those structures.

15 CHAPTER 1. Introduction and background

In Chapter 4 we detail follow-up observations of a number of those proto-groups which led to the discovery of a z = 2.45 proto-cluster whose properties we describe. In particular we study its evolution to a massive z = 0 cluster and search for possible signatures of environmental differentiation. We then move on to the study of the IGM at z ∼ 2, describing our Lyα-forest data set in Chapter 5. This sample consists of eight QSO sightlines within the zCOSMOS-deep field, observed with XSHOOTER and partially UVES. We detail the steps in data-reduction, continuum fitting and estimates of optical depth. Furthermore we compare with higher resolution UVES data to estimate possible effects of the lower XSHOOTER resolution on our estimates of the optical depths. Chapter 6 discusses a first application of this data-set to infer the reliability of redshifts assigned a given zCOSMOS-deep redshift confidence classes. We use the information of the optical depth distribution in the vicinity of galaxies with redshifts in the respective classes to estimate the fraction of correctly assigned redshifts. In Chapter 7 we perform a cross-correlation analysis of the Lyα-forest and zCOSMOS- deep galaxies, characterising the cross-correlation function on scales of & 1 cMPc and . 60 cMpc. We detect the signature of redshift space distortions which we compare to the expected model describing the large scale infall of matter. We finally present a summary and conclusions in Chapter 8 and develop some ideas for future research in Chapter 9. This thesis is concluded by Chapter 10, where we describe a new algorithm developed and implemented for the VLT/KMOS instrument. It provides an automated procedure to find the optimal calibration position for the arms of the instrument, which had to be done manually up to then.

16 Chapter 2

Data and simulations

In this chapter I will describe data-sets and simulations that I have been using exten- sively throughout this thesis, however have not worked on myself. This is primarily the zCOSMOS-deep sample, as well as various photometry catalogues from the COSMOS sur- vey. Furthermore it includes the Millennium simulation and a variety of data products derived from it (mock lightcones, galaxy catalogues, etc). The chapter will be concluded with a description of the principles of how to use this simulation to infer information on early structure in the Universe, a technique which I have developed and applied, as described in Chapter 3 and 4.

2.1 The COSMOS survey

The Cosmological Evolution Survey (COSMOS) is a large campaign designed to better understand galaxy formation and evolution over a large range of redshifts. An overview and comparison to other surveys is shown in Figure 2.1. Its primary scope was to obtain imaging in ∼30 passbands over the 2deg2 COSMOS field, centered at RA=10:00:28.600 and Dec=+02:12:21.00 (Scoville et al. 2007). Observations of the ∼2 million COSMOS objects have been enhanced by additional surveys as for example the Spitzer S-COSMOS or the spectroscopic zCOSMOS survey. Thanks to the wealth of pre-existing data, it is also the target of ongoing observational efforts. In this work we use a variety of data products from within COSMOS, described in further detail in the following sections.

2.2 The zCOSMOS-deep redshift sample zCOSMOS observed a total of ∼30’000 galaxies in the COSMOS field, adding spectro- scopic information to the COSMOS survey to permit in particular the study of large scale structure, groups and clusters. It is divided into two parts: 1) the zCOSMOS-bright sam- ple on the full 2deg2, targeting pure i-band selected objects at mostly z < 1, observing 20’000 galaxies in total and 2) the zCOSMOS-deep sample, used in this work, which added the more time consuming and challenging high-redshift dimension and will be described

17 CHAPTER 2. Data and simulations

Fig. 2.1 — Left panel: Grey scale ACS image of the COSMOS field . Right panel: Compar- ison of COSMOS to other extra-galactic surveys. The yellow boxes show the CANDELS fields. Whilst the surveys on the left part of the image mostly aim at depth, COSMOS be- longs to the type of shallower surveys that however sample representative volumes. This makes it particularly suited to the present purposes of studying large scale structures. Image credit: http://irsa.ipac.caltech.edu/data/COSMOS/images/acs mosaic 2.0/ (ACS image), http://ned.ipac.caltech.edu/level5/March14/Madau/Figures/figure6.jpg (overview) in the next paragraphs.

2.2.1 Sample selection and observations

The zCOSMOS-deep redshift survey (Lilly et al. 2007, Lilly et al. in prep.) has observed around 10’000 galaxies in the central ∼1 deg2 of the COSMOS field. The selection criteria of the targets for zCOSMOS-deep was quite complicated. All objects were colour-colour selected to preferentially lie at high redshifts, through (mostly) a BzK colour selection (c.f. Daddi et al. 2004) with a nominal KAB cut at 23.5, supplemented by the purely ultraviolet ugr selection (c.f. Steidel et al. 2004). An additional blue magnitude selection was adopted that for most objects was BAB < 25.25. These selection criteria yield a set of mainly star-forming galaxies predominantly lying in the redshift range 1.4 < z < 3 (Lilly et al. 2007). These targets were then observed with the VIMOS spectrograph at the VLT using the low resolution LR-Blue grism giving a spectral resolution of R = 180 over a spectral range of 3700 − 6700 A.˚ Each mask has been observed for a total of 4.5 hr under seeing conditions of < 1.200. The zCOSMOS-deep data have been reduced with the VIPGI software (Scodeggio et al. 2005). The spatial sampling of zCOSMOS-deep is such that a central region of 0.6◦ ×0.62◦ was covered at approximately 67% sampling, whilst the outer region, extending to 0.92◦ × 0.91◦, was sampled at a lower completeness. Both regions are centered on 10:00:43 (RA) and +02:10:23 (Dec). The spatial positions of all zCOSMOS-

18 2.2. The zCOSMOS-deep redshift sample

RA 150.6 150.4 150.2 150 149.8 2.6 2.6

2.4 2.4

2.2 2.2 DEC DEC

2 2

1.8 1.8

400 0 200 400 N gal all gal 200

N successes

0 150.6 150.4 150.2 150 149.8 RA

Fig. 2.2 — We show the spatial distribution of zCOSMOS-deep objects. Galaxies with an assigned redshift are plotted in blue, cases where no conclusive redshift determination was possible in grey. The central, highly sampled area of the survey is marked by the red square. The absence of objects in the top-left corner of the central area is due to a masked star. The smaller panels show histograms of the RA and Dec distributions: there is a clear drop outside of the central area. As expected, the probability of successfully assigning a redshift was uniform in RA and Dec. deep objects are shown in Figure 2.2, indicating also the extent of the highly-sampled central area (red square).

2.2.2 Redshift determination and reliability

The zCOSMOS-deep redshifts were measured using a combination of computer-based cross-correlation with template spectra and visual inspection. In particular each redshift was determined through at least 2 independent reduction channels and then “reconciled”. Together with a redshift, each object is assigned a confidence class, reflecting the reliabil- ity of the corresponding redshift. These classes are as follows (as described in Lilly et al. 2007):

• Class 0: No redshift assigned • Class 1: Uncertain redshift based on a pure guess. • Class 2: Moderately uncertain redshift. This redshift is more likely than the other possibilities, but there is potential for error.

19 CHAPTER 2. Data and simulations

• Class 3: Secure redshift with little possibility for error, but the spectrum may not be perfect.

• Class 4: Very secure redshift with a “textbook” spectrum.

• Class 9: These redshifts have been determined on the basis of a sole, narrow, emission line, e.g. OII (3727A),˚ Lyα and Hα.

• Class +10: Added to the above flags in the case of a broad line AGN. Class 9 will transform into “18” instead of “19” in this case to reflect that the possible lines are different than for the narrow lines.

• Class +20 or +200: This is also added to the previous classes to indicate that the object was a secondary target in the slit.

These confidence classes were further refined by decimal point modifiers, to indicate agreement with the corresponding photometric redshifts: Modifier 0.5: Consistent photometric and spectroscopic redshift, i.e. |zphot − zspec| < 0.1(1 + z). Modifier 0.4: No photometric redshift available. Modifier 0.1: Spectroscopic and photometric redshifts are not consistent. In this thesis we used two versions of the resulting zCOSMOS-deep catalogue, version 2.5 and version 2.6. Both are close to the final data products differing only in the redshift and/or flag of a few objects. The reason for use of these two catalogue versions is simply that the most recent version was not available when conducting some of the studies pre- sented here. Given the negligible differences between the two catalogue versions, we do not expect our results to change. Table 2.1 summarises the distribution of flags in zCOSMOS-deep for the two catalogue versions. Naturally for class 3 and 4 the number of x.1 is reduced in comparison to classes 2 and in particular 1: the high number of 1.1 reflects the fact that class 1 redshifts are low-confidence guesses, resulting in more inconsistency with photometric redshifts. In total 9523 (9522) galaxies are included in the final catalogues of v2.5 (v2.6). Repeat observations, including some with the higher resolution FORS2 spectrograph indicate a typical velocity error of around 300 km/s in the redshifts. Depending on the application one may consider different samples as “reliable enough” redshifts. Mostly a sample consisting of flags 3, 4, 9.5, 1.5 and 2.5 or a sample consisting of 3, 4, 9.5 and 2.5 will be usable. These translate to success rates of 60% or 40% respectively, with success rate being defined as number of usable redshift divided by number of observed objects. The zCOSMOS-deep redshift distribution is shown in Fig 2.3. At z & 2 the success-rate increases, as the Lyα-line enters the spectral range of VIMOS. This is also the redshift range we are most interested in. In chapter 6 we provide an independent assessment of the redshift reliability of the zCOSMOS galaxies, using the strength of Lyα absorption around galaxies of the different classes as an indication of the percentage for correctly assigned redshifts.

20 2.2. The zCOSMOS-deep redshift sample

Tab. 2.1 — Flags in zCOSMOS-deep for the two catalogue versions, v2.5 and v2.6. Flag no of objects v2.5 percentage v2.5 no of objects v2.6 percentage v2.6 0 1750 18.4 1753 18.4 1.1 1406 14.8 1334 14.0 1.4 0 0 14 0.1 1.5 1757 18.5 1815 19.1 2.1 560 5.9 505 5.3 2.4 1 0 8 0.1 2.5 1433 15 1481 15.6 3.1 136 1.4 98 1.0 3.4 3 0 17 0.2 3.5 1182 12.4 1206 12.7 4.1 59 0.6 40 0.4 4.4 11 0.1 22 0.2 4.5 535 5.6 543 5.7 9.1 46 0.5 37 0.4 9.3 0 0 6 0.1 9.4 0 0 1 0 9.5 96 1 98 1.0 10 3 0 0 0 11.1 3 0 3 0 11.5 3 0 3 0 12.1 6 0.1 5 0.1 12.5 15 0.2 16 0.2 13.1 8 0.1 3 0 13.4 1 0 1 0 13.5 64 0.7 69 0.7 14.1 13 0.1 12 0.1 14.4 3 0 6 0.1 14.5 87 0.9 85 0.9 18.1 3 0 4 0 18.5 0 0 1 0 19.1 2 0 0 0 20 133 1.4 133 1.4 21.1 57 0.6 53 0.6 21.5 47 0.5 51 0.5 22.1 19 0.2 18 0.2 22.5 20 0.2 21 0.2 23.1 2 0 0 0 23.5 20 0.2 22 0.2 24.1 3 0 2 0 24.4 1 0 2 0 24.5 17 0.2 17 0.2 29.1 3 0 3 0 29.5 10 0.1 10 0.1 211.1 1 0 1 0 213.5 2 0 2 0 214.5 1 0 1 0 total 9523 - 9522 - 21 CHAPTER 2. Data and simulations

all reliable with 1.5s reliable without 1.5s 400 gal N

200

0 1.5 2 2.5 3 z spec

Fig. 2.3 — We show the redshift distribution of the zCOSMOS-deep galaxies of catalogue v2.5. Differences between this and v2.6 are negligible. The grey shaded area displays all objects with a redshift assigned. The solid line shows all “reliable” redshifts correspond- ing to flags 1.5, 2.5, 3 and 4 and the dashed line excludes 1.5. The overall distribution across the different selections is preserved, but obviously the number of objects decreases.

2.3 The photometric redshift samples

As mentioned above, the selection of galaxies in zCOSMOS-deep involved a cut in the B magnitude. The spectroscopic sample is therefore highly incomplete in mass and will be biased against red objects that are quiescent or which have high dust reddening. We will be interested in statements on the influence of environment on galaxy properties like mass or SFR, studying for instance whether there is any reason to believe that galaxies in overdense high redshift environments are experiencing environmental effects. Whilst the zCOSMOS sample is, thanks to the rather precise spectroscopic redshifts, ideal to identify any overdensities, a statement about the population of galaxies within those overdensities must be based on photo-z sample(s), despite the high redshift uncertainties therein. To this end, depending on the application, we used two photo-z samples, described in the following two sections.

2.3.1 The COSMOS photometric redshift sample

The COSMOS photometric catalogue is a purely i-band selected sample (Capak et al. 2007), providing 30 passband photometry ranging from the UV to the mid-IR. It includes 2’000’000 objects. The corresponding photometric redshifts have been derived in Ilbert et al. (2009), using the “Le-Phare” code which is based on χ2 minimization through template

22 2.3. The photometric redshift samples

Fig. 2.4 — Top panel: Spatial distribution of COSMOS (blue) and UVISTA (black) galax- ies in the zCOSMOS-deep field. The central area of zCOSMOS-deep is marked by the red rectangle. The empty regions are due to masked stars. Bottom panel: The photo- metric redshift distribution of both COSMOS (blue) and UVISTA (black), including only galaxies lying in the zCOSMOS-deep field. For representation reasons, the distributions are normalised by the respective total numbers.

fitting. The templates were chosen to reproduce the observed range of observed colours and galaxy types. The accuracy of the photo-z estimates was assessed via comparison to zCOSMOS spectroscopy, and was found to have an uncertainty of σ∆z/(1+z) = 0.03, including 20% catastrophic failure rate. Out of the 2’000’000 objects, 375’000 lie within the zCOSMOS-deep field and fulfill 1.5 < zphot < 3.

23 CHAPTER 2. Data and simulations

0.01 COSMOS UVISTA

norm 0.005 N

0 8 8.5 9 9.5 10 10.5 11 11.5 12 log stellar mass [M ] sun

Fig. 2.5 — Stellar mass distribution in the COSMOS (blue) and UVISTA (black) samples

of all galaxies within the zCOSMOS-deep field with 1.5 < zphot < 3. Due to the i- band cut in COSMOS, the high-mass end of the distribution is not included there. For presentation reasons, the distributions are normalised.

2.3.2 The UVISTA photometry and photometric redshift sample

The UVISTA sample comprises NIR imaging of the COSMOS field undertaken with the VISTA telescope in Paranal, with VIRCAM YJHK and narrowband imaging. The sample contains in total 340’000 K-selected objects and is 95% complete down to KAB = 23.8. 10 This corresponds to an approximate mass completeness limit of ∼ 10 M (McCracken et al. 2012). In particular we use the photometric redshifts generated for the UVISTA sample as described in Ilbert et al. 2013. Again, they used the “Le Phare” code with 33 templates spanning elliptical and spiral galaxies. The UVISTA photometry was augmented by the COSMOS passbands, which at z > 2 mostly provided mere upper limits. The precision of the resulting photo-z is of order of 3% for star-forming galaxies and σ∆z/(1+z) = 0.056 for quiescent galaxies. This corresponds to a velocity error of about 10’000-15’000 km/s. In addition to the photo-z estimates, Ilbert et al. (2013) also provide stellar masses and SFRs again obtained by fitting synthetic spectra to the photometry. Of the 340’000 objects in the UVISTA sample, 39’000 lie within the zCOSMOS-deep field and at 1.5 < zphot < 3. Figure 2.4 and 2.5 compare the COSMOS and the UVISTA photo-z samples. Whilst the COSMOS sample is much larger than UVISTA one, it has a clear deficiency of higher mass galaxies. This arises from the pure optical selection criterion in Capak et al. (2007).

24 2.4. The Millennium simulation and its derived mock catalogues

2.4 The Millennium simulation and its derived mock cata- logues

It can be very illustrative to use simulations for a better understanding of the observed Universe. A large dark matter simulation like the Millennium simulation enables us to study the DM distribution whilst semi-analytic models describe how galaxies link to it. This formulation allows us to follow both haloes and individual galaxies through cosmic time and therefore to determine the likely evolution of a structure. In the context of studying associations of galaxies as putative “groups” or “proto-groups” those simulations are particularly powerful. By analysing analogous associations to those observed in the sky we can directly access the underlying DM distribution and infer whether a putative group is indeed a group or proto-group. Obviously statements on galaxy properties from simulations can only be as accurate as the semi-analytic models (SAM) describing them. In our study we are mostly interested in replicating the correct number densities of galaxies, and we adjust the observational parameters (e.g. magnitudes) to match the number densities in the observations. On the other hand the exact values of the cosmological parameters (in particular σ8) impact statements on the DM structure, lower values for σ8 leading to less evolved structure at a given redshift. Indeed the preferred value for σ8 now, in comparison to the one used in the Millennium simulation, is lower; we perform a test for the impact of that in Chapter 3, finding consistent results. Whilst we will therefore not make any statements on galaxy properties based on the simulation, our analysis should be able to provide information on the likely nature of the candidate groups.

2.4.1 The Millennium Simulation

The Millennium Simulation is a large dark matter N-body simulation carried out in a cubic box of 500 h−1 Mpc sidelength. Based on a ΛCDM cosmology, assuming WMAP1 parameters, it starts from a glass-like distribution of particles that is perturbed by a Gaussian random field and follows the evolution of dark matter particles from z = 127 to z = 0. The results are stored in 64 snapshots, placed logarithmically in redshift space and starting close coverage from z = 20 (see Figure 2.6). From these dark matter parti- cles merger-trees are built up through a friends-of-friends identification of gravitationally bound haloes which in post-processing are populated with galaxies (Springel et al. 2005, Lemson & Springel 2006). Several semi-analytic models (SAM) for the galaxy formation process have been implemented on top of the dark matter structure of the Millennium simulation. This simulation is particularly well suited to our purposes: whilst we do no require the smallest haloes resolved, its large box-size will permit the detection of rare massive objects. In this thesis we make use of two different lightcones that each draw their mock galaxies from a different SAM. The Kitzbichler et al. (2007) lightcones rely on the prescription from deLucia & Blaizot (2007) - we use these lightcones in Chapter 3. The more recent Henriques et al. (2012) lightcones where the galaxies are drawn from the Guo et al.

25 CHAPTER 2. Data and simulations

140 14

120 12

100 10

80 8

redshift 60 6 look back time

40 4

20 2

0 0 0 10 20 30 40 50 60 snapshot number

Fig. 2.6 — This is an illustration of the 64 snapshots of the Millennium simulation, with the corresponding redshifts on the left-hand axis and the look back time on the right hand axis.

(2011) SAM serve as a basis for Chapter 4. The two SAMs mostly differ in their different treatment of satellite galaxy evolution and mergers. It is important to note again that in the work we present here, the exact details of the, at times inaccurate, SAMs are not crucial as we are mostly interested in the correct number counts of galaxies (and then the underlying DM) rather than the details of the galaxy properties.

2.4.2 The Kitzbichler lightcones

The Kitzbichler & White (2007) mocks are based on the galaxy formation semi-analytic model (SAM) as described in deLucia & Blaizot (2007). The six independent mock light- cones provide “observations” of a 1.4◦ × 1.4◦ field in which the identities of the galaxies are linked to the Millennium Simulation haloes. These lightcones are constructed with an observer at redshift z = 0 using a periodic replication of the simulation box to cover high redshifts (Blaizot et al. 2005). This will inevitably lead to the eventual duplication of objects. However, for the field size of 1.4◦ × 1.4◦ the first duplicate will appear around z ∼ 5, which is beyond the redshift range we are interested in. Each light cone is based on a different observer and pointing vector and can therefore be regarded as an independent survey in terms of large scale structure at high redshifts. The mocks give the positions of galaxies in RA and Dec, as well as the observed redshifts, including the effect of peculiar velocities.

26 2.4. The Millennium simulation and its derived mock catalogues

2.4.3 The Henriques lightcones

Henriques et al. (2012) construct their light cones from the Millennium simulation vol- ume with the implementation of the SAM described in Guo et al. (2012). They follow the description by Kitzbichler & White (2007) for periodic replication of the simulation box needed to achieve cones that cover a wide redshift range and assign galaxy redshifts according to the comoving distance of galaxies to a z = 0 virtual observer. The resulting 24 lightcones cover an area of 1.4 × 1.4 deg2 each.

2.4.4 Application of the mocks and the simulation

The mock lightcone catalogues as well as the galaxy or halo catalogues at each snapshot are obtainable from the Millennium simulation database. As a user one then has to connect those and develop the technique to link it to the observations and, in our case, analyse the evolution across cosmic time. We describe these techniques and the required links in the following. The mock catalogues provide an “observers perspective” of simulated galaxies, by listing quantities like redshift, RA, Dec and observed frame magnitudes, some of which are derived from the inherent properties of a galaxy. However this information is not stored in the mock catalogue itself and to access these inherent properties likes SFRs, stellar masses and further the DM information, some connection to the full simulation data is needed. This link is provided by the “galaxyId”, which connects a galaxy in the mock catalogues to its entry in the galaxy catalogues of the SAM database. Any galaxy or association of galaxies identified in the mock catalogue can therefore be selected in the simulation. The galaxyId itself is unique throughout the simulation and only identifies a galaxy at one given snapshot. Galaxies can nevertheless be followed through cosmic time via the “descendantId”: the galaxyId of the same galaxy in the next snapshot. In case of a merger the descendantId becomes identical for the galaxies involved. With respect to groups and clusters of galaxies, it is also possible, at any stage, to de- termine whether a candidate group of galaxies actually constitutes a group. By definition, the members of a group must share a common dark matter halo. Within the simulation this common dark matter halo is identified via the “fofId”: the ID of the subhalo at the center of a FOF-group. This ID is shared by all galaxies within that group. One can think of this as a “background” halo containing all subhaloes of galaxies that fell into the group in question, and therefore constituting the common group halo. The relation of these identities is sketched out in Figure 2.7. The mock catalogues provide a continuous redshift coverage via an interpolation in between the discrete snapshots of the simulation itself. When identifying a candidate group in the mock catalogue and linking it to the simulation we therefore access it on the closest snapshot (in total 6 snapshots are covering the range 2 < z < 3). From that snapshot we then identify the candidate group in each subsequent snapshot until reaching the one at z = 0. In this thesis we focus on the galaxies themselves: whenever we follow the evolution of a given structure, we do so via the galaxies it contains. Obviously one can also perform

27 CHAPTER 2. Data and simulations

Fig. 2.7 — Representation of how the different IDs within the Millennium simulation and the derived mock catalogues relate to each other, together with the properties accessible through a linked ID (see text for details). For example knowing the fofID of a group allows to determine the mass of each constituent subhalo in addition to the group halo itself. the same exercise using the DM haloes. This set-up then permits us to access the following key information:

1. The identity of a galaxy or group of galaxies at high redshift and determine whether they share a common halo defined by a common fofID.

2. The tracing of these galaxies through cosmic time, evaluating at each snapshot which (potentially common) halo they occupy, by using the descendantID to identify them at the next snapshot.

3. The identification of any galaxies that newly accrete onto a given halo and have not been identified previously, by searching for additional galaxies with the same fofID within the simulation.

4. The identification of cluster/group galaxies at z = 0 and following them back to z ∼ 2 via searching for haloes of a given mass at z = 0 and tracing them backwards in time.

These techniques will be used and applied extensively in Chapter 3 and 4, where we study proto-groups and proto-clusters.

28 Part II

Proto-groups and proto-clusters

Chapter 3

Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

This chapter is based on work published in Diener et. al. 2013, ApJ, 765, 109D. I did a preliminary study on “Galaxy groups at redshift z ∼ 2 − 3” for my MSc thesis, mostly in preparation for a VLT/FORS2 run in March 2011. The work presented here has been much extended during my PhD. There is however a small overlap in three of the plots, which already have appeared in a preliminary form in my MSc thesis (this is indicated where applicable), however they have changed considerable since then.

3.1 Introduction

As already detailed earlier on, identifying and characterising galaxy groups and clusters is important for studying the growth of structure as well as galaxy evolution due to the influence the group environment is believed to have on the member galaxies. Identifying groups using discrete galaxies as a tracer sample, however, is a non-trivial task, even at z < 1 where extensive spectroscopic surveys are available. Previous work at low and inter- mediate redshift therefore discusses extensively the performance of different group finders, in terms of the underlying dark matter haloes. Common automated (spectroscopic) group finding methods are the friends-of-friends method (Huchra & Geller 1982, Eke et al. 2004, Berlind et al. 2006), the Voronoy-Delaunay method (Marinoni et al. 2002, Gerke et al. 2005, Cucciati et al. 2010) or a combination of both (Knobel et al. 2009, 2012). Little is known about groups at z > 1, mostly because few redshift surveys have penetrated beyond this depth with a high enough sampling density to have any hope of finding any except the most massive. The redshift interval around z ∼ 2 is of interest for several reasons. This is, as will become clear in this chapter, when the first groups consisting of multiple massive (around M*) galaxies should appear in the Universe in significant numbers. It is also close to the peak of star-formation (Hopkins & Beacom 2006, Reddy et al. 2008) and AGN activity (Wolf et al. 2003) in the Universe, and where we might expect the first effects of the environment in controlling galaxy evolution to become apparent.

31 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

Above a redshift of z ∼ 2 there exist only rare examples of single clusters or groups in the literature. The search for them relies on overdensities around radio galaxies (Miley et al. 2006, Venemans et al. 2007, Hatch et al. 2011), the search for X-ray emission (Gobat et al. 2011) as well as overdensities identified with photometric redshifts (Spitler et al. 2012, Capak et al. 2011, Trenti et al. 2012). Some of these high redshift clusters have been confirmed spectroscopically (Papovich et al. 2010, Steidel et al. 2005, Tanaka et al. 2010 and Gobat et al. 2011). However, so far there has been no systematic analysis of high redshift groups in spec- troscopic redshift surveys. As described in chapter 2, the zCOSMOS-deep survey provides a large sample of galaxies at z > 1 including 3502 galaxies with usable redshifts in the redshift interval 1.8 < z < 3 in a single fairly densely sampled region of sky (Lilly et al. 2007, Lilly et al. in prep.), allowing the application of the same sort of algorithm as has been used to identify groups at z < 1. The aim of this study is now to apply such a low redshift technique to the zCOSMOS sample and perform a careful study on the nature of the identified structures, followed by an analysis of their observational properties. This should then also provide an idea of the feasibility of automated group-finding in spectroscopic samples at z > 2. More specifically, we identify possible groups at 1.8 < z < 3, based on a simple linking length algorithm, providing a catalogue of 42 such associations. In order to understand the physical nature of these detected structures, we have carried out extensive comparisons with mock catalogues that have been generated by Kitzbichler & White (2007) and then passed through the same “group-finding” algorithms. The primary aim is to assess whether the galaxies in these structures are indeed already occupying the same dark matter halo. We can however also use the mocks to follow the future fate of each galaxy and thus to see when, if ever, the candidate member galaxies will be in the same halo, whether they will merge with other galaxies and so on, and what the structures identified at high redshift are likely to become by the present epoch. This chapter is organized as follows: We first give a brief recapitulation of the zCOSMOS-deep sample and the mock catalogues (both described in detail in chapter 2) used to calibrate and analyse our group catalogue. In section 3.3 we develop our group- finder algorithm on the basis of comparisons with the mocks, and produce the catalogue of 42 associations. In Section 3.4 we first carry out an extensive analysis of the mocks to see what they indicate for (a) the nature of the structures that we detect at z & 2, (b) how they develop over time, down to z ∼ 0, and c) how representative they are of the popula- tion of progenitors of massive haloes today. We then examine a complementary photo-z sample and identify a significant excess of massive galaxies in the regions of the groups, but do not find evidence for any colour differentiation of the population relative to the field, although we argue we should probably not have expected to see such differentiation. We then conclude the chapter and summarize our findings. Where needed we adopt the following cosmological parameters (consistent with the Millennium simulation, Springel et al. 2005): Ωm = 0.25, ΩΛ = 0.75 and H0 = 73 km s−1 Mpc−1. All magnitudes are quoted in the AB system.

32 3.2. Data

3.2 Data

3.2.1 The zCOSMOS-deep sample

The zCOSMOS-deep survey has been described in detail in Chapter 2. It observed around 3500 galaxies at z ∼ 2, with a total sampling rate of about 50%. The rather precise spectroscopic redshift with errors of order 300 km/s (as opposed to photometric redshift errors of ∼ 100000 km/s), the, for this redshift, dense sampling and large field, makes it an ideal survey for the search for overdensities. To minimise the effect of chance alignments due to insecure redshifts, we only use galaxies with flags 3, 4, 1.5, 2.5 and 9.5 in this chapter, meaning that the correspond- ing redshifts are either secure on their own or confirmed by the respective photometric redshifts. Furthermore, we restrict our analysis to the redshift range 1.8 < z < 3 where the success rate in measuring secure redshifts is highest because of the entrance of strong ultraviolet absorption features into the spectral range. The final sample used consists of 3502 objects from the catalogue in Lilly et al. (in prep.). In the central 0.36 deg2 region the overall sampling rate of this sample relative to the target catalogue is about 55%. The comoving number density is 6.1 × 10−4 Mpc−3.

3.2.2 Mock catalogues

The Millennium Simulation

As detailed in chapter 2 the Millennium simulation is following the DM evolution through- out cosmic time with the galaxy information being added in post-processing. The mock light-cones which supplement the simulation provide an observers perspective. If these mocks are carefully adjusted to replicate the observational situation, the combination of them and the simulation allows us to study the nature of the identified candidate groups. Further the structure of the Millennium simulation allows us to follow both haloes and individual galaxies through time and therefore to determine the subsequent evolution of group-like structures that are identified at a particular redshift (Lemson et al. 2006). If the simulations are reasonable approximations to the real Universe, they are therefore ideal for the present purposes of trying to understand the physical nature of corresponding objects in the sky.

Sample selection

In this chapter we make use of the six independent light-cones from Kitzbichler & White (2007). In order for the mock catalogues to be usable for comparison with the zCOS- MOS deep sample, we carefully matched them to the observations. We first added a straightforward observational velocity error to each galaxy redshift by adding a velocity selected randomly from a Gaussian distribution with σv = 300 km/s. The main concern is to match the number densities of galaxies in the actual zCOSMOS sample and in the mocks. Starting with the set of all galaxies in the mocks, we applied limiting magni- tudes in B and K. Small adjustments to the nominal BAB < 25.25 and KAB < 23.5 limits from zCOSMOS were then made above and below z ∼ 2 so as to match as well as

33 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

400

300 2

200 N(z)/(1deg)

100

1.8 2 2.2 2.4 2.6 2.8 3 z

Fig. 3.1 — The average N(z)-distribution of the objects in the final mock catalogues (red) after adjustment, as compared to the N(z)-distribution of the actual zCOSMOS- deep sample (blue). The shaded area shows the spread of the mocks (in terms of their standard deviation). An adjustable magnitude cut in B and K was applied to the mocks in order to match the number density of galaxies to the data (see text). A preliminary version of this plot was also shown in my MSc thesis.

possible the shape of the N(z) number counts of objects in the actual data, i.e., so that 2 s = ΣmocksΣz(Ndata(z) − Nmocks(z)) was minimized. Given the overall sampling (spatial sampling times spectroscopic success rate) of zCOSMOS-deep in this redshift range, we constructed, through these small magnitude adjustments, a mock sample that had exactly twice the surface number density as the final spectroscopic sample in the highly sampled central region. This meant a final division of the mock sample into two via random sam- pling could be used to simulate the ∼50% sampling of the spectroscopic data and yield a second, complementary, mock sample from the same light cone. This is useful to see the effects of the sampling as well as doubling the number of mock samples. It should be emphasized that the goal of this exercise was to produce a mock sample that had the correct N(z) and was similarly dominated by star-forming galaxies (by making similar nominal cuts in B and K as in the zCOSMOS selection), rather than to simulate exactly the selection of the objects. Such an exact simulation would have depended on the details of the galaxy formation prescription used in the SAM prescription, and on the uncertain vagaries of the zCOSMOS-deep spectroscopic success rate etc. Figure 3.1 shows the resulting N(z) averaged over all twelve mock samples, compared with that of the zCOSMOS sample.

34 3.3. Methods

3.3 Methods

3.3.1 Group definition

Whilst at z < 1 group-finders will actually identify galaxy groups, this may not necessarily be the case at z > 2; the systems identified there may be quite different in nature. We therefore introduce the following terminology:

1. “(real) group”: a set of three or more galaxies which are all in the same dark matter halo at the epoch in question; 2. “partial group”: a set of three or more galaxies at least two of which are in the same dark matter halo at the epoch in question; 3. “candidate group”: a set of three or more galaxies that are identified by the group- finder as defined in the next section; 4. “proto-group”: a candidate group in which all the members will be found in a real group at some later epoch; 5. “partial proto-group”: a candidate group which will become a partial group at a later epoch, i.e., in which some apparent members at the epoch in question will never appear in the same halo down to z = 0; 6. “spurious group”: a candidate group in which none of the apparent members will ever belong to the same halo down to z = 0, i.e., the galaxies are simply projected on the sky.

3.3.2 The nature of groups in the mocks

As discussed in chapter 2 the Kitzbichler light cones provide the galaxies together with a link to the actual object within the Millennium simulation. Dark matter haloes are only identified within the Millennium simulation itself using a friends-of-friends (FOF) algorithm applied to the dark matter particles. In our analysis, we have not considered the effect of changing the dark matter linking length in the Millennium simulation. For a discussion see Jenkins et al. (2001). Galaxies belonging to the same DM halo have the same halo identification number (FOF-ID) at the epoch in question (Lemson et al. 2006). By examining, at all later times, the halo FOF-IDs of the galaxies which we have placed in candidate groups at z ∼ 2, we can see when, if ever, these galaxies belong to the same halo. This makes it straightforward to determine the group nature (as defined above) of a particular set of galaxies that has been detected by application of the group-finder algorithm to a mock catalogue simulating an observational light cone. The galaxies in a proto-group will not share the same FOF-ID until the galaxies have entered the common halo. Likewise, the descendant tree of galaxies that is provided by the Millennium simulation can be used to follow the evolution of single galaxies from z ∼ 2 to z = 0 and thereby to identify mergers between galaxies. When two galaxies have the same descendant at the next snapshot, they must have merged in the intervening time.

35 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

Using the mocks and the descendant trees of galaxies we were therefore able to identify, in the mocks, which candidate groups are already real or partial groups, which are not yet real/partial but will become so at some point in the future, and which are totally spurious in that the galaxies will never reside in the same halo. We can also see which galaxies merge together, which by definition requires them to be in the same halo.

3.3.3 Group finder algorithm

At lower redshifts, where the emphasis is on real groups in the same halo, the group finder should ideally only pick out real groups, minimizing the number of interlopers. A major concern is the over-merging or fragmentation of groups and a great deal of effort goes into controlling these issues (see Knobel et al. 2012 for an extensive discussion). At higher redshift such a group-finder may not only pick-up groups but potentially also a range of proto-groups; the challenge then is to identify both those systems and real groups, whilst avoiding chance associations. In this chapter we used a FOF-approach to link galaxies into candidate groups. In choosing the linking lengths ∆r (in physical space) and ∆v one has to take into consideration the following, sometimes contradicting, requirements:

• The linking length has to be large enough to ideally encompass all (proto-)groups that are present, but small enough for not to overmerge groups, i.e., miss-detect two distinct groups as one.

• Interlopers (i.e., miss-identified group galaxies) should be avoided.

• The linking lengths must take into account the measurement errors as well as peculiar velocities.

The choice of values for the linking lengths is therefore a compromise. We explored the performance of the group-finder with varying linking lengths with the mock catalogues, determining for each resulting group catalogue the total number of candidate groups, the total number of real groups plus proto-groups, and the fraction of real (proto-)groups, i.e., the fraction of the detected structures at z ∼ 2 which either constitute a group already then, or will do so by z = 0. This is shown as a function of linking length in Figure 3.2. It turns out that the number of real (proto-)groups stays largely constant with increasing velocity linking length beyond ∼ 700 km/s, but increases with linking length ∆r. The total number of candidate groups however increases steadily with both ∆v and ∆r, meaning that the fraction of real (proto-)groups decreases with ∆v and with ∆r. We set a fraction of real (proto-)groups of 50% as a minimum requirement. The remaining 50% of the sample will contain a significant number of partial (proto-)groups, which will increase the success rate (see section 3.4.2). Because of the initial upturn in the number of real (proto-)groups we also want to have ∆v & 700 km/s. It then turns out that the maximal linking length ∆r (physical space) that fulfills these two requirements is 500 kpc. The ∆r = 500 kpc and ∆v = 700 km/s are slightly higher values than for instance in Knobel et al. (2009), who uses 300-400 kpc and ∼ 400 km/s. This is, however, justified by the larger measurement errors at our higher redshifts and the lower density of tracer galaxies.

36 3.3. Methods

Fig. 3.2 — Number of proto-groups at 1.8 < z < 3, which includes any real at this redshift, in the mock catalogues (upper panel), the total number of candidate groups (middle) and the fraction of (proto-)groups (the fraction of the detected structures which either already constitute a group or will do so by z = 0, lower panel) as a function of the velocity linking length ∆v for various projected linking lengths ∆r. The numbers show the average number per mock catalogue, and the shaded areas the spread in the mocks in terms of their standard deviation. The number of (proto-)groups stays largely constant after the first rise up to ∆v ∼ 700 km/s, whereas the total number of candidate groups keeps rising with increasing ∆r and ∆v, producing a declining fraction of (proto-)groups. Requiring the velocity linking length to fulfill ∆v & 700 km/s, and the choice of 500 kpc for the projected linking length (shown in green) keeps the fraction of proto-groups above 50% (see text for details). The middle panel also shows the actual number of candidate groups found in zCOSMOS-deep with this parameter choice (black cross). This is in good agreement with the number of candidate groups defined in the same way in the mock catalogues.

37 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

Fig. 3.3 — The N(z)-distribution of the galaxies in the actual zCOSMOS-deep candidate groups (blue) compared to the distribution of the whole sample (grey), normalized to the same number of galaxies.

The width of the shaded area in the two upper panels of Figure 3.2 indicates the standard deviation in the number of proto-groups and candidate groups in the 12 mocks. This shows that cosmic variance is small compared to Poisson noise.

3.3.4 Application to zCOSMOS sample and comparison with mocks

Having determined the parameters of the FOF algorithm in the previous section, we apply the group-finder to the actual zCOSMOS data and the 12 mock samples. In the data this results in 42 candidate groups with memberships of three or more, i.e., we do not consider “pairs”. Of these 42, one has five members and six have four, so the vast majority are triplets. The 42 candidate groups are listed in Table 3.1 at the end of this chapter, their redshift distribution as compared to the parent sample is shown in Figure 3.3. Almost all of the detected candidate groups are in the central more highly sampled region of the field, as shown in Figure 3.4. For each zCOSMOS candidate group, and for the corresponding candidate groups in the mocks, we also compute a nominal r.m.s. size and apparent velocity dispersion by q q P 2 P 2 rrms = i ri /(N − 1) and vrms = i vi /(N − 1), where ri and vi denote the distance or the velocity of a galaxy to the center of the candidate group and N is the number of members. The center of the candidate group is defined by the average RA, Dec and z. The overall number of candidate groups found in the central area of zCOSMOS-deep (36 groups) agrees quite well with the average number found in the mocks, which is 44 per 0.36 deg2, i.e., the actual data has 18% fewer candidate groups. As shown in Figure 3.5, there is also broad agreement in the distributions in redshift, richness, and in the nominal size rrms and apparent velocity dispersion vrms distributions.

38 3.4. Results

2.6

2.4

2.2 DEC [deg]

2

1.8

150.6 150.4 150.2 150 149.8 RA [deg]

Fig. 3.4 — The location of candidate groups in the COSMOS field. The candidate mem- bers are shown in red. The underlying zCOSMOS-deep sample in the same redshift range is shown in blue. The red square shows the extent of the central, highly sampled, area. Not surprisingly, the detection of structure is sensitive to the projected density of the available tracers.

3.4 Results

Having identified the zCOSMOS-deep candidate groups we first attempt to establish their nature; i.e. are they real groups or rather proto-groups; we will address this in the next two sections. Further it will be interesting to determine the future fate of these structures: How do they assemble and accrete new members and what will they evolve into by z = 0? The for this redshift rather large catalogue allows us to answer these questions in a more statistical way; we do however have to rely on the corresponding mock groups within the Millennium simulation to perform this kind of analysis. After having understood the nature of the identified candidate groups, we will turn towards observational quantities, by estimating their overdensities and comparing it to the simulation predictions. Finally we search for an early on-set of environmental differ- entiation.

3.4.1 Are we detecting real groups at z & 2? Whereas, as established in the following section, a significant number of the candidate groups will have assembled by z = 0, we find that only 5 out of in total 2791, i.e., less than 0.2%, of the candidate groups in the mocks are real groups in the sense that all of the members are already in the same dark matter halo at the time of observation (at z ∼ 2). However, 8% of the observed structures are partially assembled with two galaxies in the same halo, meaning that we are observing groups with interlopers. These statements depend on the structure build-up at a given redshift. The Millen-

39 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

richness group redshift 3 5 7 9 2 2.2 2.4 2.6 2.8 3 2 80 20 2 60 15 /(1deg) /(1deg) 40 10 groups 20 5 groups N N

2 20 30 2 25 15 20 /(1deg) 15 10 /(1deg) 10 groups 5 groups N 5 N 0 0 150 300 450 600 150 300 450 600 r [kpc] v [km/s] rms rms

Fig. 3.5 — Comparison of the basic properties of the candidate groups in the mock sample (red) with those in zCOSMOS-deep (blue). The shaded areas show the spread in the mock samples in terms of their standard deviation. Top left: Richness (number of candidate member galaxies). Top right: Redshift of the candidate group. Bottom left: Root-

mean-square radius of the candidate group, (rrms) defined as the r.m.s. distance of the members to their mean RA and Dec. Bottom right: R.m.s. of the velocity (vrms) relative to the center of the candidate group defined by the mean redshift of the members. In general there is a good agreement between mocks and data, in particular when taking into consideration the low number of candidate groups in the data (A preliminary version of this plot is shown in my Msc Thesis, however using different parameters for the linking lengths).

nium simulation used WMAP1 cosmological parameters (with a σ8 = 0.9), whereas the most recent cosmological data establish a lower value for σ8, implying a lower build-up of structure. We also checked our statements with the mock catalogues described in Wang et al. (2008), where σ8 = 0.81 using the WMAP3 parameters (which are close to the most recent estimates). As expected these also yield essentially no real groups amongst the candidate groups at 1.8 < z < 3.

3.4.2 Assembly timescale

We established above, based on comparisons with the mocks, that most of the detected structures at 1.8 < z < 3 have not yet assembled when we observe them. For 8% of the mock candidate groups, two of the galaxies are already in the same halo, but essentially no candidate group has assembled all three members. It is therefore an interesting question to see when and if these actually become groups, i.e., if they are what we call “proto-groups” at z ∼ 2. Using Millennium simulation to follow the evolution of the structures we detect at z ∼ 2 down to z = 0 we find that at the present time only 7% of the detected mock

40 3.4. Results

700 1 0.9 600 0.8 500 0.7 0.6 400

[kpc] 0.5

rms 300 r 0.4

200 0.3 fraction of proto−groups 0.2 100 0.1 0 0 0 200 400 600 800 v [km/s] rms

Fig. 3.6 — The fraction of proto-groups with respect to all candidate groups in the mocks

as a function of their velocity dispersion vrms and size rrms. This fraction strongly depends on vrms whereas it is largely independent of rrms. For vrms . 300 km/s the fraction of proto-groups is above 50%. The observed vrms is a crude indicator for the chance of a candidate group to become a real group in which all the galaxies share the same halo. The black circles show the location of the zCOSMOS-deep candidate groups. (A preliminary version of this plot is shown in my Msc Thesis, however using different parameters)

20 10 N = 3 N > 3 8 15 > 6 > 10 groups 4 groups

0 0.2 0.4 0.6 0.8 10 0.2 0.4 0.6 0.8 1 ∆a ∆a

Fig. 3.7 — The subsequent assembly of the proto-groups in the mocks. The diagrams show the change in a, the cosmic scale factor, before the proto-groups have accreted two (blue), and then all (black), of their identified members into the same halo (left panel for richness 3, right panel for richness ≥ 4). Most of the proto-groups observed at 1.8 < z < 3.0 start to assemble within ∆a < 0.1. candidate group galaxies are still completely outside of a common halo. 93% of the mock candidate groups will either fully (50 ± 1%) or partially (43 ± 1%) assemble by the present epoch. The main criterion that distinguishes proto-groups from partial or spurious ones is the apparent velocity dispersion vrms. This is shown in Figure 3.6. In the regime

41 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

1 0.9 0.8 0.7 0.6 0.5 0.4 0.3

0.2 0.1

3 2.5 2 1.5 1 0.5 0 redshift

Fig. 3.8 — The assembly history of all the candidate mock groups with richness 3 (which constitute over ∼85% of the sample) over redshift. Partially assembled systems are shown in yellow (two members in the same dark matter halo) and fully assembled systems in blue (all members in the same halo). The light areas denote member galaxies that have subsequently merged (by definition within the same halo). The grey zone represents candidate groups in which the members are not, at least yet, in the same halo. The white zone is because we only follow the evolution of a candidate group after it has been detected in the light cone and the diagonal grey-white border therefore reflects the redshift distribution of the detected candidate groups. At z = 1.8 already ∼ 25% of the candidate groups (detected at slightly higher redshifts) have assembled at least two of their members into the same DM halo, up from 8% at the epoch of observation of the individual groups. vrms . 300 km/s (which is comparable to the velocity error in the data) the fraction of mock proto-groups is above 50%, whereas it drops below 50% for velocity dispersions larger than 300 km/s. The fraction of proto-groups does not depend on the projected radial size of the candidate group. The trend with velocity dispersion is, however, weak enough that it is not attractive to reject all candidate groups with vrms ≥ 300 km/s. As stated above, 93% of the mock candidate groups will become real or partial groups by the present epoch. Already by z ∼ 1.5, 50% of the candidate groups are partial groups (up from 8% at the epoch of observation) and by the current epoch, 50% of the candidate groups at z ∼ 2 are real groups with all detected members within the same halo. The majority of the proto-groups start to assemble within a ∆a < 0.1 (see Figure 3.7,“a” being the cosmic scale factor), which means that on a rather short timescale two or more members will share the same FOF-halo. The full assembly then requires a substantially larger timescale (∆a ∼ 0.5 or even more). This continuous assembly process is further illustrated in Figure 3.8 and emphasizes that assembly is taking place even within the observational “window”. Although only 8% of the mock candidate groups are partially assembled by the time we observe them,

42 3.4. Results by the end of the observing window at z = 1.8 around 25% of the proto-groups have already members in the same dark matter halo. These are therefore groups of richness 2 “contaminated” by an interloper (most of which obviously later on will accrete onto the group). According to the mocks, we are therefore able to actually observe the earliest phases of the assembly process of these groups. Figure 3.8 also illustrates the likelihood that group members seen as distinct galaxies at z ∼ 2 will have merged together by the current epoch. In about 40% of the proto-groups, two or more of the members that we identify at z ∼ 2 will have merged together by the current epoch, and in about 10% all three members will have merged into a single massive galaxy.

3.4.3 Halo masses

In the preceding discussion we followed the evolution of the structures that were detected by our group-finder at z ∼ 2 down to the present epoch. In this section we look at haloes at the present epoch and ask which of their progenitors could have been detected at z > 1.8 in a zCOSMOS-like survey. To do this, we examine the set of all present-day haloes in the simulation whose progenitors lie within the 1.8 < z < 3 volume of any of the six light cones. We first identify at the earlier epoch all of the haloes that will eventually assemble into a given present-day halo, and then identify all the “progenitor galaxies” within these progenitor haloes and ask if they satisfy the zCOSMOS brightness selection criteria, with- out the 50% spatial sampling, referring to these as “zCOSMOS-selectable” galaxies. We then additionally ask whether this set of “progenitor galaxies” would have satisfied our group-funding requirements in terms of their spatial and velocity displacements, adding in also the incomplete spatial sampling of the zCOSMOS survey. 13 The result is shown in Figure 3.9. Many haloes today, especially at M < 10 M /h, do not have any zCOSMOS-selectable progenitor galaxies at 1.8 < z < 3. These are represented as the light grey region of the upper panel. Some have only one or two zCOSMOS-selectable progenitor galaxies and they are shown in dark grey, since they will by definition not be recognized as a “proto-group”. The pink region represents haloes today whose progenitor haloes did contain three or more zCOSMOS-selected galaxies but which were, at 1.8 < z < 3, too dispersed to satisfy our group-finding linking lengths. Finally, the blue region represents haloes with three or more progenitor galaxies that are close enough to be recognized as a candidate “group”. Applying the 50% sampling of the zCOSMOS-survey, about a half of these are actually recognized (light blue), the remainder are missed simply because of the incomplete spatial sampling of the survey. 14 At high present-day halo masses (above ∼ 10 M /h) the majority of the haloes are represented in our candidate group catalogue in the sense of detecting three or more progenitor galaxies and recognizing them as members of a candidate group structure. In 14 15 other words, around 65% of todays 10 − 10 M /h haloes should in principle have been recognized as a candidate group with the galaxy selection criteria of zCOSMOS, although a half of these will not have been detected in practice because of the random 50% sampling 14 of our survey. Of the remaining ∼35% of present-day haloes above ∼ 10 M /h that we would not have expected to be able to detect, more than a half have three or more

43 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

halo mass [M /h] sun 1011 1012 1013 1014 1015 1

0.8

0.6

0.4

0.2

4 >) 3 haloes

2 log(

1

1011 1012 1013 1014 1015 halo mass [M /h] sun

Fig. 3.9 — Top panel: The average (over the 12 mock samples) fraction of present-day haloes that are detectable in a zCOSMOS-like survey at 1.8 < z < 3, as a function of the present-day dark matter mass of the halo. The light blue region shows haloes which today contain three or more galaxies that, at high redshift, would satisfy the zCOSMOS-deep photometric selection criteria and would have been recognized as a can- didate group with the zCOSMOS-deep overall sampling and success rates. The dark blue region represents candidate groups that were not recognized simply because of the incomplete sampling/success rate - the lack of these in our candidate group catalogue was therefore simply a matter of chance. The pink region represents haloes in which the constituent galaxies would have been observed in zCOSMOS-deep, but which were too dispersed, in projected distance or velocity, to satisfy our group-finding algorithm. The darker grey region represents present-day haloes which only had one or two progenitor galaxies satisfying the zCOSMOS-deep photometric criteria, while the light grey region represents haloes in which none of the progenitor galaxies could have been in zCOSMOS- 14 15 deep. Around 65% of all present day 10 − 10 M /h groups would have a progenitor structure at z ∼ 2 which we would in principle be able to identify in zCOSMOS-deep with full sampling. Bottom Panel: As in the upper panel, but now the total number of haloes is plotted instead of the fraction. detectable progenitor galaxies, but these are too dispersed in space or velocity to satisfy our criteria. Increasing the linking lengths to catch these dispersed systems would, as shown above, however also severely increase the number of interlopers.

44 3.4. Results

The lower panel of Figure 3.9 shows the distribution of the present-day halo masses of the systems in our candidate group catalogue. While, as noted in the previous paragraphs, we are detecting a high fraction of the progenitors of the most massive haloes today, we are evidently detecting a broad range of present-day halo masses with most systems in the 13 14 10 − 10 M /h range.

3.4.4 Overdensities

Determination of the overdensity

ρgr−ρ¯ In order to give a rough estimate for the overdensities δ = ρ¯ associated with the candidate groups we calculated the mean (comoving) densityρ ¯ of the overall sample in bins of ∆z = 0.2 using the following equation: N 1 ρ¯ = ∆z ,V = · area · (l3 − l3 ), V 3 max min where l denotes the comoving distance along the line-of-sight and “area” is the field of view of the mocks (1.4◦×1.4◦).

The density of the groups ρgr was determined by assuming a cylinder with radius rrms and a length l corresponding to twice the vrms (in comoving units): N ρgr = 0.27 · 2 πrrmsl where N is the number of members, and the factor 0.27 is included to account for the fact that in a 3D Gaussian distribution only this fraction of the points would actually lie within the 1σ region (which we assumed here, by setting the size of the cylinder to the rrms and the vrms). The overdensity computed here is at best a rough order of magnitude estimate. First, there is an effect from the 50% sampling rate. Adding in the missing galaxies does not add significant numbers of new members to the detected associations (since they were the lucky ones with above average sampling), whereas the mean density of the field increases by a factor of two, leading to a factor of up to two over-estimate in the overdensity. On the other hand, due the effect of measurement errors in redshift (of order 300 km/s) as well as peculiar velocities in that, the “size” along the line of sight may have been substantially over-estimated leading to an underestimate of the actual overdensity, e.g., by almost an order of magnitude since the observed vrms corresponds to about 8 Mpc (comoving) against the typical rrms of ∼ 1 Mpc (comoving). The estimated overdensities should therefore be treated with considerable caution.

Results

With these caveats in mind, the distribution of δ for the 42 candidate groups and for the corresponding mock samples is shown in Figure 3.10. Even with the uncertainties outlined above, it is evident that that the candidate groups represent highly overdense regions and that most of them have probably already turned around (i.e., decoupled from

45 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

20

15

10 number of groups

5

0 0 50 100 150 200 250 300 350 400 δ

Fig. 3.10 — The distribution of the proto-group “overdensities” in zCOSMOS-deep (red) and in the mocks (blue). These overdensities are quite large and indicate that the struc- tures are in an advanced stage of collapse, consistent with the idea that the galaxies will assemble into the same haloes in the future. However, readers should see the text for discussion and important caveats in the interpretation of this quantity. the background). This would be expected if they are to merge into a single halo within an interval of expansion factor of ∆a ∼ a as discussed above.

3.4.5 Excess of high mass objects and red fractions

So far we have established that the associations that we have found are in the main not yet fully formed groups, but are quite likely to become so by z = 0. Furthermore, the candidate groups are already quite overdense. For this reason it is of interest to look for surrounding overdensities and to look for any colour-differentiation of the galaxy population in and around the candidate groups relative to the field population. Unfortunately, zCOSMOS- deep itself is limited to star-forming galaxies by the colour selection, and so it is necessary to use photo-z objects from the larger and deeper COSMOS photometric sample (Capak et al. 2007). Typical photo-z errors are of order of ∆z ∼ 0.03(1 + z) or 10’000 km/s. 10 We focus on relatively massive galaxies, above a stellar mass of > 10 M so that the photo-z errors are not excessive. Most of these objects have 25 < IAB < 28. We first search for any excess of galaxies around the locations of the candidate groups. We consider cylinders with radii that are a varying multiple of the group rrms and which have a fixed length dv of twice 10’000 km/s. We lay down 42 cylinders, one over each group, and 10 compare the total number of massive (> 10 M ) galaxies in these cylinders to the totals found when the 42 cylinders are laid down at positions that have the same (z, rrms, dv) but random (RA, Dec) positions, repeating these random samples 1000 times and using the variation in the random samples to give an estimate of the noise to be expected in the group sample.

46 3.4. Results

2

1.8

> 1.6 field

> /

1

1 3 5 7 9 11 distance to group center [r ] rms

10 Fig. 3.11 — The excess of high mass (> 10 M ) galaxies from the COSMOS photo-z sample around our spectroscopic candidate groups, relative to the field, as a function

of the projected distance from the group in units of the rrms of the groups as seen in cylinders of depth ∆v = ±100000 km/s to accommodate photo-z errors (see text for details). At the position of the candidate groups we find a projected excess of up to ∼40% in the number of massive galaxies (blue filled circles). This fraction reduces to ∼25% if we subtract out the already known spectroscopic members (red open circles) and also reduces to insignificance at large radii. This concentrated mean overdensity suggests that our candidate groups indeed trace significant overdensities in the Universe.

Especially at small multiples of rrms a significant excess is seen around the candidate groups as shown in Figure 3.11.

At the position of the candidate groups within a 1 − 2 rrms radius we find ∼ 40% more massive objects around the group positions as in the general field, whereas this fraction drops for larger radii and is consistent with unity at ∼ 10 rrms, which corresponds to ∼ 3 Mpc (physical). This excess is only slightly reduced when the spectroscopically observed objects are excluded (red circles in Figure 3.11), and the excess seen in this independent dataset provides further evidence that the candidate groups catalogued in this study are real physical associations and not just chance projections. Next we look at the distribution of colours in the photo-z sample around the candidate groups with respect to the field. For this we consider cylinders with a fixed radius of 2 rrms and the same length of twice 10’000 km/s as above. We define red galaxies to be galaxies with MU − MB > 0.7 and consider a red fraction which is the number of red galaxies at a given stellar mass divided by the total number of galaxies at that mass. Figure 3.12 shows the red fractions as function of stellar mass. It is clear that the fraction of red objects in the candidate groups and in the field, at fixed stellar mass, is essentially the same and we do not see evidence of colour segregation

47 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

1

0.8

0.6

red fraction 0.4

0.2

0 8.5 9 9.5 10 10.5 11 11.5 stellar mass [log, M ] sun

Fig. 3.12 — The red fraction of objects in the photo-z sample at the position of our candidate groups (in red) as compared to the field (in blue) as a function of stellar

mass. Red galaxies are defined to have rest-frame MU − MB > 0.7, using the spectral energy distributions used to estimate their photometric redshifts. We find that there is no difference in the colours for the field and the candidate groups. This is, however, not surprising if the candidate groups are only starting to assemble and if environmental differentiation is confined to satellites, as indicated at lower redshifts.

with environment. Of course, given the large cylinder length in redshift (of order ±0.1 around the group location), our “group sample” will have been heavily contaminated by unrelated foreground and background field galaxies: our overdensity of 40% suggests that also 70% of the photo-z “group sample” galaxies are projected from the field. These projected galaxies will of course heavily dilute any intrinsic color difference and we could in principle subtract these projected galaxies statistically. However, because the red fractions are so indistinguishable, we have not attempted to do this. It is not clear that any such environmental segregation, at fixed stellar mass, should have been expected. We have argued above that the galaxies in the candidate groups are in general unlikely, at the epoch at which we observe them, to be sharing the same dark matter halo. A correspondingly small fraction of the galaxies will be satellites, even in the larger photo-z sample. Various papers (van den Bosch et al. 2008, Font et al. 2008, Weinmann et al. 2009, Prescott et al. 2011, Peng et al. 2012, and many others) have presented clear evidence that all of the environmental differentiation of the galaxy population at low redshift is associated with the quenching of star-formation in satellite galaxies, and there is now also good evidence that this remains true also at redshifts approaching unity (Kovac et al. 2014).

48 3.5. Summary and Conclusions

3.5 Summary and Conclusions

We have applied a group-finder with linking lengths ∆r = 500 kpc (physical) and ∆v = 700 km/s to the zCOSMOS-deep sample of 3502 galaxies at 1.8 < z < 3.0, yielding 42 systems with three or more members. To try to understand what these associations likely are, and what they will probably become, we have constructed an analogous sample from 12 zCOSMOS-deep mock samples which were extracted from the Millennium simulation mock catalogues of Kitzbichler et al. (2007), supplemented by a single light cone from the Wang et al. (2008) simulation which has a more realistic value of σ8. We refer to the detected systems as “candidate groups”. We have introduced the following terminology in which a system in which all three detected members are in the same halo is called a “real group” and one in which only two are, a “partial group”. Candidate groups that will become real or partial groups by z = 0 are called “proto- groups” and “partial proto-groups” respectively. The number of candidate groups in the simulations agrees quite well with the number in the sky. Analysis of the simulated candidate groups suggests that only a very small fraction, less than 0.2% in the Kitzbichler et al. (2007) sample and none in the Wang et al. (2008) sample, already have all the detected galaxies occupying the same halo at the time of observation, i.e., are already “real groups”. About 8% of the candidate groups will however already have two members within the same halo in the Kitzbichler et al. (2007) sample. Furthermore, 50% of the mock candidate groups will have assembled all three galaxies into the same halo by z = 0 (i.e., are “proto-groups” at the epoch of observation) and almost all (93%) will have at least two galaxies in the same halo. Only 7% are truly random associations whose members will never occupy the same halo. The mocks suggest that the important parameter that distinguishes the fate of the candidate group is the apparent velocity dispersion vrms. For vrms . 300 km/s the fraction of system that will fully assemble all three members is above 50% and for larger dispersions it is lower. The fraction does not depend much on the projected angular size of the candidate groups. The observed candidate groups are being seen as they begin the assembly process. Already by z ∼ 1.8 (which is the lower limit of our observational window) around 25% of the candidate groups (observed at 1.8 < z < 3.0) will be partial-groups, the bulk of them doing so within ∆a < 0.1 from their epoch of detection, and within ∆a . 0.5 most proto-groups will have evolved into real or partial groups. If we look at today’s groups and ask which of their progenitors will have been seen in our spectroscopic sample at z > 1.8, then we find that we should have detected ∼ 35% 14 15 of the progenitors of todays massive clusters (of order of 10 − 10 M /h) already at z ∼ 2 and this would rise to ∼ 65% if we had 100% completeness in the zCOSMOS-deep spectroscopic sample. We can roughly estimate the overdensities of the spectroscopically detected structures and find that these are substantial, consistent with the idea that these systems will soon come together into assembled systems. We also detect a significant overdensity in the regions of these candidate groups using independent the COSMOS photometric sample, which shows a 40% excess in the numbers

49 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

10 of galaxies above 10 M at the location of our spectroscopic candidate groups as com- pared to the field, despite the very large sampling cylinders (∆z = ±0.1) required from the use of photo-z. We do not however detect any significant differentiation in the colours of the galaxies compared to the field.

50 3.5. Summary and Conclusions

Tab. 3.1 — Candidate groups detected in zCOSMOS-deep, ordered by their velocity dispersion vrms ID ID of one member < RA > < Dec > < z > rrms [kpc] vrms [km/s] Richness 30 431260 150.151 2.369 2.463 325 30 3 9 426916 150.278 2.011 2.308 322 87 3 23 430182 150.312 2.277 2.578 193 94 3 20 429414 150.147 2.219 2.090 362 101 3 21 409768 150.43 2.246 2.157 366 104 3 25 410733 150.172 2.302 2.099 117 112 3 16 490781 150.297 2.158 2.099 185 130 3 6 426643 150.206 1.985 2.232 229 140 3 7 426726 150.397 2.000 2.707 287 143 3 19 429340 149.993 2.206 2.554 279 147 3 39 429401 150.036 2.205 2.096 324 206 3 42 434564 149.870 2.343 2.678 319 222 3 5 426418 150.214 1.964 2.117 269 227 3 17 429152 149.933 2.199 2.279 261 239 3 26 411468 150.249 2.333 2.469 297 239 3 13 407675 150.194 2.118 2.178 385 251 4 32 434605 150.452 2.396 2.286 110 254 3 36 413529 150.102 2.456 2.476 294 264 3 28 411517 150.338 2.344 1.805 224 281 3 40 429794 150.098 2.232 2.099 302 284 3 35 413241 150.186 2.436 2.051 260 296 3 41 434071 150.332 1.892 2.957 257 304 4 34 431678 150.461 2.427 2.322 169 316 3 2 402591 150.329 1.841 2.096 351 322 3 11 427339 150.272 2.050 2.306 214 328 3 12 406198 150.588 2.055 2.029 369 340 3 10 490746 149.921 2.028 2.050 459 365 4 29 431233 150.452 2.356 2.278 282 381 3 1 424327 150.327 1.766 2.538 229 386 3 14 428112 150.359 2.118 2.232 126 405 3 3 425554 149.900 1.883 2.215 190 415 3 4 425598 150.218 1.892 2.684 217 435 3 27 430794 150.008 2.325 2.258 275 474 4 37 413838 150.028 2.479 2.452 146 476 3 33 413105 150.060 2.423 2.469 335 488 3 38 433521 150.153 2.603 2.282 281 496 3 8 426762 150.449 2.010 2.013 293 505 4 15 428229 150.517 2.121 2.153 102 507 3 18 420527 150.354 2.206 1.808 188 513 3 22 430097 150.000 2.256 2.440 412 526 5 31 431338 149.928 2.384 2.143 113 534 4 24 410797 150.056 2.305 1.974 237 545 3

51 CHAPTER 3. Proto-groups at 1.8 < z < 3 in zCOSMOS-deep

52 Chapter 4

A proto-cluster at z=2.45

This chapter is based on work published in Diener et. al. 2015, ApJ, 802, 31D.

4.1 Introduction

In the previous chapter we extensively discussed z ∼ 2 proto-groups, overdensities identi- fied in the zCOSMOS-deep sample and mostly evolving into group-type haloes by z = 0. No signature of environmental differentiation was detected in these group progenitors. In this chapter we will now focus on a much larger structure at z ∼ 2.5 which, as will be shown, will evolve into a Virgo- or Coma-like cluster. This poses a somewhat different environment to the proto-groups discussed before and we may expect early environmental processes to be detectable. As discussed in the introduction, at z < 1 a number of envi- ronmental processes take place and have been observed in assembled groups and clusters. It is, however, still unclear at which stage of the evolution of a proto-cluster to a cluster the onset of environmental differentiation happens. It is known that, at a given stellar mass, the properties of satellites in the local Universe are systematically different from those of typical centrals (see e.g. van den Bosch et al. 2008, Weinmann et al. 2009, Pasquali et al. 2010, Peng et al. 2012). This environmental central/satellite differentiation has been established out to z ∼ 1 (Gerke et al. 2007, Kovac et al. 2014, Knobel et al. 2013). If environmental effects in the galaxy population are dominated by satellites (see Knobel et al 2014 for a qualification of this), then it is possible that at z ∼ 2 the members of a proto-group or proto-cluster would not be environmentally differentiated from the general population, since these galaxies will (by definition) still be centrals and not satellites. Whether this is true, is however not clear yet, and a z < 1 relation does not necessarily hold at z > 2. Differences in the halo mass function in different large scale environments may lead to significant environmental differentiation amongst the population of centrals, quite independent of those astrophysical effects on the group/cluster members which appear to dominate at lower redshifts The literature to date shows at times contradictory examples for environmental influ- ences in proto-clusters at z > 2. Kodama et al. (2007) detect a well populated emerging red sequence in three z > 2 proto-clusters suggesting the appearance of massive elliptical

53 CHAPTER 4. A proto-cluster at z=2.45 galaxies whilst Hatch et al. (2011) only see a poorly populated red sequence in their sample of six proto-clusters at z ∼ 2.4. Furthermore Hatch et al. (2011) find evidence that proto-cluster members are both about twice as massive and have lower specific star formation rates than the field galaxies at the same redshift. A similar, tentative, result was found by Lemaux et al. (2014) in a z = 3.3 proto-cluster. Shimakawa et al. (2014) on the other hand report increased star formation in two z > 2 proto-clusters. Additionally Cucciati et al. (2014) found no evidence for environmental differentiation at a redshift of z = 2.9. Whilst these different results may have their roots in a variety of causes (f.e. different halo masses) it is also possible that the cause is the proto-cluster selection (e.g. Hα-emitters vs optical selection or membership assignment) made by these authors. The terminology of the membership of forming structures at high redshift should be carefully defined. When we refer to an association of galaxies as a cluster (or group) we mean that its member galaxies occupy the same dark matter (DM) halo at the time we observe it. This effectively means that the galaxies lie within the r200 perimeter of a single DM halo. Of course, this perimeter cannot be observed directly in the sky, and so reliance must be made on comparison with mock catalogues of galaxies. In contrast, the member galaxies of a proto-cluster (or proto-group) are occupying different dark matter haloes at the epoch at which they are being observed, but will later accrete into a common halo by z = 0. The galaxy members of a proto-group are therefore mostly still the dominant galaxies in their individual dark matter haloes (i.e. are ”centrals”) but will later become ”satellites” in the larger structure. In a similar manner to group/cluster identification via mock catalogues, also proto- clusters can be identified in simulations (as shown in the previous and this chapter). Furthermore simulations can be used to follow the evolution of a proto-cluster and predict its “product” at any later cosmic time. In turn this approach also provides information about the progenitors of todays clusters. In this chapter we present a z = 2.45 proto-cluster that we have identified in a follow-up of a number of proto-group structures originally identified in the zCOSMOS-deep survey. The layout of this chapter is as follows: we first describe in Section 4.2 the follow-up spectroscopic observations that led to the confirmation of the z = 2.45 proto-cluster. We compare the distribution of the members of this structure with simulations in Section 4.3, in order to establish at which stage of the process of cluster assembly it is and to predict its evolution to z = 0. In Section 4.4, we then examine its galaxy population in the search of any differences to the field population at the same redshift. We summarize our results and draw conclusions in Section 4.5. All magnitudes are quoted in the AB system and we use the ΛCDM cosmology with −1 −1 Ωm = 0.25, ΩΛ = 0.75 and H0 = 73 km s Mpc in line with the parameters used for the Millennium simulation. We refer to physical (comoving) distances with a trailing “p” (“c”), i.e. pMpc would correspond to physical Mpc.

4.2 Data

In an early (now outdated) version of the proto-group catalogue presented in the previous chapter, we identified candidate groups, using linking lengths of dr = 500 kpc and dv =

54 4.2. Data

Tab. 4.1 — Summary of the proto-groups selected as targets for the FORS2 follow- up observations. We note the ID in the zCOSMOS-deep group catalogue, the apparent velocity dispersion vrms and number of previously confirmed mem- bers. The candidate groups marked by an asterisk only appear in the (now) outdated version of the zCOSMOS proto-groups catalogue which used higher linking lengths in the FOF algorithm. ID redshift vrms [km/s] richness 23* 1.927 612 3 19 2.554 147 3 20 2.219 101 3 31* 2.285 689 3 22 2.440 526 5 39 2.096 206 3 40 2.099 284 3

1000 km/s, resulting in 55 associations in total. As before, these proto-groups had 3 − 5 members and are not likely to be already assembled at the epoch of observation, however the majority of them will assemble by z = 0. Seven of these spectroscopically identified proto-groups have been selected for follow-up spectroscopy with FORS2 to confirm the previous member galaxies and to identify additional members; the exact details of the target selection and observations will be discussed in the following.

4.2.1 The FORS2 data

The VLT/FORS2 instrument is a multi-purpose “workhorse” for a variety of observing modes in both imaging and spectroscopy. It has two CCDs, one blue and one red sensitive, allowing observations in the wavelength range 3600-11000 A.˚ For our purposes of obtaining multi-object spectroscopy (aiming at as high numbers of objects as possible) for z ∼ 2 galaxies the instrument in its MXU mode (with user-designed masks), in combination with the blue sensitive E2V CCD and the 300V grism is ideal. This set-up spans the wavelength range of 3300 − 6600 A˚ at a resolution of R = 440.

Target selection and observations

We targeted 7 previously identified candidate groups, which were however selected with slightly higher velocity linking length than our preferred values from Chapter 3. In the selection of the candidate groups for re-observation the following considerations entered: first of all, we aimed at covering a mostly contiguous field with the MXU masks, to be able to observe the same candidate groups with more than one mask and therefore maximising the possible slit configurations. We also set the candidate with 5 galaxy members (in comparison to mostly 3 members otherwise) as a mandatory target. Additionally we tried to span a variety of redshifts. Last, even with the relaxed linking length criterion, the relation between apparent velocity dispersion and likelihood to become a real group by z = 0 still holds. We therefore tried to prefer candidate groups with lower apparent

55 CHAPTER 4. A proto-cluster at z=2.45

Tab. 4.2 — Summary of the FORS2 observations, noting the mask number, the IDs of the groups that are covered by a given mask, average seeing, average airmass and exposure time Mask groups exp-time Mask-4 26, 31, 32 0.8” 1.35 6h Mask-5 31, 32, 51 1.0” 1.29 5.5h Mask-6 26, 31, 32, 51 0.8” 1.44 5.5h Mask-1 23, 28, 52 1.0” 1.50 4.5h Mask-2 23, 28, 52 1.1” 1.42 4.3h velocity dispersion, even though this was not always possible. In Table 4.1 we summarise the properties of the selected candidate groups. For each selected proto-group we aimed at observing both a maximum of additional candidate members as well as (if possible) some of the already spectroscopically confirmed members from zCOSMOS-deep, in order to confirm their redshifts and obtain more ac- curate relative velocities with the higher resolution of FORS2 in comparison to VIMOS. The targets for observation were selected from the COSMOS photo-z sample (Capak et al. 2007, Ilbert et al. 2009) as follows:

1. They had to lie in the surroundings (within 2 Mpc physical) of the already spectro- scopically confirmed proto-groups.

2. Their photo-z had to be consistent with the respective proto-group redshift (with a ∆v < 150000 km/s).

3. The targets had to fulfill BAB < 25.5 or IRAC 3.6µm < 22 (or both).

In each mask we also inserted a number of filler objects with a relaxed selection criterion (∆v < 300000 km/s) to avoid gaps in the slit configuration.

In the mask design priority was given to objects fulfilling both the BAB and the 3.6µ magnitude condition followed by zCOSMOS-deep objects. The remaining space was filled up to maximise the number of targets. Generally we used a 1” slit-width, centering the object in the middle of the slit. However in some cases a shift along the slit was applied to accommodate a neighbouring object. Further any extra space where no slit could be placed was used to increase the slit-width (up to 1.4”) for better sky-background determination. Overall 5 masks with 70 objects fulfilling BAB < 25.5 were observed, 17 with IRAC 3.6µm < 22, 16 obeying both the B and 3.6µ criterion, 18 zCOSMOS-deep objects and 11 filler objects. The FORS2 observations took place in March 2011 during 4 nights of visitor mode. The conditions were mostly good, with a median seeing of < 1.200 across the 4 nights. The last night however was affected by high wind. The observations and the target statistics are summarised in Table 4.2; Figure 4.2.1 shows the 5 FORS2 pointings.

56 4.2. Data

Fig. 4.1 — The 5 pointings of the FORS2 observations, showing each mask and the corresponding targets. The blue open circles indicate the targeted proto-groups.

Data reduction and redshift assignment

The data were reduced in the standard way with the IRAF apextract package. Due to the faintness of our targets we stacked the individual 30min exposures before performing the extraction. In some cases even the stack was not exhibiting any visible sign of a trace; these spectra were abandoned. Out of 132 objects targeted we obtained a 1d spectrum for 114. The redshifts were determined through a visual inspection of the individual spectra mostly via identification of Lyα, CIV or SiII. Of the 114 targets, we were able to assign spectroscopic redshifts to 67 objects (or 60%). The success rate in assigning redshifts was dependent on observing conditions and integration time; the success-rate for masks 4, 5 and 6 (which will be of most interest for the remainder of this chapter) was as high as high as 71%. Additionally a flag reflecting the quality of the redshift was assigned in the same spirit as for zCOSMOS-deep. We summarise the data and the success-rates in Table 4.3. The FORS2 observations pose an opportunity to get a hint at the combined velocity error of zCOSMOS-deep redshifts and the FORS2 redshifts. Table 4.4 lists the redshifts which were measured in both observations, together with the corresponding flags. Ex- cluding the two redshifts with flag 1 in FORS2 (those are quite unreliable) the redshift uncertainty between the two samples is σ∆z/1+z = 0.0037. This is higher than the esti-

57 CHAPTER 4. A proto-cluster at z=2.45

Tab. 4.3 — Summary of the FOR2 success-rates. They are comparable with zCOSMOS-deep. The columns are mask identification, number of slits, num- ber of extracted spectra, number of assigned redshifts, success rate (number of measured redshifts divided by number of extracted spectra), and number of red- shifts with a given flag (1, 2, 3 or 4). The last row summarises all masks, now also stating the percentage of a flag with respect to the number of extracted spectra. Mask #slits #spectra #redshifts success flag 1 flag 2 flag 3 flag 4 Mask-4 27 24 19 79% 7 1 10 1 Mask-5 27 22 15 68% 6 3 6 0 Mask-6 25 24 17 71% 5 5 7 0 Mask-1 25 20 5 25% 4 1 0 0 Mask-2 28 24 11 46% 7 2 2 0 total 132 114 67 59% 29 (25%) 12 (11%) 25 (22%) 1 (1%) mated velocity uncertainty of zCOSMOS-deep of ∼ 300 km/s, however this is a convolution of both FORS2 and zCOSMOS-deep uncertainties and relying on only 9 overlapping red- shifts. Finally all measured redshifts are summarised in Table 4.6 at the end of this chapter.

Tab. 4.4 — Comparison of redshifts measured with FORS2 and zCOSMOS-deep. In total 18 objects have been reobserved, 11 of them could be assigned a redshift. The spectra that have no FORS2 redshift are exclusively of confidence class 1 and 2 in zCOSMOS. The columns are zCOSMOS identifier, FORS2 redshift, flag in FORS2, zCOSMOS redshift, flag in zCOSMOS. ID zFORS2 flagFORS2 zzCOSMOS flagzCOSMOS 409614 2.4393 3 2.446 4.5 429868 2.4428 3 2.4414 3.5 429950 2.4415 3 2.4441 2.5 410000 2.4421 2 2.4363 2.5 429340 2.5560 3 2.5535 2.5 409127 2.5533 4 2.5563 4.5 429401 2.0987 3 2.0965 2.5 420557 1.9985 1 2.098 2.5 420406 2.0945 3 2.0938 3.5 409745 2.2807 2 2.2786 4.5 409666 2.3224 1 2.2933 1.5 409666 -1 0 2.2933 1.5 420557 -1 0 2.098 2.5 409346 -1 0 2.0908 1.5 408893 -1 0 1.9283 1.5 420756 -1 0 2.1006 2.5 409261 -1 0 2.0903 1.5 420718 -1 0 2.1016 1.5

58 4.2. Data

Tab. 4.5 — The 11 spectroscopically confirmed members of the proto-cluster pre- sented in this work. We list their identifier (ID), RA, Dec, redshift as well as radial distance (ri) and along the line of sight velocity vi with respect to the proto-cluster center defined by mean RA (150.00048◦), Dec (2.24132◦“) and z. ID RA Dec zspec ri [kpc] vi [km/s] 429950 149.996613 2.256573 2.442 463 -369 429868 150.007828 2.249362 2.443 321 -256 410000 150.008743 2.264080 2.442 713 -322 409614 149.995026 2.239803 2.439 167 -565 1029209 149.975500 .227124 2.440 846 -530 1034036 149.99157 2.194295 2.451 1409 414 1031108 149.97563 2.21506 2.446 1064 -17 1023628 150.01885 2.265366 2.446 891 31 1023927 150.01939 2.261413 2.450 812 361 1032336 149.98813 2.206609 2.453 1085 655 1022028 150.02802 2.274885 2.453 1278 598

4.2.2 Proto-cluster at z = 2.45

The main aim of the FORS2 observations was to confirm and ideally add additional mem- bers to the pre-identified zCOSMOS-deep proto-groups we described in the last Chapter. With 7 targeted proto-groups and 114 extracted spectra, we observed ∼ 16 candidate members per proto-group. Having to rely on uncertain photometric redshifts to select potential candidate members meant that the chance to actually detect new members was rather small: a criterion of ±700 km/s is contrasting the ∼ ±100000 km/s photo-z uncer- tainties. Somewhat lucky, one of the proto-groups was confirmed with a total of eleven spectro- scopic members at a mean redshift of z = 2.45, RA = 150.00 and Dec = 2.24. A list of the members is given in Table 4.5 and their spatial distribution is shown in Fig. 4.2. The eleven members of this structure all lie within a 1.4 Mpc radius (physical) on the sky and within a velocity range ∆v of ±700 km/s.

We calculated the root-mean-square (r.m.s.) radial size rrms and velocity spread vrms q q P 2 P 2 to be rrms = i ri /(N − 1) = 902 kpc and vrms = i vi /(N − 1) = 426 km/s, where ri and vi indicate the distance and the velocity of a galaxy relative to the mean Ra, Dec and z, and N is the number of galaxies. As we will argue in paragraphs 4.3 and 4.4, this structure is probably not yet gravitationally bound and so these values should not be used to infer a virial mass of the structure.

VLT/ISAAC observations

After the discovery of the proto-cluster we applied in the δ-call for ISAAC, which was issued as a “last opportunity” to use the instrument before it gave way to the (delayed) recommissioned VISIR. We proposed to obtain Hα narrowband imaging of the proto- cluster in a bid to acquire more candidate members as well as study properties as the

59 CHAPTER 4. A proto-cluster at z=2.45

Fig. 4.2 — The 11 members of the proto-cluster (red circles) in a Subaru B band image. They lie within a radius of 1.4 Mpc (physical, corresponding to 2.9’).

star formation rates. In this application we took a slight risk, with only 8 proto-cluster members having redshifts that would actually place them within (even if close to the cut- off wavelength) the ISAAC NB2.25 filter. However the ISAAC NB2.25 filter was, to our knowledge, the only one that would have allowed Hα studies at the proto-cluster redshift. The observations were completed, we could however only recover two of the already existing members. Reason for this could be that due to redshift uncertainties some of the members came to lie outside of the filter curve or that the decline in throughput meant that we lost the sensitivity for the levels of star formation in our galaxies. With this low confirmation rate we did not pursue the analysis of the data.

4.2.3 The mock sample

In order to learn about the likely nature of the underlying dark matter structure of this proto-cluster we will further develop the techniques described in the previous chapter. For this purpose we make use of the Millennium simulation (Springel et al. 2005) and light cones from Henriques et al. (2012). These in total 24 lightcones cover an area of 1.4×1.4 deg2 each. As already shown in Chapter 3, through the identification of similar structures in these mock samples, where full dark matter and evolutionary information is available, we can get indications about the nature and evolution of the observed structure.

60 4.3. Evolution in simulations

Construction of mock samples

Again the first step is to match the mock samples to the actual observations. The targets for the FORS2 observations were selected from a photo-z sample, but chosen to be at the position of known overdensities from the initial spectroscopic zCOSMOS sample. In attempting to mimic this situation as accurately as possible we chose a two-stage approach in constructing the mock sample. First we created mock samples that were intended to replicate the zCOSMOS-deep sample from which we draw our original candidate group. In this we followed from the previous chapter, using magnitude cuts on the mock galaxies to achieve number densities in the mocks that match those in the spectroscopic sample. The roughly 50% sampling of zCOSMOS-deep allows us to construct two mock catalogues from each light cone, resulting in 48 zCOSMOS-deep mock catalogues. Since the proto-cluster in question was originally identified with five zCOSMOS-deep galaxies, we next constructed a mock group catalogue from these zCOSMOS mock samples by applying the same group-finding criteria as for the original candidate group, i.e. we applied a FOF-algorithm with linking lengths dr = 500 kpc and dv = 700 km/s, and restricted ourselves to proto-groups with five members. In a second stage we aimed at reproducing the subsequent FORS2 observations by first creating a mock target sample from the light cones that resembles the underlying COSMOS photo-z sample from which the targets were selected. As mentioned above, the objects in our target catalogue had to fulfill BAB < 25.5 and/or IRAC 3.6µm< 22, as well as having a photo-z consistent with the respective previously identified group. We applied a photo-z error of 10’000 km/s (this corresponds to the typical observed photo-z error at z ∼ 2.5) to the mock redshifts and cut the mock sample in B and IRAC 3.6µm. These cuts where adjusted such that the number density of our target catalogue from COSMOS matched the mock sample. From this mock target catalogue we then randomly draw 16% of all objects to mimic the product of the fraction of targets actually observed (22%) and the success rate in assigning redshifts (71%). We populated the already existing group-catalogue with this “observed” sample. In the final sample we searched for proto-clusters that had 11 or more members lying within 1.4 Mpc and 700 km/s (same as the FORS2 proto-cluster). This resulted in 16 candidate proto-clusters in the redshift range 2.3 < z < 2.6, distributed over the 48 mock samples of 2 deg2 each.

4.3 Evolution in simulations

In this section we aim at understanding the relation of the proto-cluster haloes at the time of observations as well as the evolution to z = 0, in particular studying the process of becoming a cluster. First the assembly history of the mock proto-clusters is analysed, followed by determining their z = 0 descendants. Finally we trace the progenitor galaxies of these descendants back in time to z ∼ 2.5 and study their relation to the already identified proto-cluster galaxies.

61 CHAPTER 4. A proto-cluster at z=2.45

4.3.1 Surface number densities

As mentioned above, with our selection technique, we detect 16 candidate proto-clusters in the 96 deg2 of the 48 mock samples, or 0.17 proto-clusters per deg2. In other words, we expect one such system at a z ∼ 2.5 redshift range in a 6 deg2 field. Based on this, to find one in the region of zCOSMOS-deep (1 deg2 in total and 0.36 deg2 in the area of maximum coverage) appears lucky, but not exceptionally so.

4.3.2 Assembly history

Whilst at low redshift galaxy clusters will usually have mostly assembled (i.e. have their member galaxies occupying the same DM halo) and will in many cases be virialised, this is not the case at z > 2. The growth of structure is so rapid at these masses at high redshift that even quite substantial overdensities will most likely be at a “pre-assembly” stage, meaning that their member galaxies will accrete onto a common dark matter halo by z = 0 but are still occupying different haloes when they are being observed. We can use the properties of the structures in the mock catalogues to infer the likely state of the system we see in the sky. In the case of the 16 proto-clusters in the mock sample, the majority (10, or 62.5%) have already started assembly at z ∼ 2.5, in the sense that the largest halo already contains between two and four galaxies that meet our selection criteria (note that there may also be fainter galaxies residing in the same haloes). About a third of the z ∼ 2.5 proto-clusters however still consist entirely of singletons. The assembly process continues to z = 0 when 13 (81%) have fully assembled (i.e. with all the detected members within a common halo) or mostly assembled (i.e. more than 50% of its members in a common halo). Only for three (19%) of the mock clusters the contamination by interlopers is high enough that less than 50% of the identified members end up occupying the same halo at z = 0. We illustrate the assembly of such a proto-cluster in Fig 4.3, by following the haloes of all galaxies from z ∼ 2.5 that will eventually become members of the same z = 0 cluster. We highlight the proto-cluster galaxies that we identified in our mock-catalogue in red, but obviously many more galaxies are part of this massive z = 0 cluster and at z ∼ 2.5 they are distributed over rather large scales (see section 4.3.4 for further discussion). Also evident from this figure is that the originally identified proto-cluster members largely complete their accretion process before z = 1, consistent with the idea that the structure has made its turn-around (see section 4.4.2). Overall, on average, in the mock catalogues, 78% of the identified proto-cluster members will end up being true cluster members by the present epoch whilst only 16% are already in the same halo at z ∼ 2.5. These numbers suggest that the presented structure is a true proto-cluster in the sense that the vast majority of the galaxies will end up in a massive (see next section) cluster today, but only a small minority are already sharing the same dark matter halo at the high redshift that we observe them.

62 4.3. Evolution in simulations

2.5

2

1.5

1 redshift of snapshot

0.5

1−10 galaxies 11−100 galaxies 101−1000 galaxies 0 >1000 galaxies proto−cluster

−15 −10 −5 0 5 10 15 distance to z=0 halo [cMpc/h]

Fig. 4.3 — We show the assembly history of a z ∼ 2.5 proto-cluster by following all haloes that will eventually become part of the same z = 0 DM halo, i.e. form a cluster. The size of the circles corresponds to the number of galaxies that inhabit a given halo. Whilst at z ∼ 2.5 galaxies are mostly centrals themselves, they continuously accrete onto other haloes to eventually become satellites in the z = 0 cluster. The proto-cluster member haloes we identify at z ∼ 2.5 are highlighted in red.

4.3.3 Halo masses

As established in the previous section, the member galaxies of the proto-cluster are not likely to be occupying the same dark matter halo at z ∼ 2.5. However it is illustrative to compare the typical halo at z ∼ 2.5 to the fully evolved cluster halo at z = 0 by following the evolution of these haloes in the simulation. At z ∼ 2.5 the proto-cluster galaxies are 12 residing in somewhat unremarkable halo masses of ∼ 10 M /h, simply because they are mostly singleton galaxies. This changes dramatically by z = 0 when the former proto- 14 15 cluster members mostly inhabit haloes with Mhalo = 10 − 10 M /h, i.e. they become members of the most massive clusters seen today. This is illustrated in Fig. 4.4 where we both show the distribution of halo masses of the proto-cluster at z ∼ 2.5 and z = 0 (top panel) as well as the halo mass function at both redshifts for comparison (bottom panel). This again underlines the use of terminology “proto-cluster”. At z ∼ 2.5 this structure

63 CHAPTER 4. A proto-cluster at z=2.45

8 halo mass, z=0 7 halo mass z=2.45 without failures 6

5

4 proto−clusters

N 3

2

1

0 1011 1012 1013 1014 1015 M [M /h] halo sun

8 halo mass, z=0 7 halo mass z=2.45

6

5 halo 4

Log N 3

2

1

0 1011 1012 1013 1014 1015 M [M /h] halo sun

Fig. 4.4 — Top panel: We show the halo masses of the most massive halo of the mock proto-clusters at z = 2.45 (turquoise) and at z = 0 (blue). Whilst evolving from a rather 12 unremarkable halo (∼ 10 M /h) they will become some the most massive clusters by 14 z = 0 with a halo mass of ∼ 5 × 10 M /h. The dashed line indicates the halo masses without the 3 clusters that do not assemble, i.e. that end with < 50% of the members in the same halo. Bottom panel: We show the halo mass functions at z = 2.45 (dotted) and z = 0 (solid) for comparison. is an assembly of galaxies residing as centrals in their DM halo. As it evolves these haloes merge to eventually form a massive cluster halo that is occupied by the previously identified centrals as well as galaxies that accreted later on or were below the detection limit at z ∼ 2.5.

4.3.4 Progenitor galaxies

We have established in the previous section that the mock proto-clusters evolve into very massive z = 0 clusters. This means that other progenitor galaxies to these clusters exist, in addition to the ∼ 11 identified members. All of these progenitors will become part of the same DM halo by z = 0. They could have failed to be identified as members

64 4.4. Observational Characteristics of the proto-cluster for a variety of reasons. First of all spectroscopy was restricted to relatively bright (BAB < 25.5) targets. The objects that met the selection criterion were sampled incompletely, both due to a limited spatial sampling1 and a < 100% success rate in assigning redshifts. We searched for the additional z ∼ 2.5 progenitors in the lightcones, the result being shown in Figure 4.5. The progenitors are colour- and size-coded according to their B band magnitude, showing the very faint objects in green and the brightest in dark blue. There is a significant number of such progenitors present in each of the proto-cluster fields (median of 2215, the z = 0 cluster will have less members than that as some progenitor galaxies merge), however most of them are too faint to have met our selection criterion. The vast majority (95%) of these objects however meets the ∆v < 700 km/s condition that would associate them with the proto-cluster if observed. The diameter of the area occupied by progenitors is ranging from 3 pMpc to 20 pMpc. This range of areas is also reflected in the range of halo masses (Figure 4.4), which occupy almost 2 orders of magnitudes. Only as the cluster assembles it turns into the more compact structure that is observed at lower redshifts. The optical selection of such a proto-cluster can hence result in a diversity of objects. This analysis also hints towards a more extended structure at z = 2.45 in the COSMOS field. As 3 pMpc − 20 pMpc correspond to an angular scale of 60 − 410, comparable to or bigger than the FORS2 FOV (6.80 × 6.80), we would not have detected this extended structure with our observation.

4.4 Observational Characteristics

According to simulations the z = 2.45 proto-cluster is likely to evolve into a massive cluster by z = 0, but is only just starting its assembly. Whilst this implies that the effects which shape the group population at z < 1 can not take place yet, the overdense environment of a proto-cluster could influence the member galaxies and hence make them distinguishable from the field galaxies at the same redshift. We investigate this scenario in the following sections.

4.4.1 Photometric redshift samples

The selection of galaxies in zCOSMOS-deep and also for the FORS2 observation involved a cut in the B magnitude. The spectroscopic sample is therefore highly incomplete in mass and will be biased against red objects that are quiescent or which have high red- dening. Any interesting statements about the population of galaxies in the proto-cluster region compared with those in the surrounding “field” must therefore be based on photo-z sample(s), despite the high redshift uncertainties therein. To that end we use two photo-z samples, both described in detail in Chapter 2. We applied the same selection criteria as for the FORS2 observations to the i-band selected photo-z catalogue from Capak et al. (2007) and Ilbert et al. (2009). As this is then exactly the parent sample for the observations, it replicates our selection function. We base an estimate of the overdensity on this sample.

1The FORS2 observations only allowed ∼ 25 objects per mask.

65 CHAPTER 4. A proto-cluster at z=2.45

RA (deg)

M =1013.9 M /h M =1014.9 M /h M =1013.7 M /h 0.2 z=0 sun z=0 sun z=0 sun 0.2 0.1 0.1 0 0 −0.1 −0.1 −0.2 −0.2

M =1015 M /h M =1015.1 M /h M =1015.1 M /h 0.2 z=0 sun z=0 sun z=0 sun 0.2 0.1 0.1 0 0 −0.1 −0.1 −0.2 −0.2

M =1014.6 M /h M =1014.4 M /h M =1014.3 M /h 0.2 z=0 sun z=0 sun z=0 sun 0.2 0.1 0.1 0 0

DEC (deg) −0.1 −0.1 −0.2 −0.2

M =1014.5 M /h M =1014.7 M /h M =1014.1 M /h 0.2 z=0 sun z=0 sun z=0 sun 0.2 0.1 0.1 0 0 −0.1 −0.1 −0.2 −0.2 −0.2 −0.1 0 0.1 0.2 −0.2−0.1 0 0.1 0.2 14.9 B >26 M =10 M /h AB 0.2 z=0 sun 25

Fig. 4.5 — We show all z ∼ 2.5 progenitor galaxies (green and blue) that will by z = 0 become members of the cluster that we identified by our proto-cluster selection. The actual proto-cluster members that identify the structure are highlighted in red. In each proto-cluster field there exist several hundred to thousand progenitors, most of them being too faint for observations. We also note the z = 0 halo mass which reflects the number of progenitors. Two of the z = 0 clusters are identical. Reason for that being that their progenitor hosts so many galaxies that they were detected in both of the depleted mock catalogues (we randomly split the original catalogue into two parts to mimic the spectroscopic sampling rate of zCOSMOS-deep.)

To better address the issue of incompleteness we employ the K-selected UVISTA cat- alogue from McCracken et al. (2012), which is complete (to 95%) down to KAB = 23.8 10 corresponding to an approximate mass completeness limit of ∼ 10 M . We include this sample to look for differences in the galaxy population at the proto-cluster position with respect to the field. As we are only interested in differential effects it is acceptable if our sample is not complete towards lower masses as long as it includes the objects we are interested in. It should however be noted that the UVISTA sample does not necessarily include the known spectroscopic members (in fact it only contains 6 of the 11 members).

66 4.4. Observational Characteristics

4.4.2 Overdensity estimation

We can roughly estimate the overdensity of the proto-cluster by using the parent photo- metric sample from which we selected the targets for observation. To that end we calculate the field density in the 0.6◦ × 0.62◦ zCOSMOS-deep (densely sampled) area within the redshift range zcl ± 0.12 which corresponds to ±10’000 km/s, to Nfield encompass the photo-z uncertainty. Then: ρfield = 3 3 , where l denotes 1/3×area×(lmax−lmin) the comoving distance along the line of sight, lmin and lmax correspond to the distance at −4 2 zcl − 0.12 and zcl + 0.12. The “area” is the area of zCOSMOS-deep (1.13 × 10 sr ). When computing the density of the proto-cluster, we correct for the effect of the red- shift uncertainty. The ∆v ± 700 km/s of the spectroscopic members is presumably over- estimating the extent of the proto-cluster along the line of sight. We therefore assume that in reality the excess of objects concentrates within the rphys = 1.4 Mpc radius, both along the line of sight and radially. Hence the density of the proto-cluster is as follows: 11 ρcl = 2 , with rcom = rphys ∗ (1 + zcl) and lcom = 2 ∗ rcom. Then the overdensity is π rcom lcom given by δ = (ρcl − ρfield)/ρfield = 10. We double-checked our assumption of the spectroscopic members being concentrated within a radius of 1.4 pMpc = 4.8 cMpc. To that end we determined the spread in the cosmological redshifts of the 16 mock proto-clusters (being a measure for the “true” dis- tribution of the proto-cluster member galaxies). The average root-mean-square of these redshifts is 0.006, translating to 7.3 cMpc which is consistent with the 4.8 cMpc radius from above, suggesting that our assumption was valid, but that we may slightly overestimate the overdensity. An overdensity of 10 implies, in line with the simulations, that whilst the proto-cluster is not likely to be gravitationally bound yet, it has made its turn-around.

4.4.3 Connection to radio galaxies

Whilst this proto-cluster has been selected purely through a FOF approach on a spectro- scopic sample, it is well established that radio galaxies are beacons for high-z overdensities (see for example Miley et al 2006, Hatch et al. 2011 and others). We searched the publicly available FIRST catalogue (White et al. 1997) for sources at the proto-cluster position and found a radio source at (RA = 150.0025, Dec = 2.2586) with a flux of 4.21 mJy. This position coincides with the proto-cluster with an offset of 0.5 pMpc from the center. Castignani et al. (2014) also report a structure at z = 2.39 at our proto-cluster position identified with a poisson probability method using photometric redshifts looking for over- densities around radio galaxies. They associate their structure with the same radio galaxy 0.32 and quote a photometric redshift of zphot = 2.2±0.44 for it. Given the uncertainty in pho- tometric redshifts it is possible that our proto-cluster and the structure from Castignani et al. are the same overdensity and associated with the FIRST radio galaxy. Without spectroscopy we can however not make a decisive statement.

67 CHAPTER 4. A proto-cluster at z=2.45

4.4.4 Does environment matter?

As discussed in the introduction to this chapter, previous work finds at times contradic- tory results regarding environmental differentiation in proto-clusters. The proto-cluster presented here has originally been selected from a sample of blue star-forming galaxies as opposed to the predominantly Hα selected samples of the aforementioned studies. This opens the door for the search of environmental signatures both identical or different. To this end we search for any differences in the masses, star-formation rates and the quiescent fraction in the proto-cluster relying on the UVISTA catalogue. We determine the 10.5 fraction of massive (M > 10 M ) galaxies, as well as the fraction of highly star-forming (SFR > 50 M /yr) galaxies within the proto-cluster consistent with proposed scenarios of either overabundance of massive galaxies (Hatch et al. 2011) or elevated star-formation (Shimakawa et al. 2014). At the same time we also search for a difference in the quiescent fraction in comparison to the field, akin to low redshift results. We make use of the masses and SFRs that are given in the UVISTA catalogue and which are determined by the mass (SFR) of the best fitting template defined by the median of the likelihood distribution from the photo-z fitting procedure. The selection of quiescent galaxies is also taken from UVISTA, where they employ a criterion based on NUV-R/R-J colours. In total 73 galaxies with zphot consistent with the proto-cluster redshift are flagged as quiescent. We calculate the respective fractions of massive, star-forming and quiescent galaxies in the proto-cluster in a cylinder of r = 1.4 Mpc radius (physical, the proto-cluster radius) and a length of ±100000 km/s (to encompass the photo-z uncertainty). To compute the field values we put down cylinders of the same volume at hundred random positions in the zCOSMOS-deep field. Figure 4.6 shows these fractions in comparison with the field: the fraction of massive galaxies left, fraction of star-forming galaxies in the middle and the quiescent fraction right. Whilst we see a trend towards slightly more massive and quiescent galaxies within the proto-cluster, this is not statistically significant within our sample. Despite its likely evolution into a very massive z = 0 cluster, we do not see evidence for environmental differentiation at this stage, although it is possible that a weak effect was not detected due to the large errors caused by the use of photo-z.

4.5 Summary and conclusions

We presented a z = 2.45 proto-cluster with 11 spectroscopically confirmed members. It has first been identified in zCOSMOS-deep and then been followed up with FORS2 spectroscopy. Its member galaxies lie within a radius of 1.4 Mpc (physical) on the sky and within ∆v = ±700 km/s. We estimated an overdensity of 10, in line with the structure having made the turn-around, but not having accreted its member galaxies onto a common dark matter halo. This picture is confirmed by comparison of the proto-cluster to similar structures in simulations. To that end we carefully constructed mock catalogues that resemble the observational situation and identified analogous proto-clusters therein. We follow the

68 4.5. Summary and conclusions

0.5 field galaxies proto−cluster

0.4 SFR>50M /yr sun

0.3

0.2

fraction of galaxies in resp. class 0.1 quiescent mass>1010.5M sun

0

10.5 Fig. 4.6 — We show the fraction of massive M > 10 M (left), highly starforming SFR > 50 M /yr (middle) and quiescent galaxies in the proto-cluster (red) and the field (blue). There is a weak trend for more massive and quiescent galaxies within the proto-cluster, this is however not significant. evolution of these mock proto-clusters from z ∼ 2.5 to z = 0. We find that indeed most of the member galaxies are still centrals in their own DM halo at z ∼ 2.5. By z = 0 most of them share the same halo and hence form a cluster. Furthermore the z = 0 halo is of 14 15 M & 10 − 10 M /h, equivalent to a Virgo or Coma like cluster. We identified all z ∼ 2.5 mock progenitor galaxies that will by z = 0 share the DM halo with the originally identified mock proto-cluster galaxies. These galaxies would mostly be too faint for observations, however still lie within the ∆v ± 700 km/s to be associated with the proto-cluster. For each of the mock proto-clusters there exist several hundred to thousand of these progenitors spread over an area with a diameter between 3 and 20 pMpc and hence are occupying a much wider field than suggested by the originally identified members. This optical selection of proto-clusters results therefore mostly in loose structures and rich diversity of objects. In order to fully characterise the progenitor population of today’s massive clusters these wide fields need to be observed. The numbers from above furthermore hint towards an extended structure in the zCOSMOS field. In the last section we studied the galaxy population in the area of the proto-cluster in the search of early signatures of environmental differentiation. We compared the fraction of massive, highly star-forming or quiescent galaxies in the proto-cluster to the field. Whilst we see a weak trend for more massive and quiescent galaxies in the proto-cluster, this is not statistically significant.

69 CHAPTER 4. A proto-cluster at z=2.45

Tab. 4.6 — Summary of the FORS2 redshifts. Columns are COSMOS or zCOSMOS identifier, right ascension, declination, FORS2 redshift, FORS2 flag, COSMOS photometric redshift and B-band magnitude. ID RA Dec zFORS2 flag zphot BAB 1003122 150.1211 2.232462 1.7054 1 2.2062 25.031 1004337 150.125 2.224069 2.3068 1 1.9549 24.566 1006611 150.14355 2.210877 2.1922 1 2.0483 24.791 1004784 150.17492 2.224882 2.1608 1 1.9801 24.357 1002523 150.10464 2.235805 2.6485 2 2.2192 25.242 1004309 150.12292 2.22458 2.4928 3 2.1574 25.234 1007719 150.12986 2.20293 2.2225 2 1.9943 25.028 1003217 150.133 2.231528 2.2648 1 1.9845 24.283 1004429 150.13768 2.225001 1.9350 1 2.058 24.317 1004964 150.14147 2.221075 2.2324 3 1.8237 24.233 1003836 150.16526 2.226338 1.9675 1 2.1302 25.194 1001513 150.17455 2.240116 1.8369 1 1.956 24.289 998193 150.07527 2.262835 1.6784 1 2.0677 25.275 1002983 150.08434 2.233617 2.0824 1 1.9846 25.453 1002194 150.10235 2.238195 2.0369 1 2.0248 24.878 1006341 150.10851 2.212045 2.0323 2 1.7425 25.207 1029209 149.9755 2.227124 2.4397 1 2.4596 25.558 409614 149.995026 2.239803 2.4393 3 2.3575 24.293 1026462 149.991 2.24488 2.3403 1 2.3475 25.084 429868 150.007828 2.249362 2.4428 3 2.4311 24.708 429950 149.996613 2.256573 2.4415 3 2.3934 24.79 1024107 150.00248 2.261098 2.7418 1 2.5467 25.776 410000 150.008743 2.26408 2.4421 2 2.3597 25.158 1021417 149.98349 2.279298 2.5102 1 2.2977 24.33 1020660 149.99544 2.285119 2.6752 1 2.5485 25.526 1036599 149.98326 2.179332 2.2885 3 2.5303 25.944 1035510 149.94599 2.184009 2.4906 3 2.5269 24.641 1035143 149.9935 2.188273 2.5547 3 2.5153 25.226 1034036 149.99157 2.194295 2.45054 3 2.4173 24.812 1033054 149.98872 2.202892 1.8987 1 2.6491 24.973 1032336 149.98813 2.206609 2.5028 1 2.4967 25.369 1032336 149.98813 2.206609 2.4533 3 2.4967 25.369 429340 149.98674 2.20937 2.5560 3 2.3614 25.072 409127 149.994888 2.212349 2.5533 4 2.5177 24.542 1031108 149.97563 2.21506 2.4456 3 2.5333 25.155 429401 150.038605 2.212807 2.0987 3 1.9977 24.823 420557 150.031876 2.209333 1.9985 1 2.0633 24.347 1032431 150.04682 2.206569 2.3204 3 1.951 25.453 1033331 150.023 2.199258 2.2118 1 2.0783 24.554 420406 150.037491 2.193803 2.0945 3 1.9988 23.992 1010554 150.06685 2.183105 1.7580 2 1.9608 24.117 1037707 150.04143 2.171935 2.2847 1 1.9645 25.263 1022028 150.02802 2.274885 2.4527 2 2.261 25.235

70 4.5. Summary and conclusions

Tab. 4.7 — Continuation of Table 4.6 ID RA Dec zFORS2 flag zphot BAB 1023628 150.01885 2.265366 2.4461 1 2.3715 25.259 1023927 150.01939 2.261413 2.4499 3 2.2152 25.108 1025551 150.0467 2.251012 2.2769 3 2.3017 23.915 1024880 150.00588 2.254438 2.8977 3 2.3762 24.239 409745 150.027603 2.248289 2.2807 2 2.3535 24.238 409666 150.030853 2.243493 2.3224 1 2.3329 24.303 1027483 150.02239 2.23893 2.5318 1 2.4729 25.149 1029259 150.02835 2.226368 2.2038 1 1.9451 25.38 1027684 150.03355 2.235487 2.5589 3 2.327 24.463 1024878 150.0077 2.24666 2.5241 3 2.2065 24.463 1024888 150.00149 2.251221 2.0930 2 2.2837 24.258 1024880 150.00588 2.254438 2.1172 2 2.3762 24.239 1024880 150.00588 2.254438 0.2710 3 2.3762 24.239 1024345 150.0182 2.259444 2.6705 13 2.3577 23.137 1023116 150.02473 2.264624 2.3684 1 2.0659 25.218 1021467 150.03028 2.278306 2.6172 1 1.9227 24.99 1020488 150.03646 2.286129 2.4138 2 2.1629 24.856 1036718 149.98992 2.174087 2.0041 3 2.0189 23.664 1036059 150.02961 2.179876 2.5574 3 2.1887 25.236 1035427 149.99592 2.185 2.6283 1 2.0041 24.946 1033584 150.0288 2.197288 2.1850 2 2.071 24.368 1034299 150.00941 2.192789 2.2822 2 1.9587 25.192 1033253 149.9889 2.200515 2.4658 3 2.1871 24.883 1032522 150.03367 2.204548 1.9252 1 1.9785 25.064

71 CHAPTER 4. A proto-cluster at z=2.45

72 Part III

The Lyα-forest-Galaxy cross-correlation at z ∼ 2

Chapter 5

The Lyα-forest data

This chapter is devoted to the description of the reduction and first analysis of the Lyα- forest data-set that we will be using for a later cross-correlation analysis. It will explain the techniques applied to extract the absorption information as well as define the sample properties. Starting with a motivation and summary of the observing strategy, we will then detail the data reduction process and the continuum-fitting algorithm. We study the impact of S/N and resolution on the continuum placement and describe how we mea- sure the amount of HI absorption. Finally we connect the Lyα-forest data-set with the zCOSMOS-deep sample.

5.1 Description of the programme

With the advent of high-resolution spectrographs on 8-10 m class telescopes the extensive study of the connection between the Lyα-forest and galaxies at z & 2 has become feasible. Since these spectrographs require rather bright targets in order to achieve a useful S/N ratio most studies so far have been focussing on characterising the immediate vicinity of galaxies by targeting very high luminosity QSOs. These studies typically consist of disconnected fields around those QSOs. “Stacking” a number of such fields allows to construct rather detailed maps of the HI distribution around high-redshift galaxies, but is restricted to a few comoving Megaparsecs distance from the galaxies (see in particular Rakic et al 2012, Rudie et al. 2012). With the galaxy sample of zCOSMOS-deep at hand, spanning a larger, contiguous field, acquiring of high resolution spectra of a number of QSO sightlines therein would allow to study the HI-galaxy cross-correlation at larger, cosmologically relevant, scales than previously possible. This is particularly intersting as the intergalactic HI is thought to trace the large scale DM structure at z ∼ 2 essentially bias-free. The main two issues in focussing on a contiguous fields are firstly that it is more susceptible to cosmic variance than the joint data from multiple (small) fields that may be located in very different regions of the sky. Second the trade-off in finding a suitable number of QSOs in a relatively small part of the sky means to observe fainter objects at lower S/N and lower resolution than in past multiple field studies. The aim of the Lyα-forest dataset used in this thesis was

75 CHAPTER 5. The Lyα-forest data

200

180

160

140

120

gal 100 N

80

60

40

20 zCOSMOS−deep QSOs 0 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 z spec

Fig. 5.1 — Redshift distribution of the zCOSMOS-deep galaxies (blue) and QSOs (red) in our study. We included all galaxies with flags 2, 3 and 4. to observe a sample of ten 2.4 < z < 2.8 QSOs in the zCOSMOS-field. Due to the (relative) target faintness, the strategy is to observe the complete sample of 10 QSOs with VLT/XSHOOTER at medium resolution and re-observe the brightest targets with VLT/UVES at high resolution to understand the impact of the lower resolution on our estimates of optical depths.

The QSOs have been purely selected to have a suitable redshift, i.e. 2.5 . zQSO . 3 for studying the Lyα-forest at z ∼ 2 − 2.5, and a magnitude that allows them to be observable by XSHOOTER. The QSO redshift distribution in relation to the zCOSMOS- deep redshifts is shown in Figure 5.1.

5.2 The XSHOOTER observations

VLT/XSHOOTER is a single-object, multi-arm echelle spectrograph currently mounted at the Cassegrain focus of UT2. It spans the wavelength range of 3000−24800 A˚ in three arms: UVB (3000−5595 A),˚ VIS (5595−10240 A)˚ and NIR (10240−24800 A).˚ Each of the arms is an independent spectrograph with its own slit. The large wavelength range of XSHOOTER allows to efficiently observe the Lyα-forest (and higher order Lyman-transition lines) as well as metal-line systems. The latter pose an interesting field of study on their own, but also allow us to remove metal-line contaminants from the Lyα-forest. All 10 QSOs in our sample were observed with XSHOOTER in service mode utilising a set-up with 1” slit for the UVB arm to yield a resolution (R = λ/∆λ) of R = 5100, in combination with a 0.9” slit for the VIS arm (R = 8800) and a 0.9” slit for the NIR (R = 5600) arm. Further a 2×1 binning was applied, i.e. a binning of 2 pixels in the spectral direction and 1 pixel in the spatial direction. With the exception of QSO-525, each QSO was observed for 2700s (in 3 exposures of 900s) at a seeing of 0.8” or better. The three exposures were set within the same observing block and taken consecutively. QSO-525 was unfortunately only observed for 1800s, with half of its integration time in

76 5.2. The XSHOOTER observations

Tab. 5.1 — Summary of our XSHOOTER data-set and the corresponding obser- vations. The QSOs marked by a (*) were also observed with UVES. We note the right ascension (RA) and declination (Dec), redshift (zqso), B magnitude (BAB) as measured from the COSMOS photometry survey (Capak et al. 2007), total exposure time (exp-time) and observation date. QSO RA Dec zqso BAB exp-time Date QSO-199* 09:58:58.68 +02:01:39.2 2.454 18.9 2700s 31.12.2010 QSO-51* 10:00:14.14 +02:00:54.6 2.497 19.9 2700s 03.01.2011 QSO-5483 10:02:23.9 +0:23:53.6 2.825 21.0 2700s 03.04.2011 QSO-133 10:01:05.31 +02:13:48.3 2.615 20.7 2700s 06.01.2011 QSO-93 10:01:10.19 +02:32:42.4 2.653 21.4 2700s 30.01.2011 QSO-512 09:59:23.55 +02:22:27.4 2.730 21.0 2700s 30.12.2010 QSO-146* 09:59:38.28 +02:04:50.1 2.804 21.1 2700s 03.01.2011 QSO-161 10:01:08.55 +02:00:52.5 2.681 21.2 2700s 06.01.2011 QSO-498 10:01:15.95 +02:14:48.4 2.490 22.0 2700s 19.03.2012 QSO-525 09:59:41.41 +01:58:45.3 2.494 22.3 1800s 14.11 & 29.12.2010 deteriorating seeing conditions and 1.5 months lying between the two exposures. Table 5.1 summarises our QSO survey.

5.2.1 Data-reduction

XSHOOTER is an extremely complex instrument, with its 3 arms and highly curved orders. For the processing of each science exposure, of order ∼100 calibration frames are required, making the data reduction quite challenging. This is aggravated by the faintness of our targets, requiring careful processing at each step. Furthermore the official XSHOOTER pipeline, whilst being a useful tool for some of the reduction steps, struggles to extract spectra in a meaningful way, particularly when dealing with low S/N data (but also true, to a lesser degree, for high S/N data). For an optimal reduction, we therefore used a combination of the ESO XSHOOTER pipeline recipes and our own scripts. We only relied on the pipeline to create 2D rectified spectra of the individual exposures, which we then subsequently co-added and performed a tailor-made extraction. An overview of the XSHOOTER data and the reduction process is shown in Figure 5.2; we describe the details in the next two sections.

Basic steps: Creating the 2D spectrum

As a first step, we created scripts to generate input files for the ESO esorex (the command line version of the pipeline) recipes, always selecting the optimal calibration files, i.e. the ones with the required binning, mode, etc and closest in time to the actual science observations. We then built a data reduction workflow, by connecting the input and output of various ESO recipes, checking for each step that the ESO default parameters give optimal results, and adjusting them where necessary.

77 CHAPTER 5. The Lyα-forest data

Fig. 5.2 — Overview on the XSHOOTER data reduction workflow, described in detail in section 5.2.1.

78 5.2. The XSHOOTER observations

In detail the reduction steps were as follows: The science-frame was bias subtracted and flat-fielded, the inter-order light background as well as the sky-background were sub- tracted. The different spectral orders were carefully modeled (taking into account the instrument flexure) before the XSHOOTER pipeline performs one single resampling and rectifying step to create the 2D wavelength and flux-calibrated order spectra. Finally the orders are merged to create a single final calibrated 2D spectrum. In most of the reduction steps, the ESO default parameters, optimised for stable reduction, yielded the best results. We however adjusted the parameters controlling the cosmic ray removal, sky background subtraction and order localisation. This procedure resulted in 2D calibrated and merged spectra for each individual expo- sure. Despite the optimised values for the cosmic ray (CR) rejection, we still detected a significant number of CRs in our 2D spectra. We therefore utilised the DCR code (Pych 2004) to produce a CR map for co-adding of the individual frames. Because the 2D spectra are mostly at rather low S/N, the individual exposures were co-added before object extraction. With the exception of QSO-525, all 3 exposures for a given QSO were taken consecutively, meaning that co-addition was possible without prior wavelength correction for heliocentric motion. It was verified that the object was aligned in all 3 exposures by visually inspecting the collapsed (along the dispersion direction) profiles: all were centered within offsets smaller than the size of a pixel. We also created a mastermap for each exposure that combined the CR maps and the bad pixel map from the pipeline. The 2D spectra were average-co-added, excluding any pixel flagged in the mastermap. In the case of QSO-525, with only two exposures taken 1.5 months apart from each other, we shifted the exposures into a common frame of reference before co-adding them.

Extraction of the spectrum

The extraction itself was done by a tailor-made code performing an optimal extraction on the 2D spectrum in order to achieve the maximum S/N possible. The basic steps are to first ensure a good sky-subtraction and then determine a profile for the object trace, which will subsequently be used to perform the extraction; an overview is shown in Figure 5.3. More specifically we divided each 2D spectrum into 20 wavelength regions for the UVB arm and 30 for the VIS. As the sky-subtraction in the ESO pipeline was at times sub- optimal, a second sky-subtraction was performed on the co-added spectrum to eliminate any potential residual sky. As a next step a high S/N profile within each region was determined by collapsing the spectrum in dispersion direction. The edges of the aperture occupied by the QSO were then defined by truncating the profile as soon as the flux drops below 1σ in excess of the sky-level. These endpoints were interpolated along the dispersion direction. Likewise we interpolated the profile of the object, so that at each step in λ we would have a profile with well-defined endpoints, and zero weights outside of the aperture (this is to enforce positivity in the flux). We then used the optimal extraction method described in Horne (1986), where the 1D-spectrum is calculated by weighting the flux values by the object- profile as determined above.

79 CHAPTER 5. The Lyα-forest data

Fig. 5.3 — Overview on the extraction algorithm applied to the XSHOOTER data. Each spectrum is divided into wavelength regions, used to derive a high S/N profile, which, interpolated across the spectrum, will serve as a weighting profile in the optimal extrac- tion.

The wavelength of the resulting 1D spectra was corrected for heliocentric motion, by as- suming the respective corrections at the midpoint of all 3 exposures. Further we converted the wavelengths into vacuum. It turned out that in the case of QSO-525 the two exposures that were taken were not enough to achieve a signal-to-noise ratio sufficient for our purpose. The same is true for QSO-498, which was (next to QSO-525) the faintest target. We therefore exclude these two QSOs from further analysis.

5.3 Continuum fitting algorithm

Accurate continuum estimation in the Lyα-forest is important for the subsequent analysis: errors in the continuum placement will affect the estimated optical depths and if the error is systematic it will bias the result. Errors in the continuum placement will lead to a constant percentage error in the normalised flux, but when inferring optical depths (these will be introduced in section 5.5) from the normalised flux, this is not true anymore due to the non-linear relation between the flux and optical depths. The percentage error on the estimated apparent optical depths is shown in Figure 5.4; high apparent optical depths are only changed little by changes in the continuum placement, whilst for small optical depths the errors become excessive. As we will show in Chapter 7, most of our signal is indeed coming from high optical depths tail, therefore these errors should not influence

80 5.3. Continuum fitting algorithm

100 1% continuum error 5% continuum error 10% continuum error 80

60 [%] ∆τ 40

20

0 0 0.5 1 1.5 2 2.5 τ

Fig. 5.4 — The impact of errors in the continuum estimation on the inferred optical depths, for 1%, 5%, and 10% errors. For optical depths with τ & 0.5 continuum errors are tolerable, whilst for small optical depths (ie. normalised fluxes close to unity) the errors become excessive. As we will argue later, this should not influence our result greatly, as the signal is dominated by higher optical depths. our analysis too much. Accurately identifying and fitting the continuum is non-trivial. In the past, various methods have been employed for continuum fitting, mostly dependent on the size of the survey in question. For large surveys with rather low resolution spectra, the primary methods are a principle component analysis (Suzuki et al. 2005, Paris et al. 2010) or the extrapolation of power-laws. Both approaches use the shape of the continuum redwards of the Lyα emission of the QSO to infer the continuum in the Lyα-forest region. With only a small number of high S/N spectra, most authors resort to visual inspection and identification of parts of the spectrum that can be inferred as absorption-free and then use polynomial or spline fits. This involves a fair amount of user-bias and is subject to what one believes to be noise as opposed to low-level absorption. To avoid this human intervention, and to make the continuum estimate as unbiased as possible, we devised a semi-automated algorithm to identify parts of the spectra unaffected by absorption that would be included in the fit. We use the expected shape of the flux distribution to constrain the noise properties of the spectrum (an already normalised example being shown in Figure 5.5): assuming the spectrum itself is relatively flat, the flux distribution will roughly take the shape of a gaussian, but with a tail towards lower fluxes (leftwards) due to the Lyα-forest absorption. Rightwards of the peak the distribution will be dominated by the noise and its width will be a measure of the noise. In our algorithm we determine the flux distribution in a box that moves in wavelength direction. The box size is typically a few hundred pixels across (∼ 50 A),˚ dependent on the thickness of the Lyα-forest and the overall S/N of the spectrum, more absorption

81 CHAPTER 5. The Lyα-forest data

500

400

300 N

200

100

0 −0.5 0 0.5 1 1.5 f (λ) norm

Fig. 5.5 — Normalised flux distribution of QSO-51. It mostly resembles a gaussian at around 1 with the left-wards wing arising from the HI absorption. We use this fact in our continuum-fitting algorithm (see text for details). generally requiring bigger box sizes. The noise within each box is estimated from the full-width-half-maximum (FWHM) rightwards of the peak: we flag any pixel with a flux fi that obeys fpeak − FWHM/1.5 < fi < fpeak + FWHM/1.5 as a continuum pixel. This only uses the noise-dominated part of the flux distribution. We checked that within FWHW/1.5 also the left-wards side of the peak shows no excess which would be a sign for inclusion of absorption features in the continuum pixels. Obviously the procedure only works if the spectrum is reasonably flat, i.e. does not have any steep slopes (large scale gradients on the other hand pose no problem). This is not the case near the Lyα and Lyβ emission of the QSO itself. There the spectrum is manually “pre-fitted” before applying the algorithm. The continuum is then placed by fitting a cubic spline to the pixels inferred to be part of the continuum. Figure 5.6 shows an example for such a fit as well as the resulting normalised spectrum and in Figure 5.7 all 8 sightlines are displayed.

5.4 The UVES data

As mentioned before, the XSHOOTER data are at medium resolution, which may impact our determination of the continuuum. The higher resolution UVES observations of some of the QSOs in our sample should serve as a direct comparison for optical depth estimates obtained at medium resolution and at high resolution. In this section we first describe the UVES data reduction process, followed by this comparison. We also analyse the impact of S/N with an independent, high S/N, QSO spectrum and derive corrections for the measured XSHOOTER optical depths.

82 5.4. The UVES data

−16 x 10 3.5 3 2.5 2 ) λ f( 1.5 1 0.5 0 3600 3700 3800 3900 4000 4100 4200 λ

1.5

1 ) λ (

norm 0.5 f

0

3600 3700 3800 3900 4000 4100 4200 λ

Fig. 5.6 — Top panel: Continuum fit for QSO-51 as performed by our algorithm. Bottom panel: The normalised spectrum of QSO-51.

5.4.1 Observations and set-ups

UVES is a single-object, high-resolution echelle spectrograph, installed at one of the Nas- myth foci of UT2. It covers the wavelength range 3000-11000 A˚ in two arms, red and blue, which can either be operated individually or combined. UVES offers a range of standard set-ups regarding the combination of dichroic, cross-disperser and filter, resulting in differ- ent wavelength coverages. The user can chose the set-up that best covers the wavelength range he is interested in. Three of the QSOs were additionally observed with UVES, using both arms of the instrument. For each arm a 1” slit was used, which translates to a resolution of R=41400 in the blue arm and R=38700 in the red. To guarantee full wavelength coverage in the blue and the red arm, a combination of two different instrument set-ups per QSO were used, as summarised in Table 5.3. The observations took place between December 2009 and April 2010 at < 1.0” seeing conditions.

83 CHAPTER 5. The Lyα-forest data

π −335 −276 −219 −163 −108 −53 0 52 104 155 205 254 302 z Lyα 2.13 2.17 2.21 2.25 2.29 2.33 2.37 2.41 2.45 2.5 2.54 2.58 2.62

3800 3900 4000 4100 4200 4300 4400 QSO−199 1 norm )

λ 0 f(

3600 3700 3800 3900 4000 4100 QSO−51 1 norm )

λ 0 f(

3600 3700 3800 3900 4000 4100 4200 QSO−512 1 norm )

λ 0 f(

3900 4000 4100 4200 4300 4400 4500 QSO−5483 1 norm )

λ 0 f(

4000 4100 4200 4300 4400 4500 4600 QSO−93 1 norm )

λ 0 f(

3800 3900 4000 4100 4200 4300 4400 QSO−146 1 norm )

λ 0 f(

4000 4100 4200 4300 4400 4500 4600 QSO−161 1 norm )

λ 0 f(

3800 3900 4000 4100 4200 4300 4400 QSO−133 1 norm )

λ 0 f(

3800 3900 4000 4100 4200 4300 λ [A] obs

Fig. 5.7 — Lyα-forest region of the eight QSOs in our final sample. We show the con- tinuum normalised fluxes (blue). The red curve is the error spectrum, the grey lines are at 0 and 1 to guide the eye. Also already shown are the regions which will be excluded from analysis (in yellow) because of proximity to the QSO, metal-line contamination or displaying a DLA (this will be described in section 5.5). The top panel shows an example for how wavelength translates to Lyα redshift (blue) and line-of-sight distance (red).

84 5.4. The UVES data

Tab. 5.2 — Summary of the UVES data-set and the corresponding observations. For each QSO we note the mode, cross dispersion element, below-slit filter and therefore resulting central wavelength (λcentr), arm and exposure time. For each set-up both arms share the dichroic, but differ in cross-disperser and filter. QSO mode cross disp. filter λcentr arm exp-time QSO-199 DICHR#1 CD#1 HER 5 3460A˚ blue 3000s CD#3 SHP700 5800A˚ red QSO-199 DICHR#2 CD#2 HER 5 4370A˚ blue 3000s CD#4 OG590 8600A˚ red QSO-51 DICHR#1 CD#1 HER 5 3460A˚ blue 12’000s CD#3 SHP700 5800A˚ red QSO-51 DICHR#2 CD#2 HER 5 4370A˚ blue 12’000s CD#4 OG590 8600A˚ red QSO-146 DICHR#1 CD#1 HER 5 3460A˚ blue 18’000s CD#2 SHP700 5800A˚ red QSO-146 DICHR#2 CD#2 HER 5 4370A˚ blue 18’000s CD#4 OG590 8600A˚ red

Tab. 5.3 — A summary of the median signal-to-noise ratios per resolution element in the Lyα-forest, number of pixels in the forest and effective exposure time per set-up for the two usable UVES QSOs. QSO S/N #pixels exp-time QSO-199 10.836 26348 3000s QSO-51 8.2674 26676 12’000s

5.4.2 Data reduction and continuum fit

The standard ESO pipeline for UVES produces excellent and stable reductions. We there- fore used it to reduce our data, having verified that the default parameters yield an optimal result. The pipeline provides fully extracted calibrated 1D spectra for each arm, set-up and exposure, which then need to be co-added and merged for continuous wavelength coverage. Since the observations took place over a range of months, we first applied the helio-centric correction to the wavelengths of the 1D spectra before co-adding the expo- sures by an average. For the merging of the set-ups and arms (there are two different set-ups for each arm, yielding 4 configurations in total) we chose to not co-add the overlap regions to avoid any abrupt changes in the S/N. Instead we scaled the spectra to matching flux-levels and then merged them in a continuum region, by cutting the lower wavelength coverage at some λ and continuing with the next spectrum, covering a higher wavelength range. Finally we converted the wavelengths to vacuum. Due to the faintness of the target, the data for QSO-146 turned out to be too noisy to extract a meaningful spectrum. We therefore excluded it from further analysis. We fit the Lyα-forest continuum of the two remaining QSOs with the same algorithm as described in section 5.3. The resulting data is summarised in Table 5.3.

85 CHAPTER 5. The Lyα-forest data

5.4.3 The effects of resolution and S/N: Comparison of UVES and XSHOOTER data

As discussed above, the XSHOOTER spectrograph has medium resolution, so there exists the possibility that some of the weak HI absorption is mis-identified as noise, therefore underestimating the QSO continuum. Setting the continuum too low causes higher nor- malised flux levels, leading to an eventual underestimation of optical depths. To study this resolution-effect, we use the two QSOs that were observed with both XSHOOTER and UVES. The UVES data have been processed by the same continuum fitting algorithm as the XSHOOTER data. We also degrade the data by a convolution with the XSHOOTER resolution and rebin it to XSHOOTER sampling. This allows us to directly assess the effect of resolution on the continuum fit. At the same time one may expect a similar effect by high levels of noise: weak absorption will be attributed to noise, again leading to a possible underestimation of the continuum. To test for this scenario, we resorted to a high S/N z ∼ 2 QSO observed with UVES (C´edric Ledoux, private communication). This QSO is at slightly lower redshift than those in our sample, resulting in a thinner Lyα-forest, but it is still instructive for a double-check. To mimic a direct comparison with lower S/N observations, we added gaussian noise to this spectrum and performed a continuum fit on the high and low S/N spectrum. Our test data now consist of two UVES spectra which have been continuum fitted and the re-binned and convolved to XSHOOTER resolution together with the corresponding XSHOOTER spectra to test the effect of resolution as well as one high S/N UVES spectrum and the same spectrum with added noise to test the effect of S/N. In Figure 5.8 we plot the flux difference between UVES and XSHOOTER as a function of the average flux, weighted by the S/N of the respective observations. This weighting procedure ensures that “error-ellipse” of flux-difference is aligned with the y-axis; taking into account only one (noisy) component would cause the flux-difference to scatter diag- onally. Only in the hypothetical situation that one of the two spectra had infinite S/N, this would serve as a benchmark. At face value surprisingly, the normalised flux as measured in the UVES data is actually higher than in the XSHOOTER data, meaning that either the XSHOOTER continuum fit overestimates the continuum (rather than underestimating, as might be expected) or that the issue lies in the UVES continuum fit. The latter scenario may arise because the UVES spectra are actually much noisier than those from XSHOOTER, so confusion of low-level absorption and noise may actually occur more for the UVES data rather than the XSHOOTER data. We test the effect of S/N with the high S/N QSO spectrum described previously by comparing it to the exact same spectrum but with simulated noise. As shown in Figure 5.9 the continuum of the noisier spectrum is indeed under-estimated. The effect for this QSO is rather small, most likely because there is (due to the lower redshift of the QSO used) less absorption. We therefore conclude that the S/N of a spectrum impacts estimates of the continuum to a larger extent than the resolution, at least in the case of medium to high resolution spectra and typical S/N for our spectra. Consequently we used our highest S/N QSO

86 5.5. Estimates of optical depths

0.3

0.2

0.1 UVES ) λ − f( 0 overall XSHOOTER )

λ −0.1 f(

−0.2

STD SEM −0.3 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 1.4 f(λ) w

Fig. 5.8 — The normalised flux of XSHOOTER - UVES as a function of their S/N weighted flux. The blue bars indicate the mean and spread in each bin (of size 0.1); black bars show the error on the mean. Overall the flux measured in the UVES data is higher than the corresponding flux in the XSHOOTER data. This figure summarises both QSOs that have been observed with XSHOOTER and UVES.

(199) to determine a correction for those with lower S/N. We achieve this by degrading its spectrum to the respective S/N ratios and performing a continuum fit on the degraded spectra. From this fit we calculated the average difference < fnorm,degraded−fnorm,QSO199 > for each QSO, as summarised in Table 5.4, and used those values to correct our estimates of the optical depths as presented in the next section.

Tab. 5.4 — We note for each QSO the correction on the normalised flux, obtained by comparing a degraded spectrum to the high S/N spectrum of QSO-199. Con- sequently the correction for QSO-199 is 0. QSO 199 51 512 5483 93 146 161 133 Correction 0 0.0011 0.0040 0.0031 0.0060 0.0060 0.0047 0.0031

5.5 Estimates of optical depths

Estimating the absorption in a QSO line-of-sight is non-trivial due to the presence of noise and saturated lines. There exist a variety of methods to extract the information, depending on the desired application. For many cosmological applications the fluctuation around the mean flux levels is used (e.g. Font-Ribera et al. 2015, Slosar et al. 2011), whilst e.g. Rudie et al. (2012) use manual or semi-automated Voigt-profile fitting of the

87 CHAPTER 5. The Lyα-forest data

0.3

0.15 UVES ) λ − f( 0 UVES, noise ) λ f(

−0.15

−0.3 −0.2 0 0.2 0.4 0.6 0.8 1 1.2 f(λ) w

Fig. 5.9 — Direct comparison of the normalised flux estimates for the same high S/N UVES spectrum, one as the original spectrum and the other one with added noise. The presence of noise leads to an underestimation of the continuum. individual features. Our main interest lies in studying the IGM as traced by the HI, which is believed to be a continuous matter distribution, rather than being constituted of individual, pressure confined, clouds (as described in the introduction). From a “philo- sophical” viewpoint it therefore makes sense reflect this in the estimated HI distribution. Further, at XSHOOTER resolution, reliable Voigt-profile fitting becomes very challeng- ing. We therefore chose to employ the pixel optical depth (POD) method (introduced by Cowie & Songaila 1998), which also has the advantage of being automated. Unlike the flux fluctuation method, optical depths are proportional to the column density of the ab- sorbing material (at least in the linear regime) and hence somewhat closer to the physical interpretation. Generally the optical depth is defined as

τ(λ) = −ln(Fnorm(λ)), where Fnorm is the normalised flux at wavelength λ. In the absence of noise the normalised flux always lies between 0 and 1, but in the presence of noise, it can be above unity or negative. There are therefore two obvious break-downs of this method: Saturation of a line (i.e. Fnorm = 0 or, in the presence of noise, even Fnorm < 0) and negative optical depths (for Fnorm > 1). For saturated lines, previous studies (Rakic et al. 2012, Cowie & Songaila 1998, Aguirre et al. 2002) resort to higher order Lyman-transitions lines in these cases, this is however not suitable in our case: firstly the part of the spectrum available for this method (due to the ∼ 3200A˚ cut-off in the instrument throughput) usually only covers a part of the

88 5.5. Estimates of optical depths

Tab. 5.5 — A summary of the Lyα-forest data-set, listing for each QSO the redshift, the lower limit zmin of the Lyα-forest data, the upper limit zmax, the number of pixels used in that sightline and the median S/N per resolution element in the forest. QSO zqso zmin zmax #pixels S/N QSO-199 2.454 1.963 2.396 2634 48.783 QSO-51 2.497 2.000 2.439 2667 34.617 QSO-512 2.73 2.200 2.668 2845 15.412 QSO-5483 2.825 2.281 2.761 2918 20.363 QSO-93 2.653 2.134 2.592 2437 12.665 QSO-146 2.804 2.263 2.741 2211 11.546 QSO-161 2.681 2.158 2.620 2807 14.299 QSO-133 2.615 2.101 2.555 2757 20.48

corresponding Lyα-forest transitions, and secondly spectra towards this blue cut-off be- come increasingly noisy, making it almost impossible to obtain reliable continuum-fits and optical depth estimates. For our purpose we are however not interested in this saturated regime but more in the low column density IGM. Therefore we chose to not include any pixels with Fnorm(λ) < e(λ)(e(λ) denotes the flux-error) in our analysis.

For the case of negative optical depths, i.e. for Fnorm(λ) > 1, we will later only be interested in the mean absorption at a given distance of galaxies and this will therefore average out. The pixel-optical-depth method adopted here hence produces the following values: τ(λ) = −ln(Fnorm(λ)) for all pixels except for saturated ones which are masked. The estimate is performed on the corrected flux-values (as described in the previous section). Furthermore, we apply a standard cut of ∆v = 5000 km/s from the QSO Lyα emission line (e.g. Font-Ribera et al. 2012, Rakic et al. 2012) to avoid effects like QSO winds, density biases due to the large-scale QSO environment or the proximity effect. Likewise we cut at ∆v = 5000 km/s from the QSO Lyβ-line to avoid the OVI-line, that may otherwise be confused with Lyα-forest absorption. The Lyα-forest can be contaminated by metal-lines from absorbing systems with red- shifts z < zqso. We identified these contaminants redwards of the QSO Lyα-line by first focusing on doublets like MgII [2796.4A,˚ 2803.5A],˚ SiIV [1393.8A,˚ 1402.8A],˚ CIV [1548.2A,˚ 1550.8A],˚ NV [1238.8A,˚ 1242.804A]˚ and then by identifying remaining strong lines like SiII, AlII, FeII. We excluded the metal-lines lying in the Lyα-forest from our analysis, by mask- ing those regions. It should be noted that this may also mask Lyα-lines which coincide wtih a metal-line. A list of all identified metal-line systems is given at the end of this chapter in Table 5.6. In the sightlines of QSO-93 and QSO-146 we found one damped Lyα absorber (DLA) each. In particular the one in QSO-146 is a rather extreme system with a column density of log N ∼ 21.6 cm−2. These systems most likely probe the immediate neighbourhood of a galaxy and contain high amounts of self-shielding neutral gas. As in our study we are interested in the (mostly ionized) IGM, we exclude the DLA regions from further analysis.

89 CHAPTER 5. The Lyα-forest data

We summarise the properties of the Lyα data-set in Table 5.5.

5.6 The QSO sample in relation to zCOSMOS-deep

The power of the Lyα data-set lies in its combination with the zCOSMOS-deep galaxy survey covering a field of ∼ 60 × 60 cMpc2 at z ∼ 2. These data permit to study the transverse correlation of HI and galaxies up to ∼ 60 cMpc, as well as the line-of-sight correlation up to ∼ 500 cMpc. We mostly probe the redshift range 2.0 . z . 2.5, where zCOSMOS-deep also has the highest number of measured redshifts. In this section we aim at showing how the properties of the Lyα data-set relates to the zCOSMOS-deep sample and give an intuition for the connection between those galaxies and the HI absorption. In Figure 5.10 we show the distribution of the zCOSMOS galaxies and the QSOs in our data-set. To guide the eye, we plot a 10 cMpc radius around each QSO. Half of the QSOs lie in or near the fully sampled area of zCOSMOS-deep, whilst the other half is located near the edge of the field. Whilst this does not impact our ability to perform the large scale analysis we aim at, it limits the data to study the closer vicinity of galaxies.

2.6 QSO−93

QSO−5483 2.4 QSO−512

QSO−133 2.2 DEC

QSO−146

QSO−161 QSO−51 QSO−199 2

1.8

150.6 150.4 150.2 150 149.8 RA

Fig. 5.10 — Positions of the 8 quasars in our sample (red). The black circles denote a 10 cMpc distance from the respective QSO. We also show the zCOSMOS-deep galaxies we use in our analysis. The blue open circles mark the location of the proto-groups we identified in Chapter 3.

To get a first view on how the galaxies and the Lyα absorption align, we show all 8 sightlines together with galaxies at < 15cMpc in Figure 5.11, as well as the mean absorption from all redshifts and the zCOSMOS-deep redshift distribution. The mean redshift of the zCOSMOS galaxies in the range of the Lyα-forest observations is z = 2.3071 (for flag 2.5, 3 and 4) or z = 2.3312 (flag 3 and 4), and the mean redshift of the Lyα- data itself is z = 2.3708. One can already observe that generally the stronger absorption

90 5.6. The QSO sample in relation to zCOSMOS-deep

20

gal 10 N

1 >

norm galaxies b<15cMpc

1 QSO 51, z=2.497 norm f 0

1 QSO 133, z=2.615 norm f 0

1 QSO 93, z=2.653 norm f 0

1 norm f 0 QSO 161, z=2.681

1 norm f 0 QSO 512, z=2.73

1 norm f 0 QSO 146, z=2.804

1 norm f 0 QSO 5483, z=2.825 1.9 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 z

Fig. 5.11 — We show all 8 sightlines and the respective Lyα absorption, both binned (blue) and unbinned (light grey). The zCOSMOS galaxies are displayed out to 15cMpc and colour-coded according to their distance to the respective sightline. The two top panel show the zCOSMOS-deep redshift distribution (top) and the average flux over all 8 sightlines, including the DLAs (second). features coincide with galaxies, even if this is, due to the limited sampling of zCOSMOS- deep, not always the case. At z ∼ 2.5 we detect two DLAs in, as well as (even in absence of those) slightly elevated level of absorption. At this redshift we also reported a proto- cluster (Chapter 4) and Lee et al. (2015) detect this structure in their Lyα tomography of the COSMOS field.

91 CHAPTER 5. The Lyα-forest data

Tab. 5.6 — Summary of the metal-line systems identified in the 8 QSO-sightlines. The lines within the Lyα-forest are shown in italic. The columns are QSO sight- line, redshift of the system and list of all metal-line systems with the wavelength of the transition in brackets and in units of Angstroms. Sightline Redshift Transition QSO-199 0.6941 MgII (2796, 2804), FeII (2587, 2600), MgI (2853), FeII (2344, 2374, 2383) QSO-199 1.2303 MgII (2796, 2804), FeII (2344, 2374, 2383, 2587, 2600), FeI (2167), AlII (1671), AlIII (1855, 1863) QSO-199 2.2049 CIV (1548, 1551), CII (1335), MgII (2796, 2804), MgI (2853), FeII (2600), SiIV (1394, 1403), SiI (1631), AlII (1671), SiIII (1207), SiII (1190, 1193, 1260) QSO-199 2.2070 MgII (2796, 2804), CI (1657), SiII (1260) QSO-199 2.4608 CIV (1548, 1551), SiIV (1394, 1403), NV (1239, 1243), OVI (1032, 1038) QSO-161 1.5126 MgII (2796, 2804), FeII (2344, 2383), SiI (1846), AlIII (1855), SiII (1527), CIV (1548, 1551) QSO-161 1.7085 AlII (1671), MgI (2853), FeII (2344, 2374, 2383, 2587, 2600), MgII (2796, 2804), SiII (1527), CIV (1548, 1551), FeII (1608) QSO-161 1.9505 CIV (1548, 1551), FeI (2463), FeII (2344, 2374, 2383, 2600), MgII (2796, 2804), AlII (1671), SiII (1527), OI (1302) SiII (1304), CII (1335), SiIV (1394, 1403) QSO-161 2.0112 CIV (1548, 1551) QSO-161 2.3054 CIV (1548, 1551), SiIV (1394, 1403), CII (1335), SiIII (1207) NV (1239, 1243) QSO-133 0.8929 MgII (2796, 2804), FeII (2344, 2374, 2600), FeI (2463) QSO-133 0.9157 MgII (2796, 2804) QSO-133 0.9173 MgII (2796, 2804), FeII (2344, 2383, 2587, 2600) QSO-133 1.9266 CIV (1548, 1551) QSO-133 2.0066 CIV (1548, 1551), CII (1335) QSO-133 2.1535 MgII (2796, 2804), SiIV (1394, 1403), FeII (2383, 2600), SiII (1527), AlII (1671), SiII (1190, 1193, 1260, 1304) SiIII (1207), CII (1335), OI (1302) QSO-133 2.1702 MgII (2796, 2804), SiIV (1394, 1403), CIV (1548, 1551), FeII (2587, 1608), NI (1199.5, 1200), NV (1239, 1243) SiIII (1207), SiII (1260, 1304) QSO-5483 2.0274 CIV (1548, 1551), SiI (1562), OI (1302), SiII (1527) QSO-5483 2.3687 CIV (1548, 1551), SiIV (1394, 1403), OI (1302), SiII (1304) QSO-5483 2.8169 CIV (1548, 1551), NV (1239, 1243), OVI (1032, 1038) SiII (1190, 1193) QSO-5483 2.8204 CIV (1548, 1551), NV (1239, 1243), OVI (1032, 1038) SiII (1190, 1193) QSO-512 2.4421 CIV (1548, 1551), CII (1335) QSO-512 2.4566 CIV (1548, 1551)

92 5.6. The QSO sample in relation to zCOSMOS-deep

Tab. 5.7 — Continuation from Table 5.6. Sightline Redshift Transition QSO-93 1.1689 MgII (2796, 2804), FeII (2344, 2374, 2383) QSO-93 1.8753 CIV (1548, 1551), SiII (1527) QSO-93 2.0261 CIV (1548, 1551), CI (1657) QSO-93 2.4187 CIV (1548, 1551), MgII (2796, 2804), CII (1335), OI (1302), SiII (1304, 1527), SiII (1190, 1193, 1207, 1260) QSO-93 2.4710 CIV (1548, 1551) QSO-93 2.5981 SiIV (1394, 1403), CIV (1548, 1551), SiII (1260, 1304, 1527), FeII (2344, 2374, 2383, 1261), CII (1335), OI (1302), NII (1084), FeII (1145), SiIII (1207), SiII (1190, 1193) QSO-146 2.0863 CIV (1548, 1551), MgII (2796, 2804) QSO-146 2.2139 CIV (1548, 1551), MgII (2796, 2804), SiII (1527), AlII (1671), FeII (1608, 2344, 2587, 2600), FeI (2463), SiII (1260), CII (1335), SiIV (1394, 1403) QSO-146 2.4175 MgII (2796, 2804), CIV (1548, 1551), SiII (1527), AlII (1671), FeII (1608, 2344, 2374, 2383, 2587, 2600), OI (1302) SiII (1190, 1193, 1260, 1304) QSO-146 2.7867 CIV (1548, 1551), NV (1239, 1243) QSO-51 1.8394 CIV (1548, 1551), FeII (1608), FeI (2502) QSO-51 1.9441 CIV (1548, 1551), MgII (2796, 2804), SiI (1693), SiII (1304) NV (1239, 1243) QSO-51 1.9470 CIV (1548, 1551) QSO-51 1.9722 CIV (1548, 1551), SiII (1260) QSO-51 1.9814 MgII (2796, 2804), CIV (1548, 1551), FeII (2600, 2383), AlII (1671), SiII (1527), SiIV (1394, 1403), NV (1239, 1243), CII (1335), SiIII (1207), SiII(1260) QSO-51 2.0103 CIV (1548, 1551) QSO-51 2.1305 CIV (1548, 1551), CII (1335), SiIII (1207), SiII (1190, 1193) QSO-51 2.3435 CIV (1548, 1551)

93 CHAPTER 5. The Lyα-forest data

94 Chapter 6

Analysis of the zCOSMOS-deep redshift reliability

The combination of the XSHOOTER Lyα-forest data-set and the zCOSMOS-deep sample can be used to analyse the zCOSMOS-deep redshift accuracy and reliability and any de- pendence on the respective confidence flags. This is important when defining samples to be used for scientific analysis, where different requirements may allow for greater or lesser degrees of redshift contamination or accuracy. As described in chapter 2, each zCOSMOS- deep redshift has been assigned a flag, reflecting the “quality” of the redshift. Confidence classes 3 and 4 represent very secure redshifts, class 2 are less secure but still probable redshifts whilst class 1 represents insecure redshifts. Additionally each flag is qualified by a decimal place, either “.1” or “.5”, indicating agreement with independently determined photometric redshifts. “.5” signifies consistency between the photometric and spectro- scopic redshifts, “.1” indicates inconsistency. Given the large uncertainties in photometric redshifts and the rather large fraction of catastrophic failures, a “.1” does not necessarily mean the corresponding spectroscopic redshift is incorrect (in fact, there are a number of 3.1s and 4.1s). Nevertheless it may somewhat reduce the confidence in the spectroscopic redshift, in particular for classes 1 and 2. The XSHOOTER data allow an independent assessment of the redshift quality of each class as follows: Assuming that close to 100% of class 3 and 4 redshifts are correct, we can use the comparative strength of the Lyα absorption around galaxies of other classes to determine how many galaxies therein are likely to have incorrect redshifts assigned to them. This is under the assumption that the gas distribution around galaxies is independent of their confidence class. In practice, the distribution of mean optical depths along the line- of-sight (LOS) is determined, including only QSO sightlines up to a certain transverse distance rp to the galaxies. The exact value of rp is a trade-off between including enough galaxies to achieve meaningful statistics, whilst avoiding too large values where there is no excess of Lyα absorption anymore. Unlike the following chapter, our interest in this chapter also lies with the gas directly associated with the zCOSMOS-deep galaxies, meaning that the high column density gas and DLAs should be included in our analysis. When dealing with “saturated lines”, which, in the previous chapter, have been defined as lines with flux-levels below the flux-error,

95 CHAPTER 6. Analysis of the zCOSMOS-deep redshift reliability

r =5cMpc r =10cMpc r =15cMpc 0.6 p p p S/N=1.5 S/N=2.3 S/N=3.1 0.5

τ 0.4

0.3

0.2

r =20cMpc r =25cMpc r =30cMpc 0.6 p p p S/N=2.6 S/N=2.8 S/N=2.8 0.5

τ 0.4

0.3

0.2

−40 −20 0 20 40 −40 −20 0 20 40 −40 −20 0 20 40 π [cMpc] π [cMpc] π [cMpc]

Fig. 6.1 — The line-of-sight distribution of the mean pixel optical depths around flag 3 & 4 galaxies. There is a clear peak of elevated absorption. We use the highest S/N peak as an expected value for a (close to) 100% rate of correct redshift assignment. The red line is the expected mean absorption calculated from the τ − z relation of our data and is used to background-subtract the peak. As we probe further from the galaxies, the amplitude of the peak decreases as expected.

we assign a lower boundary to the “true” optical depth by setting τsat(λ) = e(λ), with e(λ) being the flux error. This will result in a HI profile to be understood as a lower limit to the real distribution. Since however this affects the galaxies of all confidence classes in the same way, it should still be valid to perform our analysis. We also performed the analysis by replacing saturated pixels with an artificially high number (103) and using a median when calculating the POD distribution. Those results are consistent with the findings presented here. Figure 6.1 shows the LOS distribution for confidence classes 3 and 4 including sight- lines up to different rp. First of all one notes the peak of the absorption is not symmetric around zero (as would be expected) but shifted by about ∼ 5 − 10 cMpc (the “c” here denotes “comoving”) so that it appears as if the absorption was located behind (as seen from the observer) the galaxies. For a symmetric appearance the galaxy redshifts should be higher than reported. This shift corresponds to ∼ 400 − 700 km/s at z = 2.3 and most likely arises from the redshift determination of zCOSMOS-deep. There are various effects that may contribute to this systematic shift: First of all, there may be systematic effects arising from the wavelength calibration, as well as the vacuum-air conversion of the spectra. The XSHOOTER spectra have been vacuum-corrected. The difference from this to air wavelengths causes features used for redshift assignment to appear at a slightly lower wavelength, leading to an underestimate of the redshift. For the Lyα transition at z ∼ 2 this effect is of order ∼ 100 km/s. Furthermore, some of the lines used for the redshift determination may not actually be at the systemic redshift of the galaxy but may

96 Tab. 6.1 — This Table summarises the analysis on zCOSMOS-deep redshift relia- bility by comparing the mean POD within rp < 15 cMpc for galaxies of different confidence classes. Columns are confidence class (class), percentage of correct red- shifts (with the exception of flag 3 & 4 where it is the mean POD at rp < 15 cMpc as for those flags are defined as 100% correct here), error from the bootstrap analysis and mean number of galaxies contributing (< N >). Class success error < N > 3 & 4 0.0755 - 404 *2.5 82% 30% 238 *2.1 59% 49% 156 *1.5 0% 19% 475 *1.1 24% 27% 394 be affected by winds (which can easily account for blueshifts of several 100 km/s). As there is a range of features that have been used for the zCOSMOS-deep redshift determination, we can not quantify this effect any further. The width of the elevated absorption of order 15−20 cMpc (or ∼ 1000 km/s) is in line with the expected zCOSMOS-deep redshift uncertainty of ±300 km/s (1σ).

To compare to other flags, we were interested in the distance rp which gives the clearest signal of absorption. We compared the signal-to-noise of the peak for flag 3 and 4 galaxies at distances between 5 cMpc and 30 cMpc and found it to be highest at rp = 15 cMpc. In the remainder of our analysis we will therefore focus on rp = 15 cMpc. Figure 6.2 shows the LOS-distribution of PODs within this rp = 15 cMpc for galaxies of different confidences classes; the grey shaded area is the peak of absorption as defined from our benchmark of flag 3 & 4 galaxies. This peak at −2 . π . 20 cMpc is quite pronounced; the signal for the lower grade redshifts is naturally less clear. We use this window to calculate the mean POD within, for each redshift class, after subtraction of the background (red line in Figure 6.1). The background was calculated from the expected mean absorption of the galaxy- QSO pairs contributing to each bin. We assumed that each Lyα-pixel corresponds to the mean optical depth expected at this redshift, as computed from a fit to optical depths as a function of Lyα-redshift. The average POD of the flag 3 & 4 galaxies is set as a benchmark for the amount of absorption expected if 100% of those redshifts were correct. For each of the confidence classes we then calculate the mean POD in the same manner. The fraction of correct redshifts is defined as the fractional level of this mean POD with respect to the benchmark. We assume a random redshift would not contribute to the absorption in the defined window and that the correct redshifts on average always contribute the same amount of absorption. We further estimate the errors by bootstrapping the galaxy sample one hundred times and calculating the fraction of correct redshifts from the bootstrapped samples. The results are summarised in Table 6.1. We only considered classes with a sufficient number of galaxies; we make no statements on 9.1, 9.5 or *.4 etc. Moreover one should note *2.5 also includes 12.5 and 22.5 (and likewise for the other classes). As expected the *2.5 confidence class turns out to include the highest rate of correct redshifts, with an estimate of success of about 82%. This number is drops to 59% for class *2.1 redshifts,

97 CHAPTER 6. Analysis of the zCOSMOS-deep redshift reliability

0.5 flag 3 & 4 flag 2.5 flag 2.1

0.4 τ

0.3

0.5 flag 1.5 flag 1.1 −40 −20 0 20 40 π [cMpc]

0.4 τ

0.3

−40 −20 0 20 40 −40 −20 0 20 40 π [cMpc] π [cMpc]

Fig. 6.2 — Line-of-sight distribution of PODs around galaxies with different class redshifts. The grey shaded area marks the peak in the secure redshifts; the mean POD values within this peak will be used as a measure for the redshift reliability. The red line is showing the background which will be subtracted before the analysis. We also show the mean within the peak for each flag, with the error (on the mean) computed from a bootstrap-analysis. although the error on this measurement is very large due to fewer galaxies in this class. In the case of class *1.5 the mean absorption after background subtraction becomes slightly negative; it is however consistent with zero within uncertainty. One can therefore assume a success rate of 0 for this class. Class *1.1 finally has an estimated success-rate of 24%. In summary one can for most applications probably infer class *2.5 redshifts as usable, potentially even class *2.1, although there is a considerable uncertainty on that result. Class *1.1 and *1.5 should be dis-regarded as the correct fraction of redshifts becomes rather small. We would like to point out that the errorbars in this analysis are considerable and that one should treat the quoted numbers more as guidelines. Also, even for the flag with highest estimated correct fraction (flag 2.5), the POD profile itself is noisy in comparison to the highest confidence profile. This is another indication for caution regarding the statements on reliability.

98 Chapter 7

Measuring the Lyα-forest-Galaxy cross-correlation

The aim of this chapter is to measure the cross-correlation function of the Lyα-forest and the zCOSMOS galaxies. In particular the focus lies on large scales (& 5 − 10 cMpc) to which the linear regime of structure growth applies. As discussed already in the intro- duction, previous studies at z ∼ 2 have constructed detailed HI maps in the vicinity of galaxies, and other studies have used the Lyα-forest-galaxy cross-correlation to undertake BAO measurements on & 40 cMpc/h scales. The analysis described here will fill in the intermediary scales, where one may expect to observe the growth of structure through the Kaiser infall. We will focus solely on the cross-correlation analysis, excluding any Lyα- forest or galaxy auto-correlation. The former is only feasible along the line-of-sight as the separation of the QSO pairs limits the transverse scales which are accessible. The latter is extremely challenging due to the complicated selection function of zCOSMOS-deep and beyond the scientific scope of this thesis. Throughout this chapter we will perform the whole analysis on two sets of galaxies from different zCOSMOS-deep redshift classes: the first sample comprises the highly reliable flag 3 or 4 redshifts (1569 galaxies) and the second sample also includes flags 2.5 in addition to flags 3 or 4 (2839 galaxies). The latter therefore almost doubles the number of galaxies used for the analysis. As we have seen in the previous chapter, across the lower confidence classes, the flag 2.5 galaxies represent those with highest fraction of correct redshifts. However we already noted a degradation of signal quality for these flags, and the large error bars on the quotes of redshift reliability. This chapter is organised as follows: we first discuss the estimation of the Lyα-forest- galaxy cross-correlation function, describing the optimal statistics, construction of back- ground maps and eventually the correlation function itself. We then introduce the model for redshift space distortions which we then compare to the data. We finally conclude the chapter by summarising our findings. Where applicable and unless stated otherwise, we make use of ΛCDM cosmology with −1 −1 parameters ΩΛ = 0.73, ΩM = 0.27, σ8 = 0.81, H0 = 100h km s Mpc and h = 0.7, consistent with WMAP9 parameters (Bennett et al. 2013). We will be using comoving

99 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation units of distance throughout this chapter which will be indicated by a trailing “c”, i.e. “Mpc” becomes “cMpc”.

7.1 Measurement of the cross-correlation function

7.1.1 Mean vs Median

In calculating the pixel optical depth (POD) map around the zCOSMOS galaxies it is worth understanding which statistical measure is most appropriate for use, in particular whether a mean or a median is the best choice. In Figure 7.1 the overall distribution of pixel optical depths τ (POD) is shown, as well as an example for a smaller 4×4 cMpc2 bin in the inset panel. As it becomes clear from that figure, most of the information is actually in the tail of the distribution as most of the POD values are in the Gaussian (“noise”) region of the distribution, making a mean measurement of the distribution more suitable than median.

4 x 10 10 400 Example bin mean median 300

8 τ

N 200

100

6 0 0 1 2 3 τ τ N

4

2

0 −0.5 0 0.5 1 1.5 2 2.5 3 τ

Fig. 7.1 — Distribution of the overall POD estimates contributing to the final map. The small inset panel shows an example distribution for just one (random) bin. The information we are interested in is contained in the tail, making the mean the best measure in calculating our maps.

7.1.2 Number counts

Figure 7.2 shows a map of the number of galaxy-sightline pairs contributing to each pixel in our cross-correlation analysis. This map has been constructed using bin-sizes of 1×1 Mpc2 (rp × π), with rp denoting the transverse distance and π the line-of-sight distance. It

100 7.1. Measurement of the cross-correlation function

flags 3 & 4 flags 2.5, 3 & 4 100 150 100 75

125 60 60

100 50 20 20 gal

75 N [cMpc] [cMpc] π −20 π −20 50 25

−60 −60 25

−100 0 −100 0 0 25 50 75 100 0 25 50 75 100 r [cMpc] r [cMpc] p p

Fig. 7.2 — A map of the number counts of galaxy-sightline pairs contributing to each 1×1 cMpc2 bin, with only flag 3 and 4 redshifts in the left panel and 2.5, 3 and 4 in the right panel. There is only very limited information at close transverse distances due to the low number of pairs present. The drop in numbers beyond ∼60 cMpc is due to the size of the densely sampled area in zCOSMOS-deep.

essentially counts the total number of sightlines that cross at a distance (rp × π) of each zCOSMOS galaxy which we call “galaxy-sightline pairs”. In the left panel only galaxies with the highest confidence classes are included, i.e. zCOSMOS-flags 3 and 4, whilst for the right hand panel flag 2.5 galaxies were added. The number of galaxy-sightline pairs is different from the actual number of galaxy-pixel pairs for each bin; due to the extension of a bin in π-direction each galaxy-sightline pair contributes to several galaxy-pixel pairs.

Generally speaking the number of pairs at close transverse distance rp is rather small, and even below 10 at <2 cMpc, a consequence of the regions around the QSOs only having normal zCOSMOS-deep sampling. This means that an effect like the finger-of-god, expected on halo-sized scales (i.e. ∼ 0.5 cMpc) will be difficult to detect. As expected, the number of pair increases with increasing transverse distance rp, starts to drop again beyond ∼ 60 cMpc and reaches almost 0 at 100 cMpc. This is explained by the zCOSMOS-field size, which is 0.6×0.62 deg2 for the highly sampled inner region. At the survey redshift of z ∼ 2.3 this corresponds to ∼ 60 cMpc. The full size of the zCOSMOS-deep survey is 0.92×0.91 deg2 corresponding to ∼ 90 cMpc, beyond which the number of pairs approaches zero. The few pairs still occupying this space are due to galaxies at higher redshifts than z = 2.3 where the field size corresponds to larger comoving distances.

101 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

The vertical structure in the number count map arises from two causes: the small- scale variations across rp arise from different sightline-galaxy pairs contributing at a given transverse distance (essentially Poisson noise). Changes or gradients along the line-of-sight (LOS) come mostly from sightlines which do not span the whole range of LOS-distances, due to the combination of galaxy redshift and QSO redshift. A smaller effect also comes from masking the DLAs and metal-lines, which causes gaps in the individual sightlines. Generally however the variations across π are small, as we are basically looking at QSO- sightlines passing through at a given distance rp. In the later analysis we will mostly use bigger bin-sizes, so the respective number counts have to be scaled accordingly. The information presented in Figure 7.3 is similar but we show scales up to 1000 cMpc in π and with larger bins of 5×5 cMpc2. This gives a better idea of the range of scales which are covered along the line-of-sight. The number of pairs decreases beyond ∼ 500 cMpc in each direction. This is expected, as a typical QSO-sightline spans ∼ 900 − 1000 cMpc.

1000

350 750

300 500

250 250

200 gal 0 N [cMpc] π

150 −250

−500 100

−750 50

−1000 0 0 25 50 75 100 r [cMpc] p

Fig. 7.3 — A large scale map of the number counts of galaxy-sightline pairs contributing to each 5×5 cMpc2 bin, with only flag 3 and 4 redshifts. For presentation purposes the two axes are not on the same scale. Since a QSO-sightline typically spans ∼ 900−1000 cMpc, the number of pairs decreases beyond 500 cMpc.

102 7.1. Measurement of the cross-correlation function

7.1.3 Construction of the background map

When assessing the POD distribution around zCOSMOS galaxies, we will be interested in the actual excess of gas above the expected mean, or more precisely the overdensity as seen in the estimated optical depths. We construct a random map from our actual sample; similarly to random samples used in correlation function estimates. In doing so the POD estimates are randomly redistributed, across all 8 sightlines, whilst keeping all other parameters like RA, Dec, galaxy redshifts, etc, fixed. Each redistributed pixel corresponds to a different Lyα-redshift. The mean Lyα-absorption is also redshift dependent, meaning that when reassigning the pixels this < τ > −z relation would be changed. This is clearly an undesired effect as it means that we would calculate an inaccurate correction, underestimating the background where the mean redshift within a bin is higher than the survey redshift and overestimating it otherwise. This is avoided by first fitting a mean optical depth vs redshift relation and then using the factor f =< τ >zorig / < τ >zreassigned as a correction for the reassigned POD. Here < τ >zorig and < τ >zreassigned are the mean optical depths at the original redshift and at the redshift of the reassigned pixel. Typical Lyα-absorption lines are wider than just one pixel. Therefore another concern is that when reassigning pixels one by one we construct an artificially featureless back- ground map by redistributing correlated pixels one by one. If we assume a typical Doppler width of ∼ 30 km/s for Lyα-forest lines, and convolve it with XSHOOTER resolution this corresponds to 6 pixels. In our reassignment procedure we therefore always shift 6 adjacent pixels. For the background map we then calculate the mean optical depth in a bin at a distance (rp, π) from the galaxy using these reassigned PODs and averaging over 100 realisations of the POD randomisation. An example of a background map is shown in the left panel of Figure 7.4, calculated in bins of 2 cMpc side-length in both rp and π and for galaxies with flag 3 and 4. The right hand panel shows the corresponding standard deviation. With sufficient number of realisations, essentially each bin is a representation of the mean absorption of the pixels/galaxies contributing to that bin. Clearly at small transverse distances and again at > 60 cMpc the map is more structured (and with higher standard deviations). This results from a smaller number of galaxies contributing or equivalently from a comparatively bigger impact of single galaxy-sightline pairs contributing. For example the higher background absorption at −60 cMpc to −100 cMpc is a consequence of a few galaxies at z ∼ 2.6 with the corresponding QSO sightlines ending at π ∼ −60 cMpc and therefore not contributing further at higher LOS distances. As already discussed in the previous section on number counts, the features in the background map are entirely explained by the varying contributions of different galaxy-sightline pairs.

7.1.4 The impact of sampling

To get an indication of the uncertainties associated with our measurements, we perform two resampling analysis. First we compute the standard deviation to the correlation function (as described in more detail in the next section) from a jackknife resampling of the QSO sightlines to test for the impact of individual sightlines. Secondly we also estimate the standard deviations of 100 realisation of a bootstrap on the galaxy sample.

103 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

<τ > random map rand std. deviation std(τ) 100 0.235 100 0.16

0.14 0.225 60 60 0.12

0.215 0.1 20 20

0.205 0.08 [cMpc] π −20 −20 0.06 0.195

0.04 −60 −60 0.185 0.02

−100 0.175 −100 0 0 20 40 60 80 0 20 40 60 80 r [cMpc] r [cMpc] p p

Fig. 7.4 — Left panel: Mean random map calculated from 100 realisations of randomly re- assigning the PODs across all sightlines and then evaluating the mean POD in 2×2 cMpc2 bins. Right panel: Same as left, but the standard deviation. Generally any structure in the map can be explained by variations of galaxy-sightline pairs contributing, for more details see text. Here only flag 3 and 4 galaxies are included.

Both measurements are done in 2 × 2 cMpc2 bins and are shown in Figure 7.5 for flag 3 and 4 galaxies. It is immediately clear that the effect of galaxy sampling is generally more important than the inclusion or lack of a QSO-sightline. The transverse differences in both maps are then explainable by the increase of galaxy- sightline pairs with increasing distance. The more structured vertical variations can be understood by a combination of two factors: since the distances are kept fixed, the omission of a sightline or a galaxy will act on the same bin as it would when we calculate the actual correlation function. As already discussed, the exact values for the mean optical depth in a bin are dominated by the tail of the distribution of optical depths. Therefore, the omission of a galaxy-sightline pair from a bin that is associated with such a high value will lead to a considerable change in the mean optical depth of that bin, in particular at small rp where the number of contributing galaxy-sightline pairs is small. The uncertainties shown here are considerable, this will however be alleviated later on when combining bins at corresponding positive and negative π as well as using larger bin-sizes or smoothing.

104 7.1. Measurement of the cross-correlation function

Jackknife QSOs Bootstrap Galaxies 100 0.6 100 0.6

0.5 0.5

50 50

0.4 0.4

0 0.3 0 0.3 std [cMpc] [cMpc] π π

0.2 0.2

−50 −50

0.1 0.1

−100 0 −100 0 0 50 0 50 r [cMpc] r [cMpc] p p

Fig. 7.5 — Left panel: We show the standard deviations computed from a jackknife resampling of the QSO sightlines. Right panel: Result for the standard deviation of 100 realisations of a bootstrap-resampling of the galaxies. Both panels only include galaxies with flag 3 and 4 and have been computed by 2 × 2 cMpc2 bins. The transverse structure is owed to the change in number of pairs contributing at a given distance. The line-of- sight structure, more subtle, arises when only few QSO-galaxy pairs contribute to a bin; the exclusion of either a QSO or a galaxy can lead to rather substantial changes in the estimate for that bin. For more details see text.

7.1.5 Cross-correlation function

The cross-correlation function ξ(rp, π) is estimated as follows:

τ(rp, π)− < τ(rp, π) > ξ(rp, π) = , < τ(rp, π) > where τ(rp, π) is the measured pixel optical depth at a distance rp and π from the galaxy, and < τ(rp, π) > is the expectation from a random field at this distance.

In practice we calculate τ(rp, π) by averaging τ from all pixels of all galaxy-sightline pairs in a bin centered on (rp, π). The < τ(rp, π) > is calculated from the same procedure but with randomised PODs, as described in the previous section. In Figure 7.6 we show the cross-correlation function for flag 3 and 4 galaxies using a

105 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

flag 3 and 4, smoothing radius 8cMpc flag 2.5, 3 and 4, smoothing radius 8cMpc 100 0.3 100

0.2 80 80 0.2 60 60

0.1 40 0.1 40

20 20

0 ) π

0 , p (r

0 0 ξ [cMpc] [cMpc] π π

−20 −0.1 −20 −0.1 −40 −40 −0.2

−60 −60 −0.2 −80 −0.3 −80

−100 −100 0 20 40 60 0 20 40 60 r [cMpc] r [cMpc] p p

Fig. 7.6 — ξ(rp, π) for galaxies with flags 3 and 4 in the left panel and galaxies with flags 2.5, 3 and 4 in the right panel. As expected we see a peak at π ∼ 0, however shifted to the zeropoint by ∼ 6 cMpc, as indicated by the dashed line. This shift is likely due to systematics in the zCOSMOS-deep calibration. The apparent deficit at π ∼ +80 cMpc is discussed in the text.

bin-size of 2×2 cMpc2 and then smoothing with a circular 8 cMpc radius kernel. There is a clear peak in ξ(rp, π) at π ∼ 0. Given an isotropic Universe, ξ(rp, π) should be symmetrical with respect to π, which, as can be seen immediately from that peak, it is not. As already discussed extensively in Chapter 6, this offset is likely to come from calibration systematics of the zCOSMOS-deep survey. The offset (indicated by the dashed line) is of order 6 cMpc and independent of the radius of the smoothing kernel we used. We observe a deficit in Lyα-absorption at π ∼ +80 cMpc and (prior to smoothing)

106 7.1. Measurement of the cross-correlation function

rp . 4 cMpc. The fact that it is restricted to small transverse distances excludes errors in the continuum-fit as an explanation. We have already required the usable part of the Lyα- forest to lie at 5000 km/s from the Lyα-emission which should avoid any region that may be affected by the QSO-radiation field which itself may lead to a deficit of Lyα-absorption. We however also tested that stipulating an even larger separation of 10’000 km/s does not account for the detected feature. The confinement in transverse distance may suggest a physical origin for the observed deficit. A possible emission line (or suite of lines) may account for a deficit in absorption, same as the presence of strong metal (absorption) lines in the Lyα-forest can cause a second peak of high optical depths somewhere along the line-of-sight. The 80 cMpc distance from the Lyα-absorption peak translates to ∆λ ∼ 70A˚ from the Lyα-line, with no strong emission line existing there. Most likely this fluctuation is just “statistical”; our further analysis will however remain unaffected by it as we will mostly be interested in smaller line-of-sight distances. To increase the S/N of our correlation function, we will now assume that it actually should be symmetric under sign exchange for π and re-calculate ξ(rp, π) for absolute values of π. We applied a shift of 6 cMpc to all values of π prior to computation to achieve this symmetry. This essentially yields ξ(rp, π) for only positive values of rp and π. For presentation reasons we then replicate this quadrant to cover the entire (rp, π)-plane and all figures shown from here onward will have this replication included.

Figure 7.7 shows the resulting ξ(rp, π), for flag 3 and 4 galaxies in the top panel and flag 2.5, 3 and 4 galaxies in the bottom panel. The correlation function has been computed in bin-sizes of 2 × 2 cMpc2 and subsequently been smoothed with a kernel of 6 cMpc radius. First one notes that the amplitude when including flag 2.5 is lower than for only the highest confidence classes. This may be a consequence of different effects. First a higher percentage of flag 2.5 redshifts is likely to be incorrect than is the case for flag 3 and 4 redshifts. Secondly the spectra for flag 2.5 redshifts most likely are systematically noisier and with less clear features than spectra with flag 3 or 4. This may result in less accurately measured redshifts and hence a further decrease in amplitude. Both of these effects may also result in the scattering of optical depths values into different bins, which then also changes the measured correlation function beyond a pure amplitude effect. Furthermore, it is possible that flag 2.5 galaxies are on average fainter and maybe less massive than flag 3 or 4 galaxies. In this case they would be less biased which would also decrease the amplitude of the correlation function. The most prominent feature in the cross-correlation maps from Figure 7.7, is the com- pression of the absorption peak in line-of-sight direction in comparison to the transverse direction. This extends out to about ∼ 25 cMpc and could be a signature of Kaiser infall, as has been observed in already many galaxy-galaxy correlation studies. We do not, how- ever, observe a finger-of-god effect, which is unsurprising, given the low number of galaxies at close transverse distance. As to whether the observed effect is indeed explainable by Kaiser infall will be discussed in the following section.

Finally we calculate an approximation of a S/N map for the ξ(rp, π) map. To estimate the noise, we perform a boot-strap analysis by randomly selecting galaxies to be included into the sample and process this in the exact same way as has been done for the data. In total we compute one hundred realisations of the bootstrap. The S/N approximation is

107 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

flags 3 and 4 60 0.3

40 0.2

20 0.1 ) π , p

0 0 (r ξ [cMpc] π

−20 −0.1

−40 −0.2

−60 −0.3 −60 −40 −20 0 20 40 60 r [cMpc] p

flags 2.5, 3 and 4 60 0.3

40 0.2

20 0.1 ) π , p

0 0 (r ξ [cMpc] π

−20 −0.1

−40 −0.2

−60 −0.3 −60 −40 −20 0 20 40 60 r [cMpc] p

Fig. 7.7 — ξ(rp, π) for both flag 3 and 4 galaxies (top panel) and flag 2.5, 3 and 4 galaxies (bottom panel). The maps are quite similar to each other, but including the lower confidence 2.5 redshifts results in a lower amplitude and some differences in the shape of

ξ(rp, π).

108 7.2. Redshift space distortion model

then calculated by dividing the ξ(rp, π)-map by the standard deviation of this bootstrap. Figure 7.8 shows the resulting S/N map for the flag 3 and 4 sample in the left panel and including flag 2.5 galaxies in the right panel, both computed for 2 × 2 cMpc2 bins. The S/N will be higher when we bin more as will be done in the subsequent analysis.

flag 3 and 4 flag 2.5, 3 and 4 60 3 60 3

30 30 2 2

0 0 S/N [cMpc] [cMpc]

π 1 π 1 −30 −30

−60 0 −60 0 −60 −30 0 30 60 −60 −30 0 30 60 r [cMpc] r [cMpc] p p

Fig. 7.8 — Approximation for a S/N map for only flag 3 and 4 galaxies in the left panel and also including flags 2.5 in the right panel. The S/N in each 2×2 cMpc2 has been computed

by dividing ξ(rp, π) by the standard deviation from a hundred bootstrap realisations on the galaxy sample.

7.2 Redshift space distortion model

7.2.1 Construction of the model

As already detailed in the introduction, the effects of peculiar velocity (due to large scale infall or random motions) cause the distance measurements in redshift space to be distorted with respect to real space. We wish to quantify this effect in order to compare it to our cross-correlation function; the goal is to establish whether the distortion we observe is explainable by large scale infall of matter under ΛCDM cosmology. This in a sense is “forward modeling” by assuming a cosmology and reasonable infall parameters and compare to the data, as opposed to fitting a model and infer the relevant parameters from there. The quality of our data however makes it very challenging if not impossible to perform a well-constrained fit. In the limit of large scales, linear theory is applicable and can be used to predict the exact form of the redshift space distortion of any tracer populations. In our case, we are dealing with two distinct tracers, the galaxy and the Lyα-forest. In Fourier space the 2 amplitude of the power spectrum will be changed by a factor of b(1 + βµk) for each tracer, where b is the respective bias, β the redshift space distortion parameter and µ the cosine of the angle between line of sight and Fourier mode (Kaiser 1987). The power spectrum therefore reads as follows:

2 2 P (k, z) = bgal(1 + βgalµk) · bLyα(1 + βLyαµk) · Plin(k, z),

109 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

where Plin is the linear (dark) matter power spectrum, bgal and bLyα the biases of the galaxy or Lyα-forest sample and βgal and βLyα the respective distortion parameters. Hamilton (1992) derived the cross-correlation function for this power spectrum to be:

ξ(rp, π) = ξ0(s)P0(µ) + ξ2(s)P2(µ) + ξ4(s)P4(µ),

q 2 2 with µ = π/s and s = rp + π being the radial distance. µ is therefore the cosine of the angle between s and π. The Pl denote the Legendre polynomials of order l, i.e.

1 1 P = 1,P = (3µ2 − 1),P = (35µ4 − 30µ2 + 3). 0 2 2 4 8

The functions ξi are, under the assumption that the DM correlation function takes a γ power-law form ζ(r) = (r/r0) (Hamilton 1992, Hawkins et al. 2002, Font-Ribera et al. 2012): 1 1 ξ (s) = b b [1 + (β + β ) + β β ]ζ(s), 0 gal Lyα 3 gal Lyα 5 gal Lyα 2 4 γ ξ (s) = b b [ (β + β ) + β β ][ ]ζ(s) 2 gal Lyα 3 gal Lyα 7 gal Lyα γ − 3

8 γ(2 + γ) ξ (s) = b b β β [ ]ζ(s). 4 35 gal Lyα gal Lyα (3 − γ)(5 − γ)

This form accounts for the large scale Kaiser infall, but does not take into account the effect of√ random motion. Convolving the model by the distribution function f(v) = 2|v| √1 exp(− ) with a as the random pairwise velocity, results in the final model: a 2 a

Z ∞ 0 ξ(rp, π) = ξ (rp, π − v/H0)f(v)dv. −∞

This model, paired with the assumptions discussed in the next paragraph will be used as a comparison point to our data. To assess the agreement we will be using a model-data comparison at fixed radius. Figure 7.11 illustrates this by showing both the 2D model (as already in the introduction) and the related curve at fixed radius s. We define this curve as ξ(µ). Plotted on the x-axis is µ, which ranges from -1 to +1. These poles represent pure LOS direction, whilst µ = 0 is purely transverse. In the absence of redshift space distortions ξ(µ) is constant, whilst β > 0 results in a parabola-shaped curve. Random motion finally causes an upturn near the poles. We arrange the x-axis in Figure 7.11 such that we show µ decreasing from 0 to -1 in the middle and then starting again at +1 and decreasing to 0. The central value “1” should therefore be read as ±1. Since effect of redshift space distortion is apparent in the line-of-sight whilst the transverse ξ(rp, π) remains unaffected, this arrangement focuses the plots where the effect is observed. For both the model and later the data the curve will be symmetric, by construction for the model, and due to the replication of the quadrants for the data.

110 7.2. Redshift space distortion model

20 20

0 0 [cMpc/h] [cMpc/h] π π

−20 −20

−20 0 20 −20 0 20 r [cMpc/h] r [cMpc/h] p p

2

1.8

1.6 ξ 1.4

1.2

1 0 −0.2 −0.4 −0.6 −0.8 1 0.8 0.6 0.4 0.2 0 µ

Fig. 7.9 — We show the redshift space distortion model in the two top panels, assuming

a correlation length r0 = 5 cMpc/h and γ = 1.8, with β = 0 on the left hand side and β = 0.5 and a = 300 km/s on the right hand side. The red circles show the fixed radius s used for the curves in the bottom panel, where ξ is plotted at this radius and as a function of µ = π/s. The two curves in the bottom panel show the model for no redshift space distortion (black) and with distortion (blue).

7.2.2 Basic assumptions

A variety of parameters and assumptions enter the model as presented in the previous section, we will list them here and provide a brief discussion of each:

• First and foremost, we assume that linear theory applies which only is true on large scales and fails increasingly when entering the vicinity of haloes or the halo itself. This is justified by the fact that indeed the effect we observe is at & 5 − 10 cMpc. • The equations presented above assume that the DM correlation function takes the form of a power law. This may not entirely be true, however serves as a good approximation.

• We take the values for the correlation length, r0 = 1.4 cMpc/h and γ = 1.4, from Moustakas & Somerville (2002) at a redshift of z ∼ 2.3. The DM correlation function therein has been measured from the GIF/VIRGO DM simulation using WMAP1

111 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

Distortion parameters 0.5 β − 50% Lyα a = 300km/s β =1 a=0 Lyα a=600km/s β + 50% Lyα

ξ 0.25

0 Bias factors 0.5 b − 50% b − 50% Lyα gal b =1 b =2 Lyα gal b + 50% b + 50% Lyα gal

ξ 0.25

0 DM correlation function parameters 0.5 r − 10% 0 γ − 10% r =1.4Mpc/h γ=1.4 0 r + 10% γ + 10% 0

ξ 0.25

0 0 −0.5 1 0.5 0 0 −0.5 1 0.5 0 µ µ

Fig. 7.10 — We show the influence of deviations from the assumed parameters on the redshift space distortion model and therefore ξ(µ). Per definition the effect of variation in the bias or the DM correlation function is a change in amplitude, whilst the effects of β and random motions are more complex. Top left: β for the Lyα-forest. Top right: random pairwise velocities. Middle left: the bias for the Lyα-forest. Middle-right: bias for the zCOSMOS-deep galaxies. Bottom left: the correlation length of the DM correlation function. Bottom right: the exponent in the DM correlation function.

parameters (Jenkins et al. 1998). In particularly σ8 was estimated too high with a value of 0.9 as compared to the current value of 0.81. This mostly influences r0 which can be expected to be slightly lower than quoted.

• We assume the bias for zCOSMOS-deep type galaxies to be of order 2. This is consistent with BzK selected galaxies at this redshift, as well as with Lyman-break-

112 7.3. Comparing ξ(rp, π) to the redshift space distortion model

galaxies or Lyα-emitting galaxies at similar redshifts (see Guaita et al. (2010) for a compilation of such data). Lyman-break-galaxies or Lyα-emitting galaxies could be a good proxy for zCOSMOS-type galaxies as the majority of the zCOSMOS objects had most likely their redshifts assigned under consideration of the Lyα-transition.

• The redshift space distortion parameter of the galaxy population is calculated by assuming ΩM (z = 0) = 0.27. The relation between ΩM at different redshifts is 3 ΩM (z=0)(1+z) given as: ΩM (z) = 3 . At a survey redshift of z ∼ 2.3, this 1−ΩM (z=0)+ΩM (z=0)(1+z) 0.6 ΩM translates to ΩM = 0.93. We then use the relation βgal = = 0.49. bgal • For the Lyα-forest a bias of unity is assumed, following the idea that low column density gas constitutes an essentially bias-free tracer of DM.

• The redshift space distortion parameter of the Lyα-forest has been shown not to follow the simple relationship to ΩM and the bias, but to be additionally compli- cated by an extra “bias” factor. This factor should compensate for the non-linear transformation between the observable (flux) and the actual gas density. The values quoted for βLyα, lie at 0.5-1.5 (Slosar et al. 2011, McDonald 2003). For our purposes we will assume βLyα = 1. • Last we assume the effects of random motion (described by a) to be negligible. First we are only looking at relatively low column density gas due to the cut on the optical depths. The gas however associated with haloes typically possesses higher column densities and would hence be excluded from our study. Secondly the finger- of-god originates in virialised motion within haloes. Whilst galaxies or gravitation- ally bound gas clouds can be considered as point-like sources in that respect, this is not true for the more ubiquitous lower density HI gas.

Figure 7.10 shows ξ(µ) for the basic model with the parameters as described above (solid line), as well as for variation of those parameters (in dashed and dotted lines), deviating either 10% or 50% from the assumed value, depending on how uncertain this value is.

7.3 Comparing ξ(rp, π) to the redshift space distortion model

Having calculated the expected redshift space distortion effects with the model from the previous section, we will now compare it to the measured cross-correlation function to infer whether it can account for the observed compression in ξ(rp, π). Before performing the comparison, we revisit the measurement of the correlation func- tion: when measuring the ξ −µ relation from the data, we are using a circle of fixed radius in rp − π space. Our correlation function has however been calculated in quadratic bins in rp and π. In the limit of very small bins, this does not impact the shape of ξ(µ) too much. However with rather noisy data at hand we have to bin heavily (of order 5-10 cMpc) to increase the S/N and detect any effect. Such large bin-sizes however deviate so much from circular geometry that they may introduce artificial effects in ξ(µ). For the purpose of comparison we will therefore calculate the correlation function in polar-coordinates (s, θ)

113 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

flags 3 and 4 0.6 r = 9cMpc r = 15cMpc r = 21cMpc 0.4 r = 27cMpc

ξ 0.2

0

−0.2 0 −0.25 −0.5 −0.75 1 0.75 0.5 0.25 0 µ

flags 2.5, 3 and 4 0.6 r = 9cMpc r = 15cMpc r = 21cMpc 0.4 r = 27cMpc

ξ 0.2

0

−0.2 0 −0.25 −0.5 −0.75 1 0.75 0.5 0.25 0 µ

Fig. 7.11 — We show the measured correlation function at fixed radii (solid line) and the corresponding model (dashed line). The radii are at 9 cMpc (black), 15 cMpc (blue), 21 cMpc (green) and 27 cMpc (red). The top panel shows only flag 3 and 4, whilst the bottom panel also includes flags 2.5.

q 2 2 instead, with s = rp + π and θ = π/s. The data curve we aim for is then just the correlation function at fixed radius s but varying angle θ. We use bins of 6cMpc in radius and 15 degrees in θ, but checked that varying the bin-size does not change our result significantly. The comparison between the measured correlation function and the model is shown in Figure 7.11, at radii of 9, 15, 21 and 27 cMpc, with the data shown as solid curves and the model with dashed lines. The errorbars have been calculated from bootstrapping (as described for the S/N map from the previous section). The model seems to agree fairly well with our data, both for the analysis for flag 3 and 4 galaxies, as well as when including flag

114 7.4. Summary and conclusions

2.5 objects. We performed a χ2-test on the combined set of the four radii: for flag 3 and 4, 2 2 we calculate a reduced χ of χred = 0.491 (with a p-value of p = 0.98) and when including 2 flag 2.5 we have χred = 0.93 (p = 0.57). We therefore find no evidence of inconsistency of the observed effect with our model; the observed compression can be accounted for by 2 large-scale infall. If we assumed the absence of Kaiser infall, the respective χred values 2 2 become χred = 1.01 (p = 0.45) using flag 3 and 4 galaxies, and χred = 1.30 (p = 0.14) when also including flag 2.5. The assumption of no large scale infall thus presents a poorer description of our data than our model describing large scale infall.

7.4 Summary and conclusions

We have analysed the Lyα-forest of 8 QSOs observed with the XSHOOTER and UVES instruments. The targets have been selected to lie within the zCOSMOS-deep field and at 2.5 . zQSO . 2.8. This redshift range permits Lyα-forest studies at 2.0 . z . 2.8. The zCOSMOS-deep survey on the other hand observed most of its galaxies in this same redshift range and therefore the combination of the two data-sets allows to study the connection between HI as probed by the Lyα-forest and the zCOSMOS-deep galaxies. We devised tailor-made algorithms to extract the QSO spectra from the XSHOOTER data as well as to fit the QSO continuum. We employed the pixel-optical-depth method to quantify the Lyα-absorption. For our analysis we excluded any saturated lines corre- sponding to high-column density systems as well as DLA systems. This data-set was used to estimate the Lyα-forest-Galaxy cross-correlation function by computing the mean overdensity in the pixel-optical-depths at a distance of (rp, π) of the zCOSMOS-galaxies. We detect a compression along the line-of-sight, as has been predicted for the large scale infall of matter. We constructed a model for the expected redshift space distortion by assuming the dark matter correlation function to be of the γ form ζ(r) = (r/r0) , with r0 = 1.4 Mpc/h and γ = 1.4 at z ∼ 2.3. Further we assumed suitable values for the bias of the galaxies and the Lyα-forest as well as respective infall parameters. The model seems to agree fairly well with our estimate of ξ(rp, π). The inferred χ2 indicates no evidence of inconsistency with our data; the observed compression in ξ(rp, π) can be explained by large scale infall.

115 CHAPTER 7. Measuring the Lyα-forest-Galaxy cross-correlation

116 Part IV

Summary and Outlook

Chapter 8

Summary and Conclusions

8.1 Proto-groups and proto-clusters

We have conducted a study on spectroscopically identified overdensities in the zCOSMOS field, with the aim of moving from isolated examples of proto-clusters to a more statistically significant sample and consequently better understanding the nature of those structures. To that end we applied a friends-of-friends group finder to the zCOSMOS-deep redshift sample of 3502 galaxies at 1.8 < z < 3. The parameters of this group-finder have been carefully chosen, to achieve a balance between the number of candidate groups detected and their contamination by interlopers. The linking lengths ∆r = 500 kpc (physical) and ∆v = 700 km/s turned out to satisfy our requirements, resulting in the identification of 42 candidate groups with three or more members. We introduced the following terminology to more accurately describe the detected sys- tems: an overdensity for which all of the detected members occupy the same dark matter halo, is called a “real group”, whilst if it is contaminated by interlopers it would be called a “partial group”. In the same spirit we define a “proto-group” as a system whose member will form a real group by z = 0, but are still occupying different DM haloes at the time of detection. A “partial proto-group” is then a system which will evolve into a partial group by z = 0, so again it is contaminated by interlopers. Observationally we have limited information about the underlying DM distribution. We have therefore constructed an analogous sample of candidate groups from mock catalogues. These mock observations have been derived by Kitzbichler et al. 2007 from the Millennium simulation and we carefully matched them to have zCOSMOS-deep sample properties. The resulting mock candidate group sample has consistent number densities, and we used it to assess the nature of our observed sample. We find that at the epoch of detection, z ∼ 2, less than 0.2% of the simulated candidate groups are already real groups, with all their member galaxies occupying the same DM halo. About 8% of them however constitute partial groups, i.e. are contaminated by an interloper or a galaxy that will only accrete onto the common halo at a later stage. Furthermore we studied the evolution of those candidate groups over cosmic time and

119 CHAPTER 8. Summary and Conclusions

find that by z = 0, 50% of them will have turned into a real group and another 43% will become a partial group. Only 7% are therefore purely random associations, meaning that at z ∼ 2 the majority of systems we detect are actually proto-groups. The mocks furthermore indicate that the decisive parameter in distinguishing between between proto- groups and partial proto-groups or chance associations is the velocity spread vrms. For vrms . 300 km/s more than 50% of the detected systems will be pure proto-groups, which will accrete all members onto the same z = 0 halo. This fraction is mostly independent of the radial size of the candidate group. As discussed before, most of the candidate groups are not real groups at the epoch of detection. However, when analysing their assembly history, it turns out that they are actually observed during the assembly process. By z = 1.8 (which corresponds to the lower limit of our observations) already a quarter of the candidate groups observed at higher redshift will be partial groups (as opposed to 8% at the time of detection). Most of the candidate groups will have started assembly within ∆a . 0.1. A rough estimate of the overdensities of our candidate groups is also consistent with the idea that they have started the assembly process and will soon accrete their members onto a common halo. We also resolved the question of what becomes of our candidate groups, i.e. what type of group or cluster they will evolve into? In doing so, we looked at the halo masses of the 13 z = 0 descendant and find that whilst most of our proto-groups evolve into ∼ 10 M /h 14 15 haloes, we also detect about a third of todays most massive clusters with 10 −10 M /h haloes. This number is highly impacted by the spectroscopic completeness of zCOSMOS- deep: with a sampling-rate of 100% we would be able to detect even ∼ 65% of todays massive clusters at z ∼ 2. Turning back to observations, we searched for any possible early onset of environmental differentiation, by using the larger and more complete COSMOS photometric sample. This sample, despite the large uncertainties in the photometric redshifts, shows a 40% excess in 10 galaxies with stellar masses > 10 M at the position of the spectroscopically identified candidate groups, confirming again our spectroscopic selection. However, we can not detect any significant differences in galaxy colours as compared to the field. After a preliminary identification of zCOSMOS-deep candidate groups, we observed seven of them with VLT/FORS2 in an attempt to confirm membership and acquire more members. The targets for observations have been selected to either meet our ”success” criterion of vrms . 300 km/s or had a high number of already identified members. Most spectacularly, these observations then led to the discovery of a proto-cluster at z = 2.45 with (to date) 11 confirmed members. These galaxies lie within a radius of 1.4 Mpc (physical) on the sky and within ∆v = ±700 km/s and have an estimated overdensity of 10. We compared this proto-cluster to carefully selected analogous structures in the Mil- lennium simulation, and followed their evolution from z ∼ 2.5 to z = 0. We find that most member galaxies are still occupying their own DM halo at the time of observation. Most 14 15 of them accrete onto a common halo by z ∼ 1 and form a cluster of M & 10 −10 M /h by z = 0, comparable to a Virgo- or Coma-like cluster. Furthermore we analysed the population of all progenitor galaxies at z ∼ 2.5 that will later occupy these massive z = 0 haloes, going beyond the 11 originally identified

120 8.2. The Lyα-forest-Galaxy cross-correlation at z ∼ 2 members. Whilst most of these galaxies would be too faint to be detectable in a FORS2- like observation, most of them still fulfill the ∆v = ±700 km/s criterion of the previously identified members: if observed, they would have been associated with the proto-cluster. The number of these progenitors arises to several hundred to thousands galaxies spread over areas with diameters between 3 and 20 pMpc and therefore far beyond the originally identified members. It seems that an optical selection of proto-clusters like our own, results in mostly loose structures and a diverse collection of objects. In order to fully characterise the progenitors of todays massive clusters, one needs to observe wider fields than available in most current observations. This impression was confirmed in a recent simulation paper, where the authors find this to be generally the case for the progenitors of 14 15 M & 10 − 10 M /h clusters (Muldrew et al. 2015). Furthermore these numbers imply existence of an extended structure at z ∼ 2.5 in the zCOSMOS-field, possibly subtending a significant part of the survey area.

8.2 The Lyα-forest-Galaxy cross-correlation at z ∼ 2

We have studied the correlation of the Lyα-forest at z ∼ 2 and the zCOSMOS-deep galaxy sample. A set of 8 QSO-sightlines in the zCOSMOS-deep field has been observed with XSHOOTER. These QSOs have been selected to have redshifts at 2.5 . zQSO . 3, in order to permit the study of the Lyα-forest-Galaxy connection at z & 2. Their require- ment of being located within a single field means these QSOs are less bright than QSOs typically used in Lyα-forest studies, and have been observed at the medium resolution of XSHOOTER rather than at higher resolution. However, the brightest two QSOs were additionally targeted with the high-resolution UVES instrument to study any possible effects of instrument resolution on the inferred Lyα-forest properties. In the reduction of the XSHOOTER data-set we found that the standard XSHOOTER pipeline does not provide satisfactory results for the spectrum extraction. We therefore developed an algorithm that traces the object on the XSHOOTER 2D spectrum and performs an optimal extraction. We identified intervening metal-line systems redwards of the Lyα-emission of the QSO and removed the metal-line contaminants in the forest itself. A catalogue of these systems has been provided. In order to perform the continuum estimation for the QSO spectra with minimal ”hu- man intervention”, we developed a semi-automated algorithm. It is based on the concept that the flux distribution within the Lyα-forest part of the QSOs spectrum approaches a Gaussian around the continuum level (Gaussian due to the noise in the spectrum) and has a tail towards lower fluxes caused by the Lyα absorption. By estimating the width and peak of that distribution, in moving boxes along the spectral direction, we are able to identify pixels belonging to the continuum and in a next step fit them by a cubic spline. Finally the pixel-optical-depth (POD) method has been invoked to extract the Lyα- absorption strength from the sightlines. This method has advantages as a quasi-continuous approach (to the level of individual pixels) and therefore reflects the idea that the IGM is a continuous, fluctuating density field. To assess any potential impact of spectral resolution and signal-to-noise on our opti-

121 CHAPTER 8. Summary and Conclusions cal depth estimate, we compared measurements from the UVES spectra with the actual XSHOOTER data for the same QSOs to test for possible changes with resolution. Sec- ondly we added noise to an initially very high S/N UVES spectrum and compared the measurements for the high S/N spectrum to the lower S/N. We found that at least in the case of medium versus high spectral resolution and at the intermediate S/N of our data, the effect of S/N on the precision of the POD estimate is more severe than the effect of resolution. In a first application we used the Lyα-forest data-set to constrain the redshift reliability within the zCOSMOS-deep confidence classes, where each zCOSMOS-redshift had also been assigned a flag reflecting how reliable that redshift was. By assuming that the average optical depth distribution around the zCOSMOS-deep galaxies is independent of confidence class, we measured this average optical depth for each of the zCOSMOS- confidence classes. The galaxies with, according to their flag, most secure redshifts have been used as a benchmark for 100% reliability. We provided the percentages of reliability for each class, which can serve as a guideline for constructing future redshifts samples from zCOSMOS-deep. The combination of the QSO-sample and the zCOSMOS-deep survey allows study of the Lyα-forest-Galaxy connection on scales up to ∼ 50 − 60 cMpc transverse. We used this to infer the Lyα-forest-Galaxy correlation function ξ(rp, π) by measuring the mean overdensity of optical depths at a distance (rp, π) from the galaxies. We presented this correlation function on scales of rp < 60 cMpc and |π| < 100 cMpc, and therefore on scales which - to our knowledge - have not been probed observationally before.

In our map of ξ(rp, π) we detect a compression along the line-of-sight at scales up to ∼ 20 − 25 cMpc. Such a signature is expected from the so-called Kaiser effect, the coherent infall of matter leading to distortions in redshift space. The magnitude of the effect depends on the bias of the tracer population and the growth of structure which can 0.6 be approximated by ΩM . Typically one would perform a fit of a redshift space distortion model to the measured ξ(rp, π) and infer those parameters from the fit. In our case this is however not feasible due to the low S/N of our data. We therefore constructed a preferred model, by assuming parameters expected from a ΛCDM universe, and compare it to the effect we detect in the data. The agreement seems pretty good, and by calculating the corresponding χ2, we find no evidence for inconsistency of the data with the model.

122 Chapter 9

Outlook: “Hay m´asfuturo que pasado”

As ever, answering some questions will open a whole Universe of others. I will sketch out a few possible further explorations using some of the techniques and data developed in this thesis. 1) What shapes the properties of proto-clusters? The known examples of proto-clusters seem to exhibit quite different properties in terms of their galaxy population. At the same time different authors use differing parameters to assign membership to proto-clusters, which could be part of the cause for the observed diversity. In simulations on the other hand it is straight-forward to identify proto-clusters by using the dark matter merger tree. Consequently simulations can be used to infer the best parameters for the identification of proto-clusters. On the other hand, by imposing a proto-cluster selection criterion one can also characterise the nature of the systems one would identify in observational data (as done in chapters 3 and 4). One can therefore explore the variety of criteria used in the literature which may help to interpret the nature of the proto-clusters found by those criteria. Furthermore one can study - in the simulation - the galaxies in the progenitors of todays clusters and determine whether this proto-cluster population has an intrinsically wide variety of properties. This may further help to understand the cause of the observed diversity. All of this is obviously only correct so long the simulations are not grossly inconsistent with the observed Universe. 2) How can we identify the field, the intermediate density and the high density environ- ment at z ∼ 2 and do galaxies evolve differently in either of those? As we have established in our proto-group study, a structure that is identified as overdense at z ∼ 2, may take rather different evolutionary paths, from becoming one of the most massive clusters to- day to ending up in a group-type halo. It is not clear at the point of observation, what that path may be. However, we do know that radio-galaxies are endemic to the high- density peak and often likely to be the centers of proto-clusters which will eventually turn into todays clusters. On the other hand (whilst observationally disputed), simulations predict high luminosity QSOs to be inhabiting intermediate mass haloes. We therefore started an observational programme to perform narrow-band Hα imaging around such

123 CHAPTER 9. Outlook: “Hay m´asfuturo que pasado” high-luminosity QSOs to first of all try to confirm the simulation predictions. If QSOs are indeed representative for an intermediate environment, we then plan to perform a comparative analysis of field galaxies (provided by HiZELS, Best et al. 2013), galaxies in the environment of radio galaxies (provided by an already observed programme with Hα-imaging) and the QSO environment. 3) Having only just started to exploit the Lyα-forest data-set, there are still a number of interesting studies to be performed, like for example analysing the information on the high- redshift metal line systems or studying the correlation of zCOSMOS-deep galaxy properties with absorption-strength. Furthermore, we observe a few zCOSMOS-deep proto-group in the vicinity of QSO-sightlines, so we can use this to constrain the gas-content of such high-redshift overdensities. Last but not least, any such data-set offers the probability for chance discoveries. Indeed we detect, with log N ∼ 21.6cm−2 an exceptionally high column density DLAs which will be interesting for a case study. 4) The Lyα-forest study presented in this thesis has been limited by the sampling and selection of zCOSMOS-deep galaxies and in particular we have been severely limited at very close transverse distance where the intergalactic medium (IGM) transitions into the circum-galactic medium (CGM). With the new MUSE instrument at hand, a logical continuation of our analysis would be to extend it to . 1 Mpc scales and to study both the CGM as well as the CGM-IGM interface. Combining deep MUSE observations around bright QSOs with already existing high-resolution spectra of the QSOs will allow a very complete galaxy census around the QSOs including much fainter galaxies than accessible so far. A sufficient number of such sightlines would allow precise maps of HI as well as metals-lines around those galaxies. Due to the MUSE wavelength range this study would have to be conducted at higher redshifts (z & 3) than presented in this thesis, when those objects are still very much in the formation process, and shed light on the so far elusive details of the gas transfer around those galaxies.

124 Part V

Appendix

Chapter 10

A new calibration algorithm for KMOS

During my studentship at the European Southern Observatory ESO, I had the chance to complete an instrumental project on KMOS. Its details are described in this chapter.

10.1 Instrument description

10.1.1 Science driver

KMOS (K-Band Multi-Object Spectrograph) is a second generation instrument for the VLT, currently mounted on one of the Nasmyth foci of UT1 and designed to perform multiple object IFU observations in the NIR. The main science driver for KMOS was to study galaxy formation and evolution at high redshift, in particular to get an insight into the physical processes that shape those early galaxies. Crucially the aim was to get spatially resolved information on e.g. star formation for a statistically significant number of objects (hence the multi-plex capability) in a variety of environments (Sharples et al. 2006).

10.1.2 Design and data organisation

KMOS has 24 deployable arms that act as independent IFUs with a FOV of 2.8”×2.8” each, which in their combination can patrol a field of 7.2’ diameter. An arm positions a pick-off mirror in the Nasmyth focal plane from which the light of each subfield is fed into an imageslicer creating 14 slices. Each of these 14 ”pseudo-slits” contains in turn 14 spatial pixels. Therefore in total, each of the KMOS arms gives rise to 14×14 spaxels. The spectra are then recorded on 3 identical Hawaii 2RG detectors, with each detector collecting the data of eight arms (KMOS User Manual). The data is organised such that the y-direction constitutes the spectral direction, whilst the x-axis is the spatial direction. A sketch of the data path is shown in Figure 10.1.

127 CHAPTER 10. A new calibration algorithm for KMOS

Fig. 10.1 — Overview on how the data of the KMOS instrument is organised. The top sketch illustrates the image slicing and defines the x- and y-direction of the pick-off mirrors. The bottom panel displays a lamp flat taken with the internal calibration unit. Each of the three KMOS detectors collects the light from eight arms. The individual ”slit” spectra originate from the dispersion of one of the 14 pseudoslits and contain 14 spaxels themselves.

10.1.3 Calibration and re-commissioning procedures

KMOS has an internal calibration unit which uniformly distributes light over a sphere with 24 output apertures (see Figure 10.2). This unit is used for the necessary day-time calibrations that are foreseen in the calibration plan of the instrument (e.g. flat-fields and arc spectra). The procedure requires the pick-off mirrors to be well centered on their respective calibration positions in front of the apertures. Any misplacement will lead to the field not being illuminated uniformly which can be observed as large scale gradients in the flat-fields. However after re-commissioning of the instrument following technical work (including warming up the instrument), the arms cannot be expected to be found in the exact same configuration as before, meaning that one will have to readjust the calibration position. When not in use, the KMOS arms are put in a so-called ”parking position”; any internal displacement of the arms is calculated with respect to that position. These movements are represented by two coordinates, radial (r) and a rotation (θ). An intervention may now result in a change of the calibration position relative to the parking position. The goal of my project was to find an automated procedure to determine this calibration position, something that had to be done manually up to then.

128 10.2. The calibration position finder algorithm

Fig. 10.2 — This is a photo of the KMOS internal calibration unit. In calibration position, each of the KMOS arms is moved in front of one of the 24 sub-apertures. Image Credit: KMOS User Manual, ESO.

10.2 The calibration position finder algorithm

10.2.1 Basic concept

The ”ideal” calibration position is reached when the pick-off mirror (”arm”) is uniformly illuminated by the flat-field lamp. A shift from this ideal position will result in a brightness gradient across the mirror as part of it does not get illuminated fully anymore. In the raw flat frame this will manifest itself in 2 possible gradients, as illustrated in Figure 10.3: 1) A gradient in the y-axis of the pick-up mirror will, due to the slicing, be visible as a gradient within each of the pseudo-slits. 2) A gradient in the x-axis of the pick-up mirror will be visible as an overall gradient across the 14 pseudo-slits. The calibration procedure plans for flat-field frames to be taken in 60 degree rotation steps of each mirrors, i.e. for each IFU flat-fields are taken at 0, 60, ..., 300 degrees. As this is however just a rotation at the actual calibration position, we only have to find the optimal position for the 0 deg case. As part of the KMOS procedures there then exists an observing block ”find calpos”. It initially moves the arms to the previously defined calibration position and then steps around it on an either 3×3 or 5×5 grid, taking flat- fields at each position on the grid. The problem of finding the optimal calibration position is now to minimise simulta- neously the overall gradient and the gradient within the pseudo-slits across the flat-fields from the 3×3 or 5×5 grid, at 0 deg rotation.

10.2.2 Implementation

As explained in the previous section, we want to minimise both gradients in x- and y- direction of each IFU. For that we established the following procedure: 1) Determine the 14 ”peaks” on the detector for each IFU, i.e. the regions of the detector

129 CHAPTER 10. A new calibration algorithm for KMOS

11000

10000 counts 9000

8000 550 600 650 700 750 pixels

8500

7500 counts 6500

5500 1050 1100 1150 1200 1250 pixels

Fig. 10.3 — Both panels show an example for a trace through one IFU, calculated by averaging 100 pixels in the spectral direction. The x-axis is marking the spatial direction. Each peak corresponds to a pseudo-slit, consisting of 14 spaxels. The top panel is an example for an imperfect illumination in the x-direction of the pick-off mirror, exhibiting a gradient within each of the peaks. The bottom panel illustrates the overall gradient due to imperfect illumination in the y-direction. where the individual slices are dispersed. 2) Determine the gradient in each pseudo-slit, i.e. the gradient in x-direction. 3) Find the gradient in y-direction. 4) Minimise both gradients simultaneously to find the optimal position. 5) Produce a file that records the calibration position for each IFU and which can be used as an input for the internal KMOS parameter files.

Location of the slices

We created a cut through the two dimensional flat frames, by collapsing a range of pixels in dispersion direction. Both the exact parameters for this range as well as whether the collapsing is done by a median or a mean can be adjusted by the user of the algorithm. We then get a first guess of the peaks by assigning all pixels above a minimum value (also controlled by the user) to the peak. The actual peak is defined as the n highest values from the first guess. ”n” is again user-defined, but can obviously never exceed 14.

Calculating the gradients

An imperfect illumination in the x-direction of a pick-off mirror will appear as gradients within the individual pseudoslits (top panel of Figure 10.3). We fit a slope to each of the

130 10.2. The calibration position finder algorithm

14 peaks of a given IFU. The algorithm allows to optionally reject the steepest and the shallowest slope. In the final quality assessment the mean of all slopes within an IFU is used. An imperfect illumination in y-direction will manifest itself as a gradient across all 14 peaks of the IFU (bottom panel of Figure 10.3). We therefore fit a slope to the mean counts of each peak.

Finding the optimal position

The above procedure results in a pair of slope-values for each of the possible calibration positions on the 3×3 or 5×5 grid. The absolute values of either of the slopes are then assigned a rank ranging from 0-8 (for the 3×3 grid) or 0-24 (for the 5×5 grid), with the shallowest slope corresponding to the lowest rank. We then calculate an estimator for the 2 2 optimal position as rankx + ranky. The optimal position is defined as the minimum of this number. In the case that there are two optimal positions, the one with the smaller 2 2 rankx +ranky is defined as optimum. The square in rankx +ranky ensures that an optimal position is preferred where both the x- and y-direction have low ranks, as opposed to a situation where only one direction is optimised and the other still has a relatively large gradient (e.g. twice a rank of 5 is preferred over a 0 and a 10).

Output

The algorithm writes the optimal position in r and θ, together with the corresponding IFU number to a file. Optionally the user can plot the fits for the optimal position as well as detailed information (rank, slopes, mean peak counts) for all positions. An example for such a plot is shown in Figure 10.4. This algorithm has now been established as a part of the KMOS procedures and has successfully been used during the last recommissioning in March 2015.

10.2.3 Other possible approaches

Instead of fitting gradients on the raw-data, one could also perform an image reconstruction and analyse the two-dimensional flat field directly. This may be somewhat more intuitive but has to rely on the image reconstruction which may introduce some effects itself. Another option is to use sky-flats as a ”benchmark”: Those would in principle be gradient free, therefore minimising the difference between the lamp-flats and the sky-flats would also solve the problem of the optimal calibration position. When testing this the two approaches gave consistent results. However, whilst this procedure can use the raw data directly, it is not ideal since the determination of the calibration position needs to be done after a KMOS intervention and during the designated engineering time. Therefore it would require relatively clear skies at this exact point in time, which may not always be the case. The advantage of the approach taken is that it does not rely on any image reconstruction or additional data except for the lamp-flats themselves.

131 CHAPTER 10. A new calibration algorithm for KMOS

Fig. 10.4 — This is an example for the output of the calibration position algorithm. The top panel shows (line by line) the slopes for both y- and x-direction, the corresponding rank, and the values for the position (r/θ) together with (as an additional check) the average count over all peaks, which should be higher for more evenly illuminated pick- off mirrors. The position highlighted in red is the optimal position (it coincides with the lowest rank in x-direction. The position with lowest rank in y is shown in cyan). The middle and the lower panels show the fits in y- and x-direction respectively for the optimal position. The yellow slopes in the bottom panel have been discarded. For details, see text.

132 Bibliography

Abadi, M. G., Moore, B., & Bower, R. G. 1999, MNRAS, 308, 947

Aguirre, A., Schaye, J., & Theuns, T. 2002, ApJ, 576, 1

Balogh, M., Eke, V., Miller, C., et al. 2004, MNRAS, 348, 1355

Behroozi, P. S., Wechsler, R. H., & Conroy, C. 2013, ApJL, 762, L31

Bennett, C. L., Larson, D., Weiland, J. L., et al. 2013, ApJS, 208, 20

Berlind, A. A., Frieman, J., Weinberg, D. H., et al. 2006, ApJS, 167, 1

Bergeron, J. 1986, A&A, 155, L8

Best, P., Smail, I., Sobral, D., et al. 2013, Astrophysics and Space Science Proceedings, 37, 235

Birrer, S., Lilly, S., Amara, A., Paranjape, A., & Refregier, A. 2014, ApJ, 793, 12

Blaizot, J., Wadadekar, Y., Guiderdoni, B., et al. 2005, MNRAS, 360, 159

Borgani, S. 2006, arXiv:astro-ph/0605575

Boylan-Kolchin, M., Springel, V., White, S. D. M., Jenkins, A., & Lemson, G. 2009, MNRAS, 398, 1150

Capak, P. L., Riechers, D., Scoville, N. Z., et al. 2011, Nature, 470, 233

Capak, P., Aussel, H., Ajiki, M., et al. 2007, ApJS, 172, 99

Castignani, G., Chiaberge, M., Celotti, A., Norman, C., & De Zotti, G. 2014, ApJ, 792, 114

Cooke, E. A., Hatch, N. A., Muldrew, S. I., Rigby, E. E., & Kurk, J. D. 2014, MNRAS, 440, 3262

Cowie, L. L., & Songaila, A. 1998, Nature, 394, 44

Cucciati, O., Zamorani, G., Lemaux, B. C., et al. 2014, A&A, 570, A16

Cucciati, O., Marinoni, C., Iovino, A., et al. 2010, A&A, 520, A42

133 APPENDIX.

Daddi, E., Cimatti, A., Renzini, A., et al. 2004, ApJ, 617, 746

De Lucia, G., & Blaizot, J. 2007, MNRAS 375, 2

D´ıaz-Gim´enez,E., & Zandivarez, A. 2015, A&A, 578, A61

Diener, C., Lilly, S. J., Ledoux, C., et al. 2015, ApJ, 802, 31

Diener, C., Lilly, S. J., Knobel, C., et al. 2013, ApJ, 765, 109

Dressler A. 1980, ApJ, 236, 351

Eke, V. R., Baugh, C. M., Cole, S., et al. 2004, MNRAS, 348, 866

Elvis, M., Civano, F., Vignali, C., et al. 2010, VizieR Online Data Catalog, 218, 40158

Font-Ribera, A., Miralda-Escud´e,J., Arnau, E., et al. 2012, JCAP, 11, 059

Font-Ribera, A., Arnau, E., Miralda-Escud´e, J., et al. 2013, JCAP, 5, 018

Font-Ribera, A., Kirkby, D., Busca, N., et al. 2014, JCAP, 5, 027

Font, A. S., Bower, R. G., McCarthy, I. G., et al. 2008, MNRAS, 389, 1619

Gerke, B. F., Newman, J. A., Davis, M., et al. 2005, ApJ, 625, 6

Gobat, R., Daddi, E., Onodera, M., et al. 2011, A&A, 526, A133

Guaita, L., Gawiser, E., Padilla, N., et al. 2010, ApJ, 714, 255

Gunn, J. E., & Gott, J. R. 1972, ApJ, 176, 1

Guo, Q., White, S., Boylan-Kolchin, M., et al. 2011, MNRAS, 413, 101

Hatch, N. A., De Breuck, C., Galametz, A., et al. 2011, MNRAS, 410, 1537

Henriques, B. M. B., White, S. D. M., Lemson, G., et al. 2012, MNRAS, 421, 2904

Hilton, M., Lloyd-Davies, E., Stanford, S. A., et al. 2010, ApJ, 718, 133

Hopkins, A. M., & Beacom, J. F. 2006, ApJ, 651, 142

Horne, K. 1986, PASP, 98, 609

Huchra, J. P., & Geller, M. J. 1982, ApJ, 257, 423

Ilbert, O., McCracken, H. J., Le F`evre,O., et al. 2013, A&A, 556, A55

Ilbert, O., Capak, P., Salvato, M., et al. 2009, ApJ, 690, 1236

Jenkins, A., Frenk, C. S., White, S. D. M., et al. 2001, MNRAS, 321, 372

Jenkins, A., Frenk, C. S., Pearce, F. R., et al. 1998, ApJ 499, 20

Kawata, D., & Mulchaey, J. S. 2008, ApJ, 672, L103

134 BIBLIOGRAPHY

Kitzbichler , M. G., & White, S. D. M., 2007, MNRAS, 376, 2

KMOS User Manual P93, https://www.eso.org/sci/facilities/paranal/instruments/kmos/doc/VLT- MAN-KMO-146603-001 P93.pdf

Knobel, C., Lilly, S. J., Woo, J., & Kovaˇc,K. 2015, ApJ, 800, 24

Knobel, C., Lilly, S. J., Kovaˇc,K., et al. 2013, ApJ, 769, 24

Knobel, C., Lilly, S. J., Iovino, A., et al. 2012, ApJ 753, 121

Knobel, C., Lilly, S. J., Iovino, A., et al. 2009, ApJ, 697, 1842

Kodama, T., Tanaka, I., Kajisawa, M., et al. 2007, MNRAS, 377, 1717

Komatsu, E., Smith, K. M., Dunkley, J., et al. 2011, ApJS, 192, 18

Kovaˇc,K., Lilly, S. J., Knobel, C., et al. 2014, MNRAS, 438, 717

Koyama, Y., Kodama, T., Tadaki, K.-i., et al. 2014, ApJ, 789, 1

Larson, R. B., Tinsley, B. M., & Caldwell, C. N. 1980, ApJ, 237, 692

Lee, K. G., Hennawi, J. F., White, M., et al. 2015, arXiv:1509.02833

Lemaux, B. C., Cucciati, O., Tasca, L. A. M., et al. 2014, A&A, 572, A41

Lemson, G., & Virgo Consortium, t. 2006, arXiv:astro-ph/0608019

Lemson, G., & Springel, V. 2006, Astronomical Data Analysis Software and Systems XV, 351, 212

Lilly, S. J., et al., in preparation

Lilly, S. J., Peng, Y., Carollo, M., & Renzini, A. 2013, IAU Symposium, 295, 141

Lilly, S. J., Le Brun, V., Maier, C., et al. 2009, ApJS, 184, 218

Lilly, S. J., Le F`evre,O., Renzini, A., et al. 2007, ApJS, 172, 70

Lilly, S. J., Le Fevre, O., Hammer, F., & Crampton, D. 1996, ApJL, 460, L1

Lynds, R. 1971, ApJ, 164, 73L

Mainieri, V., Hasinger, G., Cappelluti, N., et al. 2008, VizieR Online Data Catalog, 217, 20368

Marinoni, C., Davis, M., Newman, J. A., & Coil, A. L. 2002, ApJ, 580, 122

McDonald, P. 2003, ApJ, 585, 34

McCracken, H. J., Milvang-Jensen, B., Dunlop, J., et al. 2012, A&A, 544, A156

Miley, G. K., Overzier, R. A., Zirm, A. W., et al. 2006, ApJL, 650, L29

135 APPENDIX.

Moore, B., Katz, N., Lake, G., Dressler, A., & Oemler, A. 1996, Nature, 379, 613

Moustakas, L. A., & Somerville, R. S. 2002, ApJ, 577, 1

Muldrew, S. I., Hatch, N. A., & Cooke, E. A. 2015, arXiv:1506.08835

Murphy, D. N. A., Geach, J. E., & Bower, R. G. 2012, MNRAS, 420, 1861

Oemler, A. Jr. 1974, ApJ, 194, 1

Papovich, C., Momcheva, I., Willmer, C. N. A., et al. 2010, ApJ, 716, 1503

Pˆaris,I., Petitjean, P., Rollinde, E., et al. 2011, A&A, 530, A50

Pasquali, A., Gallazzi, A., Fontanot, F., et al. 2010, MNRAS, 407, 937

Peng, Y.-j., Lilly, S. J., Renzini, A., & Carollo, M. 2012, ApJ, 757, 4

Peng, Y.-j., Lilly, S. J., Kovaˇc,K., et al. 2010, ApJ, 721, 193

Petitjean, P., Rauch, M., & Carswell, R. F. 1994, A&A, 291, 29

Prescott, M., Baldry, I. K., James, P. A., et al. 2011, MNRAS, 417, 1374

Pych, W. 2004, PASP, 116, 148

Rakic, O., Schaye, J., Steidel, C. C., & Rudie, G. C. 2012, ApJ, 751, 94

Rakic, O., Schaye, J., Steidel, C. C., et al. 2013, MNRAS, 433, 3103

Reddy, N. A., Steidel, C. C., Pettini, M., et al. 2008, ApJS, 175, 48

Rykoff, E. S., Rozo, E., Busha, M. T., et al. 2014, ApJ, 785, 104

Rudie, G. C., Steidel, C. C., Trainor, R. F., et al. 2012, ApJ, 750, 67

Schinnerer, E., Sargent, M. T., Bondi, M., et al. 2010, ApJS, 188, 384

Scodeggio, M., Franzetti, P., Garilli, B., et al. 2005, PASP, 117, 1284

Scoville, N., Aussel, H., Benson, A., et al. 2007, ApJS, 172, 150

Sharples, R., Bender, R., Bennett, R., et al. 2006, Proc. SPIE, 6269, 62691C

Shimakawa, R., Kodama, T., Tadaki, K.-i., et al. 2014, MNRAS, 441, L1

Skibba, R. A. 2009, MNRAS, 392, 1467

Slosar, A., Font-Ribera, A., Pieri, M. M., et al. 2011, JCAP, 9, 001

Slosar, A., Irˇsiˇc,V., Kirkby, D., et al. 2013, JCAP, 4, 026

Spitler, L. R., Labb´e,I., Glazebrook, K., et al. 2012, ApJL, 748, L21

Spitzer, L. Jr., & Baade, W. 1951, ApJ, 113, 413

136 BIBLIOGRAPHY

Springel, V., White, S. D. M., Jenkins, A., et al. 2005, Nature, 435, 629

Steidel, C. C., Adelberger, K. L., Shapley, A. E., et al. 2005, ApJ, 626, 44

Steidel, C. C., Shapley, A. E., Pettini, M., et al. 2004, ApJ, 604, 534

Suzuki, N., Tytler, D., Kirkman, D., O’Meara, J. M., & Lubin, D. 2005, ApJ, 618, 592

Tanaka, M., Finoguenov, A., & Ueda, Y. 2010, ApJL, 716, L152

Tran, K. V. H., Papovich, C., Saintonge, A., et al. 2010, ApJL, 719, L126

Trenti, M., Bradley, L. D., Stiavelli, M., et al. 2012, ApJ, 746, 55 van den Bosch, F. C., Aquino, D., Yang, X., et al. 2008, MNRAS, 387, 79

Venemans, B. P., R¨ottgering,H. J. A., Miley, G. K., et al. 2007, A&A, 461, 823

Wang, J., De Lucia, G., Kitzbichler, M. G., & White, S. D. M. 2008, MNRAS, 384, 1301

Weinmann, S. M., Kauffmann, G., van den Bosch, F. C., et al. 2009, MNRAS, 394, 1213

Wetzel, A. R., Tinker, J. L., Conroy, C., & van den Bosch, F. C. 2013, MNRAS, 432, 336

White, R. L., Becker, R. H., Helfand, D. J., & Gregg, M. D. 1997, ApJ, 475, 479

Wolf, C., 2003, AAP, 408, 499

Wuyts, S., F¨orsterSchreiber, N. M., van der Wel, A., et al. 2011, ApJ, 742, 96

XSHOOTER Pipeline Manual v12.5, ftp://ftp.eso.org/pub/dfs/pipelines/xshooter/xshoo- pipeline-manual-12.5.pdf

137 APPENDIX.

138 BIBLIOGRAPHY

List of publications

A proto-cluster at z = 2.45 Diener, C., Lilly, S.J., Ledoux, C., et al. 2015, ApJ, 802, 31D

Proto-groups at 1.8 < z < 3 in zCOSMOS-deep Diener, C., Lilly, S.J., Knobel, C., et al. 2013, ApJ, 765, 109D

Environmental Effects in the Interaction and Merging of Galaxies in zCOSMOS Kampczyk, P., Lilly, S.J., de Ravel, L., ... Diener, C., et al. 2013, ApJ, 762, 43K

Spot the difference. Impact of different selection criteria on early-type galaxies observed properties in zCOSMOS 20-k sample Moresco, M., Pozzetti L., Cimatti, A., ... Diener, C., et al. 2013, A&A, 558, 61M

Improved constraints on the expansion rate of the Universe up to z ∼ 1.1 from the spec- troscopic evolution of cosmic chronometers Moresco, M., Cimatti, A., Jimenez, R., ... Diener, C., et al. 2012, JCAP, 08, 006M

Quest for COSMOS Submillimeter Galaxy Counterparts using CARMA and VLA: Iden- tifying Three High-redshift Starburst Galaxies Smolcic, V., Navarrete, F., Aravena, M., ... Diener, C., et al. 2012, ApJS, 200, 10S

The Radial and Azimuthal Profiles of Mg II Absorption around 0.5 < z < 0.9 zCOSMOS Galaxies of Different Colors, Masses, and Environments Bordoloi, R., Lilly, S.J., Knobel, C., ... Diener, C., et al. 2011, ApJ, 743, 10B

139 APPENDIX.

140 BIBLIOGRAPHY

CURRICULUM VITAE - CATRINA DIENER ______

Institute for !!!!!!! !! Date of birth: 2. 3. 1987 ETH Zurich, Switzerland!!!!!!!! Nationality: Swiss Phone: +41 76 464 23 58 [email protected]

Scientific working experience ______

10/2011 - 10/2015!!PhD studies on “The role of environment in galaxy evolution at z>2” !!!!Institute for Astronomy, ETH Zurich, Switzerland. Supervisor: Prof. Simon Lilly

10/2013 - 12/2014!!ESO Studentship at ESO Vitacura, Chile. Continuation of PhD studies. !!!!Supervisor: Dr. Cédric Ledoux 03/2011 - 09/2011!!Research Assistant, Institute for Astronomy, ETH Zurich 10/2010 - 02/2011!!Master Thesis, Institute for Astronomy, ETH Zurich. Supervisor: Prof. Simon Lilly !!!!Topic: Groups at z~2. Grade: 5.75 (1=lowest and 6=highest grade)

03/2010 - 09/2010 !!Junior Research Assistant, Institute for Astronomy, ETH Zurich !!! 03/2009 - 02/2010!!Semester Thesis, Institute for Astronomy, ETH Zurich. Supervisor: Prof. M. Carollo !!!!Topic: Testing and improving SED models for photometric redshift determination

Education ______

10/2011 - 10/2015!!PhD studies, Observational Cosmology Group, Institute for Astronomy, ETH Zurich !!!!and ESO Vitacura, Chile

09/2009 - 09/2011!!Master of Science ETH in Physics, ETH Zurich !!!!! 10/2005 - 07/2009!!Bachelor of Science ETH in Physics, ETH Zurich!!!!!

Grants and scholarships awarded ______

11/2014!Grant SCFA (Swiss Commission for Astronomy), 1000CHF for living costs in Santiago to extend the !!ESO studentship

03/2014!Travel Grant SSAA (Swiss Society for Astrophysics and Astronomy), 1000CHF for attending the Con- !!ference “Future Directions in Galaxy Cluster Surveys”, Paris, France 10/2013!ESO studentship for ESO Vitacura, Chile. Placement at ESO for one year whilst continuing my PhD 03/2013!Travel Grant SSAA, 1500CHF for travel to Santiago, Chile, visit at ESO Vitacura and collaboration !!with researchers at Universidad Catolica

03/2012!Travel Grant SSAA, 600CHF for attending the conference “Growing-up at high redshift”, ESA Madrid

Teaching Experience ______

02/2015 - 05/2015!!Exercise classes in Physics II for ETH students 01/2012 & 01/2013!!Supervising ETH students during observational week on 8cm telescopes

02/2012 - 12/2012!!Exercise classes in Physics I&II for ETH students

02/2012 - 09/2012!!Supervision of a Masters Thesis. Student: Valentina Tamburello !!!!Topic: Galaxy merging in zCOSMOS: comparing simulations and data

03/2009 - 07/2010!!Teaching Assistant at the Department of Mathematics, ETH Zurich

141 APPENDIX.

Outreach and other activities ______!!! 11/2014!Organisation of a “tool” workshop for ESO students: Introduction to various data reduction and data !!analysis techniques 08/2014!Co-organising a proposal writing workshop for ESO students and fellows 11/2013!Outreach project on Easter Island: Teaching primary school students the basics of astronomy 05/2012!Scientific Consultant for an Astronomy exhibition in the Swiss Traffic Museum

11/2011!SOC talk: Introducing ETH undergraduates to possible thesis topics

07/2011!“Highschool student week”: Giving visiting highschool students a first insight into astronomical data !!and research

Further skills ______

Languages!! German!Mother tongue !!!English!! Fluent (Level C1 in the European Reference Frame) !!!Spanish!Very good (Level B2) !!!French!!Good (Level B1)!! !!!Italian!!Good (Level B1) !!!Swedish!Basic (Level A1.2)

Programming!! Matlab, C++, Python, SQL Data reduction !IRAF, ESO pipelines and associated tools (esorex, gasgano and Reflex)

Instruments !! XSHOOTER, FORS2, ISAAC, HAWKI, UVES, KMOS (all VLT), SPARTAN (SOAR), !!!FourStar (Magellan), IMACS (Magellan), DECam (Blanco 4m/CTIO) Techniques!!Echelle and multi-slit spectroscopy, broad and narrow band imaging, IFU, developing tools !!!for data reduction and analysis for XSHOOTER, ISAAC and KMOS

Other software! LaTex, P2PP, FIMS, DS9, topcat, Aladdin, SExtractor

142 Acknowledgments

First and foremost I would like to thank my supervisor Simon Lilly. This thesis owes a great deal to your inspiration and insight. Undoubtedly our paths will cross again, but I want to thank you with my whole heart to have taken the time and patience to guide my first steps towards becoming a researcher. It has been a true privilege and honour to learn from you. During the 3rd year of my PhD I have spent an unforgettable year at ESO Chile working with C´edric Ledoux. It has been a wonderful experience working with you, both as a scientist but also as a person, and your continuous support throughout the time at ESO and beyond has been invaluable. Your dedication, knowledge and enthusiasm are the best example to me. Thank you, C´edric! I would like to thank my friends and colleagues for the countless coffee breaks, beers and chats, in particular Andreas Faisst, Martin Bernet and Christian Knobel from ETH. As special thank goes to my friends in Chile, Matt Shultz for the fun and interesting discussions, Juan Carlos Beamin and Mirjam de Boere for the morning coffees; I still miss them! Stephane Brillant definitely deserves an extra mention: thank you for being such a wonderful friend and also a mentor (on “boring stars” and “stuff”...)! A very big thank you also goes to Rongmon Bordoloi: It is always a pleasure to chat with you, share coffees, beers, mojitos, discussing life, the Universe and evil. Thank you for your help and support and for loads of last minute comments and corrections to this thesis. In the last few weeks Will Hartley has become “the” one person to bug with even the most ignorant questions; thank you so much for your patience, time and help! All throughout my time at ETH I have been a member of the cathedral choir St. Gallen. The Friday evening rehearsal is a pillar in my life and has often been the one thing I was looking forward to. Thank you Hans and Kim, for all the joy I could experience and that has kept me motivated for my work and thanks to Helen, Ruth, Milo and many more for all the wonderful hours spent together. David Murphy deserves a huge thank you for big job of reading through large parts of this thesis, providing correction and suggestions: Thank you for all of this, but even more important, for your patience, support, enduring me when being stressed the most and of course for sharing the last 3 years of this journey! Last, I would like to thank my family, my mother and my sister. You have seen me all the way through, always been there, always supported me. I can’t even start to put a name on all the things you have done for me and what it has meant to me. I am the most lucky to have you in my life. Thank you!

143 APPENDIX.

144