<<

BIODIVERSITY, DISTRIBUTION AND

EVOLUTION OF ENDOLITHIC

MICROORGANISMS IN CORAL SKELETONS

Vanessa Rossetto Marcelino

ORCID 0000-0003-1755-0597 Doctor of Philosophy

School of Biosciences The University of Melbourne

Thesis submitted in total fulfillment of the requirements of the degree of Doctor of Philosophy

December 2016 ABSTRACT

Prokaryotic and eukaryotic microbes regulate key processes in reef ecosystems but very little is known about the biodiversity of microorganisms living inside coral skeletons (i.e. endolithic). Endolithic microalgae, for example, are among the main contributors of reef bioerosion and can facilitate coral survival during bleaching events, but their phylogenetic diversity, distribution and evolutionary origins are largely unknown. We developed a high- throughput sequencing procedure to assess the biodiversity of prokaryotic and eukaryotic microbes in coral skeletons. A surprisingly high biodiversity of was found, including entirely new lineages that are distantly related to known genera. This technique was then applied to study the relative effects of niche specialisation and neutral processes on the spatial distribution of endolithic communities. The results indicated that stochastic processes and dispersal limitation create a high rate of bacterial turnover within colonies, while niche specialisation explains most of the distribution of endolithic microbes at larger spatial scales. Finally, we studied whether signatures of an endolithic lifestyle could be observed in the genome of a common endolithic alga. The results suggested that chloroplast genome streamlining and slow rates of molecular evolution are associated with the low light inherent of endolithic lifestyles.

i DECLARATION

This is to certify that: i. The thesis comprises only my original work towards the PhD except where indicate in the preface. ii. Due acknowledgment has been made in the text to all other material used. iii. The thesis is fewer than 100,000 words in length, exclusive of tables, bibliographies and appendices.

______

Vanessa R. Marcelino

ii PREFACE

The four data chapters of this thesis have been written for submission to refereed journals and involve collaborators.

Chapter 2 has been published as: Marcelino VR, Verbruggen H (2016) Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Scientific Reports 6, 31508.

Both authors conceptualized the study and collected the samples. VRM performed DNA extractions and library preparation. VRM and HV developed the analysis pipeline. VRM wrote the manuscript, HV edited. VRM contributed with ~85% of the work.

Chapter 3 is being prepared for publication in collaboration with Kathleen M. Morrow, Madeleine van Oppen, David G. Bourne and Heroen Verbruggen.

KMM and DGB were responsible for the samples collection. VRM performed DNA extractions, library preparation and analyses. VRM wrote the manuscript and received comments from all other authors. VRM contributed with ~ 70% of the work.

Chapter 4 is being prepared for publication in collaboration with Eric Treml, Madeleine van Oppen and Heroen Verbruggen. VRM and HV conceptualized the study and collected the samples. VRM performed DNA extractions, library preparation and analyses. ET, MO and HV provided feedback and supervision for analyses. HV produced scripts for species accumulation curves. VRM wrote the manuscript and received feedback from HV. VRM contributed with ~90% of the work.

Chapter 5 has been published as: Marcelino VR, Cremen MC, Jackson CJ, Larkum AA, Verbruggen H (2016) Evolutionary dynamics of chloroplast genomes in low light: a case study of the endolithic green alga quekettii. Genome Biol Evol 8, 2939-2951.

VRM and HV conceptualized the study. CJ and CMC performed DNA extractions. VRM and CC performed genome assemblies and annotations. VRM performed the analyses. VRM wrote the manuscript and received comments from all other authors. VRM contributed with ~85% of the work.

iii ACKNOWLEDGMENTS

Back in 2012 Heroen Verbruggen suggested me to do a PhD project on the diversity of limestone-boring algae using high-throughput sequencing. I had never heard of boring algae before and doubted his unfounded expectation of finding several algal species inside coral skeletons. The idea was intriguing though, or perhaps I just wanted to prove him wrong, so I accepted the challenge. Well, he was right. I am immensely thankful to him for giving me the opportunity of pursuing this project, for being supportive about the directions and subprojects I have tried to pursue, and for being patient and still supportive when they failed. Heroen’s optimism, determination and passion for science are contagious and this project would not be possible without his excellent supervision.

I am very thankful to Tom Schils (University of Guam) for sending me the very first coral skeletons samples I analysed and for supporting the boring algae idea from the start (still in 2012). I am also thankful to Tom’s student Adrian Kense for the interest and for sharing some good and bad library prep moments.

In 2013 I started my PhD, and the first thing a PhD student in Australia needs is an advisory committee. I am very thankful to Jane Elith, Ed Newbigin and Mick Keough, their well-grounded opinion was extremely valuable and our discussions during the progress meetings were very educational for me. I am also very grateful to my co-supervisor Madeleine van Oppen for her insightful inputs during those discussions, but I am writing these acknowledgments in chronological order so I will get to Madeleine later.

The second thing a PhD student need is samples and mine were underwater. I would like to thank Mel Tate and the Melbourne University Underwater Club for teaching me the first Open Water dive courses, and the Diveline crew for making me a dive master. The field work associated with this thesis was fantastic thanks to all the amazing people involved in it. My first field trip was in Western Australia and I would like to thank Seraphya Berrin, Heroen Verbruggen, Joana Ferreira Costa, Lambros Stravias and the personnel from Murdoch University Coral Bay Research Station. Then my first samples from the renowned Great Barrier Reef were collected on Keppel Islands thanks to Guillermo Diaz-Pulido (Griffith University), his student Carlos Del Monaco and the skipper Peter Williams. This was during my mid-PhD “why am I doing this?” crisis moment. Guillermo is one of the few scientists on Earth who has interest and experience with boring algae, his genuine excitement was a very important motivation to carry on with this project and I am very thankful for that. Maybe that was the best outcome of this field trip as I lost all the collected samples due to a bad storage buffer… Same fate for all deep sea samples kindly provided by Heather Spalding (University

iv of Hawaii), thanks anyway Heather. The final large field excursion was carried out on Heron Island, and I am thankful to Chiela Cremen, Pilar Diaz-Tapia and Heroen Verbruggen, in addition to the helpful staff at Heron Island. On the way back we had a chance of passing by Keppels for re-collecting those lost samples, but the windy weather only allowed us to dive once. Oh well, I got some samples and the trip was fun anyway. A substantial part of my field work was supported by philanthropic funding, and I am very grateful to the Botany Foundation and the Holsworth Wildlife Research Endowment.

Then it comes the molecular work, and dozens of people have helped me enormously. Before the Verbruggen lab physically existed I was hosted by the Systematics and the Malaria labs. I would like to thank Mike Bayly and Geoffrey McFadden for letting me use their facilities, and all their staff and students for welcoming and helping me around, including Erin Batty, Anton Cozijnsen and Vanessa Mollard. Special thanks to Todd McLay for helping with the library preparation issues, and there were a lot of issues. The staff from the sequencing facilities has also been very helpful, and I would like to thank Kym Pham (Centre for Translational Pathology), Matthew Tinning (Australian Genome Research Facility) and Stephen Wilcox (Walter and Eliza Hall Institute). I would like to thank Cassie Watts, Ouda Khammy and Kirsty for the countless favors. Last but not least, I would like to thank all past and present members of the Verbruggen lab, especially Joana Costa, Chiela Cremen, Lupita Bribiesca and Chris Jackson for their assistance with all sorts of stuff, for all the SWASIs we went through (and survived), and for making the everyday work more fun.

Not only algae live inside coral skeletons and I thought it would be interesting to adventure into the world of prokaryotes, but they were so scarily unfamiliar to me. Luckily Madeleine van Oppen moved to Melbourne Uni in 2015 and accepted to be my co-supervisor, which was a boost of confidence to work on bacteria. I am very thankful to her for assisting with her extensive expertise, nice ideas and even moral support. I would also like to thank Raquel Peixoto (Universidade Federal do Rio de Janeiro) and the members of her lab for hosting me during an internship, which gave me a taste of what is like to work in a microbiology lab. I am also grateful to Kathy Morrow and David Bourne (AIMS) for providing samples and microbial ecology expertise.

I am also thankful to the always-happy Joan and Asmira (Melbourne Uni) for making the university feel more like home. To Fran and Witold for making Australia feel more like home too.

Finally, I would like to thank family and friends back home in Brazil. It was always comforting to hear that if my thesis would totally fail and I had to come back home without a PhD, at least someone would be happy. Well mom, it did not happen.

v TABLE OF CONTENTS

Abstract wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww i

Declaration wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww ii

Preface wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww iii

Acknowledgementswwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww iv

Table of Contents wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww vi

List of Figures wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww viii

Chapter 1: Introduction wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 1

1.1 The state of the art in sequencing techniques wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 1

1.2 Diversity and function of endolithic organisms wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 2

1.3 Distribution and structure of endolithic communities wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 4

1.4 Evolutionary considerations wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 4

1.5 Aims and scope wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 6

1.6 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 7

Chapter 2: Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 13

2.1 Introduction wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 15

2.2 Results wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 16

2.3 Discussion wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 18

2.4 Materials and Methods wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 21

2.5 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 22

2.6 Supplementary Materials wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 26

Chapter 3: Diversity and stability of coral endolithic communities at a natural high pCO2 reef wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 37

3.1 Introduction wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 38

3.2 Materials and Methods wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 40

3.3 Results wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 44

3.4 Discussion wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 50

vi 3.5 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 56

3.6 Supplementary Materials wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 64

Chapter 4: Distribution of endolithic microbial communities wwwwwwwwwwwwwwwwwwwwwwwwwwwwww 73

4.1 Background wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 74

4.2 Spatial structure of the endolithic microbiome within and between

coral colonieswwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 75

4.3 Niche specificity and neutral processes underlying the distribution of

endolithic microbes wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 83

4.4 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 91

4.5 Supplementary Materials wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 96

Chapter 5: Evolutionary dynamics of chloroplast genomes in low light: a case study of the endolithic green alga Ostreobium quekettii wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 107

5.1 Introduction wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 108

5.2 Materials and Methods wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 110

5.3 Results wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 111

5.4 Discussion wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 114

5.5 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 118

5.6 Supplementary Materialswwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 121

Chapter 6: Discussion and Perspectives wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 129

6.1 Biodiversity wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 130

6.2 Distribution wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 131

6.3 Evolution wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 131

6.4 The next steps wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 133

6.5 References wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 135

vii LIST OF FIGURES

1.1 Coral fragments revealing a green band of endolithic algae wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 2

2.1 The multi-marker metabarcoding approach wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 17

2.2 Relative abundances of taxa in our biodiversity assessment wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 17

2.3 Maximum Likelihood phylogeny of green algae wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 19

3.1 Microorganisms in coral skeletons of Porites sp. wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 45

3.2 Principal Coordinate Analysis contrasting high pCO2 and control sites wwwwwwwwwwwwwwwwwwww 47

3.3 Microorganisms in coral skeletons of P. damicornis, S. hystrix and Porites sp. wwwwwwww 49

3.4 Principal Coordinate Analysis contrasting coral hosts wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 50

4.1 Massive Porites lutea colonies in Western Australia and sampling design wwwwwwwwwwwwwww 77

4.2 Intracolony distance-decay relationships wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 78

4.3 Distance-decay relationships for the bacterial community across different scales wwww 86

4.4 Distance-decay relationships for the algal community across different scales wwwwwwwwww 87

5.1 Gene map of the Ostreobium quekettii chloroplast genome wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 112

5.2 Mauve alignment of green algae chloroplast genomes wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 113

5.3 Proportion of genes, introns and intergenic spacers in the chloroplast genomes wwwwwww 114

6.1 A general framework to study holobionts and hologenomes wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 129

6.2 Network of co-occurring bacterial and algal genera wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 134

viii CHAPTER 1 INTRODUCTION

Most creatures are associated with a diverse and abundant microbiota. The host organism and all its associated microorganisms, named the holobiont, respond together to diseases, environmental stress and selective forces (Rosenberg et al. 2007; Bourne et al. 2009; Egan & Gardiner 2016). Reef-building corals exemplify the functioning of a holobiont: besides their well-known endosymbiotic dinoflagellates performing photosynthesis (Symbiodinium), corals harbour other microorganisms capable of nitrogen fixation, antimicrobial production and other functions that potentially play a role in the fitness, health and resilience of the holobiont (Reshef et al. 2006; Rosenberg et al. 2007; Santos et al. 2014; Radecker et al. 2015; Bourne et al. 2016). This field of research is still in its infancy as the development of the sequencing technologies that allow detecting the majority of these microbes is relatively recent, and most microorganisms populating corals are still poorly characterised. Most studies to date have focused on the bacterial communities of coral living tissues, while microbes living in other microhabitats within corals, like their skeletons, have been largely overlooked. This study uses high-throughput sequencing to investigate the diversity and distribution of prokaryotic and eukaryotic microorganisms living inside coral skeletons and explores the evolution of the green algal members of this uncharted microbiome.

1.1 The state of the art in sequencing techniques

The study of microorganisms has been revolutionized in recent years with the development of high-throughput sequencing technologies (Shokralla et al. 2012). The per- base cost of sequencing has dropped drastically and billions of DNA sequences can now be generated in a single run. These technologies allow in-depth biodiversity assessments of microbial communities through the amplification of genetic markers (for example, the 16S rDNA) directly from environmental DNA, a technique known as metabarcoding (Taberlet et al. 2012). The vast majority of microorganisms are hard to culture and have never been isolated, therefore this culture-independent approach has led to the discovery of an enormous number of species with unknown ecological roles – the microbial dark matter (Rappé & Giovannoni 2003; Marcy et al. 2007).

CHAPTER 1 2

Microbial biodiversity assessments however still miss a substantial fraction of the species diversity because they are commonly based on the 16S rDNA marker. Microbial are only occasionally included in these surveys by using a different marker (e.g. 18S rDNA). Universal primers amplify only a few eukaryotic algae species and the plastid 16S and the nuclear 18S rDNA have poor resolution to distinguish closely related species. The development of a cost-effective technique to simultaneously sequence eukaryotic and prokaryotic microbes would be an asset to study complex microbial communities where members of both domains play important ecological roles, like in coral skeletons.

1.2 Diversity and function of endolithic organisms

Organisms living inside hard substrates (generally limestone) are termed endoliths. They may actively penetrate the substrate (euendoliths), dwell in pores and cavities produced by other creatures (cryptoendoliths), or inhabit cracks in the rock (chasmoendoliths) (Golubic et al. 1981; Tribollet 2008a). For practical reasons, in this thesis they will be referred to simply as endolithic or boring organisms. In marine habitats these organisms are found in rocky substrates, shells, crustose coralline algae and inside the skeletons of live and dead reef- building corals (order Scleractinia) (Golubic 1969; Le Campion-Alsumard et al. 1995; Tribollet 2008a). The endolithic population in coral skeletons is composed of bacteria, archaea, fungi, sponges and algae, the latter often forming a conspicuous green band in the skeletons (Figure 1.1).

Figure 1.1. Coral fragments revealing a green band of endolithic algae.

CHAPTER 1 3

Endolithic green algae are remarkably abundant in coral skeletons (Odum & Odum 1955; Golubic 1969) and play a key role in the calcium carbonate budget of reef ecosystems (Tribollet 2008b; Tribollet et al. 2011; Grange et al. 2015). These algae actively dissolve limestone substrates to produce the boreholes in which they live (Golubic et al. 1981; Tribollet et al. 2011), a process that can lead to the dissolution of up to 1 kg of reef carbonate per m2 per year (Tribollet 2008b; Grange et al. 2015). Endolithic algae also attract grazers that additionally contribute to reef bioerosion (Chazottes et al. 1995; Clements et al. 2016).

Corals may benefit from harbouring endolithic algae during bleaching events. Coral bleaching is the whitening of corals due to the loss of their nutrient-provider symbionts (Symbiodinium) and can lead corals to starvation. Without Symbiodinium, more light reaches the skeleton and the endolithic algae biomass increases (Fine et al. 2006). The algae transfer part of their photosynthates to corals (Schlichter et al. 1995) and potentially extend the time corals can survive without Symbiodinium (Fine & Loya 2002). An excessive growth of endolithic algae however might have adverse effects like weakening of the skeleton and damaging the coral tissue (see Peters 1984; Fine et al. 2006).

The most common algae in coral skeletons are members of the siphonous Ostreobium (, , ). Ostreobium is the only genus in the Ostreobiaceae family and contains three described species, although there are some inconsistencies in literature and evidence for cryptic diversity. A study on the rbcL gene diversity found seven Ostreobium genotypes, indicating that the species diversity of this genus is higher than its morphology suggests (Gutner-Hoch & Fine 2011). Cryptic diversity is not unexpected for these algae given their limited number of morphological characters that can be used to distinguish different species, microscopic size, inconspicuous lifestyle and broad distribution. Because different species may have different physiological traits, characterising this cryptic diversity is the first step towards understanding their ecological roles in reef ecosystems.

Fungi are also notable residents in coral skeletons. At least 12 genera of endolithic fungi have been identified (Golubic et al. 2005). They primarily feed on Ostreobium and coral polyps (Golubic et al. 2005) and are thought to play a role in nitrogen cycling (Wegley et al. 2007). A diverse population of bacteria (including cyanobacteria) are also found in coral skeletons, including taxa that have only been found in the skeleton and species that also occur in the coral tissue (Meron et al. 2012; Ainsworth et al. 2015). The relationships between endolithic algae, fungi and bacteria are thought to be in dynamic equilibrium in healthy holobionts, but when corals are under stress some members of this microbiome may bloom while others vanish, potentially harming the corals in ways that are still poorly understood (Golubic et al. 2005; Ainsworth et al. 2008).

CHAPTER 1 4

1.3 Distribution and structure of endolithic communities

While little is known about the biodiversity of endolithic organisms, virtually nothing is known about their distribution. Microbes tend to have very broad distributions when compared to large organisms (Fenchel & Finlay 2004). This could be the case of the common endolithic alga Ostreobium: although it is mostly known from tropical reefs, this alga has also been reported in high latitude areas as Helgoland and Iceland (Kornmann & Sahling 1980; Gunnarsson & Nielsen 2016). It also occurs in shallow and deep (>200 m) waters (Odum & Odum 1955; Littler et al. 1985; Aponte & Ballantine 2001). Since a high cryptic diversity within the Ostreobium genus is expected it is likely that different habitats are occupied by different Ostreobium lineages, as observed by Gutner-Hoch & Fine (2011) along a depth gradient. Likewise, it is possible that other endolithic microorganisms are not homogeneously distributed, but instead, correlated with particular habitats, environmental conditions or geographic distances.

Characterising distribution patterns can provide insights into the processes underlying the generation and maintenance of biodiversity. Niche specialisation is a common deterministic process causing non-homogeneous species distributions (e.g. Wang et al. 2013). Neutral processes such as random colonization and extinction events and stochastic demographic changes in a population (i.e. ecological drift) also play a role in shaping species distributions (Hubbell 2001; Chase & Myers 2011). Adding to the complexity, the relative importance of deterministic and stochastic processes can vary with spatial scale (Martiny et al. 2011). One way to investigate distribution patterns and the underlying processes is to analyse how community similarity correlates with distance and environmental factors across different spatial scales. To achieve that, a sampling strategy that encompasses very small distances to global spatial scales is necessary.

1.4 Evolutionary considerations

The hologenome theory of evolution postulates that the combined genome of the host and its microbiota – the hologenome – can act as a single unit of selection (Rosenberg et al. 2007). This theory was initially supported by evidence that corals are capable of quickly adapting to changing environmental conditions by altering their associated microbiota (Reshef et al. 2006). Support to this theory expanded from studies on corals to other organisms, including Drosophila, aphids and (Gilbert et al. 2010; Rosenberg et al. 2010; Sharon et al. 2010). However, many argue that this theory is not solid and lacks scientific evidence (Leggat et al. 2007; Moran & Sloan 2015). In fact, too little is known about the biodiversity and evolution of microorganisms to understand their role in evolution.

CHAPTER 1 5

Testing the hologenome theory of evolution is beyond the scope of this project, but studying the evolution of individual members of a holobiont helps to pave the way to investigate this theory in the future. This thesis highlights the evolution of limestone boring algae, which is particularly interesting from an evolutionary perspective because it occupies a habitat considered extreme for eukaryotic algae.

Endolithic microorganisms are often found in extreme environments where they need special ecophysiological adaptations to survive – like perform photosynthesis under extreme low light or endure wide temperature ranges (e.g. Shashar & Stambler 1992; Behrendt et al. 2011; Sun 2013; Robinson et al. 2015; Omelon 2016). Therefore endolithic microbes constitute good model systems to study ecophysiological traits and their molecular basis (e.g. Fork & Larkum 1989; Behrendt et al. 2011; Qiu et al. 2013). Their unusual ability of surviving in extreme environments attracts the attention of astrobiologists as endoliths might provide clues about the origins of life on Earth and the likely habitats to find life in other planets (McLoughlin et al. 2007; Walker & Pace 2007; Omelon 2016). Boring is an ancient lifestyle: the fossil record suggests the existence of endolithic microorganisms as early as 3.4 billion years ago (Furnes et al. 2004; Wacey et al. 2006; Cockell & Herrera 2008). Several selective pressures have been proposed to explain why some microorganisms bore, including protection from UV and predators, nutrient acquisition, and avoidance of burial (entombment) by surface mineralization (Cockell & Herrera 2008).

The fossil record suggests that Ostreobium has had an endolithic lifestyle since about 500 million years (Vogel 1993; Vogel & Brett 2009). Its boring lifestyle is associated with a remarkable ability to perform photosynthesis under extreme low light conditions. Only a fraction of the photosynthetic active radiation reaches Ostreobium in its endolithic habitat: the first millimetre of limestone attenuates about 99% of the light (the same amount attenuated by the water column throughout the entire photic zone), depending on the composition and density of the limestone (Nienow et al. 1988; Matthes et al. 2001). Other organisms living on the surface of the limestone can attenuate light even further: the corals’ living tissue and their zooxanthellae, for example, absorb 95 – 99.9% of the incident light (Halldal 1968; Schlichter et al. 1997). Besides living under extreme low light in shallow reefs worldwide, Ostreobium also thrives in cave-dwelling corals and probably grows deeper than any other green alga (Littler et al. 1985; Dullo et al. 1995; Aponte & Ballantine 2001; Hoeksema 2012). Known physiological specialisations include a special chlorophyll antenna that allows Ostreobium to harvest far-red light and an uphill energy transfer from the chlorophylls to Photosystem II (Fork & Larkum 1989; Wilhelm & Jakob 2006; Magnusson et al. 2007). From an evolutionary perspective, it would be interesting to investigate the genomic basis of these

CHAPTER 1 6 specialisations and how many times green algal lineages have transitioned to this particular niche.

1.5 Aims and scope

The multiple knowledge gaps about coral holobionts and their endolithic communities call for a multidisciplinary approach to understand their biodiversity, distribution and evolution. The first goal of this thesis is to set the baseline of the biodiversity of microorganisms living inside coral skeletons. To achieve that we first developed a cost- effective metabarcoding method and an analysis pipeline to assess both prokaryotic and eukaryotic microbes in the endolithic community. Then we report the endolithic diversity found across a variety of coral species and habitats. This is the second chapter of this thesis.

The second goal is to characterise aspects of the distribution and structure of endolithic communities in coral skeletons. Note that the focus is on the communities rather than on the distribution of individual endolithic lineages because a community can reveal emergent patterns that cannot be detected when analysing species in isolation (Konopka 2009; Tan 2016). To characterise distribution patterns and infer their underlying processes, we studied changes in community composition across different habitats and across different geographic scales. More specifically, Chapter 3 addressed the question whether endolithic community structure relates to pH and/or coral host species, and also explored the diversity and potential function of the various bacteria found in coral skeletons. Chapter 4 tested whether endolithic communities are homogeneously distributed within coral colonies and what is the relative importance of deterministic and stochastic processes in generating community heterogeneity over millimetres to global spatial scales.

The third goal of this thesis is to better understand evolutionary processes in endolithic algae. This thesis does not address co-evolution or the hologenome theory of evolution. The scope of this thesis is to investigate how many times endolithic lineages appeared in the phylogenetic history of green algae (within Chapter 2) and investigate the footprints of low light adaptation in the chloroplast genome of Ostreobium (Chapter 5).

While this thesis presents several novelties, it is only the first step towards the countless questions that can be tested in the future. The summary of the advances, open questions and perspectives for future work are discussed Chapter 6.

CHAPTER 1 7

1.6 References

Ainsworth T, Krause L, Bridge T, Torda G, Raina J-B, et al. (2015) The coral core microbiome identifies rare bacterial taxa as ubiquitous endosymbionts. ISME Journal 9, 2261-2274. Ainsworth TD, Fine M, Roff G, Hoegh-Guldberg O (2008) Bacteria are not the primary cause of bleaching in the Mediterranean coral Oculina patagonica. ISME Journal 2, 67-73. Aponte NE, Ballantine DL (2001) Depth distribution of algal species on the deep insular fore reef at Lee Stocking Island, Bahamas. Deep Sea Research Part I: Oceanographic Research Papers 48, 2185-2194. Behrendt L, Larkum AWD, Norman A, Qvortrup K, Chen M, et al. (2011) Endolithic chlorophyll d-containing phototrophs. ISME Journal 5, 1072-1076. Bourne DG, Garren M, Work TM, Rosenberg E, Smith GW, Harvell CD (2009) Microbial disease and the coral holobiont. Trends in Microbiology 17, 554-562. Bourne DG, Morrow KM, Webster NS (2016) Insights into the Coral Microbiome: Underpinning the Health and Resilience of Reef Ecosystems. Annual Review of Microbiology 70, 317-340. Chase JM, Myers JA (2011) Disentangling the importance of ecological niches from stochastic processes across scales. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences 366, 2351-2363. Chazottes V, Le Campion-Alsumard T, Peyrot-Clausade M (1995) Bioerosion rates on coral reefs: interactions between macroborers, microborers and grazers (Moorea, French Polynesia). Clements KD, German DP, Piché J, Tribollet A, Choat JH (2016) Integrating ecological roles and trophic diversification on coral reefs: multiple lines of evidence identify parrotfishes as microphages. Biological Journal of the Linnean Society. Cockell CS, Herrera A (2008) Why are some microorganisms boring? Trends in Microbiology 16, 101-106. Dullo W-C, Gektidis M, Golubic S, Heiss GA, Kampmann H, et al. (1995) Factors controlling holocene reef growth: An interdisciplinary approach. Facies 32, 145-188. Egan S, Gardiner M (2016) Microbial dysbiosis: rethinking disease in marine ecosystems. Frontiers in Microbiology 7, 991. Fenchel TOM, Finlay BJ (2004) The ubiquity of small species: patterns of local and global diversity. Bioscience 54, 777. Fine M, Loya Y (2002) Endolithic algae: an alternative source of photoassimilates during coral bleaching. Proceedings of the Royal Society B 269, 1205-1210. Fine M, Roff G, Ainsworth TD, Hoegh-Guldberg O (2006) Phototrophic microendoliths bloom during coral “white syndrome”. Coral Reefs 25, 577-581. Fork DC, Larkum AWD (1989) Light harvesting in the green alga Ostreobium sp., a coral symbiont adapted to extreme shade. Marine Biology 103, 381-385.

CHAPTER 1 8

Furnes H, Banerjee NR, Muehlenbachs K, Staudigel H, de Wit M (2004) Early life recorded in archean pillow lavas. Science 304, 578-581. Gilbert SF, McDonald E, Boyle N, Buttino N, Gyi L, et al. (2010) Symbiosis as a source of selectable epigenetic variation: taking the heat for the big guy. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences 365, 671- 678. Golubic S (1969) Distribution, , and boring patterns of marine endolithic algae. American Zoologist 9, 747-751. Golubic S, Friedmann I, Schneider J (1981) The lithobiontic ecological niche, with special reference to microorganisms. Journal of Sedimentary Petrology 51, 475-478. Golubic S, Radtke G, Le Campion-Alsumard T (2005) Endolithic fungi in marine ecosystems. Trends in Microbiology 13, 229-235. Grange JS, Rybarczyk H, Tribollet A (2015) The three steps of the carbonate biogenic dissolution process by microborers in coral reefs (New Caledonia). Environmental Science and Pollution Research 22, 13625-13637. Gunnarsson K, Nielsen R (2016) Culture and field studies of Ulvellaceae and other microfilamentous green seaweeds in subarctic and arctic waters around Iceland. Nova Hedwigia 103, 17-46. Gutner-Hoch E, Fine M (2011) Genotypic diversity and distribution of Ostreobium quekettii within scleractinian corals. Coral Reefs 30, 643-650. Halldal P (1968) Photosynthetic capacities and photosynthetic action spectra of endozoic algae of the massive coral Favia. The Biological Bulletin 134, 411-424. Hoeksema BW (2012) Forever in the dark: the cave-dwelling azooxanthellate reef coral Leptoseris troglodyta sp. n. (Scleractinia, Agariciidae). Zookeys, 21-37. Hubbell S (2001) The unified neutral theory of species abundance and diversity. Princeton University Press, Princeton, NJ. Hubbell, SP (2004) Quarterly Review of Biology 79, 96-97. Konopka A (2009) What is microbial community ecology? ISME Journal 3, 1223-1230. Kornmann P, Sahling P-H (1980) Ostreobium quekettii (Codiales, Chlorophyta). Helgoländer Meeresuntersuchungen 34, 115-122. Le Campion-Alsumard T, Golubic S, Hutchings P (1995) Microbial endoliths in skeletons of live and dead corals: Porites lobata (Moorea, French Polynesia). Marine Ecology Progress Series 117, 149-157. Leggat W, Ainsworth T, Bythell J (2007) The hologenome theory disregards the coral holobiont. Nature Reviews Microbiology 59, 2007. Littler MM, Littler DS, Blair SM, Norris JN (1985) Deepest known life discovered on an uncharted seamount. Science 227, 57-59. Magnusson S, Fine M, Kühl M (2007) Light microclimate of endolithic phototrophs in the scleractinian corals Montipora monasteriata and Porites cylindrica. Marine Ecology Progress Series 332, 119-128.

CHAPTER 1 9

Marcy Y, Ouverney C, Bik EM, Losekann T, Ivanova N, et al. (2007) Dissecting biological "dark matter" with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proceedings of the National Academy of Sciences, USA 104, 11889-11894. Martiny JB, Eisen JA, Penn K, Allison SD, Horner-Devine MC (2011) Drivers of bacterial beta-diversity depend on spatial scale. Proceedings of the National Academy of Sciences, USA 108, 7850-7854. Matthes U, Turner SJ, Larson DW (2001) Light attenuation by limestone rock and its constraint on the depth distribution of endolithic algae and cyanobacteria. International Journal of Plant Sciences 162, 263-270. McLoughlin N, Brasier MD, Wacey D, Green OR, Perry RS (2007) On biogenicity criteria for endolithic microborings on early Earth and beyond. Astrobiology 7, 10-26. Meron D, Rodolfo-Metalpa R, Cunning R, Baker AC, Fine M, Banin E (2012) Changes in coral microbial communities in response to a natural pH gradient. ISME Journal 6, 1775-1785. Moran NA, Sloan DB (2015) The Hologenome Concept: Helpful or Hollow? PLoS Biology 13, e1002311. Nienow Ja, McKay CP, Friedmann EI (1988) The cryptoendolithic microbial environment in the ross desert of antarctica: light in the photosynthetically active region. Microbial Ecology 16, 271-289. Odum HT, Odum EP (1955) Trophic structure and productivity of a windward coral reef community on Eniwetok Atoll. Ecological Monographs 25, 291-320. Omelon CR (2016) Endolithic microorganisms and their habitats. 1, 171-201. Peters EC (1984) A survey of cellular reactions to environmental stress and disease in Caribbean scleractinian corals. Helgoländer Meeresuntersuchungen 37, 113-137. Qiu H, Price DC, Weber AP, Reeb V, Yang EC, et al. (2013) Adaptation through horizontal gene transfer in the cryptoendolithic red alga Galdieria phlegrea. Current Biology 23, R865-866. Radecker N, Pogoreutz C, Voolstra CR, Wiedenmann J, Wild C (2015) Nitrogen cycling in corals: the key to understanding holobiont functioning? Trends in Microbiology 23, 490-497. Rappé MS, Giovannoni SJ (2003) The uncultured microbial majority. Annual Review of Microbiology 57, 369-394. Reshef L, Koren O, Loya Y, Zilber-Rosenberg I, Rosenberg E (2006) The coral probiotic hypothesis. Environmental Microbiology 8, 2068-2073. Robinson CK, Wierzchos J, Black C, Crits-Christoph A, Ma B, et al. (2015) Microbial diversity and the presence of algae in halite endolithic communities are correlated to atmospheric moisture in the hyper-arid zone of the Atacama Desert. Environmental Microbiology 17, 299-315.

CHAPTER 1 10

Rosenberg E, Koren O, Reshef L, Efrony R, Zilber-Rosenberg I (2007) The role of microorganisms in coral health, disease and evolution. Nature Reviews: Microbiology 5, 355-362. Rosenberg E, Sharon G, Atad I, Zilber-Rosenberg I (2010) The evolution of animals and plants via symbiosis with microorganisms. In: Environmental Microbiology Reports, pp. 500-506. Santos HF, Carmo FL, Duarte G, Dini-Andreote F, Castro CB, et al. (2014) Climate change affects key nitrogen-fixing bacterial populations on coral reefs. ISME Journal 8, 2272-2279. Schlichter D, Kampmann H, Conrady S (1997) Trophic potential and photoecology of endolithic algae living within coral skeletons. Marine Ecology 18, 299-317. Schlichter D, Zscharnack B, Krisch H (1995) Transfer of photoassimilates from endolithic algae to coral tissue. Naturwissenschaften 82, 561-564. Sharon G, Segala D, Ringob JM, Hefetzc A, Zilber-Rosenberg I, Rosenberg E (2010) Commensal bacteria play a role in mating preference of Drosophila melanogaster. Proceedings of the National Academy of Sciences, USA 107, 20051-20056. Shashar N, Stambler N (1992) Endolithic algae within corals - life in an extreme environment. Journal of Experimental Marine Biology and Ecology 163, 277-286. Shokralla S, Spall JL, Gibson JF, Hajibabaei M (2012) Next-generation sequencing technologies for environmental DNA research. Molecular Ecology 21, 1794-1805. Sun HJ (2013) Endolithic microbial life in extreme cold climate: snow is required, but perhaps less is more. Biology (Basel) 2, 693-701. Taberlet P, Coissac E, Pompanon F, Brochmann C, Willerslev E (2012) Towards next- generation biodiversity assessment using DNA metabarcoding. Molecular Ecology 21, 2045-2050. Tan (2016) All Together Now: Experimental Multispecies Biofilm Model Systems. Environmental Microbiology doi: 10.1111/1462-2920.13594. [Epub ahead of print] Tribollet A (2008a) The boring microflora in modern coral reef ecosystems: a review of its roles. In: Current developments in bioerosion, pp. 67-94. Springer. Tribollet A (2008b) Dissolution of dead corals by euendolithic microorganisms across the northern Great Barrier Reef (Australia). Microbial Ecology 55, 569-580. Tribollet A, Radtke G, Golubic S (2011) Bioerosion. In: Encyclopedia of Geobiolog eds. Reitner J, Thiel V), pp. 117-134. Springer Netherlands. Vogel K (1993) Bioeroders in fossil reefs. Facies 28, 109-113. Vogel K, Brett CE (2009) Record of microendoliths in different facies of the Upper Ordovician in the Cincinnati Arch region USA: The early history of light-related microendolithic zonation. Palaeogeography, Palaeoclimatology, Palaeoecology 281, 1-24. Wacey D, McLoughlin N, R. Green O, Parnell J, Stoakes CA, Brasier MD (2006) The ~3.4 billion-year-old Strelley Pool Sandstone: a new window into early life on Earth. International Journal of Astrobiology 5, 333.

CHAPTER 1 11

Walker JJ, Pace NR (2007) Endolithic microbial ecosystems. Annual Review of Microbiology 61, 331-347. Wang J, Shen J, Wu Y, Tu C, Soininen J, et al. (2013) Phylogenetic beta diversity in bacterial assemblages across ecosystems: deterministic versus stochastic processes. ISME Journal 7, 1310-1321. Wegley L, Edwards R, Rodriguez-Brito B, Liu H, Rohwer F (2007) Metagenomic analysis of the microbial community associated with the coral Porites astreoides. Environmental Microbiology 9, 2707-2719. Wilhelm C, Jakob T (2006) Uphill energy transfer from long-wavelength absorbing chlorophylls to PS II in Ostreobium sp. is functional in carbon assimilation. Photosynthesis Research 87, 323-329.

CHAPTER 1 12

CHAPTER 2 MULTI-MARKER METABARCODING OF CORAL SKELETONS REVEALS A RICH MICROBIOME AND DIVERSE EVOLUTIONARY ORIGINS OF ENDOLITHIC ALGAE

CHAPTER 2 14

www.nature.com/scientificreports

OPEN Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse received: 30 March 2016 accepted: 21 July 2016 evolutionary origins of endolithic Published: 22 August 2016 algae Vanessa Rossetto Marcelino & Heroen Verbruggen

Bacteria, fungi and green algae are common inhabitants of coral skeletons. Their diversity is poorly characterized because they are difficult to identify with microscopy or environmental sequencing, as common metabarcoding markers have low phylogenetic resolution and miss a large portion of the biodiversity. We used a cost-effective protocol and a combination of markers (tufA, 16S rDNA, 18S rDNA and 23S rDNA) to characterize the microbiome of 132 coral skeleton samples. We identified a wide range of prokaryotic and eukaryotic organisms, many never reported in corals before. We additionally investigated the phylogenetic diversity of the green algae—the most abundant eukaryotic member of this community, for which previous literature recognizes only a handful of endolithic species. We found more than 120 taxonomic units (near species level), including six family-level lineages mostly new to science. The results suggest that the existence of lineages with an endolithic lifestyle predates the existence of modern scleractinian corals by ca. 250my, and that this particular niche was independently invaded by over 20 lineages in green algae evolution. These results highlight the potential of the multi- marker approach to assist in species discovery and, when combined with a phylogenetic framework, clarify the evolutionary origins of host-microbiota associations.

Corals harbour a diverse microbial community that is vital for their health and resilience1,2. The skeletons of stony corals are populated by endolithic (limestone-boring) bacteria, fungi and a conspicuous layer of green algae3,4. These organisms are protected from the external environment but endure very low levels of light and extreme daily fluctuations of pH and oxygen levels5. The endolithic habitat therefore contains a specialized microbial community, and very little is known about when or how many times the association between corals and these organisms has evolved. Endolithic algae constitute a major component of the endolithic microbiome in terms of abundance and eco- logical roles. They are the principal microbial agent of reef erosion6,7 and increase coral decalcification under elevated acidity and temperature8. These algae also protect corals from high-light stress9 and provide them with an alternative source of energy during bleaching events10. The balance of benefits and drawbacks that these algae convey to corals is unclear and likely depends on the interplay between different algal lineages and other micro- organisms in the coral holobiont. However, these organisms are mostly uncharacterized. Ostreobium (Ulvophyceae, Chlorophyta) is considered to be the most abundant endolithic algal genus in marine habitats and has three described species (although there are some inconsistencies in literature). It is a siphonous alga, meaning that its whole body consists of a single, branched, multinucleate cell11. This simple archi- tecture evidently puts strong limits on the number of morphological characters available to distinguish differ- ent species12. A pilot study on the rbcL gene diversity in Ostreobium found seven genotypes, indicating that the number of species is higher than the taxonomic literature suggests13. Several other micro-eukaryotic groups in coral skeletons possibly also have uncharted cryptic diversity (e.g. fungi), demanding approaches to study their biodiversity that do not rely on morphological identification.

School of Biosciences, University of Melbourne, VIC 3010, Australia. Correspondence and requests for materials should be addressed to V.R.M. (email: [email protected])

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 16

Metabarcoding allows for in-depth microbial composition assessments directly from environmental samples14. This approach has led to the discovery of an enormous number of microorganisms never isolated or cultured before. Those organisms–coined microbial dark matter–account for the majority of microbial diversity15,16. There is an even darker category of organismal matter though–those that are undetectable with commonly used metabarcoding methods. The fraction of the biodiversity captured by metabarcoding surveys depends on the markers and primers used, so organisms that are not amplified with the standard methods go undetected even if they are common and play important roles in the ecosystem. Endolithic algae illustrate how common and important organisms can be virtually ignored in metabarcoding surveys that use a single standard marker. Although the coral microbiome is relatively well studied, researchers using the 16S rDNA are generally interested in the bacterial community and often discard chloroplast reads (e.g17). Eukaryotic surveys based on 18S rDNA possibly underestimate algal diversity because they can be biased towards heterotrophs18. The use of group-specific markers with higher phylogenetic resolution improves the rec- ognition of closely related organisms (e.g. cryptic species) and allows phylogeny-based evolutionary inferences. Comprehensively surveying all microbial diveristy would require sequencing both universal and high-resolution markers. Such approaches would facilitate capturing an eclectic range of co-occurring microorganisms and simultaneously getting a deeper understaing of particular taxa of interest. Multi-marker strategies have not been extensively used for two main reasons. First, library preparation becomes expensive for multiple markers and there are no automated protocols available to study the less com- monly used markers. Second, non-standard markers have relatively poor reference datasets compared to what is available for 16S and 18S rDNAs, hence the classification of the retrieved sequences by conventional methods (e.g. RDP classifier19 or BLAST20) is problematic21. In such cases, operational taxonomic units (OTUs) are better classified using a phylogenetic framework, which has the added advantage of providing an historical evolutionary perspective. We investigated the diversity of the prokaryotic and eukaryotic microbiome in coral skeletons using a cost-effective multi-marker metabarcoding protocol and evaluated the benefits of different markers. Because of their abundance and importance, we focused on the biodiversity and evolution of the green algae in the endo- lithic community. Using phylogenetic methods, we inferred when and how many times the association with a coral-endolithic habitat emerged in the evolutionary history of green algae. 2.2.Results Cost-effective biodiversity assessment. We sequenced 132 coral skeleton samples collected in Australia and Papua New Guinea from a wide variety of habitats and coral genera (Supplementary Materials). In order to obtain information from eukaryotic and prokaryotic members of the microbiome, we used four metabarcoding markers: the 16S rDNA22, the 18S rDNA23, a fragment of the 23S rDNA that targets algal chloroplasts24 and a fragment of the elongation factor Tu (tufA) gene, a DNA barcode recommended and commonly used for green algae due to its ability to distinguish between closely related species25. We used a cost-effective multi-marker metabarcoding approach that uses a two-step PCR protocol to amplify the markers and prepare the Illumina library, replacing a commonly used kit (Nextera Index kit) with custom made oligos, reducing the indexing costs by 60 times (AUD $3.28 vs. $0.05 per sample, Supplementary Materials). When compared to the Earth Microbiome Project protocol26, this approach has the advantage of requiring only 20 indexing oligos (Fig. 1A), plus first PCR primers, instead of 384 (plus 4 reverse primers) for 96 samples and 4 markers. Using home-made magnetic beads as described in Rohland & Reich27 decimated the costs of cleaning PCR products. We obtained, on average, 161,364 sequences per sample (comprising 4 amplicons each). Following removal of low quality sequences, 14,131,986 sequences were retained, of which 2,603,384 were in the 16S rDNA dataset, 4,102,867 in the 18S rDNA, 4,607,505 in the 23S rDNA and 2,818,230 sequences in the tufA dataset. These sequences were deposited in NCBI’s Sequence Read Archive (SRA) under the accession ID SRP073961.

Multi-marker microbiome characterization. To analyse the overall microbiome diversity, we made an assessment of all OTUs classified via RDP classifier as implemented in QIIME19,28. Our results showed that the widely used 16S and 18S rDNA markers drastically underestimate green algae diversity. In total, 3,680 OTUs were found in the 16S rDNA survey, 406 in the 18S rDNA, 659 in the 23S rDNA and 2,274 in the tufA dataset. Green algal reads were the most abundant in all but 18S rDNA dataset (Fig. 2). Note that the living coral tissue was removed prior to DNA isolation, therefore the high relative abundance of algal reads when compared to bacterial reflects the densely algae-populated skeletons. Besides Ostreobium, several other photosynthetic lineages were found including other green algae, , brown algae and cyanobacteria, most of which are organisms or lineages not previously known to occur in coral skeletons. Although green algae accounted for 55.7% of the 16S sequences, only 30 out of 3,680 OTUs were assigned to this group, and only 3 OTUs were classified beyond the rank of class (Ulvophyceae)–1 OTU with the clos- est match to Bryopsis hypnoides (confidence score =​ 0.6), and 2 OTUs classified as Chlorodesmis fastigiata (confidence score >​0.7). Red algae and cyanobacteria composed 0.8% and 0.3% of 16S rDNA reads, respectively. The alphaproteobacterial order Rhizobiales, thought to occur exclusively within the coral tissue29, was found in considerable abundance (2.3%) among our reads (see also30), although the possibility of contamination from coral tissue cannot be entirely ruled out. Green sulphur bacteria (phylum Chlorobi) composed 0.1% of the reads in our 16S rDNA dataset. The 18S rDNA metabarcode captured mostly endolithic sponges (42.3%), but also occasional nematodes, arthropods, annelids and fungi (Fig. 2). Sixteen OTUs (1.9% of relative abundance) were assigned to Labyrinthula (confidence score >0.7),​ a heterokont genus known to infect algae31. Green algae composed 8.2% of the sequence reads, comprising 25 OTUs. 7 OTUs were classified to genus rank (confidence score >​ 0.7): 4 Cladophora, 1

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 17

Figure 1. The multi-marker metabarcoding approach. The library preparation (A) consists of a 2-step PCR amplification: the first PCR amplifies the target markers and the second PCR adds the indices and the Illumina adapters (T =​ tail, P =​ amplicon-specific primer, P5 =​ Illumina adapter, FI =​ forward index, A =​ amplicon). The amplicons rea then purified with magnetic beads, quantified and pooled together to be sequenced in an Illumina’s MiSeq platform. The sequence reads of the 4 markers are teased apart in the analysis pipeline (B) based on primers sequences, and go through a series of quality control steps (including pipelines available in QIIME28), OTU clustering (using UPARSE46), alignment and classification.

Figure 2. Pie charts indicating the relative abundance of sequence reads matching the main taxa assigned with RDP-classifier.Note that the relative abundances do not always reflect diversity, as indicated by the total and green algal number of Operation Taxonomic Units retrieved with each marker.

Pseudulvella and 2 Phaeophila. None of the reads corresponded to Ostreobium despite it clearly being abundant in the samples. Brown algae composed 0.5% of the reads, with 11 OTUs, mostly Ectocarpales. In the 23S rDNA dataset, the vast majority of reads were assigned to green algae (91%—84 OTUs), red algae (4%) and bacteria (3.5%). Cyanobacteria were present in low relative abundance but were diverse. Of the 92 cyanobacterial OTUs, 5 matched Acaryochloris marina with high confidence scores (0.87–1.00).

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 18

The tufA reads were composed of green algae (51.3%), bacteria (48.4%) and a small fraction of red algae and heterokonts (Fig. 2). This marker retrieved the highest number of green algal OTUs (128) of which 53 were classified as Ostreobium. Three other algal genera were found with high confidence scores (0.85–1.00): 2OTUs in Halimeda, 2 Phaeophila and 1 . Other OTUs were only classified at higher taxonomic ranks or with lower confidence scores.

Phylogenetic diversity and evolution of endolithic green algae. We studied the diversity of green algae in more detail by building phylogenetic trees from the retrieved eukaryotic OTUs (as classified by RDP) and available reference sequences (Fig. 3 and Supplementary Figs S1–S4). The use of a phylogenetic framework allowed identifying more green algal OTUs than the RDP classifier did. In the 16S data, 36 OTUs were green algae (versus 30 classified with RDP), 21 of which were in theOstreobium clade (Supplementary Fig. S2). The phylogeny of the 18S OTUs confirmed the absence of Ostreobium reads in this dataset, and the presence of 5 OTUs in the Cladophora genus (Supplementary Fig. S3). The 23S rDNA dataset revealed 79 OTUs within core Chlorophyta, of which 61 were in the Ostreobium clade (Supplementary Fig. S4). Thetuf A gene has a better phylogenetic resolution than other markers25 and allowed us to perform a detailed analysis of phylogenetic diversity and evolution of the green algal OTUs (Fig. 3, Supplementary Fig. S1). We excluded 7 green algal tufA OTUs that did not fall within the core Chlorophyta clade. Of the 121 remaining OTUs, one belonged to the and 120 were Ulvophyceae. Endolithic OTUs were found in 11 families in Ulvophyceae (plus Cladophoraceae in 18S rDNA dataset), some OTUs were distantly related to known algae while others were very similar or identical to known seaweeds never reported in coral skeletons (Fig. 3, boot- strap values in Supplementary Fig. S1). With the two previously published Ostreobium tufA sequences, 82 OTUs formed a well-supported, early-branching clade that was further split into 4 subclades, also well supported. We discovered another two endolithic clades: one including Pseudochlorodesmis (4 OTUs) and a second one sister to the family Rhipiliaceae (12 OTUs). The ancestral reconstruction of the coral-endolithic nature in green algae indicated that this trait evolved more than 20 times independently (Fig. 3). The time-calibrated phylogeny also allowed comparing the age of families across different parts of the tree and extrapolating this age to the newly discovered lineages. It suggested that the endolithic algae found in this study represent at least six family-level lineages (four subclades of the Ostreobium clade and endolithic clades #1 and #2). 2.3.Discussion Multi-marker view of coral skeleton microbiome. This study highlights how multi-marker approaches can enrich biodiversity surveys. Our results show that the commonly used 16S and 18S rDNA markers severely underestimate algal diversity, and that no metabarcode, in isolation, is sufficient to characterize complex microbi- omes. The multi-marker data increase the range of microbial taxa recovered from the samples and yields massive savings when compared to traditional methods (Fig. 1). The multi-marker method allows combining the qualities of each marker for more comprehensive biodiver- sity surveys (e.g.32). The 16S rDNA, for example, retrieved the highest number of OTUs and is convenient for cross-comparability with the vast number of studies focused on bacterial communities, but it underestimates algal diversity. Commonly used universal primers for highly conserved rRNA genes (16S and 18S) capture a wide range of microbial taxa at the expense of losing power to detect closely related species. The tufA marker has a higher rate of evolution especially at third codon positions and yields many more green algal OTUs as well as a better-supported phylogeny (Supplementary Figs S1–S4). Further, some organisms are difficult or impossible to amplify with standard primer pairs, presumably due to substitutions at primer binding sites, but can be detected using the multi-marker approach. For instance, the 18S rDNA marker did not retrieve any OTU of the sipho- nous green algae (Bryopsidales), while 112 were obtained with tufA. Nevertheless, the 18S rDNA was the only marker yielding OTUs of another order of green algae (; Supplementary Fig. S3). Like dinoflagel- lates, this green algal order possesses an unusual plastid configuration that prevents amplification with standard plastid markers33. We found 5 OTUs in the Cladophorales and this is the first record of their occurrence in coral skeletons. Corals harbour a particular microbiome in their skeletons29,30. Alphaproteobacteria and Gammaproteobacteria were the predominant prokaryotic members, in agreement with some metabarcoding studies of coral skeletons30,34. A recent study indicates that green sulphur bacteria are prevalent in skeletons of Isopora palifera35. We found that these bacteria compose only a small fraction of the prokaryotic community in the corals analysed here (in agreement with one other study targeting endolithic bacteria30). We also found a diverse community of cyano- bacteria, which was best characterized with the 23S rDNA marker. To our knowledge, this is the first record of Acaryochloris marina in skeletons of living corals. This cyanobacterium produces chlorophyll-d and is known to occupy niches depleted of visible light36. Many other cyanobacterial OTUs could not be classified at lower taxo- nomic ranks with the RDP classifier, but might reveal interesting groups specialized in the endolithic niche when analysed in a phylogenetic context.

Highly diverse endolithic green algae. We found that the genus Ostreobium, previously thought to be composed of only three species, is a 500 million year-old complex comprising more than 80 taxonomic units at the near-species level (Fig. 3). The lineage is divided into four well-supported subclades with divergence times comparable to the family level in the seaweed lineages of the siphonous green algae (Fig. 3). A recent study in which reef rubble and coralline algae were sequenced also revealed a large Ostreobium diversity37. Our results also revealed a large number of green algae outside the Ostreobidineae clade that were not known to occur in coral skeletons. One of the lineages–endolithic clade #2 in Fig. 3–is exclusively composed of endo- lithic algae never described before and constitutes a new family, nested among larger-bodied seaweed lineages. Endolithic clade #1 is related to Pseudochlorodesmis, which forms small turfs growing out of hard substrata, and it

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 19

Figure 3. Maximum Likelihood tree of green algae (4833 bp alignment) including the OTUs retrieved with the tufA metabarcode. The green algal families where OTUs were found are indicated in square brackets. The OTUs composing the endolithic clades were given an identifier for reference in future biodiversity screenings using the tufA marker. The ancestral states reconstruction of the endolithic nature is plotted along the tree and indicate the probability of the ancestral lineage being coral-endolithic (black) or not (grey).

is known for having a problematic classification and high cryptic diversity38. Our results suggest that this lineage consists of primarily endolithic algae that only occasionally grow out of their rock but retain most of their biomass inside of it. Likewise, the two OTUs matching the macroalgae species Halimeda discoidea and H. micronesica, both of which were present in the areas where we collected, suggests Halimeda species have an endolithic life

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 20

stage. Although the possibility of a contamination cannot be discarded entirely (e.g. the possibility of or gametes in the seawater), the high abundance of reads and their presence in several samples, even after rigorous quality control, indicate that this is unlikely. Endolithic “Conchocelis” stages have been described for red algal seaweeds39 but never for Halimeda. The life cycle of Halimeda has never been completed in culture40, perhaps because of unknown life stages such as these. Halimeda and many of the green algal endolithic lineages found here have also been sequenced from limestone substrates in a recently published study37. We also retrieved OTUs related to three algae species that are known to bore into limestone: two OTUs are closely related to Phaeophila dendroides, two to Ochlochaete hystrix and two are related to Ulvella spp. Phaeophila is known to bore into coral skeletons, although to a lesser extent than Ostreobium41. Ulvella species have been reported as endophytic in other algae and as a pathogenic species in the skeleton of gorgonian corals42,43. Ochlochaete hystrix was found growing in shells39. This is the first time that Ulvella and Ochlochaete are reported in live stony coral skeletons. When observing unexpected results such as the massive biodiversity of Ostreobium species and the presence of macroalgal species in an endolithic environment, it is desirable to address potential sequencing artifacts and contamination issues. Potential sources of contamination include the living coral tissue and the surrounding water. We have taken precautions to limit these sources of contamination in the field and during data processing. A good indication that our results do not result from spurious contaminations is that they are relatively abundant in mul- tiple samples. Methodological artifacts include tag jumping and chimera formation44,45, and we have taken several precautions to avoid an overestimation of biodiversity due to these potential problems: 1) we apply a conservative similarity threshold to cluster OTUs; 2) we use a conservative de novo OTU clustering method (UPARSE) that is efficient in filtering chimeras without a reference database46; 3) our pipeline only keeps OTUs if they exceed 5 reads across the entire dataset; 4) our pipeline only keeps OTUs in individual samples if they exceed 2 reads in that sample; 5) we use a variety of controls including 10 mock extractions and 6 PCR negative controls. Due to these precautions, we may underestimate the endolithic microbial diversity but we rather err on the side of caution.

Ecology and Evolution. Our phylogenetic analyses show that the Ostreobium clade originated in the Ordovician, around 500 my ago (Fig. 3 and47), which is in agreement with the oldest trace fossils attributed to this alga48. Although this pre-dates the existence of modern scleractinian corals, traces of ancestral Ostreobium lineages have been found in limestone rocks formed by extinct rugose corals, shells and stromatoporoids48,49, attesting the old origins of the endolithic lifestyle. Our results show that multiple Ostreobium lineages survived the Permian mass extinction and diversified after the Triassic, together with the rise of scleractinian reefs50. The appearance of endolithic clades #1 and #2 falls in the late Paleozoic, clade #1 diversified in the Mesozoic while clade #2 continued to diversify during the Cenozoic. The ability to bore into coral skeletons evolved independently over 20 times in 12 Ulvophycean families (Fig. 3 and Supplementary Fig. S3 for Cladophora). This is surprising because Ostreobium and Phaeophila were thought to be the only green algae able to live within skeletons of live corals. The skeleton is an extreme environment for algae due to low light conditions and exposure to daily fluctuations of pH and oxygen levels caused by the holobiont’s photosynthesis and respiration5. The endolithic niche also varies depending on the coral species and external environmental conditions. It is therefore reasonable to expect that the endolithic lineages discovered here are not homogeneously distributed among these different niches. Indeed, the study of Gutner-Hoch and Fine13 suggests niche differentiation across depth gradients in the distribution of Ostreobium genotypes and some species-specific associations with corals. The effects of diversity and distribution of endolithic algae on the coral holobiont is still to be investigated. For example, tolerance to thermal stress in some coral species is partly dependent on the relative abundance of certain Symbiodinium types, which can change in response to environmental perturbations51. Endolithic algal 8 biomass within corals increases under elevated temperature and pCO2 , but it is unknown whether relative abun- dances of different lineages change, as is the case for Symbiodinium, and whether these different lineages have different ecological roles in the holobiont. By uncovering the diversity of endolithic algae, we set the stage and present methods to investigate these ecological interactions in detail. Scleractinian corals have been associated with endolithic algae since early in their evolution (Fig. 3), hence one would expect to find a variety of symbiotic associations ranging from mutualism to amensalism and perhaps parasitism, but these are yet to be discovered.

Conclusion and perspectives. This study shows that metabarcoding surveys of coral-associated microbi- omes based only on 16S rDNA or 18S rDNA underestimate the diversity of entire families of organisms, some of which have critical roles in the holobiont. We put forward the use of a cost effective multi-marker approach for more comprehensive biodiversity surveys. Our results reveal that both prokaryotic (e.g. cyanobacteria) and eukaryotic members of the microbiome within coral skeletons are more diverse than previously thought, offering interesting perspectives for future research on the interactions among these microorganisms. By using a high-resolution marker and a phylogenetic framework we found six endolithic algal clades with divergence times close to the family level. Our results show that the oldest endolithic lineages originated ca. 500 million years ago, and the transition to a coral-endolithic lifestyle happened over 20 times in green algae evolu- tion. With this baseline of their biodiversity and evolution at hand, it becomes possible to design ecophysiological experiments to investigate the adaptation of the different lineages to the endolithic niche. We are also applying the multi-marker environmental sequencing method in comparative ecology settings, for example, to study whether different coral species are associated with particular algal lineages and how coral-algal associations change as a function of ecological conditions. Besides helping to understand how these different lineages affect the holobiome resilience under environmental stress, this approach is likely to reveal a large number of species in other eukary- otic groups and assist to shed light on the darkest matter of microbial diversity.

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 21

2.4.Materials and Methods Sampling and DNA isolation. The sampling was designed to set a solid baseline of the biodiversity and evolution of the endolithic community. We collected 132 coral skeleton samples from a wide variety of habitats and coral genera in Australia and Papua New Guinea (Supplementary Table 1). We chose not to focus our col- lections on systematic samples for comparative studies (i.e. multiple replicates for beta diversity analysis) at this stage. Instead, we targeted at a broad diversity of coral species and ecological conditions (depth, microhabitat) to increase chances of detecting different endolithic species. After collection, the coral living tissue was removed and samples were stored in RNAlater or 100% etha- nol. The environmental DNA was extracted using the Wizard Genomic DNA Purification Kit (Promega) or a phenol-chloroform protocol (Supplementary Materials). Although different DNA isolation protocols may be biased towards extracting certain groups of organisms more than others, we chose to analyse these samples together because we do not perform any comparative (e.g. beta-diversity) analysis that could be negatively affected by it. Ten mock DNA extractions were performed together with the samples DNA isolation to detect possible laboratory contaminants.

Library preparation. We used a two-step PCR procedure to prepare Illumina sequencing libraries for mul- tiple samples and markers. In order to add complexity to the library we added 0–3 random base pairs at the 5′ ​ end of the primers52, followed by an overhang tail of 33 bp (Supplementary Table S2). We replaced a commonly used commercial kit (Nextera Index kit) by custom made oligos containing dual indices (8 bp) and Illumina adapters (Supplementary Table S2), which are ligated to the amplicons in the 2nd PCR reaction. The details about the prim- ers and both PCR amplification steps use here are given in the Supplementary Materials. Negative controls for all PCRs were sequenced and OTUs from those libraries were excluded in step 11 of the data processing workflow described below. We purified the samples using home-made magnetic beads as described in Rohland and Reich27, quantified, normalized and pooled the samples together (See Supplementary Materials for details). The libraries were sequenced with the Illumina MiSeq platform (2 ×​ 300 bp paired end reads).

Data processing pipeline. The steps used to process the multi-gene dataset, perform the quality control and OTU clustering were (see also Fig. 1B and Supplementary Materials for details):

1. Remove the reverse complement of adapters from short amplicons. 2. Separate genes into different files. With our library preparation design, the MiSeq run yields one file con- taining all four amplicons per sample, which can be teased apart based on primers sequence. 3. Trim 3′​ ends of reads to improve consensus quality. 4. Merge forward and reverse reads using FLASH53. 5. Filter merged reads based on a quality threshold (average of 35 per merged sequence) using PRINSEQ54. 6. Trim primers from merged reads. Sequences that do not meet a minimum length threshold and/or do not have the primer sequence at the 3′​ and 5′​ ends are excluded in order to ensure global trimming. 7. Format the sequence headers to include sample name, run name and read number, then generate one file per gene containing all samples. 8. Cluster OTUs with the UPARSE pipeline46. Based on the divergence of the tufA gene among Bryopsidales we used a similarity threshold of 98% for OTU clustering in this marker, which is a conservative threshold for species level. For the other markers we used the default threshold of 97% (see Supplementary Materials). 9. Alignment using PyNAST55 for 16S and 18S rDNA sequences and MAFFT56 for 23S rDNA and tufA. 10. Assign taxonomy using the Naïve Bayesian Classifier (RDP) implemented in QIIME19,28. We used RDP taxonomic assignments to: i) infer the abundance of reads assigned to the main microbial groups; and ii) pre-filter OTUs to build a green algae phylogeny: OTUs that were not classified as “Eukaryotic” were excluded from the tufA phylogenetic analysis. Likewise, only OTUs classified as “Chloroplast” in the 16S rDNA dataset were included in the 16S phylogeny. 11. Filter OTUs found in negative controls. 12. Filter OTU table by minimum count (2) of reads per OTU per sample. 13. Filter rare OTUs and produce final filtered OTU fasta file. 14. Produce final OTU table and statistics.

Phylogenetic analysis. In order to place the OTUs in a green algal phylogeny, we created reference alignments containing tufA, rbcL, 18S, 16S and 23S rDNA sequences of green algae (Supplementary Materials). We subse- quently added the sequence data for the OTUs to the multi-locus alignment, producing one alignment per ampli- con (4833–9489 bp). We aligned the OTUs with the reference sequences using Geneious57 and MAFFT56. We used Partitioned Model Tester v.1.03 (https://github.com/hverbruggen/PMT) to identify the best-fit model of molecular evolution and partitioning strategy, then reconstructed the phylogeny using RAxML58, with Prasinococcus capsula- tus as outgroup. OTUs that did not fall within the core Chlorophyta were excluded from the analysis. We calibrated the phylogeny containing the tufA OTUs in geological time with the PhyloBayes program59 using node ages estimated in a previous study47. To infer the origins of the coral skeleton-boring nature, we clas- sified taxa into coral-endolithic or non-coral-endolithic. Besides the OTUs retrieved here, the following species were classified as coral-endolithic: Phaeophila dendroides, Ulvella endozoica and Ostreobium spp., which have been reported from coral skeletons4,41,42 and Halimeda discoidea and H. micronesica, which have identical tufA sequences to two retrieved OTUs. We estimated the ancestral states with 1000 simulations of stochastic mapping using the R package phytools60, and plotted the average log-likelihood of the ancestral states along the tree with TreeGradients v.1.03 (available at www.phycoweb.net).

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 22

2.5.References 1. Rosenberg, E., Koren, O., Reshef, L., Efrony, R. & Zilber-Rosenberg, I. The role of microorganisms in coral health, disease and evolution. Nat. Rev. Microbiol. 5, 355–362 (2007). 2. Blackall, L. L., Wilson, B. & van Oppen, M. J. H. Coral-the world’s most diverse symbiotic ecosystem. Mol. Ecol. 24, 5330–5347 (2015). 3. Verbruggen, H. & Tribollet, A. Boring algae. Curr. Biol. 21, R876–R877 (2011). 4. Tribollet, A. The boring microflora in modern coral reef ecosystems: a review of its roles In Curr. Dev. Bioerosion (eds Wisshak, M. & Tapanila, L.) 67–94 (Springer Berlin Heidelberg, 2008). 5. Shashar, N. & Stambler, N. Endolithic algae within corals - life in an extreme environment. J. Exp. Mar. Bio. Ecol. 163, 277–286 (1992). 6. Tribollet, A. Dissolution of dead corals by euendolithic microorganisms across the northern Great Barrier Reef (Australia). Microb. Ecol. 55, 569–80 (2008). 7. Grange, J. S., Rybarczyk, H. & Tribollet, A. The three steps of the carbonate biogenic dissolution process by microborers in coral reefs (New Caledonia). Environ. Sci. Pollut. Res. 22, 13625–13637 (2015). 8. Reyes-Nivia, C., Diaz-Pulido, G., Kline, D., Hoegh-Guldberg, O. & Dove, S. Ocean acidification and warming scenarios increase microbioerosion of coral skeletons. Glob. Chang. Biol. 19, 1919–1929 (2013). 9. Yamazaki, S. S., Nakamura, T. & Yamasaki, H. Photoprotective role of endolithic algae colonized in coral skeleton for the host photosynthesis In Photosynth. Energy from Sun (eds Allen, J., Gantt, E., Golbeck, J. H. & Osmond, B.) 1391–1395 (Springer Netherlands, 2008). 10. Fine, M. & Loya, Y. Endolithic algae: an alternative source of photoassimilates during coral bleaching. Proc. R. Soc. B-Biological Sci. 269, 1205–10 (2002). 11. Vroom, P. & Smith, C. The challenge of siphonous green algae. Am. Sci. 89, 524 (2001). 12. Verbruggen, H. Morphological complexity, plasticity, and species diagnosability in the application of old species names in DNA- based taxonomies. J. Phycol. 50, 26–31 (2014). 13. Gutner-Hoch, E. & Fine, M. Genotypic diversity and distribution of Ostreobium quekettii within scleractinian corals. Coral Reefs 30, 643–650 (2011). 14. Taberlet, P., Coissac, E., Pompanon, F., Brochmann, C. & Willerslev, E. Towards next-generation biodiversity assessment using DNA metabarcoding. Mol. Ecol. 21, 2045–50 (2012). 15. Rappé, M. S. & Giovannoni, S. J. The uncultured microbial majority. Annu. Rev. Microbiol. 57, 369–394 (2003). 16. Marcy, Y. et al. Dissecting biological ‘dark matter’ with single-cell genetic analysis of rare and uncultivated TM7 microbes from the human mouth. Proc. Natl. Acad. Sci. 104, 11889–11894 (2007). 17. Lema, K. A., Willis, B. L. & Bourne, D. G. Amplicon pyrosequencing reveals spatial and temporal consistency in diazotroph assemblages of the Acropora millepora microbiome. Environ. Microbiol. 16, 3345–3359 (2014). 18. Kirkham, A. R. et al. Basin-scale distribution patterns of photosynthetic picoeukaryotes along an Atlantic Meridional Transect. Environ. Microbiol. 13, 975–990 (2011). 19. Wang, Q., Garrity, G. M., Tiedje, J. M. & Cole, J. R. Naive Bayesian Classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Appl. Environ. Microbiol. 73, 5261–5267 (2007). 20. Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. Basic local alignment search tool. J. Mol. Biol. 215, 403–410 (1990). 21. Yilmaz, P. & Glöckner, F. O. Metagenomes: 23S Sequences In Encycl. Metagenomics: Genes, Genomes and Metagenomes: Basics, Methods, Databases and Tools (ed. Nelson, K. E.) 396–402 (Springer US, 2015) 22. Klindworth, A. et al. Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing- based diversity studies. Nucleic Acids Res. 41, e1 (2013). 23. Porazinska, D. L. et al. Evaluating high-throughput sequencing as a method for metagenomic analysis of nematode diversity. Mol. Ecol. Resour. 9, 1439–1450 (2009). 24. Presting, G. G. Identification of conserved regions in the plastid genome: implications for DNA barcoding and biological function. Can. J. Bot. 84, 1434–1443 (2006). 25. Saunders, G. & Kucera, H. An evaluation of rbcL. tufA. UPA. LSU and ITS as DNA barcode markers for the marine green macroalgae. Cryptogam. Algol. 31, 487–528 (2010). 26. Gilbert, J. A., Jansson, J. K. & Knight, R. The Earth Microbiome project: successes and aspirations. BMC Biol. 12, 69 (2014). 27. Rohland, N. & Reich, D. Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Res. 22, 939–46 (2012). 28. Caporaso, J. G. et al. QIIME allows analysis of high-throughput community sequencing data. Nat. Methods 7, 335–336 (2010). 29. Ainsworth, T. et al. The coral core microbiome identifies rare bacterial taxa as ubiquitous endosymbionts. ISME J. 9, 2261–2274 (2015). 30. Li, J. et al. Bacterial dynamics within the mucus, tissue and skeleton of the coral Porites lutea during different seasons. Sci. Rep. 4, 7320 (2014). 31. Andrews, J. H. The pathology of marine algae. Biol. Rev. 51, 211–252 (1976). 32. Kittelmann, S. et al. Simultaneous amplicon sequencing to explore co-occurrence patterns of bacterial, archaeal and eukaryotic microorganisms in rumen microbial communities. Plos One 8, e47879 (2013). 33. La Claire, J. W. & Wang, J. S. Structural characterization of the terminal domains of linear plasmid-like DNA from the green alga Ernodesmis (Chlorophyta). J. Phycol. 40, 1089–1097 (2004). 34. Fernando, S. C. et al. Microbiota of the major south atlantic reef building coral Mussismilia. Microb. Ecol. 69, 267–280 (2014). 35. Yang, S.-H. et al. Prevalence of potential nitrogen-fixing, green sulfur bacteria in the skeleton of reef-building coral Isopora palifera. Limnol. Oceanogr. 61, 1078–1086 (2016). 36. Behrendt, L. et al. Endolithic chlorophyll d-containing phototrophs. ISME J. 5, 1072–1076 (2011). 37. Sauvage, T., Schmidt, W. E., Suda, S. & Fredericq, S. A metabarcoding framework for facilitated survey of endolithic phototrophs with tufA. BMC Ecol. 16, 8 (2016). 38. Verbruggen, H. et al. Phylogenetic analysis of Pseudochlorodesmis strains reveals cryptic diversity above the family level in the siphonous green algae (Bryopsidales, Chlorophyta). J. Phycol. 45, 726–731 (2009). 39. Nielsen, R. Marine algae within calcareous shells from New Zealand. New Zeal. J. Bot. 25, 425–438 (1987). 40. Meinesz, A. Sur le cycle de l’Halimeda tune (Ellis et Solander) Lamouroux (Udoteacee, Caulerpale). Compte Rendu Hebd. des Séances l’Académie des Sci. Paris 275, 1363–1365 (1972). 41. Titlyanov, E. A., Kiyashko, S. I., Titlyanova, T. V., Kalita, T. L. & Raven, J. A. δ​13C and δ​15N values in reef corals Porites lutea and P. cylindrica and in their epilithic and endolithic algae. Mar. Biol. 155, 353–361 (2008). 42. Goldberg, W. M., Makemson, J. C. & Colley, S. B. Entocladia endozoica sp. nov., A pathogenic chlorophyte: structure, life history, physiology, and effect on its coral host. Biol. Bull. 166, 368 (1984). 43. Correa, J. A. & McLachlan, J. L. Endophytic algae of Chondrus crispus (Rhodophyta). V. Fine structure of the infection by operculata (Chlorophyta). Eur. J. Phycol. 29, 33–47 (1994). 44. Acinas, S. G., Sarma-Rupavtarm, R., Klepac-Ceraj, V. & Polz, M. F. PCR-Induced sequence artifacts and bias: insights from comparison of two 16s rRNA clone libraries constructed from the same sample. Appl. Environ. Microbiol. 71, 8966–8969 (2005).

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 www.nature.com/scientificreports/ Chapter 2 23

45. Schnell, I. B., Bohmann, K. & Gilbert, M. T. P. Tag jumps illuminated - reducing sequence-to-sample misidentifications in metabarcoding studies. Mol. Ecol. Resour. 15, 1289–1303 (2015). 46. Edgar, R. C. UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nat. Methods 647, 1–5 (2013). 47. Verbruggen, H. et al. A multi-locus time-calibrated phylogeny of the siphonous green algae. Mol. Phylogenet. Evol. 50, 642–653 (2009). 48. Vogel, K. & Brett, C. E. Record of microendoliths in different facies of the Upper Ordovician in the Cincinnati Arch region USA: The early history of light-related microendolithic zonation. Palaeogeogr. Palaeoclimatol. Palaeoecol. 281, 1–24 (2009). 49. Vogel, K. Bioeroders in fossil reefs. Facies 28, 109–113 (1993). 50. Stanley, G. D. The evolution of modern corals and their early history. Earth-Science Rev. 60, 195–225 (2003). 51. Berkelmans, R. & van Oppen, M. J. H. The role of zooxanthellae in the thermal tolerance of corals: a ‘nugget of hope’ for coral reefs in an era of climate change. Proc. R. Soc. B Biol. Sci. 273, 2305–2312 (2006). 52. Tremblay, J. et al. Primer and platform effects on 16S rRNA tag sequencing. Front. Microbiol. 6, 1–15 (2015). 53. Magoč, T. & Salzberg, S. L. FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957–2963 (2011). 54. Schmieder, R. & Edwards, R. Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863–864 (2011). 55. Caporaso, J. G. et al. PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266–267 (2010). 56. Katoh, K., Misawa, K., Kuma, K. & Miyata, T. MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 30, 3059–3066 (2002). 57. Kearse, M. et al. Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647–1649 (2012). 58. Stamatakis, A. RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006). 59. Lartillot, N., Lepage, T. & Blanquart, S. PhyloBayes 3: a Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25, 2286–2288 (2009). 60. Revell, L. J. phytools: an R package for phylogenetic comparative biology (and other things). Methods Ecol. Evol. 3, 217–223 (2012). Acknowledgements This work was supported by the Australian Biological Resources Study (RFL213-08), the Australian Research Council (FT110100585), the Botany Foundation (The University of Melbourne) and the Holsworth Wildlife Research Endowment. VRM receives a University of Melbourne scholarship. We thank Chiela Cremen and Chris Jackson for providing reference sequences, Todd McLay and Kym Pham for helping troubleshooting the library prep and Adam Robbins-Pianka for updating the filter_observation_by_samples.py script. We are very grateful to Francesca Benzoni for providing coral identifications during the Papua New Guinea sampling campaigns. We thank Claude Payri, the staff of the Alis research vessel and the La Planète Revisitée program for facilitating fieldwork in Papua New Guinea, and everyone who helped with field work in Western Australia. We also thank Madeleine van Oppen, Margaret Brookes, Chris Jackson, Analy Leite, Raquel Peixoto and an anonymous reviewer for their valuable comments on the manuscript. Author Contributions Both authors conceptualized the study and collected the samples. V.R.M. performed DNA extractions and library preparation. V.R.M. and H.V. developed the analysis pipeline and wrote the manuscript. Additional Information Supplementary information accompanies this paper at http://www.nature.com/srep Competing financial interests: The authors declare no competing financial interests. How to cite this article: Rossetto Marcelino, V. and Verbruggen, H. Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Sci. Rep. 6, 31508; doi: 10.1038/srep31508 (2016). This work is licensed under a Creative Commons Attribution 4.0 International License. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

© The Author(s) 2016

Scientific Reports | 6:31508 | DOI: 10.1038/srep31508 CHAPTER 2 24

2.6 Supplementary Materials

Sampling and DNA isolation

Supplementary Table S1: Coral skeleton samples and locality.

Sample Country Locality Coral host HV04438.R3 Papua New Guinea Kavieng Unidentified stony coral HV04837.R3 Papua New Guinea Kavieng Favites sp. HV04841.R3 Papua New Guinea Kavieng Porites sp. HV04853.R3 Papua New Guinea Kavieng Millepora sp. HV04854.R3 Papua New Guinea Kavieng Goniastrale pectinata HV04857.R3 Papua New Guinea Kavieng Porites sp. HV05003.R3 Papua New Guinea Kavieng Goniastrea sp. HV05004b.R3 Papua New Guinea Kavieng Platygyra sp. HV05004c.R3 Papua New Guinea Kavieng Platygyra sp. HV05005.R3 Papua New Guinea Kavieng Porites sp. HV05006a.R3 Papua New Guinea Kavieng Goniastrea edwardsi HV05006b.R3 Papua New Guinea Kavieng Goniastrea edwardsi HV05007.R3 Papua New Guinea Kavieng Diploastrea heliopora HV05008.R3 Papua New Guinea Kavieng Leptoseris sp. HV05009.R3 Papua New Guinea Kavieng Favites russelli HV05010.R3 Papua New Guinea Kavieng Madracis sp. HV05011.R3 Papua New Guinea Kavieng Oxypora lacera HV05012.R3 Papua New Guinea Kavieng Leptoseris striatus HV05013.R3 Papua New Guinea Kavieng Pectinia sp. HV05014.R3 Papua New Guinea Kavieng Leptoseris striata HV05015.R3 Papua New Guinea Kavieng Porites profundus HV05016.R3 Papua New Guinea Kavieng Goniastrea edwardsii HV05017.R3 Papua New Guinea Kavieng Goniastrea edwardsii HV05019.R3 Papua New Guinea Kavieng Favites halicora HV05020.R3 Papua New Guinea Kavieng Pachyseris sp. HV05021.R3 Papua New Guinea Kavieng Goniastrea edwardsii HV05022.R3 Papua New Guinea Kavieng Platygyra lamellina HV05023.R3 Papua New Guinea Kavieng Porites australiensis HV05024.R3 Papua New Guinea Kavieng Lobophyllia sp. HV05025.R3 Papua New Guinea Kavieng Symphyllia valenciennesi HV05026.R3 Papua New Guinea Kavieng Goniastrea edwardsii HV05028.R3 Papua New Guinea Kavieng Stylophora pistillata HV05029.R3 Papua New Guinea Kavieng Porites australiensis HV05031.R3 Papua New Guinea Kavieng Pachyseris sp. HV05032.R3 Papua New Guinea Kavieng Pertinia sp. HV05034.R3 Papua New Guinea Kavieng Porites sp. HV05035.R3 Papua New Guinea Kavieng Echinopora hirsutissima HV05037.R3 Papua New Guinea Kavieng Merulina ampliata HV05038.R3 Papua New Guinea Kavieng Pachyseris sp. HV05039.R3 Papua New Guinea Kavieng Merulina ampliata HV05040c.R3 Papua New Guinea Kavieng Porites sp. HV05041a.R3 Papua New Guinea Kavieng Galaxea astreata HV05041b.R3 Papua New Guinea Kavieng Galaxea astreata HV05042.R3 Papua New Guinea Kavieng Diploastrea heliopora HV05043.R3 Papua New Guinea Kavieng Echinopora hirsutissima HV05044.R3 Papua New Guinea Kavieng Lobophyllia sp. HV05045.R3 Papua New Guinea Kavieng Oxypora lacera HV05046.R3 Papua New Guinea Kavieng Leptoseris troglodyta HV05047.R3 Papua New Guinea Kavieng Leptoseris mycetoseroides PHV207.R1 Papua New Guinea PNG Unidentified stony coral PHV237.R1 Papua New Guinea PNG Unidentified stony coral CHAPTER 2 25

PHV570.R1 Papua New Guinea PNG Unidentified stony coral PHV882.R1 Papua New Guinea PNG Unidentified stony coral VRM0028.R1 Australia Western Australia Montipora sp. VRM0032.R1 Australia Western Australia Montipora sp. VRM0036.R1 Australia Western Australia Unidentified stony coral VRM0039.R2 Australia Western Australia Cyphastrea sp. VRM0040.R2 Australia Western Australia Goniastrea sp. VRM0042.R2 Australia Western Australia Goniastrea sp. VRM0043.R2 Australia Western Australia Leptoria sp. VRM0044.R2 Australia Western Australia Unidentified stony coral VRM0045.R2 Australia Western Australia Montastrea sp. VRM0046.R2 Australia Western Australia Unidentified stony coral VRM0048.R2 Australia Western Australia Goniastrea sp. VRM0051.R2 Australia Western Australia Porites sp. VRM0052.R2 Australia Western Australia Porites sp. VRM0053.R2 Australia Western Australia Porites sp. VRM0054.R2 Australia Western Australia Porites sp. VRM0055.R2 Australia Western Australia Porites sp. VRM0056.R2 Australia Western Australia Porites sp. VRM0057.R2 Australia Western Australia Porites sp. VRM0058.R2 Australia Western Australia Porites sp. VRM0059.R2 Australia Western Australia Porites sp. VRM0060.R1 Australia Western Australia Porites sp. VRM0061.R2 Australia Western Australia Porites sp. VRM0062.R2 Australia Western Australia Porites sp. VRM0063.R2 Australia Western Australia Porites sp. VRM0064.R2 Australia Western Australia Porites sp. VRM0065.R2 Australia Western Australia Porites sp. VRM0066.R1 Australia Western Australia Porites sp. VRM0067.R1 Australia Western Australia Porites sp. VRM0068.R2 Australia Western Australia Porites sp. VRM0069.R2 Australia Western Australia Porites sp. VRM0070.R2 Australia Western Australia Porites sp. VRM0071.R2 Australia Western Australia Porites sp. VRM0072.R2 Australia Western Australia Porites sp. VRM0073.R2 Australia Western Australia Porites sp. VRM0074.R2 Australia Western Australia Porites sp. VRM0076.R2 Australia Western Australia Porites sp. VRM0077.R2 Australia Western Australia Porites sp. VRM0078.R2 Australia Western Australia Porites sp. VRM0079.R2 Australia Western Australia Porites sp. VRM0080.R2 Australia Western Australia Porites sp. VRM0081.R1 Australia Western Australia Porites sp. VRM0082.R2 Australia Western Australia Porites sp. VRM0083.R2 Australia Western Australia Porites sp. VRM0084.R2 Australia Western Australia Porites sp. VRM0085.R2 Australia Western Australia Porites sp. VRM0086.R1 Australia Western Australia Porites sp. VRM0087.R1 Australia Western Australia Porites sp. VRM0090.R1 Australia Western Australia Porites sp. VRM0091.R1 Australia Western Australia Montipora sp. VRM0096.R1 Australia Western Australia Porites sp. VRM0097.R2 Australia Western Australia Porites sp. VRM0098.R1 Australia Western Australia Porites sp. VRM0099.R1 Australia Western Australia Porites sp. VRM0100.R1 Australia Western Australia Porites sp. VRM0101.R2 Australia Western Australia Porites sp. VRM0102.R2 Australia Western Australia Porites sp. VRM0103.R2 Australia Western Australia Porites sp. VRM0104.R2 Australia Western Australia Porites sp. VRM0105.R2 Australia Western Australia Porites sp. CHAPTER 2 26

VRM0106.R1 Australia Western Australia Porites sp. VRM0107.R2 Australia Western Australia Porites sp. VRM0108.R2 Australia Western Australia Porites sp. VRM0109.R2 Australia Western Australia Porites sp. VRM0110.R2 Australia Western Australia Porites sp. VRM0111.R2 Australia Western Australia Porites sp. VRM0112.R2 Australia Western Australia Porites sp. VRM0113.R2 Australia Western Australia Porites sp. VRM0114.R2 Australia Western Australia Porites sp. VRM0115.R1 Australia Western Australia Porites sp. VRM0116.R2 Australia Western Australia Porites sp. VRM0117.R2 Australia Western Australia Porites sp. VRM0118.R2 Australia Western Australia Porites sp. VRM0119.R2 Australia Western Australia Porites sp. VRM0120.R1 Australia Western Australia Porites sp. VRM0121.R1 Australia Western Australia Porites sp. VRM0122.R1 Australia Western Australia Porites sp. VRM0123.R1 Australia Western Australia Porites sp. VRM0124.R1 Australia Western Australia Porites sp. VRM0190.R1 Australia Queensland Pocillopora sp.

After collection, the coral tissue was removed with pliers and razor blades and samples were stored in RNAlater or 100% ethanol. The DNA was extracted using either: (1) a modified phenol-chloroform protocol — see Cremen et al. (2016); our only modification was to use phenol/chloroform/isoamyl alcohol [25:24:1] (instead of chloroform/isoamyl alcohol only) in the first extraction step. Or (2) the Wizard Genomic DNA Purification Kit (Promega). More amplification success and DNA yield was obtained using the kit with a small modification: incubating a piece of coral skeleton (ca. 80mm3) in the lysis buffer for 3 hours without grinding the sample, then proceeding with the manufacturer’s instructions for plant tissue.

Library preparation:

The amplicons and respective primers used here were:

16S rDNA: We used either the 515f/806r (Caporaso et al., 2012; Gilbert et al., 2014) or the S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21 (Klindworth et al., 2013) primer pairs to PCR-amplify this marker. The amplicons generated by these 2 primer pairs overlap in the V3-V4 region, so we used only this overlapping region — the sequence length after trimming primer sequences was, on average, 225 base pairs.

18S rDNA: We used the NF1/18 Sr2b primer combination (Porazinska et al., 2009).

23S rDNA: We used the algal specific primer pair p23SrV_r1/p23SrV_f1, which PCR- amplifies the Universal Plastid Amplicon (Presting 2006). CHAPTER 2 27

tufA: We used primers tufAR (Fama et al., 2002) and a forward primer designed here for Ostreobium (Oq-tuf: ACN GGN CGN GGN ACN GT), which has several ambiguous bases in order to amplify a larger range of green algae species.

Supplementary Table S2: 1st PCR primers design†

Tail 1 Ns (0-3bp) Primer (e.g. 16S 515/806)

Forward TCGTCGGCAGCGTCAGATGTGTATAAGAGACAG NNN [GTGCCAGCMGCCGCGGTAA] Reverse GTCTCGTGGGCTCGGAGATGTGTATAAGAGACAG NNN [GGACTACHVGGGTWTCTAAT] † Oligonucleotide sequences © 2007-2012 Illumina, Inc. All rights reserved - Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.

Supplementary Table S3: 2nd PCR oligonucleotides†.

Forward Illumina adapter F Indices Tail_2 A*ATGATACGGCGACCACCGAGATCTACAC TAGATCGC TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC CTCTCTAT TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC TATCCTCT TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC AGAGTAGA TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC GTAAGGAG TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC ACTGCATA TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC AAGGAGTA TCGTCGGCAGCGTC A*ATGATACGGCGACCACCGAGATCTACAC CTAAGCCT TCGTCGGCAGCGTC

Reverse Illumina adapter R Indices Tail_2 C*AAGCAGAAGACGGCATACGAGAT TAAGGCGA GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT CGTACTAG GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT AGGCAGAA GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT TCCTGAGC GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT GGACTCCT GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT TAGGCATG GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT CTCTCTAC GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT CAGAGAGG GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT GCTACGCT GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT CGAGGCTG GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT AAGAGGCA GTCTCGTGGGCTCGG C*AAGCAGAAGACGGCATACGAGAT GTAGAGGA GTCTCGTGGGCTCGG *Indicates a phosphorothioate modification. † Oligonucleotide sequences © 2007-2012 Illumina, Inc. All rights reserved - Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.

We amplified the four markers in separate reactions containing 0.2 mM dNTP mix,

0.5 µM forward and reverse primers, 2 mM MgCl2, 0.4 µg/µl Bovine Serum Albumin, 1× PCR buffer and 0.25U of Platinum Taq DNA Polymerase (Invitrogen). The first PCR round consisted of: initial denaturation step at 94°C for 5 min, followed by 25 cycles of denaturation (94°C for 30 s), annealing (45 s) and extension (72°C for 30 s) and a final extension step at 72°C for 5 min for the ribosomal DNA markers. Annealing temperature was set at 50°C for primer pair 515f/806r, 55°C for p23SrV_r1/p23SrV_f1 and S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21 and 60°C for NF1/18Sr2b. Because CHAPTER 2 28

tufA is a coding gene, it has higher mutation rates (especially at 3rd codon positions) when compared to ribosomal DNA, therefore a touchdown step and a lower annealing temperature is required (55—48°C for 14 cycles followed by 24 cycles at 48°C). Unspecific amplification does occur, but those are excluded in the analysis pipeline (e.g. steps 5, 6 and 9 of the pipeline).

For the second PCR we used the following conditions: initial denaturation step at 94°C for 5 min, followed by 8 cycles of denaturation (94°C at 30 s), annealing (55°C at 30 s) and extension (72°C at 30 s) and a final extension step at 72°C for 5 min. We purified the samples using home-made magnetic beads as described in Rohland and Reich (2012) and quantified the libraries using the Qubit fluorometer (Invitrogen). We produced libraries for three runs: the first run containing 48 samples (of which 5 were PCR controls or mock extractions), the second and third runs contained 96 samples (including 7 and 8 controls, respectively). The libraries were sequenced with the Illumina MiSeq platform (V3 kit - 2×300 bp PE reads) at the Centre for Translational Pathology, University of Melbourne. The runs generated sequences for all samples that had a successful PCR amplification (ca.1 uM or more). Not all samples successfully sequenced in these three runs are included in this study: here we included the 132 samples listed above, the remaining samples are part of a different study and will be published separately.

Costs calculation:

20 indexed oligos (Supplementary Table 3), with phosphorothioate modification, produced at 200 nmole scale by Bioneer Pacific = 425AUD (Including GST). Each 200 nmole oligonucleotide is sufficient for 800 reactions. Prices for plate orders.

Nextera kit with 96 Indices, 384 samples (FC-121-1012) = 1132 AUD.

Prices are from 2014.

1 Australian Dollar (AUD) ≈ 0.72 US Dollar (October 2015).

Data processing pipeline:

1. Remove the reverse complement of adapters from short amplicons. When the length of the reads is longer than the amplicon, you will get the reverse complement of the adapter sequenced in the 3’ end of the read, which can influence the merging of the paired end sequences. CHAPTER 2 29

2. Separate genes into different files. With our library preparation design, the MiSeq run yields one file per sample, each containing all amplicons. The different amplicons are teased apart based on primers sequences in this step. 3. Trim 3’ ends of reads (5 bases in forward reads and 20 bases in reverse reads) to improve consensus quality. 4. Merge forward and reverse reads using FLASH (Magoč and Salzberg, 2011). 5. Quality control: filter merged reads based on a quality threshold (average of 35 per merged read) using PRINSEQ (Schmieder and Edwards, 2011). 6. Trim primers from merged reads. Sequences that do not meet a minimum length threshold and/or do not have the exact primer sequence at the 3’ and 5’ ends are excluded from analysis in order to ensure quality (i.e., the sequence belong to the target gene) and global trimming (i.e., they start and end at the same position). 7. Format reads’ identification and generate one file per gene containing all samples. 8. Run UPARSE pipeline (Edgar, 2013): dereplication, sort by size, cluster OTUs and produce OTU map. We chose UPARSE because other available software (e.g. Qiime and Mothur) seem to significantly overestimate the number of OTUs (Edgar, 2013). Based on the divergence of tufA among Bryopsidales we used a similarity threshold of 98% for OTU clustering in this marker, which is a conservative threshold for species level (i.e., most Bryopsidales species are more similar than that, so at 98% the OTUs will be somewhere between species and genus level). We choose the 97% threshold for the other markers for two main reasons: 1) there is not enough information in literature about the rDNA markers similarity among Bryopsidales species, on the contrary, it is known that they do not have phylogenetic signal to distinguish them; 2) our aim was to compare how the normally used markers (with their commonly used thresholds) perform in distinguishing algae species. 9. Alignment: we used PyNAST (Caporaso et al., 2010a) to align the 16S and 18S rDNA sequences. This aligner requires a reference database with aligned sequences and lots of gaps in the alignment. Due to the lack of such reference databases for 23S rDNA and tufA, we chose MAFFT (Katoh et al., 2002) to align 23S rDNA and tufA. The OTUs that failed to align were excluded from downstream analysis. 10. Assign taxonomy using the Naïve Bayesian Classifier (RDP) implemented in Qiime (Wang et al., 2007; Caporaso et al., 2010b). We used Greengenes and SILVA databases for the 16S and 18S rDNA sequences respectively. In order to produce an RDP-friendly database for the 23S rDNA and the tufA, we downloaded reference sequences from Genbank, used a phylogenetic similarity threshold (based on a UPGMA tree) to equalize the dataset (i.e. exclude repetitive species, which will bias the RDP classifier) and produced the reference dataset (one file with the sequences and another with taxonomic ranks). We used RDP taxonomic assignments to: i) infer the abundance of reads assigned to the main microbial groups (Figure 2); and ii) pre-filter OTUs to build a green algae phylogenetic tree: OTUs that were not classified as “Eukaryotic” (or “Chloroplast” in the 16S) were excluded from the phylogenetic analysis. 11. Filter OTUs found in negative controls (mock extractions and negative control PCRs). Although virtually no DNA was detected (with Qubit) in these controls, we added those samples to our library in order to detect any possible contaminant. But apart CHAPTER 2 30

from cross contamination, sequencing errors can yield false-positives. Therefore some OTUs found in the controls could be, for example, the most abundant Ostreobium sequences which should not be excluded. So we filtered the OTUs present in the controls, but only if they would represent less than 1% of the total number of reads. For the 18S dataset, OTUs matching Cnidaria and Dinophyceae were considered contaminants from the coral tissue and were also removed from the analysis. 12. Filter OTU table by minimum count (2) of reads per OTU per sample (filter_observations_by_sample.py - https://gist.github.com/adamrp/7591573). Another quality control step to remove OTUs present with low abundance in the samples. 13. Filter rare OTUs (less than 5 reads) and produce final filtered OTU fasta file. This is the input for the phylogenetic analysis. 14. Produce final OTU table and statistics. These statistics are another sort of quality check, and the OTU table is necessary for beta-diversity/comparative analysis (which we do not do in this study). From here one can check the sequencing depth and proceed to Qiime’s core_diversity_analysis.py, for example.

Phylogenetic analysis:

The short reads generated by high-throughput sequencing technologies remain an issue for phylogenetic analysis. To overcome this problem, we used longer larger parts of the sequenced genes (available on Genbank, generated by Sanger sequencing) and additional genes to reconstruct the backbone of the green algal phylogeny. That way, even though the relationship within OTU-only clades may not be well resolved due to short fragments, the position of these clades among the green algae phylogeny can be inferred with strong support. We concatenated genes of different species of the same genus when same- species-sequences were not available, therefore filling as much as possible the gaps in the alignment. We used only species for which there was a reference sequence (for species or genus) for the marker analyzed – for example: there is no Ostreobium 18S rDNA sequence available (Supplementary Table 4), therefore this taxon was not included in the phylogenetic analysis of the 18S OTUs. The phylogenetic trees (Figure 3 and Supplementary Figures 1-4) were built with the following markers:

• tufA-OTUs phylogeny: tufA (including OTUs) + rbcL + 18S rDNA = 4833 bp. • 16S rDNA-OTUs phylogeny: tufA + rbcL + 18S rDNA + 16S rDNA (including OTUs) = 7251 bp. • 18S rDNA-OTUs phylogeny: tufA + rbcL + 18S rDNA (including OTUs) = 5297 bp. • 23S rDNA-OTUs phylogeny: tufA + rbcL + 18S rDNA + 23S rDNA (including OTUs) = 9489 bp.

CHAPTER 2 31

Supplementary Table S4: Reference sequences (voucher and Genbank accession number, when available) used to reconstruct the green algae phylogeny.

Species tufA rbcL 18S rDNA 16S rDNA 23S rDNA DI1_HG518471 GB_HG518454.1 GB_AY165774 acetabulum HEC12349_XX GB_AY177739.1 GB_Z33468 Acetabularia dentata MBLPoly1_AY454408 GB_AY303591 Acrochaete leptochaete GWS00746_HQ610 SAG127.80_FN563074.1 Acrosiphonia sp. UTEX.393_DQ396875 UTEX393_DQ396875 Acutodesmus obliquus XX_KC843975 GB_KC843975 Auxenochlorella protothecoides HV00599_FJ432651 GB_FJ432635.1 GB_FJ535833 Avrainvillea lacerata HV02664_XX HV02664_XX GB_FJ535834 Avrainvillea nigricans LL0095_XX Batophora noID1 KMP01309a_KJ41191 GB_FJ715716.1 MA31b1_AY303599 Bolbocoleon piliferum H.0758_XX H.0758_XX Boodleopsis sp. SAG.807.1_KM462884 SAG.807.1_KM462884 Botryococcus braunii HV01238_XX Bryopsis duplex Bryopsis hypnoides GB_GQ892829 GB_GQ892829 GB_FJ715685 XX_GQ892829 GB_NC013359 Bryopsis plumosa HV00880_XX GB_AB038480 HV00880_FJ432630 West4718_LN810504 West4718_LN810504 F004_XX Bryopsis vestita HV03983_XX HV03983_XX H.0890_XX Callipsygma wilsonis GB_CAD10730 GB_FR848349.1 GB_JF932262 filiformis L.09.10.052_FR848335 GB_KF649941.1 Caulerpa obscura GB_JN817682 GB_AB038485 GB_AF479702 Caulerpa racemosa PGSTPM002_KF724396 Caulerpa sertularioides L.09.10.048_FR848333 GB_FR848346.1 Caulerpa simpliciuscula TS0072_KM186530 GB_FR668300.1 GB_JF932252 Caulerpa verticillata TS24_FJ432655 GB_FJ432638.1 GB_FJ535836 Caulerpella ambigua GB_AJM90125.1 GB_KM464711.1 GB_DQ399583 parasiticus Chlamydomonas reinhardtii GB_DAA00908 GB_BK000554.2 GB_JN903984 GB_BK000554 GB_BK000554 SAG.38.88_KM462865 SAG.38.88_KM462865 Chlorella mirabilis ArM0029B_KF554427 ArM0029B_KF554427 Chlorella sp. NC64A_KJ718922 Chlorella variabilis C-27_BAA57886 GB_CHLC27 GB_X13688 Chlorella vulgaris PH660_XX GB_AY177750.1 GB_Z33466 Chlorocladus australasicus GB_KM052785.1 GB_U41176 Chlorococcum oleofaciens H.0880_XX H.0880_XX Chlorodesmis baculifera GB_FJ535837 Chlorodesmis fastigiata UTEX1176_KM462875 UTEX1176_KM462875 Chlorosarcina brevispinosa UTEX1186_HQ246369 BCPJT1VF80_HQ246349 BCPJT1VF8_HQ246315 Chlorosarcinopsis eremi Choricystis sp. CAUP.H.1984_XX GB_KM438409.1 GB_AY762605 SAG.17.98_KM462878 SAG.17.98_KM462878 GB_FM205051 Cladophora pygmaea GB_FM205053 Cladophora rhodolithicola GB_AB971263 Cladophora socialis ARS00769_KM676565 ARS00515_KM677025 Cloniophora spicata GB_AM260447.1 SAG1697_KM020110 Coccobotrys verrucariae C-169_HQ693844 GB_HQ693844.1 C.169_HQ693844 C-169_HQ693844 Coccomyxa sp. HV01432_XX GB_EF107968.1 Codium arabicum.2 HEC15968_KP685825 GB_JQ706329.1 Codium arenicola G.113_XX G.113_XX Codium bursa.2 LT0248_JX463043 GB_EF107984.1 KZN2K4.1_FJ535848 Codium duthieae GB_EF426671.1 Codium edule GB_U08345.1 GWS002780_HQ603326.1 Codium fragile GB_EF426672.1 Codium sp. WA3_AY198125 Collinsiella tuberculata XX_XX XX_XX HV00469_XX Cymopolia barbata HV02661_XX HV02661_XX Dasycladus vermicularis HADL602_XX GWS008836_HQ603328.1 Derbesia sp. UTEX2073_HE610155 Desmochloris halophila GB_AJ431571 Desmococcus endolithicus SAG.41.98_KM462885 SAG.41.98_KM462885 Dicloster acuatus SAG.2150_KM462860 SAG.2150_KM462860 Dictyochloropsis reticulata Dunaliella salina CCAP19-18_GQ250046 GB_GQ250046.1 GB_EF473745 CCAP.19.18_GQ250046 CAUP.H7103_KM462887 CAUP.H7103_KM462887 Elliptochloris bilobata XX_KC66149 Enteromorpha ovata SAG.228.1_XX GB_KM438414.1 GB_AF387154 Eremosphaera viridis UTEX975_KM462869 UTEX975_KM462869 Ettlia pseudoalveolaris CCMP1673_AY198123 Eugomontia sacculata HV01202_XX GB_FJ432640.1 GB_AF416389 GB_FJ535847 Flabellia petiolata Floydiella terrestris UTEX1709_ACZ58461 UTEX1709_NC014346 GB_D86498 UTEX1709_GU196268 UTEX1709_GU196268 SAG.28.85_KM462882 SAG.28.85_KM462882 Fusochloris perforata 6969_JF680967 Gayralia sp. SAG.22.88_KM462883 SAG.22.88_KM462883 Geminella minor SAG.20.91_KM462881 SAG.20.91_KM462881 Geminella terricola Gloeotilopsis sterilis UTEX1704_KM462877 SAG888_KM020063 UTEX1704_KM462877 UTEX1704_KM462877 MA24b1_AY278216 Gomontia polyrhiza K3.F3.4_AP012494 K3-F3-4_AP012494 Gonium pectorale HV00565_FJ535858 GB_AY177745.1 GB_AY165786 Halicoryne wrightii HV03191_XX HV01207_XX Halicystis sp. HV00183b_AM049955 GB_FJ624514.1 PH534_AF525556 Halimeda borneensis CHAPTER 2 32

H.0330_EF667065 GB_FJ624508.1 H.0330_AF525612 Halimeda copiosa HV00483_EF667057 GB_FJ624496.1 H.0237_AF407244 Halimeda cryptica GB_AJF21965.1 GB_AJF21946.1 GB_AF525549.1 Halimeda cylindracea LPT0057_XX GB_AB038488.1 SOC299_AF407254 Halimeda discoidea.ip H.0125_EF667058 GB_FJ624498.1 H.0125_AF407245 Halimeda fragilis.1 WLS184.0_EF667059 GB_FJ624499.1 H.0025_AF525578 Halimeda micronesica WA1.14B_AY454417 Wa14_AY198122 Halochlorococcum moorei GB_DQ398104 GB_DQ398104 Helicosp.oridium sp. UTEXEE124_KM464719.1 UTEXEE52_HQ317296.1 GB_FJ648517 Hemichloris antarctica UTEX2012_FN563076 Ignatius tetrasporus A_KM609 GB_JN102134.1 GB_JN102133 Jaoa prasina CAUPH8102_HM563744 Jenufa minuta Koliella corcontica SAG.24.84_XX SAG.24.84_XX GB_AJ306536 SAG.24.84_KM462874 SAG.24.84_KM462874 UTEX339_KM462868 UTEX339_KM462868 Koliella longiseta GWS004830_HQ6105 Kornmannia leptoderma Leptosira terrestris UTEX333_ABO69293 GB_EF506945.1 GB_Z28973 UTEX333_EF506945 UTEX333_EF506945 Lobosp.haera incisa CAUP.H.4301_XX CAUP.H.4301_XX GB_AY762602 SAG.2007_KM462871 SAG.2007_KM462871 NIES1824_KM462870 NIES1824_KM462870 Marsupiomonas sp. SAG.12.88_KM462888 SAG.12.88_KM462888 Marvania geminata RCC299_FJ858267 RCC299_FJ858267 Micromonas sp. Microthamnion kuetzingianum CAUP.J.1201_XX GB_KM438427.1 GB_AB488588 UTEX318_KM462876 UTEX318_KM462876 Monomastix sp. OKE-1_ACK36861 GB_217314511_gb_FJ493 GB_FJ493496 OKE.1_FJ493497 OKE-1_FJ493497 497.1_ GWS003626_HQ610262 GWS003626_HQ603497 LYGJM_HQ850570 Monostroma sp. SAG21114_HQ902932 SAG21114_HQ902940 Muriella zofingiensis Myrmecia israelensis UTEX1181_723456786 UTEX1181_723456786 UTEX1181_KM462861 UTEX1181_KM462861 CAUP.D802_KM462873 CAUP.D802_KM462873 Neocystis brevis NIES.252_KJ746600 NIES.252_KJ746600 Nephroselmis astigmatica Nephroselmis olivacea NIES484_AF137379 NIES484_AF137379 GB_FN562436 NIES.484_AF137379 GB_AF137379 MA1.8d1_AY454406 Ochlochaete hystrix Oedogonium cardiacum SAG575-1b_ACC97263 GB_EU677193.1 GB_U83133 SAG.575.1b_EU677193 GB_EU677193 Oltmannsiellopsis viridis NIES360_ABB81968 GB_DQ291132.1 GB_FN562431 NIES.360_DQ291132 GB_DQ29113 Oocystis solitaria SAG83.80_ACQ90812 GB_FJ968739.1 GB_AF228686 SAG.83.80_FJ968739 SAG.83.80_FJ968739 GB_GU119643.1.1399 Ostreobiaceae sp. GB_GU119844.1.1398 Ostreobiaceae sp.2 GB_GU119848.1.1403 Ostreobiaceae sp.3 GB_GU119621.1.1408 Ostreobiaceae sp.4 GB_GU119562.1.1419 Ostreobiaceae sp.5 GB_FJ203420.1.1418 Ostreobiaceae sp.6 GB_FJ203501.1.1406 Ostreobiaceae sp.7 SAG699_XX GB_AY004765.1 SAG699_XX Ostreobium quekettii H.0754_XX GB_FJ535853.1 Ostreobium sp. Ostreococcus tauri OTTH0595_CAL36350 GB_KC990831.1 GB_Y15841 OTTH0595_CR954199 OTTH0595_CR954199 SAG.7.90_KM462866 SAG.7.90_KM462866 Pabia signiensis Parachlorella kessleri SAG211-11g_ACQ90978 GB_FJ968741.1 GB_X56105 SAG.211.11g_FJ968741 SAG.211.11g_FJ968741 SAG.18.84_KM462879 SAG.18.84_KM462879 Paradoxia multiseta HV01770_XX GB_AY177741.1 GB_Z33471 Parvocaulis parvulus Pedinomonas minor UTEXLB1350_ACQ90891 GB_FJ968740.1 GB_JN592588 UTEX.LB1350_FJ968740 UTEX-LB1350_FJ968740 SAG.42.84_KM462867 Pedinomonas tuberculata YPF-701_AB561078 Pedinophyceae sp. GB_FJ535841.1.1486 Pedobesia simplex HV03128_XX GB_FJ432641.1 H.0349_AF416404 Penicillus capitatus UTEX143_AY454403 GB_AF387105.1 UTEX1423_AY303589 percursa KMP01309b_KJ4119 Phaeophila dendroides Picocystis salinarum CCMP1897_AB561082 GB_AB491633.1 GB_FR865648 CCMP1897_KJ746599 CCMP1897_KJ746599 Planctonema lauterborni SCCAP.K0187_XX GB_KM438432.1 GB_AF387148 SAG.68.94_KM462880 SAG.68.94_KM462880 NIES.1363_JX977846 NIES-1363_JX977846 Pleodorina starri Prasinococcus sp. MBIC11011_AB561084 GB_AB491660.1 GB_AB058384 CCMP1194_KJ746597 CCMP1194_KJ746597 CCMP1220_KJ746598 CCMP1220_KJ746598 Prasinoderma coloniale CCMP1205_AB561083 GB_AB491624.1 CCMP1205_U40921 CCMP1205_KJ746601 sp. P65_KF993457 GALW015724_JQ669724 GB_EF200532 Prasiola crispa Prasiolopsis sp. XX_XX GB_KM464713.1 GB_AY762601 SAG.84.81_KM462862 SAG.84.81_KM462862 GWS00668_HQ61075 GB_HQ603507.1 18S_DQ821517 Protomonostroma undulatum SAG.263.11_KJ001761 GB_KJ001761 Prototheca wickerhami Pseudendoclonium akinetum UTEX1912_AAV80665 UTEX1912_AAV80617 GB_DQ011230 UTEX.1912_AY835431 UTEX1912_AY835431 CAUP.H.1990_XX GB_KM438438.1 GB_X63520 Pseudochlorella pringsheimii SAG.1.80_KM462886 SAG.1.80_KM462886 Pseudochloris wilhelmii TS64_FJ432660 GB_FJ432649.1 Pseudochlorodesmis abbreviata HV01204_FJ432656 GB_FJ432643.1 HV01204_XX Pseudochlorodesmis sp.1 CAUP.H.102_XX GB_KM438443.1 GB_FN298926 Pseudococcomyxa simplex JH1_AM909695 GB_AM909690.1 Pseudocodium devriesii DML58944_FJ607676 GB_AM909692.1 NSF.I23_FJ432631 Pseudocodium floridanum KZNb2250_FJ607679 GB_AM909693.1 KZNb2242_FJ432632 Pseudocodium natalense HV03339_LK0451 Pseudoderbesia sp. UTEX58_HQ292755 UTEX58_HM770959 UTEX57_HM852441 Pseudomuriella engadinensis UTEX1445_AY4544 Pseudoneochloris marina NIES626_AB561080 GB_U30281.1PCU30281 GB_AJ010407 Pterosp.erma cristatum Pycnococcus provasolii CCMP1203_ACK36845 GB_FJ493498.1 GB_X91264 CCMP.1203_FJ493498 GB_FJ493498 Pyramimonas parkeae CCMP726_ACJ71140 GB_FJ493499.1 GB_FN562443 CCMP.726_FJ493499 GB_FJ493499 GB_FJ535843 Rhipidosiphon javensis PITI044_JQ082492 GB_JQ082482.1 G.453_XX Rhipilia coppejansii GB_FJ535844 Rhipilia crassa CHAPTER 2 33

HV00788_FJ432658 GB_FJ432646.1 HV00788_FJ432633 Rhipilia nigrescens HEC10439_XX G.466_XX HEC10439_XX Rhipiliopsis gracilis ASA001_XX ASA001_XX Rhipiliopsis howensis TZ0333_XX TZ0333_XX Rhipiliopsis madagascariensis H.0891_XX H.0891.p2_XX H.0891_XX Rhipiliopsis peltata DML52138_XX DML51973_FJ432647 DML51973_XX GB_FJ535845 Rhipiliopsis profunda DML67755_XX DML68726_XX GB_AF416386 Rhipiliopsis reticulata DML68740_XX DML68740_XX Rhipiliopsis stri GB_FJ535846 Rhipocephalus phoenix RNtenuis7_JQ30995 Ruthnielsenia tenuis UTEX393_ABD48278 UTEX393_ABD48257 GB_AJ249515 Scenedesmus obliquus Schizomeris leibleinii UTEX-LB1288_HQ700713 UTEX-LB1288_HQ700713 GB_AF182820 UTEX.LB1228_HQ700713 UTEX.LB1228_HQ700713 GWS003854_HQ61078 Spongomorpha aeruginosa Stichococcus sp. CAUP.J.1302_XX GB_KM438447.1 GB_DQ275461 UTEX176_KM462864 UTEX176_KM462864 Stigeoclonium helveticum UTEX441_ABF60202 GB_DQ630521.1 GB_U83131 UTEX.441_DQ630521 UTEX441_DQ630521 GB_EF113476.1 GB_U41175 Tetracystis aeria NIES2432_AB561081 GB_U05039 Tetraselmis sp. SAG219-1d_EU123976 GB_EU123967.1 GB_EU123942 Trebouxia aggregata MX-AZ01_JX402620 GB_JX402620.1 MX.AZ01_JX402620 Trebouxiophyceae sp. GB_KM464717 GB_KM464712.1 SAG20.94_ KM020077 Trentepohlia annulata GB_EF113480.1 SAG10380_KM020157 Trochiscia hystrix Tydemania expeditionis FL1151_LN810505 FL1151_LN810505 HV00873_FJ432634 FL1151_LN810505 FL1151_LN810505 HV02674_XX HV02674_XX H.0415_AF407270 Udotea flabellum UTEX745_AY45444 GB_AF499683.1 112011_JX491158 Ulothrix zonata GB_AB561079 GB_AB097621.1 GB_AJ000040 Ulva arasakii GWS023543_JN029306 GB_AY422565 GB_DQ286547 Ulva fasciata SAS.06035_KJ617036 Ulva shanxiensis UNA00071828_KP720616 UNA00071828_KP720616 Ulva sp. GWS01793_KM55060 Ulvaria sp. UTEXB351_JQ30993 Ulvella endozoica B_KM608 HB1306_KM226205 HB1306_KM226210 Ulvella sp. RN18981_JQ3099 Ulvella waernii GWS018246_JN029346 C14_AB426255 Umbraulva japonica GWS03894_JN09359 GB_AB097612.1 Umbraulva sp. GWS006374_HQ610441 GWS006374_HQ603676 Park1_AY476819 Urosp.ora wormskioldii UTEX2908_ACY06012 GB_EU755275.1 GB_AB542923 Volvox carteri Watanabea reniformis CAUP.H.1932_XX GB_KM464714.1 GB_X73991 SAG.211.9b_KM462863 SAG.211.9b_KM462863 Xylochloris irregularis CAUP.H.7801_XX GB_KM438451.1 GB_EU105209 CAUP.H7801_KM462872 CAUP.H7801_KM462872

Supplementary Figures:

The supplementary figures bellow can be found in the supplementary materials online: http://www.nature.com/articles/srep31508#supplementary-information

Or at the following Dropbox folder: https://www.dropbox.com/sh/g9thuyr5ohlqzc6/AACG1ZKtQw45l85BNwkv-ohka?dl=0

Supplementary Figure S1 – Maximum Likelihood phylogeny including tufA OTUs with bootstrap values.

Supplementary Figure S2 – Maximum Likelihood phylogeny including 16S rDNA OTUs with bootstrap values.

Supplementary Figure S3 – Maximum Likelihood phylogeny including 18S rDNA OTUs with bootstrap values

Supplementary Figure S4 – Maximum Likelihood phylogeny including 23S rDNA OTUs with bootstrap values

CHAPTER 2 34

References: Caporaso J, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. (2010). QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7: 335–336. Caporaso JG, Bittinger K, Bushman FD, Desantis TZ, Andersen GL, Knight R. (2010). PyNAST: A flexible tool for aligning sequences to a template alignment. Bioinformatics 26: 266–267. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, et al. (2012). Ultra-high- throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME Journal 6: 1621–1624. Cremen C, Huisman JM, Marcelino VR, Verbruggen H. (2016). Taxonomic revision of Halimeda in southwestern Australia. Aust Syst Bot. 29: 41-54 Edgar RC. (2013). UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 647: 1–5. Fama P, Wysor B, Kooistra WHCF, Zuccarello GC. (2002). Molecular phylogeny of the genus Caulerpa (Caulerpales, Chlorophyta) inferred from chloroplast tufA gene. Journal of Phycology 38: 1040–1050. Gilbert JA, Jansson JK, Knight R. (2014). The Earth Microbiome project: successes and aspirations. BMC Biology 12: 69. Katoh K, Misawa K, Kuma K, Miyata T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30: 3059–3066. Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, et al. (2013). Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Research 41: e1. Magoč T, Salzberg SL. (2011). FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27: 2957–2963. Porazinska DL, Giblin-Davis RM, Faller L, Farmerie W, Kanzaki N, Morris K, et al. (2009). Evaluating high-throughput sequencing as a method for metagenomic analysis of nematode diversity. Molecular Ecology Resources 9: 1439–1450. Rohland N, Reich D. (2012). Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Research 22: 939–46. Schmieder R, Edwards R. (2011). Quality control and preprocessing of metagenomic datasets. Bioinformatics 27: 863–864. Sherwood AR, Presting GG. (2007). Universal primers amplify a 23S rDNA plastid marker in eukaryotic algae and cyanobacteria. Journal of Phycology 43: 605–608. Steven B, McCann S, Ward NL. (2012). Pyrosequencing of plastid 23S rRNA genes reveals diverse and dynamic cyanobacterial and algal populations in two eutrophic lakes. FEMS Microbiology Ecology 82: 607–615. Wang Q, Garrity GM, Tiedje JM, Cole JR. (2007). Naïve Bayesian classifier for rapid CHAPTER 2 35

assignment of rRNA sequences into the new bacterial taxonomy. Applied Environmental Microbiology 73: 5261–5267.

CHAPTER 2 36

CHAPTER 3 DIVERSITY AND STABILITY OF CORAL ENDOLITHIC MICROBIAL COMMUNITIES AT A NATURALLY HIGH pCO2 REEF

The health and functioning of reef-building corals is dependent on a balanced association with prokaryotic and eukaryotic microbes. The coral skeleton harbours numerous endolithic microbes, but their diversity, ecological roles and responses to environmental stress, including ocean acidification, are not well characterized. This study tests whether pH affects the diversity and structure of prokaryotic and eukaryotic algal communities associated with skeletons of Porites sp. using targeted amplicon (16S rRNA gene, UPA and tufA) sequencing. We found that the composition of endolithic communities in the massive coral

Porites sp. inhabiting a naturally high pCO2 reef (avg. pCO2 811µatm) is not significantly different from corals inhabiting control sites (avg. pCO2 357µatm), suggesting that these microbiomes are less disturbed by ocean acidification than previously thought. Possible explanations may be that the endolithic microhabitat is highly homeostatic, or that the endolithic microorganisms are well adapted to a wide pH range. Some of the microbial taxa identified include nitrogen-fixing bacteria (Rhizobiales and cyanobacteria), algicidal bacteria in the phylum Bacteroidetes, symbiotic bacteria in the family Endozoicomoniaceae, and endolithic green algae, considered the major microbial agent of reef bioerosion. Additionally, we test whether host species has an effect on the endolithic community structure. We show that the endolithic community composition of massive Porites sp. is substantially different and more diverse than the ones found in skeletons of the branching species Seriatopora hystrix and Pocillopora damicornis. Our results reveal a highly diverse and structured microbial community in coral skeletons that is likely to be resilient to ocean acidification. CHAPTER 3 38

3.1 Introduction

Ocean acidification (OA) is predicted to threaten the persistence of coral reefs by affecting the balance between constructive forces (calcification and growth of reef builders) and destructive forces (bioerosion and carbonate dissolution) (Tribollet 2008; Andersson &

Gledhill 2013). Acidification lowers the saturation state of calcium carbonate (CaCO3), making it more difficult for calcifying organisms, such as stony corals, to build their skeletons (Orr et al. 2005; Hofmann et al. 2010). OA has been shown to slow down calcification and cause structural deformities in juvenile corals (Crook et al. 2013; Foster et al. 2016). Some studies, however, indicate that corals are able to regulate the pH at the tissue-skeleton interface, where calcification takes place, mitigating the potential consequences of OA on the calcification process (McCulloch et al. 2012; Venn et al. 2013; Georgiou et al. 2015). Rates of biological dissolution of CaCO3 (bioerosion) tend to increase under low pH conditions, mostly due to an increase in biomass of the boring organisms living inside coral skeletons (Manzello et al. 2008; Tribollet et al. 2009; Crook et al. 2013; Fang et al. 2013; Reyes-Nivia et al. 2013; Enochs et al. 2016), potentially resulting in a shift from a net reef accretion condition to one of net erosion (Andersson & Gledhill 2013).

The skeletons of live and dead corals harbour bacteria, fungi, sponges and an abundant population of limestone-boring algae, all having important roles in the reef’s CaCO3 budget (Le Campion-Alsumard et al. 1995; Tribollet 2008; Verbruggen & Tribollet 2011). For example, the green alga Ostreobium can be responsible for 70–90% of carbonate dissolution within dead corals, eroding as much as 1 kg of reef carbonate per m2 per year (Tribollet 2008b; Grange et al. 2015). Green algal biomass in live coral skeletons exceeds Symbiodinium biomass in coral tissues by about 16 times (Odum & Odum 1955), making the limestone attractive to grazers and further increasing bioerosion (Chazottes et al. 1995). However endolithic algae also protect corals from high light stress (Yamazaki et al. 2008) and provide vital nutrients to corals, potentially extending the time they can survive without Symbiodinium during bleaching events (Schlichter et al. 1995; Fine & Loya 2002). Endolithic algae have exceptionally high levels of cryptic diversity (Del Campo et al. 2016; Marcelino & Verbruggen 2016; Sauvage et al. 2016), and although it is known that their biomass increases substantially upon acidification and warming (Tribollet et al. 2009; Reyes-Nivia et al. 2013), it is not known whether different cryptic species increase in relative abundance disproportionally.

The endolithic community, along with the coral host and its other symbionts, constitutes the coral holobiont (Rohwer et al. 2002). The responses of the coral microbiome, including both prokaryotic and eukaryotic members, to acidification has gained attention as

CHAPTER 3 39

we continue to uncover vital roles played by the microbiome in holobiont health and resilience (Bourne et al. 2009; Sharp & Ritchie 2012; Krediet et al. 2013; Blackall et al. 2015; Bourne et al. 2016). Because the ocean pH naturally changes throughout seasons, along depth gradients, with productivity and other biological factors, marine microbial communities are expected to be sufficiently plastic to endure the predicted levels of ocean acidification (Joint et al. 2011; O'Brien et al. 2016). This notion is supported by some studies that demonstrated no shift in coral prokaryotic community when subjected to high CO2 partial pressure (pCO2) and therefore reduced seawater pH conditions (Meron et al. 2012; Webster et al. 2016). However other studies have demonstrated that a reduced seawater pH can lead to the loss of Symbiodinium (coral bleaching) and trigger shifts from a healthy microbiome composition to a microbial community typically associated with diseased corals (Anthony et al. 2008; Vega Thurber et al. 2009; Meron et al. 2011; Webster et al. 2013; Morrow et al. 2015). These different responses of a coral’s microbiome to reduced seawater pH may reflect difference in resilience across coral holobionts to acidification, different evolutionary strategies to cope with OA that have evolved in distinct coral taxa, or different experimental setups used in the various studies.

Reefs at the Milne Bay Provence of Papua New Guinea (PNG) are in close proximity to volcanic seeps (expelling ~ 99% pure CO2) and constitute a good model system to study the impacts of acidification in situ on the microbial community associated with corals. Both coral species composition and the prokaryotic microbial community associated with coral tissue and mucus differ between high pCO2 seep sites and nearby control sites with ambient pCO2 (Fabricius et al. 2011; Morrow et al. 2015). However, little is known about the coral endolithic communities and how it may change under various seawater pH conditions. Previous studies have screened 16S rRNA gene clone libraries and demonstrated contrasting results, with significant effects of OA community composition within the skeleton in an experimental system (Meron et al. 2011) but no significant changes in corals transplanted to a natural CO2 seep site (Meron et al. 2012). One limitation with the 16S rRNA gene marker is that it underestimates the diversity of eukaryotic algae (Marcelino & Verbruggen 2016), and as a consequence, the major microbial agents of bioerosion have been overlooked in these studies.

Here we use high-throughput amplicon sequencing to investigate the effects of ocean acidification on the diversity and structure of endolithic microbial communities of corals inhabiting a high pCO2 site in PNG. Our goals are to: 1) test whether the community composition of prokaryotes and photosynthetic eukaryotes (assessed with the 16S rRNA gene, UPA and tufA markers) within the skeletons of massive colonies of Porites sp. differs between high pCO2 sites and nearby control sites where pCO2 is not affected by the volcanic

CHAPTER 3 40

seeps; 2) compare the endolithic communities of Porites sp. and two branching coral species (Seriatopora hystrix and Pocillopora damicornis) to investigate whether the microbiome in coral skeletons varies between host species; and 3) describe the endolithic community diversity found in corals of Papua New Guinea and discuss the potential functional roles of this microbiome under ocean acidification.

3.2 Materials and Methods

Field Site Sampling

Samples of Porites sp. (n = 6 per site and month) were collected in April and

November 2014 at two high pCO2 (seep) and control site pairs within the D’Entrecasteaux Islands, Milne Bay Province, Papua New Guinea. Two sites were surveyed and two field

excursions were made to increase the number of replicates of this study. High pCO2 samples were collected at Illi Illi (Upa-U) Seep (09.82425S, 150.81789E) and Dobu Seep (09.73646S,

150.86894E), and at nearby control (ambient pCO2) sites not exposed to elevated pCO2 conditions (Illi Illi Control, 09.82806S, 150.82028E and Dobu Control, 09.75211S,

150.85410E) (Fabricius et al. 2011; Uthicke et al. 2013). High pCO2 and control sites were ~500 m apart from one another. Samples of the branching corals Seriatopora hystrix (n = 3 at each site) and Pocillopora damicornis (n = 3 at each site) were only collected in April 2014 at Illi Illi seep and control sites (same as above). Seawater carbonate chemistry varies in response to bubble activity and water motion at the seep sites; thus, at Illi Illi seep, corals th th experience a pH range (defined here as the 5 and 95 percentiles) of 7.28–8.01 (avg. pCO2

624 µatm) and at Dobu seep a pH range of 7.08-7.99 (avg. pCO2 998 µatm). At Illi Illi control

site the pH ranges from 7.91–8.09 (avg. pCO2 346 µatm) and at Dobu control the pH ranges

from 7.91–8.10 (avg. pCO2 368 µatm) (Table 3.1, Fabricius et al. 2014), which is within the range of future predictions for the year 2100 ( et al. 2010).

Table 3.1. Seawater chemistry at the study sites. Medians and 5th and 95th percentiles, of

pCO2, pH, total alkalinity (TA), dissolved inorganic carbon (DIC), salinity and aragonite saturation state (Ω arag) are given. Source: Fabricius et al. 2014.

Locality pCO2 pH TA (µmoles kg-1) DIC(µmoles kg-1) Salinity Ω arag Dobu control 368 (279,558) 8.01 (7.91,8.10) 2235 (2221,2293) 1942 (1924,1924) 34.8 (33.8,36.0) 4.16 (3.07,4.35) Dobu high pCO2 998 (423,3541) 7.72 (7.08,7.99) 2295 (2260 2326) 2147 (1962,1962) 34.8 (34.0,36.0) 2.1 (0.65,3.75) Illi Illi control 346 (302,653) 7.98 (7.91,8.09) 2261 (2225,2326) 1982 (1947,1947) 34.8 (34.5,36.0) 4.08 (2.82,4.57) Illi Illi high pCO2 624 (410,1564) 7.81 (7.28,8.01) 2308 (2240,2378) 2049 (1964,1964) 34.8 (33.8,35.9) 2.89 (1.39,3.59)

CHAPTER 3 41

Coral fragments were collected using bone cutters or a hammer and chisel and placed into individual sections within a plastic tackle box, which allowed for water flow whilst underwater. After returning to the boat, samples were immediately placed into flowing seawater sourced directly from the collection site. Large pieces of Porites sp. were chipped into smaller fragments, rinsed thoroughly with sterile 0.02 µm-filtered seawater and placed in 50 mL Falcon tubes with RNAlater (Ambion). Samples were kept in a cooler with ice until returned to the laboratories at the Australian Institute of Marine Science (AIMS) where they were processed.

Fragments were removed from RNAlater and soaked in 0.2 µm filtered calcium and magnesium free seawater for ~10 minutes at room temperature (CMFSW; 0.45M NaCl,

10mM KCl, 7 mM Na2SO4, 0.5 mM NaHCO3 and milli-Q water) to remove loosely attached cells (Esteves et al. 2016). Tissues were removed into the CMFSW using an air gun fitted with a sterile tip. Skeletons with tissues removed were placed back into the original RNAlater collection buffer and stored at -80°C until shipment to the University of Melbourne where DNA was isolated from the endolithic community.

DNA isolation, library preparation and sequencing

Total DNA was isolated from coral skeletons using the Wizard Genomic DNA Purification Kit (Promega) following the manufacturer’s instructions for plant DNA with the exception of an initial 3 hr incubation in the first extraction buffer. Amplified DNA products for library preparation were obtained with a two-step process described by Marcelino and Verbruggen (2016). During the first PCR step, three metabarcoding markers were amplified: the 16S rRNA gene (Klindworth et al. 2013); the universal plastid amplicon (UPA), which is a fragment of the 23S rRNA gene (Presting 2006; Sherwood & Presting 2007) and the elongation factor Tu (tufA), which targets green algae (Ulvophyceae) (Fama et al. 2002; Marcelino & Verbruggen 2016). During the second PCR step, barcodes and Illumina adapters were attached to both 3’ and 5’ ends of the amplicons. One negative control was performed with each amplification (6 in total, one per marker and per amplification step) and sequenced with the library, even though no DNA was detected in any negative control during quantification. Two mock ‘blank’ extractions were also performed along with the coral samples DNA isolation and processed through the amplification process and sequenced to further control for possible contamination. All DNA isolation and PCR preparation were carried out inside a dedicated dead-air box (PCR workstation) sterilized with UV light for 15 min prior to each use. Libraries were quantified using the Quant-It PicoGreen reagent (Invitrogen) and pooled with other samples of another project. The library was sequenced

CHAPTER 3 42

using the Illumina MiSeq platform (2×300 bp paired end reads) at the Centre for Translational Pathology, University of Melbourne. Further details about the primers and library preparation are provided in Supplementary materials.

Data processing

The MiSeq run yielded one file containing all amplicons per sample, which were separated into different files based on the primer sequences. The 3’ ends of reads were trimmed to improve consensus quality; forward and reverse reads were merged using FLASH (Magoc & Salzberg 2011) and sequences having average quality scores smaller than 35 or lengths shorter than a threshold (350 bp for 16S rRNA gene, 320 bp for UPA and 400 bp for the tufA) were filtered out using PRINSEQ (Schmieder & Edwards 2011). Sequences were clustered into Operational Taxonomic Units (OTUs) using UPARSE (Edgar 2013). A similarity threshold of 98% was set for the tufA marker, a threshold near species level for this marker (Sauvage et al. 2016). For the other markers the default threshold of 97% was used. The 16S rRNA gene OTUs were aligned with PyNAST (Caporaso et al. 2010a) while the UPA and tufA datasets were aligned with MAFFT (Katoh et al. 2002). A taxonomy affinity was assigned to the OTUs using the Naïve Bayesian Classifier (RDP) implemented in QIIME v.1.9.1 (Wang et al. 2007; Caporaso et al. 2010b). The Greengenes v.13.8 dataset (DeSantis et al. 2006) was used to classify the 16S rRNA gene sequences, and custom-made reference datasets were used for tufA and UPA. The resulting OTU table went through a filtering process to remove negative controls and rare OTUs (i.e. OTUs with less than 5 reads across all samples and OTUs from samples where they are present with 2 or less reads). OTUs were also filtered based on their taxonomic classification to focus on the taxonomic groups that each marker best characterises: chloroplast sequences were excluded from the 16S rRNA gene dataset and bacterial sequences were excluded from the tufA dataset. Further details about the data processing pipeline are provided in the Supplementary materials.

Statistical analysis

There were no significant differences related to the time of collection (see Supplementary materials), therefore all Porites sp. samples (n = 24) were used to investigate the effects of pCO2 in endolithic communities associated with this coral genus. Rarefaction curves of the number of observed OTUs per number of reads were constructed by randomly subsampling the reads in QIIME, allowing to set a threshold for each marker where the curve reaches saturation (i.e. a plateau in the rarefaction curve), which was 2,200 reads in the 16S rRNA gene, 1,400 in the tufA and 7,000 in the UPA dataset (Supplementary Figure 3.1).

CHAPTER 3 43

Samples containing less reads than the rarefaction threshold were excluded, resulting in 20 samples in the 16S rRNA gene, and 22 samples in the UPA and tufA datasets (Supplementary Table S1). Alpha diversity indices (Chao1 and observed OTUs) were calculated using QIIME (Caporaso et al. 2010b). The relative abundance of individual OTUs and taxonomic groups between samples were tested for significant differences with a Kruskal–Wallis test (for OTUs) and ANOVA (for taxon groups) using QIIME (Caporaso et al. 2010b). Principal coordinate analysis (PCoA) on UniFrac distance matrices were also performed using QIIME (Caporaso et al. 2010b; Lozupone et al. 2011) and the results visualized using the ggplot2 package in R (Wickham 2009). To further investigate the distribution of green algal lineages, a maximum likelihood phylogeny was built with the green algal tufA OTUs together with reference sequences (from GenBank) using a GTR+gamma model of sequence evolution in RAxML v.8.2.6 (Stamatakis 2006). OTUs present in less than 3 samples were excluded, their relative abundances were normalized with cumulative sum scaling (Paulson et al. 2013) and visualized alongside a phylogenetic tree using the R package phytools (Revell 2012).

A discriminant analysis (DA) was performed to determine which OTUs contribute more to observed differences between samples at high pCO2 and control sites with the R package candisc (Friendly & Fox 2009). A description of the rationale behind this method and examples applied in microbial ecology can be found in Paliy & Shankar (2016). To avoid the errors imposed by having many zero OTU abundances in the dataset, we excluded from the analysis OTUs present in fewer than 15 (16S rDNA), 4 (tufA) and 6 (UPA) samples. These thresholds were based on the data required to compute eigenvalues by R’s function eigen and aimed at minimizing the number of zeros (and therefore deviations from normality) and maximize the number of OTUs analysed. Cross-validation of the discriminant analysis was performed by letting samples out of the analysis one by one and then assigning them to a high pCO2 or control category based only on their community composition. This was performed using the lda function from the R package MASS (Venables & Ripley 2002) and a confusion matrix based on the DA results.

The number of samples of the branching species (S. hystrix and P. damicornis) did not allow statistical analyses to test for differences between high pCO2 and control sites, although it did permit a comparison of the endolithic communities associated with the different coral hosts (Supplementary Table 3.2). To investigate the community structure related to host species, a rarefaction threshold of 707 reads for the 16S rRNA gene, 713 for the tufA and 3257 for the UPA marker was used, allowing the inclusion of a larger number of samples in the analysis. Samples with lower sequencing depth were excluded, resulting in 35 samples in the 16S rRNA gene and UPA datasets and 27 in the tufA datasets (Supplementary Table S1). Alpha diversity, Kruskal–Wallis test (for OTUs), ANOVA (for taxon groups) and

CHAPTER 3 44

PCoA were performed on this dataset as previously described, but here, samples from different pCO2 conditions from conspecific host species were combined in order to investigate the community structure purely associated with coral host species.

3.3 Results pCO2 conditions have little influence on the Porites sp. endolithic microbiomes

A total of 6,584,274 sequence reads were recovered for the samples analysed here, 4,405,336 belonging to Porites sp. samples. After stringent filtering, a total of 119,367 (16S rRNA gene), 109,948 (tufA) and 393,816 (UPA) reads were analysed. The number of OTUs in each dataset is reported in Table 3.2. Alpha diversity statistics, including Chao1 (Chao 1984) and observed OTUs, indicated that species richness of the endolithic communities associated with Porites sp. was not significantly different between high pCO2 and control sites (Table 3.2).

Table 3.2: Diversity indices based on the microbiome of Porites sp. skeletons from control and high pCO2 (Seep) sites and standard deviations (±). N= number of samples after rarefaction. Seqs = rarefaction threshold. OTUs = number of OTUs retrieved in each dataset, after quality filtering. Chao1 Obs. OTUS

N Seqs OTUs Control high pCO p-value Control high pCO p-value 2 2 16S 20 2200 890 141.2 ± 52.4 148.5 ± 45.8 0.79 133.6 ± 49.9 140.7 ±44.8 0.77

tufA 22 1400 59 7.8 ± 3.3 7.4 ± 2.8 0.78 7.7 ± 3.2 7.3 ± 2.8 0.74

UPA 22 7000 164 21.6 ± 7.1 24.4 ± 5.4 0.33 20.6 ± 6.7 23.2 ± 5.1 0.33

Although the relative abundance of some microbial taxa differed between high pCO2 and control sites (Figure 3.1), the differences were not statistically significant, neither at the OTU level (Supplementary table S2) nor at higher taxonomic level (Supplementary table S3). Accordingly, principal coordinate analysis did not reveal any pattern between sites with all three markers (Figure 3.2). We further investigated whether any of the different phylogenetic lineages in the endolithic algal communities differed in abundance at high pCO2 and control sites. A phylogenetic heatmap of relative abundances (Supplementary Figure 3.2) indicated that phylogenetic relatedness among green algae is not correlated with different abundances in high pCO2 or control sites.

Discriminant analysis (DA) indicated which OTUs maximized the distinction between high pCO2 and control sites (despite there not being any statistically significant

CHAPTER 3 45

differences overall), and where they were most abundant (Supplementary table S4, Supplementary Figure 3.3). In the 16S rRNA gene dataset, members of the order Rhizobiales accounted for most of the variability between sites. In the tufA gene dataset, members of the endolithic alga, Ostreobium, within clade #3 and #4, were the most informative OTUs distinguishing between high pCO2 and control sites differences. Finally, within the UPA gene dataset, OTUs related to Cyanobacteria, Alphaproteobacteria and green algae (e.g. Ulvophyceae), accounted for most of the between-site variability. Despite these differences, the predictive power of the DA to assign a sample to the correct bin (high pCO2 or control) based on its composition was poor, nearly random, for all markers (Supplementary table S5). This suggests that even though the relative abundance of some OTUs varied (as suggested by

DA), there was no discernible difference in microbial community profiles between high pCO2 and control sites.

Figure 3.1: Relative abundances of the most

common microorganisms in coral skeletons of Porites sp. from high pCO2 and control sites. A) biodiversity survey targeting prokaryotes based on the 16S rRNA gene; B) survey of the eukaryotic green algal members of the microbiome based on the tufA marker; C) biodiversity survey using the Universal Plastid

Amplicon.

CHAPTER 3 46

Taxonomic profiling of the Porites sp. endolithic community

The microbial community observed in the skeletons of Porites sp. is highly diverse and variable between samples within pCO2 conditions. Prokaryotic members of the microbiome (observed in the 16S rRNA gene dataset) accounted for most of the species diversity (Table 3.2). The most abundant phylum recovered was Proteobacteria, followed by Bacteroidetes and Archaea (Figure 3.1A). The relative abundance of the nitrogen-fixing order, Rhizobiales (Alphaproteobacteria), was 9.1% ± 4%, and the phylum of green sulphur bacteria, Chlorobi, was 0.4% ± 1%. Members of the Bacteroidetes were twice as abundant at high pCO2 sites (12.1% ± 10% in control versus 23.9% ± 17% in high pCO2 sites), mostly due to a higher abundance within the classes Cytophagia (4.8% ± 7% versus 10.4% ± 9%), Flavobacteria (2.2% ± 2% versus 3.7% ± 3%) and Saprospirae (2.4% ± 5% versus 9.4% ± 15%). We also observed a lower abundance of the Archaeal class Parvarchaea in the high pCO2 site (1.6% ± 0.9% versus 11.6% ± 12% in high pCO2 and control sites, respectively). These differences are, however, not statistically significant according to ANOVA and Kruskal-Wallis tests (Supplementary tables S2 and S3).

While prokaryotes account for most of the diversity, green algae might account for most of the biomass in the skeleton of Porites sp. (see Odum & Odum 1955). The tufA dataset (Figure 3.1B) suggested that the algal community was dominated (64.7% ± 36%) by lineages of the Ostreobiaceae (Chlorophyta, Bryopsidales). Ostreobium clade #1 showed the highest relative abundance (33% ± 40%), followed by Ostreobium clades #4, #3 and #2. While the relative abundance of clade #1 was similar between sites, the relative abundance of the other Ostreobium clades varied substantially – but not significantly – between control and high pCO2 sites. A high abundance of an unclassified group of OTUs belonging to the green algal order Bryopsidales was also observed, particularly in the control site (Figure 3.1B). The tufA primers used here were designed to amplify Bryopsidales (Marcelino and Verbruggen 2016), therefore it is possible that other green algal orders are also abundant but were not detected here.

The prevalence of green algal lineages in the skeletons of Porites sp. was also suggested by the UPA dataset (Figure 3.1C), which shows that 86.8% ± 15% of the reads belong to the green algal order Bryopsidales. There are no UPA reference sequences for Ostreobium clades #1 and #2, therefore possible sequences of these clades might have been classified as clades #3, #4 or “unclassified Bryopsidales” by the RDP classifier, which may explain the differences in the abundances of Ostreobium clades between the tufA and UPA datasets. Differences in the relative abundances between control and high pCO2 sites were minimal (Figure 3.1C).

CHAPTER 3 47

Figure 3.2: Principal Coordinate Analysis of microbial communities present in limestone skeletons of Porites sp. from high pCO2 and control sites. The analyses were based on weighted UniFrac distance matrices for each metabarcoding marker: A) prokaryotic 16S rRNA gene marker; B) eukaryotic green algae tufA marker; C) Universal Plastid Amplicon maker.

Endolithic communities across different host corals

The prokaryotic endolithic communities of Seriatopora hystrix and Pocillopora damicornis were significantly less diverse than those found in Porites sp., as indicated by Chao1 and observed OTUs indices (Table 3.3). The Kruskal-Wallis test indicated a significant difference (p-values < 0.05) in the relative abundances of certain OTUs belonging to the Endozoicimonaceae family between coral species, with the highest abundance in P. damicornis (Supplementary table S6). The Porites sp. samples had a higher relative abundance of an OTU related to the order Rhizobiales (genus Afifella), and P. damicornis

CHAPTER 3 48

showed a significantly higher abundance of an OTU related to the phylum Bacteroidetes (order Cytophagales; Supplementary table S6). In terms of higher taxonomic levels, the relative abundance of the phyla Planctomycetes, Bacteroidetes, and the bacterial phylum OD1 were significantly different among coral hosts (Figure 3.3A, Supplementary table S7). When analysed with principal coordinate analysis (PCoA), the species-specific differences in the prokaryotic microbiome associated with these hosts was evident: Porites sp. samples clustered together, clearly separated from the two branching species (Figure 3.4A).

The alpha diversity of green algae (i.e. Chao 1 and Observed OTUs within the tufA dataset) was significantly different between S. hystrix and Porites sp., but no significant differences were observed within the taxonomically broader spectrum of eukaryotic algae amplified with the UPA marker (Table 3.3). The Kruskal–Wallis test suggested no significant

Table 3.3: Comparison of endolithic community diversity indices between Porites sp., Seriatopora hystrix and Pocillopora damicornis corals. Significant results are marked with an asterisk.

16S rRNA gene Chao1 Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 29.087 29.494 21.514 15.705 0.507 1.000 S. hystrix Porites sp. 29.087 29.494 125.296 49.889 -4.363 0.003* P. damicornis Porites sp. 21.514 15.705 125.296 49.889 -4.854 0.003*

16S rRNA gene observed OTUs Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 21.767 19.758 19.083 12.331 0.258 1.000 S. hystrix Porites sp. 21.767 19.758 101.274 37.282 -4.865 0.003* P. damicornis Porites sp. 19.083 12.331 101.274 37.282 -5.138 0.003* tuf A Chao1 Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 2.033 0.858 3.450 0.000 1.168 1.000 S. hystrix Porites sp. 2.033 0.858 9.606 3.849 -3.264 0.006* P. damicornis1 Porites sp. 3.450 0.000 9.606 3.849 -1.531 0.3091 tuf A observed OTUs

Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 1.933 0.736 3.000 0.000 1.024 1.000 S. hystrix Porites sp. 1.933 0.736 9.330 3.623 -3.388 0.003* P. damicornis1 Porites sp. 3.000 0.000 9.330 3.623 -1.673 0.1921 UPA Chao 1 Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 19.353 14.158 13.386 16.038 0.624 1.000 S. hystrix Porites sp. 19.353 14.158 21.732 6.187 -0.591 1.000 P. damicornis Porites sp. 13.386 16.038 21.732 6.187 -1.921 0.162

UPA observed OTUs Group1 Group2 Group1 mean Group1 std Group2 mean Group2 std t stat p-value S. hystrix P. damicornis 18.100 12.882 12.567 14.816 0.630 1.000 S. hystrix Porites sp. 18.100 12.882 19.539 5.572 -0.394 1.000 P. damicornis Porites sp. 12.567 14.816 19.539 5.572 -1.754 0.327

1 - The tufA dataset has only 1 P. damicornis sample (after rarefaction), therefore the significance cannot be reliably calculated. See Supplementary table S1 for number of samples.

CHAPTER 3 49

difference of relative abundances of particular OTUs between host species (neither within the tufA nor within the UPA dataset), at least when corrected (Bonferroni) p-values were taken into consideration (Supplementary table S6). After rarefying the sequences, the tufA dataset was reduced to a single P. damicornis sample (Supplementary table S1); therefore we could not test for differences in alpha diversity or relative abundances in the P. damicornis community compared to other host corals. Using ANOVA, we observed a significantly different relative abundance of Ostreobium spp. (order Bryopsidales) among coral hosts within the UPA dataset, but not in the tufA dataset (Figure 3.3B and 3C, Supplementary table S7). Seriatopora hystrix had a high and variable (83% ± 57%) relative abundance of endolithic lineages related to the macroalga Halimeda spp. within the tufA dataset, while this group constituted a minimal fraction (0.04% ± 0.2%) of the endolithic community of Porites sp. (Figure 3.3B). No pattern was observed within the PCoA plot of the community composition using the tufA marker (Figure 3.4B). However, the UPA marker, which has more samples of the branching species, shows that Porites sp. samples cluster together and away from S. hystrix and P. damicornis, although two outliers belonging to branching samples cluster with Porites sp. (Figure 3.4C).

Figure 3.3: Relative abundances of the most common microorganisms in coral skeletons of Pocillopora damicornis, Seriatopora hystrix and

Porites sp.. A) Biodiversity survey targeting prokaryotes based on the 16S rRNA gene; B) survey of the eukaryotic green algal members of the microbiome based on the tufA marker; C) biodiversity survey using the Universal Plastid

Amplicon.

CHAPTER 3 50

3.4 Discussion

Our results show that the prokaryotic and eukaryotic microbiome in the skeletons of Porites sp. are highly diverse but indistinguishable between corals inhabiting naturally high pCO2 reefs and ambient control conditions. Ocean acidification is predicted to affect the coral reef CaCO3 budget and its biological associations (Meron et al. 2011; Andersson & Gledhill 2013; Morrow et al. 2015), and depending on the experiment, endolithic communities were shown to either exacerbate or buffer the effects of these environmental changes (Fine & Loya 2002; Tribollet et al. 2009; Reyes-Nivia et al. 2013). Our results suggest that the composition of endolithic communities, at least in Porites sp., is virtually unaffected by the surrounding high pCO2 water from a natural volcanic seep, and therefore less likely to be disturbed by OA than previously thought. Although homogeneous between high pCO2 and control sites, we show that the endolithic community is highly diverse and structured among coral host species.

Figure 3.4. Principal Coordinate Analysis of microbial communities present in limestone skeletons of three coral host species collected in high pCO2 and control sites. The analyses were based on weighted UniFrac distance matrices for each metabarcoding marker: A) prokaryotic 16S rRNA gene marker; B) eukaryotic green algae tufA marker; C) Universal Plastid Amplicon marker.

CHAPTER 3 51

A stable microbiome

The mechanisms influencing the structure of the endolithic microbiome (regardless of variable pCO2 conditions), are currently unknown. We raise here two hypotheses that may explain our results. The "stable habitat" hypothesis assumes that the endolithic environment is highly homeostatic so that pH is maintained inside the skeletons regardless of external changes in the surrounding water. The "tolerant endolith" hypothesis is based on the notion that endolithic microorganisms have a wide pH tolerance range, wider than the microbes associated with the tissues and mucus.

The first hypothesis is supported by the ability of some corals to up-regulate the pH at the tissue-skeleton interface, which allows them to calcify and grow even under high pCO2 (McCulloch et al. 2012; Venn et al. 2013; Georgiou et al. 2015). The pH within coral cells remains relatively constant throughout the day (7.05-7.46 units), likely due to membrane transporters that extrude the excess of by-products of photosynthesis and respiration to maintain a stable intracellular pH (Laurent et al. 2013). This process may indirectly create a stable microhabitat within the coral skeleton that is protected from shifting pH in the surrounding seawater (see also Shashar et al. 1997). The observation that seawater impregnated with radioactivity impacted corals' living tissue but did not reach their endolithic zone (Odum & Odum 1955) supports this notion.

One problem with the stable habitat hypothesis is that the pH in the skeletons of Porites (compressa) can vary daily from 7.7 to 8.5 pH units, mostly due to the by-products of respiration and photosynthesis of the coral and Symbiodinium that are exported to the skeleton (Shashar & Stambler 1992). This daily variation is well above the projections of OA for the near future, which predicts a pH drop of 0.4 units by 2100, and up to 0.7 units by 2300 (Raven et al. 2005; Hoegh-Guldberg et al. 2007). Furthermore, the boring mechanism involves a sophisticated control of intracellular pH (associated with calcium pumps and protons counter-transport) in endolithic cyanobacteria (Garcia-Pichel 2006; Garcia-Pichel et al. 2010). Therefore our second hypothesis that organisms exposed to daily pH fluctuations within the skeleton are adapted to cope with a wide range of pCO2 conditions may be more accurate. Experimental work and genomic data of endolithic organisms will help to test the tolerant endolith hypothesis. For example, specialization to the low light experienced in the endolithic habitat has been observed in the plastid genome architecture in Ostreobium quekettii (Marcelino et al. 2016) and the presumed pH tolerance may also be reflected in the genomes of endolithic organisms.

The lack of discernible differences in community profiles of the endolithic community between high pCO2 and control corals observed in the current study, is in

CHAPTER 3 52

agreement with an experimental study targeting the effects of acidification on the endolithic microbiome of the Mediterranean corals Balanophyllia europaea and Cladocora caespitose, transplanted to a naturally high pCO2 area (Meron et al. 2012). In an aquarium-based experiment conducted over a shorter time period, the bacterial community composition present in the tissue, skeleton and mucus of Acropora eurystoma were found to be affected by high pCO2, but further analysis using clone libraries suggested that only the prokaryotic communities of the mucus and tissue, not the skeleton, were affected by low pH (Meron et al. 2011). These different observations might be associated with the different time spans and experimental setups of the two studies, and it is likely that the microbial community associated with different coral taxa have different responses to acidification. The resilience of endolithic algae to acidification has also been observed: the net photosynthesis and respiration of algae growing at the surface of dead coral blocks was severely impacted upon exposure to high pCO2 treatments, while the endolithic flora was unaffected (Tribollet et al. 2006). Studies have demonstrated that endolithic algae actually benefit from low pH and tend to increase in biomass under high pCO2 conditions (Tribollet et al. 2009; Reyes-Nivia et al. 2013; Enochs et al. 2016). Future studies targeting the whole (prokaryotic and eukaryotic) microbiome of live coral species will be necessary to clarify the validity and generality (across coral species) of the tolerant endolith hypothesis.

The observation that high pCO2 did not impact the endolithic community of Porites sp. does not necessarily imply that coral holobionts are immune to ocean acidification. It is possible that the methods used here are not sufficiently powerful to detect the effects of high pCO2 on endolithic microbial communities. However, the fact that we detected significant differences among coral hosts, even though the sampling size for the branching corals was smaller, indicates that our methods and sampling design are adequate and it is unlikely that differences among high pCO2 and control sites were present but went undetected. It is possible though that high pCO2 impacts the endolithic communities of other coral species that were not examined. It is interesting to note that massive Porites spp. outcompete branching corals and dominate the reef near volcanic seeps (Fabricius et al. 2011). Our analyses are restricted to the volcanic seeps of Milne Bay, which have relatively small areas under high pCO2 and are surrounded by ambient seawater, and corals living in this area likely have had a longer time to adapt to those conditions than OA will permit. Further studies at additional sites impacted by high pCO2, across a wider range of coral species and under experimental conditions are required to evaluate the results of our study across the broader ecological context of effects of OA on coral microbiomes.

Diversity and potential functional roles of the endolithic microbiome

CHAPTER 3 53

Bacteria related to Endozoicomonas spp. (class Gammaproteobacteria) are thought to have a key role in the coral holobiont. These bacteria have been shown to be endosymbionts, forming aggregations within coral tissues (Neave et al. 2016), potentially contributing to nutrient cycling and structuring of the microbiome through the production of quorum-sensing signalling metabolites and antimicrobial compounds (Meyer et al. 2014; Morrow et al. 2015 and references therein). The relative abundance of Endozoicomoniaceae within coral tissues appears to be sensitive to high pCO2 (Morrow et al. 2015; Webster et al. 2016), but in the skeletons of Porites sp. analysed here, they did not differ significantly between samples from different pCO2 conditions (Supplementary Table S2). We observed a significantly higher relative abundance of two Endozoicomoniaceae OTUs in the skeletons of P. damicornis when compared to the other two coral species, possibly reflecting stable associations of Endozoicomoniaceae species with this coral host (see Neave et al. 2016). Although some of the sequences retrieved here may derive from other parts of the coral, we have detected members of Endozoicomonaceae in the endolithic community in previous experiments (Marcelino & Verbruggen 2016), and at least one other study detected them outside the tissue, likely in the coral skeleton or mucus (Ainsworth et al. 2015).

Bacteria in the phylum Bacteroidetes are often associated with coral disease and have been shown to increase in relative abundance under reduced pH (Vega Thurber et al. 2009). The average relative abundance of this group doubled in endolithic communities from control to high pCO2 sites, but this difference was not significant likely due to the high level of variation in community composition among colonies within a site. This increase was mostly due to a higher abundance of the classes Saprospirae, Flavobacteria and Cytophagia, which contained most of the marine algicide bacteria (Furusawa et al. 2003; Mayali & Azam 2004; Zozaya-Valdes et al. 2015). It is plausible that a higher relative abundance of these microorganisms is associated with an increase in endolithic algal biomass under high pCO2 (see Reyes-Nivia et al. 2013), and therefore, rather than compromising coral health, they might be a key group to control excessive algal growth in the skeletons.

Microorganisms involved in nitrogen cycling may be fundamental to coral resilience to ocean acidification and climate change (Rädecker et al. 2014; Santos et al. 2014; Radecker et al. 2015). We found a diverse community of nitrogen fixing (diazotrophic) microorganisms inhabiting coral skeletons. The majority (in terms of relative abundance) belonged to the order Rhizobiales, a group that appears to form stable symbiotic associations with corals (Lema et al. 2014). Green sulphur (also diazotrophic) bacteria in the phylum Chlorobi, previously documented as prevalent members of the endolithic community in the coral Isopora (Yang et al. 2016), were found at low relative abundances in the samples analysed here and in a previous study (Marcelino & Verbruggen 2016). Cyanobacterial OTUs captured

CHAPTER 3 54

with the UPA marker, while not abundant, were very diverse and mostly unclassified at lower taxonomic levels. Cyanobacteria have been shown to fix nitrogen in coral tissues (Lesser et al. 2004; Radecker et al. 2015) and can be responsible for a large fraction of the nitrogen fixation observed in their skeletons (Crossland & Barnes 1976; Davey et al. 2007).

Endolithic algal biomass has been shown to increase under high pCO2, as they benefit from the increased availability of carbon dioxide for photosynthesis (Tribollet et al. 2009; Reyes-Nivia et al. 2013). Indeed, we observed a higher relative abundance of all Ostreobium clades in Porites sp. samples from high pCO2 sites, but the variability among replicates (i.e. Porites sp. samples within sites) was also high, making it difficult to draw conclusions about whether Ostreobium spp. are competitively superior to other endolithic algal lineages under OA. Whether the increase in algal biomass is a threat to corals under OA depends on whether the associated bioerosion levels will exceed reef accretion (calcification). Besides increasing bioerosion, excessive endolithic algal growth can penetrate the coral living tissue, possibly increasing their susceptibility to infections (Peters 1984; Fine et al. 2006). An increase in endolithic algae may also be beneficial to the coral by providing them with vital nutrients, which is especially important during coral bleaching events (Schlichter et al. 1995; Fine & Loya 2002).

The possibility that the endolithic microbiome contributes to the resilience of corals under future OA conditions deserves further attention. Massive Porites spp. may be considered more competitive under OA than branching species based on their prevalence at naturally high pCO2 sites (Fabricius et al. 2011). We also visibly observed that our massive Porites sp. samples had higher colonisation with endolithic algae compared to the branching species. The photosynthetic activity of Symbiodinium plays an important role in maintaining pH homeostasis within corals (Gibbin et al. 2014), therefore it is possible that endolithic algae provide a similar service within the skeleton. The biomass of endolithic algae may exceed that of Symbiodinium by 16-fold (Odum & Odum 1955) and can contribute significantly to the buffering capacity of the holobiont (see Yamazaki et al. 2008; Reyes-Nivia et al. 2014). It is noteworthy that several functionally important microorganisms (e.g. Endozoicomoniaceae and Bacteroidetes) often found in coral tissues and mucus also occur in coral skeletons (Sweet et al. 2010; Ainsworth et al. 2015; Marcelino & Verbruggen 2016; this study). Additionally, the majority of OTUs commonly found in healthy corals have also been found in bare coral skeleton, but not in seawater or in a diseased coral tissue (Fernando et al. 2015). It is possible that these bacteria were actually associated with deep-polyp interstitial tissues that remained after removing the tissues from the skeleton with pressurized air. It is also possible that the coral skeleton serves as a reservoir for the microbiome and provides a source of beneficial bacteria to coral tissues, analogous to the human appendix which functions as a

CHAPTER 3 55

safe house for symbiotic microbes that repopulate the intestine following acute illness (Randal Bollinger et al. 2007). Acute environmental stress can disrupt symbiotic relationships among hosts and symbionts (see Hawkins et al. 2013), and a stable endolithic community may assist in the recovery of the coral microbiome after environmental (and/or physiological) conditions stabilize.

Different host species harbour distinct endolithic communities

The endolithic communities of the branching corals Seriatopora hystrix, Pocillopora damicornis and the massive Porites sp. contain significantly different relative abundances of functionally important members of the microbiome (including species of Endozoicimonaceae and Bacteroidetes) and appear to separate based on morphology or taxonomy (as both branching species belong to the family Pocilloporidae). First, the two branching species harbour a reduced diversity of bacteria and algae. The low relative abundance of Ostreobium spp. in the endolithic communities of branching species is surprising, considering the generally ubiquitous nature of this alga in coral skeletons (Lukas 1973; Tribollet 2008; Gutner-Hoch & Fine 2011). Instead of Ostreobium spp., the coral S. hystrix has a high relative abundance of OTUs related to a macroalga (Halimeda spp.), which has only recently been reported in coral skeletons. It is possible that Halimeda spp. occurs in the coral skeleton in the form of rhizoids that have penetrated the limestone, or most likely, as an unknown microscopic and endolithic life stage of two Halimeda species (H. discoidea and H. micronesica) that are ubiquitously present in metabarcoding studies of endolithic communities (Marcelino & Verbruggen 2016; Sauvage et al. 2016; this study). The possible endolithic life stage of Halimeda spp. may have been incorrectly classified as Ostreobium in the past.

The observed differences in endolithic community composition among coral hosts may be a result of specialization to particular host traits or reflect co-evolution between coral hosts and endolithic species. The living tissue of Porites (lobata) is about five times thicker and contains a higher density of Symbiodinium than the living tissue of P. damicornis and S. hystrix (Yost et al. 2013). Tissue thickness would influence the amount of light that penetrates and reflects within the inner parts of the skeleton and may influence the composition of the endolithic community. Branching coral species also tend to grow faster than massive corals (Gates & Ainsworth 2011), and the branch tips collected in this study may have a younger population of endoliths in comparison to more mature sections of the colony base (a pattern reported in Pica et al. 2016). Future studies would benefit from examining the microbiome associated with different areas of the colony and possible specialization to skeletal features.

CHAPTER 3 56

Alternatively (and not mutually exclusively), endolithic lineages might form stable community assemblies that have co-evolved with the coral host, or the host species have some control over the composition of the endolithic community by selecting beneficial taxa. Mutualistic relationships between corals and their endolithic associates have been suggested in several studies (Odum & Odum 1955; Schlichter et al. 1995; Schlichter et al. 1997; Fine & Loya 2002; Försterra & Häussermann 2008; Titlyanov et al. 2009), and future research would benefit from characterising possible co-evolutionary processes among coral species and endolithic microorganisms.

Conclusions

This study reports a diverse microbiome within the skeletons of Porites sp., though demonstrates little discernible patterns in this microbiome across ambient and naturally high pCO2 environments. We show that the endolithic community shares several functionally important microbes with the coral tissue layer. Environmental stress can induce corals to lose their symbiotic microorganisms, however a diverse endolithic microbial community might serve as a reservoir to recolonise the microbiome in the coral tissue after the re-establishment of their physiological equilibrium. We also found functionally important members in the endolithic community, including members in the Endozoicimonaceae and Bacteroidetes, forming consistent associations with the host coral families, an observation consistent with the endolithic reservoir proposition. The diversity and community structure observed in this study is the baseline to investigate the roles of endolithic microorganisms in conferring corals the flexibility needed to endure climate change.

3.5 References Ainsworth T, Krause L, Bridge T, Torda G, Raina J-B, et al. (2015) The coral core microbiome identifies rare bacterial taxa as ubiquitous endosymbionts. ISME Journal 9, 2261-2274. Andersson AJ, Gledhill D (2013) Ocean acidification and coral reefs: effects on breakdown, dissolution, and net ecosystem calcification. Annual Review of Marine Science 5, 321-348. Anthony KRN, Kline DI, Diaz-Pulido G, Dove S, Hoegh-Guldberg O (2008) Ocean acidification causes bleaching and productivity loss in coral reef builders. Proceedings of the National Academy of Sciences, USA 105, 17442-17446. Blackall LL, Wilson B, van Oppen MJ (2015) Coral-the world's most diverse symbiotic ecosystem. Molecular Ecology 24, 5330-5347.

CHAPTER 3 57

Bourne DG, Garren M, Work TM, Rosenberg E, Smith GW, Harvell CD (2009) Microbial disease and the coral holobiont. Trends in Microbiology 17, 554-562. Bourne DG, Morrow KM, Webster NS (2016) Insights into the coral microbiome: underpinning the health and resilience of reef ecosystems. Annual Review of Microbiology 70, 317-340. Caporaso JG, Bittinger K, Bushman FD, DeSantis TZ, Andersen GL, Knight R (2010a) PyNAST: a flexible tool for aligning sequences to a template alignment. Bioinformatics 26, 266-267. Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, et al. (2010b) QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7, 335-336. Chao A. (1984). Non-parametric estimation of the number of classes in a population. Scandinavian Journal of Statistics 11, 265–270. Chazottes V, Le Campion-Alsumard T, Peyrot-Clausade M (1995) Bioerosion rates on coral reefs: interactions between macroborers, microborers and grazers (Moorea, French Polynesia). Palaeogeography, Palaeoclimatology, Palaeoecology 113: 189-198. Crook ED, Cohen AL, Rebolledo-Vieyra M, Hernandez L, Paytan A (2013) Reduced calcification and lack of acclimatization by coral colonies growing in areas of persistent natural acidification. Proceedings of the National Academy of Sciences, USA 110, 11044-11049. Crossland CJ, Barnes DJ (1976) Acetylene reduction by coral skeletons. Limnology and Oceanography 21, 153-156. Davey M, Holmes G, Johnstone R (2007) High rates of nitrogen fixation (acetylene reduction) on coral skeletons following bleaching mortality. Coral Reefs 27, 227-236. Del Campo J, Pombert JF, Slapeta J, Larkum A, Keeling PJ (2016) The 'other' coral symbiont: Ostreobium diversity and distribution. ISME Journal. Advance online publication. DeSantis TZ, Hugenholtz P, Larsen N, Rojas M, Brodie EL, et al. (2006) Greengenes, a chimera-checked 16S rRNA gene database and workbench compatible with ARB. Applied and Environmental Microbiology 72, 5069-5072. Edgar RC (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10, 996-998. Enochs IC, Manzello DP, Tribollet A, Valentino L, Kolodziej G, et al. (2016) Elevated colonization of microborers at a volcanically acidified coral reef. PLoS One 11, e0159818. Esteves AI, Amer N, Nguyen M, Thomas T (2016). Sample processing impacts the viability and cultivability of the sponge microbiome. Frontiers in Microbiology, 7, 7836. Fabricius KE, De'ath G, Noonan S, Uthicke S (2014) Ecological effects of ocean acidification and habitat complexity on reef-associated macroinvertebrate communities. Proceedings of the Royal Society B 281, 20132479.

CHAPTER 3 58

Fabricius KE, Langdon C, Uthicke S, Humphrey C, Noonan S, et al. (2011) Losers and winners in coral reefs acclimatized to elevated carbon dioxide concentrations. Nature Climate Change 1, 165-169. Fama P, Wysor B, Kooistra WHCF, Zuccarello GC (2002) Molecular phylogeny of the genus Caulerpa (Caulerpales, Chlorophyta) inferred from chloroplast tufA gene. Journal of Phycology 38, 1040-1050. Fang JK, Mello-Athayde MA, Schonberg CH, Kline DI, Hoegh-Guldberg O, Dove S (2013) Sponge biomass and bioerosion rates increase under ocean warming and acidification. Global Change Biology 19, 3581-3591. Fernando SC, Wang J, Sparling K, Garcia GD, Francini-Filho RB, et al. (2015) Microbiota of the major South Atlantic reef building coral Mussismilia. Microbial Ecology 69, 267- 280. Fine M, Loya Y (2002) Endolithic algae: an alternative source of photoassimilates during coral bleaching. Proceedings of the Royal Society B 269, 1205-1210. Fine M, Roff G, Ainsworth TD, Hoegh-Guldberg O (2006) Phototrophic microendoliths bloom during coral “white syndrome”. Coral Reefs 25, 577-581. Försterra G, Häussermann V (2008) Unusual symbiotic relationships between microendolithic phototrophic organisms and azooxanthellate cold-water corals from Chilean fjords. Marine Ecology Progress Series 370, 121-125. Foster T, Falter JL, McCulloch MT, Clode PL (2016) Ocean acidification causes structural deformities in juvenile coral skeletons. Science Advances 2, e1501130-e1501130. Friendly M, Fox J (2009) candisc: Generalized Canonical Discriminant Analysis. R package version 0.5-16. Furusawa G, Yoshikawa T, Yasuda A, Sakata T (2003) Algicidal activity and gliding motility of Saprospira sp. SS98-5. Canadian Journal of Microbiology/Revue Canadienne de Microbiologie 49, 92-100. Garcia-Pichel F (2006) Plausible mechanisms for the boring on carbonates by microbial phototrophs. Sedimentary Geology 185, 205-213. Garcia-Pichel F, Ramirez-Reinat E, Gao Q (2010) Microbial excavation of solid carbonates powered by P-type ATPase-mediated transcellular Ca2+ transport. Proceedings of the National Academy of Sciences, USA 107, 21749-21754. Gates RD, Ainsworth TD (2011) The nature and taxonomic composition of coral symbiomes as drivers of performance limits in scleractinian corals. Journal of Experimental Marine Biology and Ecology 408, 94-101. Georgiou L, Falter J, Trotter J, Kline DI, Holcomb M, et al. (2015) pH homeostasis during

coral calcification in a free ocean CO2 enrichment (FOCE) experiment, Heron Island reef flat, Great Barrier Reef. Proceedings of the National Academy of Sciences, USA 112, 13219-13224. Gibbin EM, Putnam HM, Davy SK, Gates RD (2014) Intracellular pH and its response to CO2-driven seawater acidification in symbiotic versus non-symbiotic coral cells. Journal of Experimental Biology 217, 1963-1969.

CHAPTER 3 59

Grange JS, Rybarczyk H, Tribollet A (2015) The three steps of the carbonate biogenic dissolution process by microborers in coral reefs (New Caledonia). Environmental Science and Pollution Research 22, 13625-13637. Gutner-Hoch E, Fine M (2011) Genotypic diversity and distribution of Ostreobium quekettii within scleractinian corals. Coral Reefs 30, 643-650. Hawkins TD, Bradley BJ, Davy SK (2013) Nitric oxide mediates coral bleaching through an apoptotic-like cell death pathway: evidence from a model sea anemone-dinoflagellate symbiosis. FASEB Journal 27, 4790-4798. Hoegh-Guldberg O, Mumby PJ, Hooten AJ, Steneck RS, Greenfield P, et al. (2007) Coral reefs under rapid climate change and ocean acidification. Science 318, 1737-1742. Hofmann GE, Barry JP, Edmunds PJ, Gates RD, Hutchins DA, Klinger T, Sewell MA (2010) The effect of ocean acidification on calcifying organisms in marine ecosystems: an organism to-ecosystem perspective. Annual Review of Ecology, Evolution and Systematics 41, 127-147. Joint I, Doney SC, Karl DM (2011) Will ocean acidification affect marine microbes? ISME Journal 5, 1-7. Katoh K, Misawa K, Kuma K-i, Miyata T (2002) MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30, 3059-3066. Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, Glockner FO (2013) Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next- generation sequencing-based diversity studies. Nucleic Acids Research 41, e1. Krediet CJ, Ritchie KB, Paul VJ, Teplitski M (2013) Coral-associated micro-organisms and their roles in promoting coral health and thwarting diseases. Proceedings of the Royal Society B 280, 20122328. Laurent J, Tambutte S, Tambutte E, Allemand D, Venn A (2013) The influence of photosynthesis on host intracellular pH in scleractinian corals. Journal of Experimental Biology 216, 1398-1404. Le Campion-Alsumard T, Golubic S, Hutchings P (1995) Microbial endoliths in skeletons of live and dead corals: Porites lobata (Moorea, French Polynesia). Marine Ecology Progress Series 117, 149-157. Lema KA, Willis BL, Bourne DG (2014) Amplicon pyrosequencing reveals spatial and temporal consistency in diazotroph assemblages of the Acropora millepora microbiome. Environmental Microbiology 16, 3345-3359. Lesser MP, Mazel CH, Gorbunov MY, Falkowski PG (2004) Discovery of symbiotic nitrogen-fixing cyanobacteria in corals. Science 305, 997-1000. Lozupone C, Lladser ME, Knights D, Stombaugh J, Knight R (2011) UniFrac: an effective distance metric for microbial community comparison. ISME Journal 5, 169-172. Lukas KJ (1973) Taxonomy and ecology of the endohthic microflora of reef corals with a review of the literature on endolithic microphytes.

CHAPTER 3 60

Magoc T, Salzberg SL (2011) FLASH: fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27, 2957-2963. Manzello DP, Kleypas JA, Budd DA, Eakin CM, Glynn PW, Langdon C (2008) Poorly cemented coral reefs of the eastern tropical Pacific: possible insights into reef development in a high-CO2 world. Proceedings of the National Academy of Sciences, USA 105, 10450-10455. Marcelino VR, Cremen MC, Jackson CJ, Larkum AA, Verbruggen H (2016) Evolutionary dynamics of chloroplast genomes in low light: a case study of the endolithic green alga Ostreobium quekettii. Genome Biology and Evolution 8, 2939-2951. Marcelino VR, Verbruggen H (2016) Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Scientific Reports 6, 31508. Mayali X, Azam F (2004) Algicidal bacteria in the sea and their impact on algal blooms. The Journal of Eukaryotic Microbiology 51, 139-144. McCulloch M, Falter J, Trotter J, Montagna P (2012) Coral resilience to ocean acidification and global warming through pH up-regulation. Nature Climate Change 2, 623-627. Meron D, Atias E, Iasur Kruh L, Elifantz H, Minz D, Fine M, Banin E (2011) The impact of reduced pH on the microbial community of the coral Acropora eurystoma. ISME Journal 5, 51-60. Meron D, Rodolfo-Metalpa R, Cunning R, Baker AC, Fine M, Banin E (2012) Changes in coral microbial communities in response to a natural pH gradient. ISME Journal 6, 1775-1785. Meyer JL, Paul VJ, Teplitski M (2014) Community shifts in the surface microbiomes of the coral Porites astreoides with unusual lesions. PLoS One 9, e100316. Morrow KM, Bourne DG, Humphrey C, Botte ES, Laffy P, et al. (2015) Natural volcanic

CO2 seeps reveal future trajectories for host-microbial associations in corals and sponges. ISME Journal 9, 894-908. Moss RH, Edmonds JA, Hibbard KA, Manning MR, Rose SK, et al. (2010) The next generation of scenarios for climate change research and assessment. Nature 463, 747- 756. Neave MJ, Rachmawati R, Xun L, Michell CT, Bourne DG, Apprill A, Voolstra CR (2016) Differential specificity between closely related corals and abundant Endozoicomonas endosymbionts across global scales. ISME Journal. Advance online publication. O'Brien PA, Morrow KM, Willis BL, Bourne DG (2016) Implications of ocean acidification for marine microorganisms from the free-living to the host-associated. Frontiers in Marine Science 3, 47. Odum HT, Odum EP (1955) Trophic structure and productivity of a windward coral reef community on Eniwetok Atoll. Ecological Monographs 25, 291-320. Orr JC, Fabry VJ, Aumont O, Bopp L, Doney SC, et al. (2005) Anthropogenic ocean acidification over the twenty-first century and its impact on calcifying organisms. Nature 437, 681-686.

CHAPTER 3 61

Paliy O, Shankar V (2016) Application of multivariate statistical techniques in microbial ecology. Molecular Ecology 25, 1032-1057. Paulson JN, Stine OC, Bravo HC, Pop M (2013) Robust methods for differential abundance analysis in marker gene surveys. Nature Methods 10, 1200-1202. Peters EC (1984) A survey of cellular reactions to environmental stress and disease in Caribbean scleractinian corals. Helgoländer Meeresuntersuchungen 37, 113-137. Pica D, Tribollet A, Golubic S, Bo M, Di Camillo CG, Bavestrello G, Puce S (2016) Microboring organisms in living stylasterid corals (Cnidaria, Hydrozoa). Marine Biology Research 12, 573-582. Presting GG (2006) Identification of conserved regions in the plastid genome: implications for DNA barcoding and biological function. Canadian Journal of Botany-Revue Canadienne De Botanique 84, 1434-1443. Rädecker N, Meyer FW, Bednarz VN, Cardini U, Wild C (2014) Ocean acidification rapidly reduces dinitrogen fixation associated with the hermatypic coral Seriatopora hystrix. Marine Ecology Progress Series 511, 297-302. Radecker N, Pogoreutz C, Voolstra CR, Wiedenmann J, Wild C (2015) Nitrogen cycling in corals: the key to understanding holobiont functioning? Trends in Microbiology 23, 490-497. Randal Bollinger R, Barbas AS, Bush EL, Lin SS, Parker W (2007) Biofilms in the large bowel suggest an apparent function of the human vermiform appendix. Journal of Theoretical Biology 249, 826-831. Raven J, Caldeira K, Elderfield H, Hoegh-Guldberg O, Liss P, Riebesell, U., et al. (2005) Ocean acidification due to increasing atmospheric carbon dioxide The Royal Society. Revell LJ (2012) phytools: an R package for phylogenetic comparative biology (and other things). Methods in Ecology and Evolution 3, 217-223. Reyes-Nivia C, Diaz-Pulido G, Dove S (2014) Relative roles of endolithic algae and carbonate chemistry variability in the skeletal dissolution of crustose coralline algae. Biogeosciences Discussions 11, 2993-3021. Reyes-Nivia C, Diaz-Pulido G, Kline D, Guldberg OH, Dove S (2013) Ocean acidification and warming scenarios increase microbioerosion of coral skeletons. Global Change Biology 19, 1919-1929. Rohwer F, Seguritan V, Azam F, Knowlton N (2002) Diversity and distribution of coral- associated bacteria. Marine Ecology Progress Series 243, 1-10. Santos HF, Carmo FL, Duarte G, Dini-Andreote F, Castro CB, et al. (2014) Climate change affects key nitrogen-fixing bacterial populations on coral reefs. ISME Journal 8, 2272-2279. Sauvage T, Schmidt WE, Suda S, Fredericq S (2016) A metabarcoding framework for facilitated survey of endolithic phototrophs with tufA. BMC Ecology 16, 8. Schlichter D, Kampmann H, Conrady S (1997) Trophic Potential and Photoecology of endolithic algae living within coral skeletons. Marine Ecology 18, 299-317.

CHAPTER 3 62

Schlichter D, Zscharnack B, Krisch H (1995) Transfer of photoassimilates from endolithic algae to coral tissue. Naturwissenschaften 82, 561-564. Schmieder R, Edwards R (2011) Quality control and preprocessing of metagenomic datasets. Bioinformatics 27, 863-864. Sharp KH, Ritchie KIMB (2012) Multi-partner interactions in corals in the face of climate change. Marine Biological Laboratory, 66-77. Shashar N, Banaszak AT, Lesser MP, Amrami D, Gan R (1997) Coral endolithic algae : life in a protected environment. Pacific Science 51, 167-173. Shashar N, Stambler N (1992) Endolithic algae within corals - life in an extreme environment. Journal of Experimental Marine Biology and Ecology 163, 277-286. Sherwood AR, Presting GG (2007) Universal primers amplify a 23S rDNA plastid marker in eukaryotic algae and cyanobacteria. Journal of Phycology 43, 605-608. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688-2690. Sweet MJ, Croquer A, Bythell JC (2010) Bacterial assemblages differ between compartments within the coral holobiont. Coral Reefs 30, 39-52. Titlyanov EA, Kiyashko SI, Titlyanova TV, Yakovleva IM (2009) δ13C and δ15N in tissues of reef building corals and the endolithic alga Ostreobium quekettii under their symbiotic and separate existence. Journal of Coral Reef Studies, 169-175. Tribollet A (2008) The boring microflora in modern coral reef ecosystems: a review of its roles. In: Current Developments in Bioerosion eds. Wisshak M, Tapanila L. pp. 67- 94. Springer Berlin Heidelberg, Berlin, Heidelberg. Tribollet A (2008b) Dissolution of dead corals by euendolithic microorganisms across the northern Great Barrier Reef (Australia). Microbial Ecology 55, 569-580.

Tribollet A, Godinot C, Atkinson M, Langdon C (2009) Effects of elevated pCO2 on dissolution of coral carbonates by microbial euendoliths. Global Biogeochemical Cycles 23, 3. Tribollet T, Atkinson MJ, Langdon C (2006) Effects of elevated pCO2 on epilithic and endolithic metabolism of reef carbonates. Global Change Biology 12, 2200-2208. Uthicke S, Momigliano P, Fabricius KE (2013) High risk of extinction of benthic foraminifera in this century due to ocean acidification. Scientific Reports 3, 1769. Vega Thurber R, Willner-Hall D, Rodriguez-Mueller B, Desnues C, Edwards RA, et al. (2009) Metagenomic analysis of stressed coral holobionts. Environmental Microbiology 11, 2148-2163. Venables WN, Ripley BD (2002) Modern Applied Statistics with S, Fourth edn. Springer. Venn AA, Tambutte E, Holcomb M, Laurent J, Allemand D, Tambutte S (2013) Impact of seawater acidification on pH at the tissue-skeleton interface and calcification in reef corals. Proceedings of the National Academy of Sciences, USA 110, 1634-1639. Verbruggen H, Tribollet A (2011) Boring algae. Current Biology 21, R876-877.

CHAPTER 3 63

Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naïve Bayesian Classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied and Environmental Microbiology 73, 5261-5267. Webster NS, Negri AP, Botte ES, Laffy PW, Flores F, et al. (2016) Host-associated coral reef microbes respond to the cumulative pressures of ocean warming and ocean acidification. Scientific Reports 6, 19324. Webster NS, Negri AP, Flores F, Humphrey C, Soo R, et al. (2013) Near-future ocean acidification causes differences in microbial associations within diverse coral reef taxa. Environmental Microbiology Reports 5, 243-251. Wickham H (2009) ggplot2: Elegant Graphics for Data Analysis Springer-Verlag New York. Yamazaki SS, Nakamura T, Yamasaki H (2008) Photoprotective role of endolithic algae colonized in coral skeleton for the host photosynthesis. In: Photosynthesis. Energy from the Sun eds. Allen JF, Gantt E, Golbeck JH, Osmond B), pp. 1391-1395. Springer Netherlands, Dordrecht. Yang S-H, Lee STM, Huang C-R, Tseng C-H, Chiang P-W, et al. (2016) Prevalence of potential nitrogen-fixing, green sulfur bacteria in the skeleton of reef-building coral Isopora palifera. Limnology and Oceanography 61, 1078-1086. Yost DM, Wang LH, Fan TY, Chen CS, Lee RW, Sogin E, Gates RD (2013) Diversity in skeletal architecture influences biological heterogeneity and Symbiodinium habitat in corals. Zoology (Jena) 116, 262-269. Zozaya-Valdes E, Egan S, Thomas T (2015) A comprehensive analysis of the microbial communities of healthy and diseased marine macroalgae and the detection of known and potential bacterial pathogens. Frontiers in Microbiology 6, 146.

CHAPTER 3 64

3.6 Supplementary Materials

The materials and methods used in this Chapter are largely similar to the methods described in Chapter 2, but had to be reported here in order to send this study for publication. The main differences with the previous protocol were:

We used only one 16S rDNA primer pair – the one yielding a longer fragment.

We did not analyse the 18 S rDNA dataset.

We use an extra PCR amplification cycle to obtain higher PCR product concentrations.

We use a more stringent quality control. In Chapter 2 our goal was to describe the diversity of endolithic communities and we aimed at retaining most of the diversity obtained in the sequences. In this Chapter however we want to avoid tag jumping (incorrect assignment of OTUs to samples) as this may interfere with the analyses.

Library preparation:

16S rRNA gene: We used the S-D-Bact-0341-b-S-17/S-D-Bact-0785-a-A-21 (Klindworth et al., 2013) primer pair to PCR-amplify this marker.

UPA (23S rRNA gene): We used the primer pair p23SrV_r1/p23SrV_f1, which PCR- amplifies the Universal Plastid Amplicon (Presting 2006). tufA: We used primers tufAR and a forward primer designed for green algae (Oq-tuf: ACN GGN CGN GGN ACN GT), (Fama et al., 2002; Marcelino &Verbruggen 2016)

We amplified the three markers in separate reactions containing 0.2 mM dNTP mix, 0.5 µM forward andse prever rimers, 2 mM MgCl2, 0.4 µg/µl Bovine Serum Albumin, 1× PCR buffer and 0.25U of Platinum Taq DNA Polymerase (Invitrogen). The first PCR round consisted of: initial denaturation step at 94°C for 5 min, followed by 26 cycles of denaturation (94°C for 30 s), annealing (45 s) and extension (72°C for 30 s) and a final extension step at 72°C for 5 min for the ribosomal DNA markers. Annealing temperature was set at 55°C for p23SrV_r1/p23SrV_f1 and S-D-Bact-0341-b- S-17/S-D-Bact-0785-a-A-21. Because tufA is a coding gene, it has higher mutation rates (especially at 3rd codon positions) when compared to ribosomal DNA, therefore a touchdown step and a lower annealing temperature is required (55—48°C for 14 cycles followed by 24 cycles at 48°C). Unspecific amplification does occur, but those are excluded in the analysis pipeline (e.g. steps 5, 6 and 9 of the pipeline).

For the second PCR we used the following conditions: initial denaturation step at CHAPTER 3 65

94°C for 5 min, followed by 8 cycles of denaturation (94°C at 30 s), annealing (55°C at 30 s) and extension (72°C at 30 s) and a final extension step at 72°C for 5 min. We purified the samples using home-made magnetic beads as described in Rohland and Reich (2012) and quantified the libraries using the Qubit fluorometer (Invitrogen). The libraries were sequenced with the Illumina MiSeq platform (V3 kit - 2×300 bp PE reads) at the Centre for Translational Pathology, University of Melbourne.

Data processing pipeline:

1. Remove the reverse complement of adapters from short amplicons. When the length of the reads are longer than the amplicon, you will get the reverse complement of the adapter sequenced in the 3’ end of the read, which can influence the merging of the paired end sequences.

2. Separate genes into different files. With our library preparation design, the MiSeq run yields one file per sample, each containing all amplicons. The different amplicons are teased apart based on primers sequences in this step.

3. Trim 3’ ends of reads (5 bases in forward reads and 20 bases in revaerse re ds) to improve consensus quality.

4. Merge forward and reverse reads using FLASH (Magoč and Salzberg, 2011).

5. Quality control: filter merged reads based on a quality threshold (average of 35 per merged read) using PRINSEQ (Schmieder and Edwards, 2011).

6. Trim primers from merged reads. Sequences that do not meet a minimum length threshold and/or do not have the exact primer sequence at the 3’ and 5’ ends are excluded from analysis in order to ensure quality (i.e., the sequence belong to the target gene) and global trimming (i.e., they start and end at the same position).

7. Format reads’ identification and generate one file per gene containing all samples.

8. Run UPARSE pipeline (Edgar, 2013): dereplication, sort by size, cluster OTUs and produce OTU map. We chose UPARSE because other available software (e.g. Qiime and Mothur) seem to significantly overestimate the number of OTUs (Edgar, 2013). Based on the divergence of tufA among Bryopsidales we used a similarity threshold of 98% for OTU clustering in this marker, which is a conservative threshold for species level (i.e., most Bryopsidales species are more similar than that, so at 98% the OTUs will be somewhere between species and genus level). We choose the 97% threshold CHAPTER 3 66

for the other markers for two main reasons: 1) there is not enough information in literature about the rDNA markers similarity among Bryopsidales species, on the contrary, it is known that they do not have phylogenetic signal to distinguish them; 2) our aim was to compare how the normally used markers (with their commonly used thresholds) perform in distinguishing algae species.

9. Alignment: we used PyNAST (Caporaso et al., 2010a) to align the 16S rDNA sequences. This aligner requires a reference database with aligned sequences and lots of gaps in the alignment. Due to the lack c of su h reference databases for UPA and tufA, we chose MAFFT (Katoh et al., 2002) to align UPA and tufA. The OTUs that failed to align were excluded from downstream analysis.

10. Assign taxonomy using the Naïve Bayesian Classifier (RDP) implemented in Qiime (Wang et al., 2007; Caporaso et al., 2010b). We used the Greengenes databases for the 16S rDNA sequences. In order to produce an RDP-friendly database for the UPA and the tufA, we downloaded reference sequences from Genbank, used a phylogenetic similarity threshold (based on a UPGMA tree) to equalize the dataset (i.e. exclude repetitive species, which will bias the RDP classifier) and produced the reference dataset (one file with the sequences and another with taxonomic ranks).

11. Filter OTUs found in negative controls (mock extractions and negative control PCRs). Although virtually no DNA was detected (with Qubit) in these controls, we added those samples to our library in order to detect any possible contaminant. But apart from cross contamination, sequencing errors can yield false-positives. Therefore some OTUs found in the controls could be, for example, the most abundant Ostreobium sequences which should not be excluded. So we filtered the OTUs present in the controls, but only if they would represent less than 1% of the total number of reads.

12. Filter OTU table by minimum count (2) of reads per OTU per sample (filter_observations_by_sample.py - https://gist.github.com/adamrp/7591573). Another quality control step to remove OTUs present with low abundance in the samples.

13. Filter rare OTUs (less than 5 reads) and produce final filtered OTU fasta file.

14. Produce final OTU table and statistics. Rarefaction and several of the analysis reported in the manuscript were performed with the core_diversity_analysis.py pipeline in QIIME.

CHAPTER 3 67

Collection date effect on Porites sp. endolithic community:

The Porites sp. samples were collected in April and November 2014 and seasonality might affect the composition of microbial communities. To test whether this was the case, we performed a comparison of the alpha and beta diversity among samples collected in the different months, following the same methodology and rarefaction thresholds described in the

text to compare Porites sp. samples from high pCO2 and control sites. The results show no significant different between samples collected in different months:

Alpha diversity:

Chao1 Obs. OTUS

N Seqs April November p-value April November p-value

16S 20 2200 148.6 ± 64 139.0 ± 32 0.70 138.9 ± 60 133.4 ± 31 0.82 tufA 22 1400 7.0 ± 3 8.2 ± 3 0.40 7.0 ± 3 8.1 ± 3 0.42

UPA 22 7000 23.1 ± 8 21.7 ± 5 0.62 22.1 ± 7 21.1 ± 5 0.72

Beta diversity:

No OTU or taxon group was significantly different (i.e. Bonferroni’s p < 0.05) between collection months according to a Kruskal–Wallis test (for OTUs) and ANOVA (for taxon groups).

We therefore used all Porites samples to compare the endolithic communities in samples from

high pCO2 and control sites.

Supplementary Tables:

The following supplementary tables can be found online at:

https://www.dropbox.com/sh/g9thuyr5ohlqzc6/AACG1ZKtQw45l85BNwkv-ohka?dl=0

Supplementary Table S1. Number of samples processed and included in the analysis

Supplementary Table S2: Kruskal-Wallis test of significant differences between the relative abundances of OTUs present in high pCO2 and control sites. The OTUs were retrieved from the skeletons of Porites sp.

Supplementary Table S3: ANOVA test of significant differences between the relative CHAPTER 3 68

abundances of taxonomic groups present in high pCO2 and control sites. The OTUs were retrieved from the skeletons of Porites sp.

Supplementary table S4. Structure coefficients from discriminant analysis of common* OTUs retrieved from Porites sp. skeletons in high pCO2 and control sites.

Supplementary table S5. Confusion matrices summarizing the probability of a sample being assigned to control or seep sites in a linear discriminant analysis.

Supplementary Table S6: Kruskal-Wallis test of significant differences between the relative abundances of OTUs present in the skeletons of Porites sp.,Seriatopora hystrix and Pocillopora damicornis

Supplementary Table S7: ANOVA test of differences between the relative abundances of taxon groups present in the skeletons of Porites sp.,Seriatopora hystrix and Pocillopora damicornis

Supplementary Figures:

Supplementary Figure 3.1: Alpha rarefaction curves showing the number of OTUs per number of sequences for the different skeleton samples (colored lines).

CHAPTER 3 69

Supplementary figure 3.2. A) Maximum Likelihood phylogeny with green algal OTUs retrieved in this study with the tufA marker and reference sequences (including OTUs identified in a previous study). Bootstrap values are shown in the branch nodes. B) Heatmap indicating relative abundances of the green algal OTUs in control and high pCO2 (seep) sites. The three OTUs present only in seep or control sites are indicated with an asterisk.

CHAPTER 3 70

Supplementary Figure 3.3. Discriminant analysis based on abundances of the most common OTUs. See supplementary table S4 for OTUs taxonomic classification.

Supplementary References:

Caporaso J, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, Costello EK, et al. (2010). QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7: 335–336. Caporaso JG, Bittinger K, Bushman FD, Desantis TZ, Andersen GL, Knight R. (2010). PyNAST: A flexible tool for aligning sequences to a template alignment. Bioinformatics 26: 266–267. Caporaso JG, Lauber CL, Walters WA, Berg-Lyons D, Huntley J, Fierer N, et al. (2012). Ultra-high- throughput microbial community analysis on the Illumina HiSeq and MiSeq platforms. ISME Journal 6: 1621–1624.

Cremen C, Huisman JM, Marcelino VR, Verbruggen H. (2016). Taxonomic revision of Halimeda in southwestern Australia. Australian systematic botany 29, 41-54. Edgar RC. (2013). UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 647: 1–5. Fama P, Wysor B, Kooistra WHCF, Zuccarello GC. (2002). Molecular phylogeny of the genus Caulerpa (Caulerpales, Chlorophyta) inferred from chloroplast tufA gene. Journal of Phycology 38: 1040–1050. CHAPTER 3 71

Gilbert JA, Jansson JK, Knight R. (2014). The Earth Microbiome project: successes and aspirations. BMC Biology 12: 69. Katoh K, Misawa K, Kuma K, Miyata T. (2002). MAFFT: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Research 30: 3059–3066. Klindworth A, Pruesse E, Schweer T, Peplies J, Quast C, Horn M, et al. (2013). Evaluation of general 16S ribosomal RNA gene PCR primers for classical and next-generation sequencing-based diversity studies. Nucleic Acids Research 41: e1. Magoč T, Salzberg SL. (2011). FLASH: Fast length adjustment of short reads to improve genome assemblies. Bioinformatics 27: 2957–2963. Marcelino V, Verbruggen H. (2016) Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Scientific Reports 6, 31508. Porazinska DL, Giblin-Davis RM, Faller L, Farmerie W, Kanzaki N, Morris K, et al. (2009). Evaluating high-throughput sequencing as a method for metagenomic analysis of nematode diversity. Molecular Ecology Resources 9: 1439–1450. Rohland N, Reich D. (2012). Cost-effective, high-throughput DNA sequencing libraries for multiplexed target capture. Genome Research 22: 939–46. Schmieder R, Edwards R. (2011). Quality control and preprocessing of metagenomic datasets. Bioinformatics 27: 863–864. Sherwood AR, Presting GG. (2007). Universal primers amplify a 23S rDNA plastid marker in eukaryotic algae and cyanobacteria. Journal of Phycology 43: 605–608. Steven B, McCann S, Ward NL. (2012). Pyrosequencing of plastid 23S rRNA genes reveals diverse and dynamic cyanobacterial and algal populations in two eutrophic lakes. FEMS Microbiology Ecology 82: 607–615. Wang Q, Garrity GM, Tiedje JM, Cole JR. (2007). Naïve Bayesian classifier for rapid assignment of rRNA sequences into the new bacterial taxonomy. Applied Environmental Microbiology 73: 5261–5267.

CHAPTER 3 72

CHAPTER 4 DISTRIBUTION OF ENDOLITHIC MICROBIAL COMMUNITIES

Abstract

For a long time microbial species were alleged to be distributed worldwide, but an increasing number of studies indicate that microorganisms can have non-random spatial distributions and well-defined ecological niches. Characterising the variation in the composition of microbial communities across space and habitats (β-diversity) is central to understand their ecological roles in the ecosystem. Several studies suggest that different coral species and habitats are associated with different microbial communities, but to our knowledge, no study to date has quantified the turnover of microbial species associated with corals across space, neither the possible effects of niche specialization and random processes (e.g. ecological drift) on their distribution. Here we quantify the rate of microbial species turnover over millimetres (i.e. within individual colonies) to global spatial scales and assess the relative importance of niche specialization and neutral processes in shaping the structure of the endolithic community. Our results suggest that neutral processes and dispersal limitation create an unexpectedly high rate of bacterial species turnover within colonies, while niche specialization explains most of the distribution of endolithic microbes at larger spatial scales.

CHAPTER 4 74

4.1 Background Many studies support the longstanding notion that small organisms have very broad or cosmopolitan distributions as a consequence of large population sizes and dispersal rates (Fenchel & Finlay 2004, reviwed in Green & Bohannan 2006 and Martiny et al. 2006). However there is a growing evidence for non-random distributions and niche specialization in host-associated and free living microorganisms (reviewed in Green & Bohannan 2006; Martiny et al. 2006). This ambiguity is partially due to the challenge of distinguishing between microbial species (Green & Bohannan 2006). The common endolithic alga Ostreobium sp., for example, has long been thought to have a cosmopolitan distribution: it is ubiquitously present in tropical coral reefs but has also been recorded in high latitude areas such as Iceland and Helgoland (Kornmann & Sahling 1980; Gunnarsson & Nielsen 2016), and it occurs in shallow and deep waters as well as in cave-dwelling corals (Odum & Odum 1955; Aponte & Ballantine 2001; Hoeksema 2012). It was recently shown however that the genus Ostreobium is composed of dozens of cryptic species (Del Campo et al. 2016; Marcelino & Verbruggen 2016; Sauvage et al. 2016), that different Ostreobium genotypes have been found in different depths (Gutner-Hoch & Fine 2011) and that the relative abundance of at least one Ostreobium OTU is significantly different among coral species (Chapter 3). The prokaryotic members of the coral microbiome seem to follow a similar pattern: the vast majority of bacterial species are only found in a few coral species and only a couple of bacterial OTUs were found to be ubiquitously present in corals sampled across the world (Ainsworth et al. 2015). If endolithic microbes are not homogeneously distributed, then how can one describe and quantify the distribution of endolithic diversity across space and habitats? The study of how community similarity changes with geographical distance (coined distance-decay relationship) and along environmental gradients can provide quantitative information about the distribution of biodiversity. This chapter is divided into two sections where we study different spatial scales. In Section 4.2 we perform a high-resolution assessment of the distribution of endolithic microbes within individual coral colonies and among adjacent corals. This section will be submitted for publication as a short communication and has been formatted accordingly (therefore methods are at the end). In section 4.3 we study the distribution of endolithic microbes from centimetres to global scales and evaluate the relative influence of ecological niches and neutral processes in structuring endolithic communities.

CHAPTER 4 75

4.2 Spatial structure of the endolithic microbiome within and between coral colonies

Thousands of bacteria and over a hundred eukaryotic algae species live inside the skeletons of stony corals (Marcelino & Verbruggen 2016). Members of the skeletal microbiome (termed endolithic) are involved in coral nutrition and photoprotection (Schlichter et al. 1995; Yamazaki et al. 2008), bioerosion (Tribollet 2008), nitrogen cycling (Crossland & Barnes 1976) and coral disease (Page & Willis 2007). The endolithic community composition varies across large geographical scales and coral hosts (Del Campo et al. 2016, Chapter 3), but the small scale heterogeneity of these communities is poorly characterised. Conspecific neighbouring coral colonies are often considered biological replicates for a given habitat or experimental condition (e.g. Chapter 3, Meron et al. 2012), but the spatial heterogeneity of the endolithic community among nearby corals has never been quantified. Microbial community structure might also exist within individual colonies. Endolithic algae, for example, were found to be more abundant at the base than at the apex of stylasterid corals, suggesting some degree of dispersal limitation at intracolony scales (Pica et al. 2016). Besides dispersal limitation, mutualistic and competitive associations between microbes can create a patchy species distribution (Horner-Devine et al. 2007). If the coral microbiome is not homogeneously distributed, then a high-resolution quantification of its heterogeneity is fundamental to define sampling designs, investigate symbiotic relationships, and better understand the ecological roles of endolithic communities in coral health and resilience.

One of the most widely used metrics of habitat heterogeneity is β-diversity, which, among other definitions, describes the variation in community composition from one sampling unit to another along a spatial or environmental gradient (Anderson et al. 2011). The rate of decay in community similarity with geographical distance – coined distance-decay relationship (DDR) – is a well-established measure of β-diversity (Nekola & White 1999; Green et al. 2004; Anderson et al. 2011; Martiny et al. 2011). Constrained dispersal, random colonization and extinction events, environmental heterogeneity and biotic associations give rise to β-diversity. Most DDR studies have target macroscopic organisms because microorganisms are expected to have broad or cosmopolitan distributions, so the rate of decay in community similarity with distance is often very small or not significantly different from zero for microbes (Soininen 2012; Barreto et al. 2014; Zinger et al. 2014). Nevertheless, a relatively high β-diversity has been observed in microbial communities living in heterogeneous environments like coastal sediments (which have shorter environmental gradients when compared with open water), possibly due to habitat specialization (Zinger et al. 2014). Decay in community similarity with distance within small spatial scales, where CHAPTER 4 76

environmental conditions are expected to be similar (e.g. in biological replicates), suggests dispersal constraints and/or non-random biotic associations. The dispersal abilities and biotic associations among endolithic microbes are currently unknown, and despite its wide applicability and relevance, no study to date has investigated whether a DDR exists for coral reefs microbiomes.

Here we test the existence and strength of distance-decay relationships for the prokaryotic and eukaryotic endolithic microbiome within and between colonies of massive Porites coral colonies. We surveyed eight large (~1 meter across or over) Porites lutea and P. lobata colonies from three coral reefs in Australia. We collected six to twelve healthy skeleton samples from each colony following a geometrical progression design (Webster & Boag 1992) with distances between samples ranging from 0.4 to 199.2 cm to investigate intra- colony DDR (Figure 4.1). We minimized the effects of environmental variables, such as light, by collecting all intracolony samples along longitudinal transects (i.e. all samples are at the same depth) following the edge of the colony. All samples had a similar size and were collected 0.5 cm below the upper surface of the coral’s tissue. We used the distances between samples from adjacent conspecific colonies (92 – 411 cm apart) to quantify β-diversity across colonies. Using a multi-marker amplification protocol (Marcelino & Verbruggen 2016) and high-throughput sequencing (Illumina MiSeq), we characterised the β-diversity of both bacteria and photosynthetic microbes. Specifically, we used the 16S rDNA marker to characterise prokaryotes, the Universal Plastid Amplicon (UPA) to characterise photosynthetic eukaryotes and prokaryotes (including cyanobacteria), and tufA for green algae specifically. For each assemblage (defined by the three DNA markers), we quantified β- diversity as the slope of the linear regression between (log10-transformed) pairwise community similarities and (log10-transformed) spatial distances, within and between colonies. Besides the Sørensen index, which is the standard similarity index used in DDR studies, we also used the UniFrac method that takes into consideration relative abundances and phylogenetic distances (Lozupone et al. 2011). We tested whether the slope was significantly different from zero with Mantel tests. We also generated species accumulation curves as a function of number of samples and distance to better characterise the community heterogeneity and guide future studies. CHAPTER 4 77

1 2 3 4 5 6 7 8 9 10 11 12

0.4 cm1.2 cm4.0 cm11 cm 33 cm 100 cm 33 cm 11 cm 4.0 cm1.2 cm0.4 cm

Figure 4.1: Massive Porites lutea colonies in Western Australia and sampling design. The red line illustrates a transect along which samples have been collected, the varying distances between samples is shown.

Endolithic bacteria within colonies showed a high decay in community similarity with distance (Figure 4.2A). The observed intracolony bacterial DDR-slope (-0.15, P=0.004, Supplementary table S1) is among the highest slopes reported for microorganisms (Supplementary table S2). Free living marine bacteria, for example, show average DDR- slopes smaller than -0.8 (Zinger et al. 2014) and salt marsh bacteria (defined at 97% sequence similarity) have a DDR-slope of -0.04 (Horner-Devine et al. 2004). The rate of turnover of bacterial species within the skeleton is comparable to the species turnover of fungi in desert soils, but still smaller than the DDR slopes commonly observed for macroorganisms (Supplementary table 2; Green et al. 2004). It should be noted that, to our knowledge, no study to date has assessed DDR at small spatial scales as assessed here, therefore the rate of species turnover of other assemblages across comparable scales is unknown. Endolithic algae showed a shallower DDR slope that is not significantly different from zero, suggesting that they are more homogeneously distributed within the skeleton (Figure 4.2A, Supplementary table S1). The patterns were similar irrespective of sampling site (Supplementary Figure S1). Analyses based on similarity values that take into consideration abundance and phylogenetic distances yielded less steep DDR-slopes for all groups of organisms (Supplementary table S1). CHAPTER 4 78

The patchy distribution of endolithic communities was also observed in the species accumulation curves, which did not reach an asymptote in most cases, except for the eukaryotic algae communities (tufA) in some colonies (Supplementary figures S2-S4). These results confirm that bacteria are more heterogeneously distributed within colonies than endolithic algae, as a higher number of samples and distances within colonies were required to reach 100% of the observed colony’s OTUs in the 16S rDNA dataset when compared with the eukaryotic algae (tufA dataset). The marker depicting both cyanobacteria and eukaryotic algae (UPA) yielded intermediary results. Only a fraction of the colony’s microbial diversity could be observed in one single sample: the percentage of OTUs retrieved in one environmental sample (~ 0.25 cm3) in relation to the total number of OTUs recovered from the colony were 26.36% (±10.77 SD) for 16S rDNA, 37.76% (±15.63 SD) for UPA and 51.78% (±16.67 SD) for tufA. Even though we analysed up to 12 samples from each colony, the results indicate that further sampling would likely recover a higher diversity, especially of bacterial OTUs (Supplementary figures S2-S4).

Figure 4.2: Distance-decay relationships for the endolithic communities in coral skeletons.

The blue line indicates the linear regression between (log10-transformed) geographical distance and (log10-transformed) Sørensen community similarity. Shaded area represents the 95% confidence interval. Mantel r statistic (r) and significance values (P) are provided. CHAPTER 4 79

The microbial community inhabiting adjacent colonies did not show a significant decay in community similarity with distance (Figure 4.2B). The slopes of the DDR were flat even for bacteria, even though the distances assessed here were substantially higher (92 – 411 cm) than the distances within colonies (0.4 – 199 cm). The significant DDR for bacteria at the intracolony scale therefore should be associated with features of the skeleton habitat. Although no significant DDR was observed, our results show that the community similarity among samples from adjacent colonies was low for bacteria – smaller than 30% on average (Figure 4.2B).

The high rate of bacterial species turnover – and therefore high β-diversity – within individual corals can be a consequence of limited dispersal within colonies. Marine bacteria are easily dispersed through the water column, but in dense substrates like limestone skeletons, their mobility would obviously be reduced. Local accumulation of bacterial colonies with limited dispersal can create a patchy landscape and a high β-diversity as observed here. The lack of a significant bacterial DDR between nearby colonies further support that bacteria may be dispersal limited within the skeletons, but not in the water column. Some species of endolithic algae and cyanobacteria, on the other hand, are known to actively bore their way through limestone substrates (Tribollet et al. 2011) and their small (if any) dispersal limitation is reflected in the small β-diversity observed for these organisms (Figure 4.2A).

High β-diversity can also be a consequence of environmental heterogeneity and niche specialization (Nekola & White 1999; Horner-Devine et al. 2004; Martiny et al. 2011; Zinger et al. 2014). Our sampling design aimed at minimizing the effects of environmental variables, but the presence and abundance of co-occurring microorganisms could create micro-niches and habitat heterogeneity. Competitive and mutualistic microbial interactions can result in non-random distributions that will affect β-diversity (Horner-Devine et al. 2007). It has been shown that the distribution of photosynthetic organisms has a stronger correlation with environmental variables, while the distribution of organisms in higher trophic levels are more related to biotic interactions (Soininen et al. 2007a; Soininen et al. 2011). Therefore the null intracolony DDR for algae may be a consequence of homogeneous environmental conditions among samples, while the distribution of heterotrophic bacterial species may depend on co- occurring microbes, generating the high bacterial β-diversity observed here. Teasing apart the relative importance of dispersal limitation, environmental factors and microbial interactions is the next step to understand of the distribution and functioning of the coral microbiome.

This study provides the first quantitative evidence for a high turnover of bacterial species in coral skeletons and shows that only a fraction of the endolithic microbiome is represented in any part of the skeleton. Whether this high β-diversity also occurs in corals’ CHAPTER 4 80

tissue and mucus still needs to be investigated, but the few studies that analysed multiple samples from the same colony indicate that intracolony heterogeneity also exists in other parts of the coral (Rohwer et al. 2002; Hansson et al. 2009; Daniels et al. 2011). These findings imply that high-resolution biodiversity assessments are essential to understand the composition of the coral microbiome and its ecological roles. Differences in community composition among healthy and diseased corals (or parts of a coral) can easily go undetected without this high-resolution sampling. Likewise, differences in community composition among corals exposed to experimental conditions (e.g. temperature or pH) may be masked by intracolony (and within treatment) variation. The endolithic microbiome is gaining increasing attention due to its newly discovered high biodiversity (Del Campo et al. 2016; Marcelino & Verbruggen 2016; Sauvage et al. 2016) and ecological functions (e.g. Weinstein et al. 2016), including their potential role as a stable reservoir for the coral microbiome (Chapter 3). This study suggests yet another unexpected property of this community – a high bacterial β- diversity likely caused by dispersal limitation and/or microbial associations. The next step is to investigate how the processes underlying species turnover within individual colonies scales up to regional and global distribution patterns of the endolithic microbiome.

Methods

Sampling design and library preparation

Coral skeletons were collected at two sites in Western Australia and one site in Queensland (Supplementary table S3). Samples of ~ 0.25 cm3 were collected along a intracolony transect, following the border of the coral, according to a geometric progression design (Webster & Boag 1992): the distance among successive samples were: 0.4 cm, 1.2 cm, 4 cm, 11 cm, 33 cm, 100 cm, 33 cm, 11 cm, 4 cm, 1.2 cm and 0.4 cm (Figure 4.1). Twelve samples were collected from each colony, except for one smaller colony where only the first 6 samples were collected (90 samples in total). The 0.4 cm distances were recorded for 5 out of the 8 colonies (colonies P1 – P5, Supplementary table S3), and the between-colonies distances were recorded within 3 colonies (colonies P1 – P3). All intracolony distances larger than 0.4 cm were recorded for all 90 samples. Skeleton samples were collected in the field using a hammer and chisel. Pliers and a Dremel tool were used in the field laboratory to separate samples across smaller distances. Samples were stored in RNAlater (when collected in Western Australia) or 100% ethanol (when collected in Queensland). CHAPTER 4 81

The DNA isolation and amplification followed previously described protocols (Marcelino & Verbruggen 2016), with the addition of the amplification and sequencing of the Internal transcribed spacer (ITS) region for corals (White et al. 1990). The ITS amplicons were amplified with Kapa Taq (Kapa biosystems) following manufactures instructions for the PCR reaction mixture. The PCR conditions consisted of an initial denaturation step at 94°C for 2 min, followed by 26 cycles of denaturation (94°C for 30s), annealing (45s at 51°C for the first 6 cycles and 55° for the remaining 20 cycles) and extension (72°C for 60 s), 20 cycles of denaturation (94°C for 30s), and a final extension step at 72°C for 7 min. Libraries were sequenced using the Illumina MiSeq platform (2×300 bp paired end reads).

In depth community characterisation

The initial parsing and quality filtering of the reads was carried out as described in (Marcelino & Verbruggen 2016). Sequences were clustered into Operational Taxonomic Units (OTUs) using UPARSE (Edgar 2013). A similarity threshold of 98% was set for the tufA and ITS markers, and 97% for the 16S rDNA and UPA markers. A taxonomy was assigned to the OTUs using the Naïve Bayesian Classifier (RDP) implemented in QIIME v.1.9.1 (Wang et al. 2007; Caporaso et al. 2010). OTUs with less than 50 reads across all samples and OTUs from samples where they are present with 50 or less reads were removed from the analysis to reduce the risk of false-positives (due to tag-jumping). Chloroplast sequences were excluded from the 16S rDNA dataset and bacterial sequences were excluded from the tufA dataset. To investigate whether the sequencing effort was deep enough to represent the community, rarefaction curves of the number of observed OTUs per number of reads were constructed by randomly subsampling the reads in QIIME (Supplementary figure S6). To correct for different sequencing depth among samples, a rarefaction threshold was set for each marker where the curve reaches an asymptote - 2500 reads for 16S rDNA, UPA and ITS, and 1000 reads for tufA marker. Samples containing less reads than this threshold were excluded from the analyses. Sørensen similarity and UniFrac distances (Lozupone et al. 2011) were calculated for each OTU pair in QIIME.

Coral identification

Coral identifications were based on the combination of internal transcribed spacer (ITS) sequences and corallite morphology. The ITS OTUs with relative abundance greater than 75% were aligned with reference ITS sequences of previously identified Porites (Forsman et al. 2009; Hellberg et al. 2016) using MAFFT v.7.222 (Katoh & Standley 2013). CHAPTER 4 82

A maximum likelihood tree was constructed with RAxML v.8.2.6 (Stamatakis 2006) (Supplementary figure S6).

Distance decay relationship and species accumulation curve

The rate of the decay in community similarity with distance was calculated for each marker separately as the slope of the linear least squares regression on the relationship between pairwise geographic distance and similarity (Nekola & White 1999; Martiny et al. 2011). In cases where pairwise similarity was 0 (i.e. no OTUs in common), it was replaced by the lowest nonzero community similarity observed in the similarity matrix (Martiny et al.

2011). The community similarity and geographical distances were log10-transformed prior DDR analyses. The significance of the slope was tested with Mantel tests (9,990 permutations) using the vegan R package (Oksanen et al. 2007).

To investigate the degree of patchiness within colonies, species accumulation curves as a function of samples and geographical distances were calculated for each colony using 100 permutations (for sample-based curve).

CHAPTER 4 83

4.3 Niche specificity and neutral processes underlying the distribution of endolithic microbes

Introduction

Niche and neutral processes shape the distribution of biodiversity (Chase & Myers 2011). Niche processes are deterministic processes like specialization to particular habitats, environmental filtering and biotic interactions. Niche processes are a consequence of natural selection happening over evolutionary timescales and result in a patchy distribution of biodiversity. Neutral processes are stochastic processes as random births and deaths, colonization and extinction events and ecological drift (i.e. random demographic changes Hubbell 2001). Neutral processes also lead to a patchy distribution of biodiversity. The relative importance of the ecological niche versus neutral processes on species distributions is a matter of debate in literature and requires consideration of how the relative importance of these processes changes across scales (Chase & Myers 2011). For example, species may be adapted to specific microhabitats at local scales and have a patchy distribution due to dispersal limitation and stochastic colonization at large geographical scales. The study of how community similarity changes with distance (distance-decay relationships) can help to disentangle niche and neutral processes underlying the distribution of organisms. If the relative importance of niche processes is high, then communities from similar ecological niches are expected to have a similar species composition irrespective of the distance between them. Neutral processes on the other hand are directly proportional to dispersal and geographical distance – random colonization events, for example, are more likely to occur at locations nearer the current species distribution. If the relative importance of neutral processes is high, then a negative correlation between community similarity and geographical distance should be observed even after controlling for niche factors. In the previous section we observed that within individual corals, endolithic bacteria have a high rate of species turnover (i.e. β-diversity) while endolithic algae showed an insignificant rate of species turnover. Here we study the global distribution of endolithic microbes and test whether rates of species turnover vary across spatial (intracolony to global) scales. Then we investigate the relative influence of niche and neutral processes underlying the β-diversity of endolithic microbes.

CHAPTER 4 84

Materials and Methods

Sampling and estimation of geographical distances A total of 163 skeleton samples of Porites sp., including previously published (Chapters 3 and section 4.2) and newly sequenced samples were collected across distances varying from 0.4 cm (intracolony) to 17,140 km (Supplementary table S4). The Euclidean distances between samples, corrected for Earth’s curvature, were calculated with the R package geosphere (Hijmans et al. 2012) and are provided in Supplementary table S5. Five geographical scales were analysed, chosen to maximize the distribution of sampled distances (in the log space) among the scale categories: i) intracolony scale; ii) local scale, representing samples within contiguous reefs (90 cm – 1 km, excluding intracolony distances); iii) regional scale (1 km – 50 km); iv) large scale (50 km – 17,140 km) and v) a global analysis incorporating all distances (0.4 cm – 17,140 km). Since the global analysis incorporates a higher number of samples and distances, we expect that it will give a more accurate description of the global distribution of endolithic microbes.

Community similarity distances Sample processing, DNA isolation and taxonomic profiling followed previously described protocols in section 4.1. This section focused on the prokaryotic (16S rDNA) and eukaryotic green algal (tufA) communities. Community similarity was based on Sørensen index, which is one of the most commonly used β-diversity indices in distance-decay studies and thus facilitates cross-studies comparisons (e.g. Horner-Devine et al. 2004; Soininen et al. 2007b; Martiny et al. 2011; Barreto et al. 2014).

Niche distances Both biotic and abiotic features of the ecological niche were assessed. Depth was recorded for each sample during field work. Mean sea surface temperature (SST), phosphate concentration, chlorophyll A and diffuse attenuation data for each collection site were extracted from the Bio-ORACLE database (Tyberghein et al. 2012). The abiotic variables extracted from Bio-ORACLE have a ~9 km resolution and are more appropriate for regional, large and global scale analyses, therefore their influence at smaller spatial scales was not assessed. The biotic niche features included coral host sequence similarity and relative abundance of algae and bacteria. The most common Internal Transcribed Spacer (ITS) OTU was used as a representative of the coral genotype. The host ITS sequence similarity was calculated with the MEGA software (Tamura et al. 2013) using the Jukes-Cantor model and a gamma-distributed rate variation among sites. The relative abundance of algae was estimated CHAPTER 4 85

as the percentage of Chlorophyta sequences in the tufA dataset and was used as a niche feature for the endolithic bacteria. The relative abundance of bacteria was estimated as the percentage of all sequences that did not match with Archaea, “undetermined” or Cyanobacteria (which includes green algae in the Greengenes reference dataset used to classify 16S OTUs). The relative abundance of bacterial OTUs was considered part of the biotic niche of endolithic algae and vice versa. To avoid auto-correlation we do not compare relative abundances and community similarity within the same dataset (e.g. relative abundances in the 16S dataset were not considered a niche feature for bacteria). A Euclidean similarity matrix was produced for each niche feature using R. Additionally, the overall (biotic and abiotic) niche similarity was estimated as the average similarity between all niche features. In cases where abiotic niche features were equal for all samples (e.g. temperature for intracolony and local scales), only the biotic niche features were used to calculate the overall niche similarity.

Statistical analyses The decay in community similarity with distance was estimated for all scales as described in section 4.2. Mantel tests (999 permutations) were used to assess the significance of the DDR-slope. Partial Mantel tests (999 permutations) were used to tease apart the effects of stochastic and niche processes. We tested the correlation of each niche feature with community similarity while controlling for the effects of geographical distance. Likewise, we tested the effects of geographical distance alone on community similarity, while controlling for the effects of the overall niche similarity. Given the coarse resolution (~9 km) of the majority of the environmental variables, we did not assess their relative importance at small scales (intracolony and local), but they are likely to be very similar and therefore play a minor role in shaping β-diversity at these scales.

Results

Endolithic β-diversity across scales The overall rate of species turnover was substantially different across scales for both bacterial and algal members of the endolithic community. The bacterial members of the microbiome showed a decrease in similarity with distance when all scales were analysed together (global scale) but the slope of the DDR was not significantly different from zero (Figure 4.3A). The DDR slope was only significantly different from zero at the intracolony scale (Figure 4.3, Table 4.1), which was also observed on the previous section (Figure 4.2). CHAPTER 4 86

The algal community showed a significant decay in community similarity at global scales, but the pattern is less clear at the individual scales analysed separately (Figure 4.4, Table 4.1). The results showed a counter-intuitive significant increase in community similarity with distance at regional scales, although the variation in community similarity for similar distances within this scale was high (Figure 4.4D, Table 4.1).

Figure 4.3: Distance-decay relationships for the bacterial community in coral skeletons across different scales. Blue lines indicate the linear regression between (log10 transformed) geographical distance and (log10 transformed) Sørensen community similarity. Shaded areas represent the 95% confidence interval. The red asterisk indicates a correlation significantly different from zero according to Mantel tests. CHAPTER 4 87

Figure 4.4: Distance-decay relationships for the green algae community in coral skeletons across different scales. The blue lines indicate the linear regression between (log10 transformed) geographical distance and (log10 transformed) Sørensen community similarity. Shaded areas represent the 95% confidence interval. Red asterisks indicate correlations significantly different from zero according to Mantel tests.

Neutral and niche processes influence on β-diversity All niche factors analysed here were significantly correlated with bacterial and algal community similarity at large and/or global scales (Table 4.1), indicating that sites with an overall similar niche had a similar community composition. At regional scales, depth and CHAPTER 4 88

biotic factors (coral similarity and relative abundance of algae) were significantly correlated with bacterial community similarity. The algae community composition was also significantly correlated with coral host (at large spatial scales), relative abundance of bacteria (at large and global scales) and environmental factors (mostly at large spatial scales). When the effect of the overall niche similarity was removed, geographical distance showed no influence on the decay in community similarity except for endolithic bacteria at the intracolony scale (Table 4.1), suggesting a higher relative importance of niche processes in shaping the distribution of endolithic microbes.

Table 4.1. Influence of geographical distance and niche factors on endolithic bacterial and algal communities. Partial Mantel tests were used to infer the influence of niche factors while controlling for the effects of geographical distance and vice versa.

Bacterial community (16S rDNA) Green algae community (tufA)

Global Colony Local Region Large Global Colony Local Region Large all scales 0.4–199cm 90cm-1km 1-50km 50km+ all scales 0.4–199cm 90cm-1km 1-50km 50km+ Distance (including niche) DDR slope -0.058 -0.148 -0.223 0.686 -0.049 -0.049 -0.061 -0.195 -0.018 0.147

Mantel r 0.041 0.503 0.072 0.241 -0.055 0.385 0.210 -0.327 0.331 -0.094

Mantel P 0.175 0.002* 0.361 0.061 0.955 0.001* 0.118 0.893 0.012* 1.000

Distance (controlling for niche) Mantel r -0.378 0.500 -0.209 0.247 -0.192 -0.105 0.197 -0.402 0.278 -0.350

Mantel P 1.000 0.008* 0.793 0.103 0.991 0.952 0.155 - nan - - nan - 1.000

Depth

Mantel r 0.353 na na 0.582 0.353 0.388 na na 0.014 0.321

Mantel P 0.001* na na 0.007* 0.001* 0.001* na na 0.472 0.001*

SST mean

Mantel r 0.135 na na 0.118 0.169 0.186 na na -0.027 0.245

Mantel P 0.068 na na 0.189 0.041* 0.005* na na 0.580 0.001*

Phosphate

Mantel r 0.158 na na 0.176 0.089 -0.035 na na -0.027 0.166

Mantel P 0.002* na na 0.109 0.087 0.847 na na 0.551 0.004*

Chlorophyll A mean Mantel r 0.196 na na 0.066 0.224 0.066 na na -0.028 0.116

Mantel P 0.020* na na 0.297 0.008* 0.120 na na 0.579 0.016*

Diffuse Attenuation mean Mantel r 0.187 na na -0.091 0.213 0.115 na na -0.033 0.153

Mantel P 0.021* na na 0.766 0.016* 0.020* na na 0.583 0.002*

Coral sequence similarity Mantel r 0.194 na 0.572 0.300 0.165 0.067 na 0.014 0.113 0.130

Mantel P 0.017* na 0.002* 0.033* 0.044* 0.116 na 0.403 - nan - 0.014*

Algae rel. abundance Mantel r 0.311 0.153 0.539 0.276 0.337 na na na na na

Mantel P 0.023* 0.209 0.052 0.042* 0.008* na na na na na

Bacteria rel. abundance Mantel r na na na na na 0.342 0.199 0.323 0.112 0.303

Mantel P na na na na na 0.001* 0.200 0.136 0.236 0.001*

Overall niche similarity

Mantel r 0.414 na 0.684 0.383 0.232 0.247 na 0.422 -0.093 0.539

Mantel P 0.001* na 0.061 0.115 0.045* 0.001* na - nan - - nan - 0.001*

CHAPTER 4 89

Discussion This study shows that the β-diversity of endolithic microorganisms in coral skeletons can vary substantially across geographical scales and between organisms. In the previous section (4.2) we found a significant decay in community similarity with intracolony distances for bacteria, but not for the algae. At global scales this pattern is inversed: only the algae showed a significant decay in community similarity with global distances (Figure 4.3). Endolithic microbes are not cosmopolitans: the average community similarity between samples from adjacent colonies versus samples that are 17,000 km apart was ≈ 30% vs. 10% respectively for bacteria and ≈ 60% vs. 17% for algae (Figures 4.2B, 4.3 and 4.4). Interestingly, the overall community similarity of both algal and bacterial communities within samples spanning from 300 metres to 17,000 kilometres does not decrease with distance (Figures 4.3A and 4.4A), suggesting that sampling across large (global) geographical scales will not contribute to the species diversity more than sampling across smaller scales. In other words, the differences in community composition between samples that are only 300 m apart can be equivalent to differences between samples from opposite sides of the Earth. We found no evidence that neutral processes influence the β-diversity of endolithic microorganisms except for bacteria at intracolony scales. As discussed in the previous section, bacteria might be more dispersal-limited inside corals than algae species that actively bore into limestone. Neutral processes as ecological drift and dispersal limitation are the most likely causes for the high rate of bacterial species turnover at the intracolony scale. At larger scales however, niche processes better explain the decay in community similarity of algae and bacteria (Table 4.1). Temperature and depth have been associated with the distribution of marine microbes before (e.g. Johnson et al. 2006; Sunagawa et al. 2015), and here we show that other niche features such as phosphate and chlorophyll A (a proxy for nutrients concentration) and diffuse attenuation (water turbidity) also play a role in shaping the composition of endolithic microbes. The biotic niche, rarely incorporated in microbial biogeography studies, was found to be as important as abiotic niche factors in shaping microbial β-diversity. Even though our analyses were restricted to one coral genus, the host sequence similarity (a measure of phylogenetic relatedness) significantly correlated with the similarity of endolithic algae (at large spatial scales) and bacteria (at all scales analysed). The relative abundance of endolithic algae was also correlated with the composition of the bacterial communities and vice versa at large geographical scales (Table 4.1). These results suggest that non-random interactions (e.g. mutualistic or competitive) among algae, bacteria and the coral host exist in endolithic communities. Niche factors were significantly correlated with community similarity at large and global scales but rarely at smaller geographical scales. One possible explanation for this CHAPTER 4 90

pattern is that the number of different habitats sampled increases substantially with geographical distance and so do the chances of observing a significant correlation between community similarity and niche features. Additionally, any variability of macroecological niche features (e.g. temperature) occurring at scales smaller than 9km (the resolution of the environmental variables extracted from Bio-ORACLE) would not be detected in our analyses. The unexpected increase in community similarity with distance observed at regional scales (Figures 4.3D and 4.4D) may also be a consequence of the number of habitats sampled at scales larger than 300m. The habitat features were mostly homogeneous at scales smaller than 300m. At regional geographical scales, more samples from distant localities with similar ecological niches and endolithic communities were incorporated, resulting in an increase in community similarity as a function of distance. Environmental sequencing and distance-decay studies have proven to be powerful tools to investigate microbial β-diversity, but they also have inherent caveats. We minimized the potential effects of sequencing bias and tag jumping (incorrect assignment of OTUs to samples) with a stringent quality control (e.g. excluding all OTUs in samples where they occur with less than 50 reads). Although this excludes rare OTUs from our analyses, rare species were found to have a minor influence on distance-decay relationships (Morlon et al. 2008). We also adopted a presence-absence similarity index (Sørensen), which is less sensitive to PCR biases and is one of the most widely used metric in distance-decay studies (e.g. Horner-Devine et al. 2004; Soininen et al. 2007b; Martiny et al. 2011; Barreto et al. 2014). Future studies however would benefit from incorporating abundances if possible, as they may be a more realistic representation of species environmental affinities. The sources of uncertainties when assessing the relative importance of niche factors include spatial autocorrelation between environmental variables and the effects of unmeasured variables. Many environmental variables are typically autocorrelated, therefore the mechanistic causes of a significant correlation between a given environmental variable and community similarity may be a reflection of other variables. Analyses that better handle autocorrelation (Legendre & Fortin 2010; Wang et al. 2013) will be applied to accurately estimate the proportion of variance in community similarity that is explained by distance and niche factors when we prepare this manuscript for submission to a scientific journal. The correlation between some niche factors and community similarity could not be statistically investigated at local and regional scales for algae due to a high similarity between variables (i.e. during the partial Mantel permutations, zero standard deviations would occur and produce an error). These observations highlight the importance of considering a sufficiently large sampling (Green & Bohannan 2006) and a geographical scale that is relevant to the environmental variable being tested. For example, more extensive sampling along temperature gradients might reveal a statistically significant correlation between temperature CHAPTER 4 91

and community similarity at smaller spatial scales. Finally, it should be noted that the framework proposed here to disentangle the effects of neutral and niche features are based on the assumption that dispersal (which is considered to be proportional to distance, e.g. Martiny et al. 2011) affects neutral but not niche processes. Restricted dispersal plays an important role in species diversification and possibly affects niche-processes (e.g. diversification of habitat affinities) at evolutionary timescales. Because diversification rates of marine microbes are much slower than dispersal rates, niche specialization is unlikely to be proportional to distance, and therefore this is unlikely to affect our analyses (see Finlay 2002; Fenchel & Finlay 2004; Chase & Myers 2011). In summary, this study shows that the endolithic algae and bacteria are not uniformly distributed across space. While neutral processes drive the patchy distribution of endolithic bacteria within colonies, biotic and abiotic niche features better explain the distribution of both algae and bacteria at larger spatial scales. These results imply that different (often cryptic) endolithic species have different ecophysiological traits, and therefore are likely to have distinct ecological roles in reef ecosystems, or have different responses to environmental perturbations such as climate change. The results also suggest non-random associations among species of endolithic algae, bacteria and corals, which could be an indication of symbiosis, but more studies are required to investigate the network of associations within the diverse coral microbiome. Finally, coral skeletons are extreme habitats for algae where only a specialist population was expected to be found and it is impressive that endolithic algae also have well defined macroecological niches and a high global β-diversity. Investigating the evolutionary adaptation of green algae to the peculiarities of the endolithic niche(s) is an exciting avenue for future research.

4.4 References Ainsworth T, Krause L, Bridge T, Torda G, Raina J-B, et al. (2015) The coral core microbiome identifies rare bacterial taxa as ubiquitous endosymbionts. ISME Journal 9, 2261-2274. Anderson MJ, Crist TO, Chase JM, Vellend M, Inouye BD, et al. (2011) Navigating the multiple meanings of beta diversity: a roadmap for the practicing ecologist. Ecology Letters 14, 19-28. Aponte NE, Ballantine DL (2001) Depth distribution of algal species on the deep insular fore reef at Lee Stocking Island, Bahamas. Deep Sea Research Part I: Oceanographic Research Papers 48, 2185-2194. Barreto DP, Conrad R, Klose M, Claus P, Enrich-Prast A (2014) Distance-decay and taxa- area relationships for bacteria, archaea and methanogenic archaea in a tropical lake sediment. PLoS One 9, e110128. CHAPTER 4 92

Caporaso JG, Kuczynski J, Stombaugh J, Bittinger K, Bushman FD, et al. (2010) QIIME allows analysis of high-throughput community sequencing data. Nature Methods 7, 335-336. Chase JM, Myers JA (2011) Disentangling the importance of ecological niches from stochastic processes across scales. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences 366, 2351-2363. Crossland CJ, Barnes DJ (1976) Acetylene reduction by coral skeletons. Limnology and Oceanography 21, 153-156. Daniels CA, Zeifman A, Heym K, Ritchie KB, Watson CA, Berzins I, Breitbart M (2011) Spatial heterogeneity of bacterial communities in the mucus of Montastraea annularis. Marine Ecology Progress Series 426, 29-40. Del Campo J, Pombert JF, Slapeta J, Larkum A, Keeling PJ (2016) The 'other' coral symbiont: Ostreobium diversity and distribution. ISME Journal. Edgar RC (2013) UPARSE: highly accurate OTU sequences from microbial amplicon reads. Nature Methods 10, 996-998. Fenchel TOM, Finlay BJ (2004) The ubiquity of small species: patterns of local and global diversity. Bioscience 54, 777. Finlay BJ (2002) Global dispersal of free-living microbial species. Science 296, 1061-1063. Forsman ZH, Barshis DJ, Hunter CL, Toonen RJ (2009) Shape-shifting corals: molecular markers show morphology is evolutionarily plastic in Porites. BMC Evolutionary Biology 9, 45. Green J, Bohannan BJ (2006) Spatial scaling of microbial biodiversity. Trends in Ecology & Evolution 21, 501-507. Green JL, Holmes AJ, Westoby M, Oliver I, Briscoe D, et al. (2004) Spatial scaling of microbial eukaryote diversity. Nature 432, 747-750. Gunnarsson K, Nielsen R (2016) Culture and field studies of Ulvellaceae and other microfilamentous green seaweeds in subarctic and arctic waters around Iceland. Nova Hedwigia 103, 17-46. Gutner-Hoch E, Fine M (2011) Genotypic diversity and distribution of Ostreobium quekettii within scleractinian corals. Coral Reefs 30, 643-650. Hansson L, Agis M, Maier C, Weinbauer MG (2009) Community composition of bacteria associated with cold-water coral Madrepora oculata: within and between colony variability. Marine Ecology Progress Series 397, 89-102. Hellberg ME, Prada C, Tan MH, Forsman ZH, Baums IB (2016) Getting a grip at the edge: recolonization and introgression in eastern Pacific Porites corals. Journal of Biogeography. Hijmans RJ, Williams E, Vennes C (2012) geosphere: Spherical Trigonometry. R package version 1.2–28. CRAN. R-project. org/package= geosphere. Hoeksema BW (2012) Forever in the dark: the cave-dwelling azooxanthellate reef coral Leptoseris troglodyta sp. n. (Scleractinia, Agariciidae). ZooKeys 37, 21-37. CHAPTER 4 93

Horner-Devine MC, Lage M, Hughes JB, Bohannan BJ (2004) A taxa-area relationship for bacteria. Nature 432, 750-753. Horner-Devine MC, Silver JM, Leibold MA, Bohannan BJ, Colwell RK, et al. (2007) A comparison of taxon co-occurrence patterns for macro- and microorganisms. Ecology 88, 1345-1353. Hubbell S (2001) The unified neutral theory of species abundance and diversity. Princeton University Press, Princeton, NJ. Hubbell, SP (2004) Quarterly Review of Biology 79, 96-97. Johnson ZI, Zinser ER, Coe A, McNulty NP, Woodward EM, Chisholm SW (2006) Niche partitioning among Prochlorococcus ecotypes along ocean-scale environmental gradients. Science 311, 1737-1740. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772-780. Kornmann P, Sahling P-H (1980) Ostreobium quekettii (Codiales, Chlorophyta). Helgoländer Meeresuntersuchungen 34, 115-122. Legendre P, Fortin MJ (2010) Comparison of the Mantel test and alternative approaches for detecting complex multivariate relationships in the spatial analysis of genetic data. Mol Ecol Resour 10, 831-844. Lozupone C, Lladser ME, Knights D, Stombaugh J, Knight R (2011) UniFrac: an effective distance metric for microbial community comparison. ISME Journal 5, 169-172. Marcelino VR, Verbruggen H (2016) Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Scientific Reports 6, 31508. Martiny JB, Bohannan BJ, Brown JH, Colwell RK, Fuhrman JA, et al. (2006) Microbial biogeography: putting microorganisms on the map. Nature Reviews: Microbiology 4, 102-112. Martiny JB, Eisen JA, Penn K, Allison SD, Horner-Devine MC (2011) Drivers of bacterial beta-diversity depend on spatial scale. Proceedings of the National Academy of Sciences, USA 108, 7850-7854. Meron D, Rodolfo-Metalpa R, Cunning R, Baker AC, Fine M, Banin E (2012) Changes in coral microbial communities in response to a natural pH gradient. ISME Journal 6, 1775-1785. Morlon H, Chuyong G, Condit R, Hubbell S, Kenfack D, et al. (2008) A general framework for the distance-decay of similarity in ecological communities. Ecology Letters 11, 904-917. Nekola JC, White PS (1999) The distance decay of similarity in biogeography and ecology. Journal of Biogeography 26, 867-878. Odum HT, Odum EP (1955) Trophic structure and productivity of a windward coral reef community on Eniwetok Atoll. Ecological Monographs 25, 291-320. CHAPTER 4 94

Oksanen J, Kindt R, Legendre P, O’Hara B, Stevens MHH, Oksanen MJ, Suggests M (2007) The vegan package. Community ecology package 10. Page CA, Willis BL (2007) Epidemiology of skeletal eroding band on the Great Barrier Reef and the role of injury in the initiation of this widespread coral disease. Coral Reefs 27, 257-272. Pica D, Tribollet A, Golubic S, Bo M, Di Camillo CG, Bavestrello G, Puce S (2016) Microboring organisms in living stylasterid corals (Cnidaria, Hydrozoa). Marine Biology Research 12, 573-582. Rohwer F, Seguritan V, Azam F, Knowlton N (2002) Diversity and distribution of coral- associated bacteria. Marine Ecology Progress Series 243, 1-10. Sauvage T, Schmidt WE, Suda S, Fredericq S (2016) A metabarcoding framework for facilitated survey of endolithic phototrophs with tufA. BMC Ecology 16, 8. Schlichter D, Zscharnack B, Krisch H (1995) Transfer of photoassimilates from endolithic algae to coral tissue. Naturwissenschaften 82, 561-564. Soininen J (2012) Macroecology of unicellular organisms - patterns and processes. Environmental Microbiology Reports 4, 10-22. Soininen J, Kokocinski M, Estlander S, Kotanen J, Heino J (2007a) Neutrality, niches, and determinants of plankton metacommunity structure across boreal wetland ponds. Ecoscience 14, 146-154. Soininen J, Korhonen JJ, Karhu J, Vetterli A (2011) Disentangling the spatial patterns in community composition of prokaryotic and eukaryotic lake plankton. Limnology and Oceanography 56, 508-520. Soininen J, McDonald R, Hillebrand H (2007b) The distance decay of similarity in ecological communities. Ecography 30, 3-12. Stamatakis A (2006) RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics 22, 2688-2690. Sunagawa S, Coelho LP, Chaffron S, Kultima JR, Labadie K, et al. (2015) Structure and function of the global ocean microbiome. Science 348. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evolutionary Genetics Analysis version 6.0. Molecular Biology and Evolution 30, 2725-2729. Tribollet A (2008) The boring microflora in modern coral reef ecosystems: a review of its roles. In: Current developments in bioerosion, pp. 67-94. Springer. Tribollet A, Radtke G, Golubic S (2011) Bioerosion. In: Encyclopedia of Geobiolog eds. Reitner J, Thiel V), pp. 117-134. Springer Netherlands. Tyberghein L, Verbruggen H, Pauly K, Troupin C, Mineur F, De Clerck O (2012) Bio- ORACLE: a global environmental dataset for marine species distribution modelling. Global Ecology and Biogeography 21, 272-281. Wang J, Shen J, Wu Y, Tu C, Soininen J, et al. (2013) Phylogenetic beta diversity in bacterial assemblages across ecosystems: deterministic versus stochastic processes. ISME Journal 7, 1310-1321. CHAPTER 4 95

Wang Q, Garrity GM, Tiedje JM, Cole JR (2007) Naïve Bayesian Classifier for rapid assignment of rrna sequences into the new bacterial taxonomy. Applied and Environmental Microbiology 73, 5261-5267. Webster R, Boag B (1992) Geostatistical analysis of cyst nematodes in soil. Journal of Soil Science 43, 583-595. Weinstein DK, Sharifi A, Klaus JS, Smith TB, Giri SJ, Helmle KP (2016) Coral growth, bioerosion, and secondary accretion of living orbicellid corals from mesophotic reefs in the US Virgin Islands. Marine Ecology Progress Series 559, 45-63. White TJ, Bruns T, Lee SJWT, Taylor JW (1990) Amplification and direct sequencing of fungal ribosomal RNA Genes for phylogenetics. In: PCR protocols: a guide to methods and applications, pp. 315-322. Yamazaki SS, Nakamura T, Yamasaki H (2008) Photoprotective role of endolithic algae colonized in coral skeleton for the host photosynthesis. In: Photosynthesis. Energy from the Sun eds. Allen JF, Gantt E, Golbeck JH, Osmond B), pp. 1391-1395. Springer Netherlands, Dordrecht. Zinger L, Boetius A, Ramette A (2014) Bacterial taxa-area and distance-decay relationships in marine environments. Molecular Ecology 23, 954-964.

CHAPTER 4 96

4.5 Supplementary Material

Supplementary figure S1. Distance-decay relationships for the bacterial (16S rDNA) and phototrophic (UPA and tufA) communities found in coral skeletons, highlighting the site where the corals were surveyed.

CHAPTER 4 97

Supplementary figure S2. Bacterial species accumulation curves, based on the 16S rDNA marker. A) Percentage of the total number of OTUs observed in each colony that is recovered with increasing skeleton samples. These accumulation curves were obtained by randomizing the samples and storing the recovered OTU percentage 100 times. B) Percentage of the total number of OTUs observed in each colony that is recovered with increasing distance between samples.

CHAPTER 4 98

Supplementary figure S3. Species accumulation curves of eukaryotic green algae based on the tufA marker. A) Percentage of the total number of OTUs observed in each colony that is recovered with increasing skeleton samples. These accumulation curves were obtained by randomizing the samples and storing the recovered OTU percentage 100 times. B) Percentage of the total number of OTUs observed in each colony that is recovered with increasing distance between samples.

CHAPTER 4 99

Supplementary figure S4. Species accumulation curves of photosynthetic eukaryotes and cyanobacteria based on the UPA marker. A) Percentage of the total number of OTUs observed in each colony that is recovered with increasing skeleton samples.These accumulation curves were obtained by randomizing the samples and storing the recovered OTU percentage 100 times. B) Percentage of the total number of OTUs observed in each colony that is recovered with increasing distance between samples.

CHAPTER 4 100

Supplementary Figure S5: Alpha rarefaction curves showing the number of OTUs per number of sequences for the different skeleton samples (coloured lines).

CHAPTER 4 101

Supplementary Figure S6: Maximum Likelihood tree of ITS coral sequences. Reference sequences from previously identified corals are in black and the OTUs retrieved from the corals analysed in this study are in red. High bootstrap support values (>80) in internal nodes are indicated. Bootstrap values from terminal nodes and values smaller than 80 are omitted.

CHAPTER 4 102

Supplementary table S1: Slopes of the distance-decay relationships and results of Mantel analyses to test the significance of the correlation between distance and community similarity. Analyses were based on Sørensen similarity distance matrices (presence-absence data) and on UniFrac distance matrices (taking into consideration phylogenetic relatedness between species and their relative abundance).

Intracolony Between colonies

16S rDNA UPA tufA 16S rDNA UPA tufA

Sørensen-based DDR slope -0.1494 -0.0301 -0.0612 -0.1349 -0.1871 -0.1682

Mantel r 0.5069 0.1853 0.2095 0.0429 0.0108 -0.0569

Mantel P 0.0038* 0.1408 0.1323 0.3114 0.3799 0.7795

UniFrac-based

DDR slope -0.0222 -0.0042 -0.0109 -0.0513 -0.0705 -0.0580

Mantel r 0.2891 0.0154 0.0144 0.0308 -0.0129 -0.0185

Mantel P 0.0532 0.4631 0.4347 0.3387 0.4062 0.5562

CHAPTER 4 103

Supplementary table S2. The strength of the distance-decay relationship (DDR-slope) for endolithic bacteria (16S rDNA) within colonies of Porites spp. in comparison with other organisms. The DDR slope for phototrophs was not significantly different from zero therefore is not shown. Studies deriving a taxa-area exponent (z-value) from the DDR slope following the method of Harte et al (1999) are included. It is important to note that z-values and DDR slope values can vary substantially according to geographic scale (Martiny et al 2011), sequencing approach (Terrat et al 2015) and methodologies (Zinger et al 2014). The z-values provided in Horner-Devine et al (2004) represent an average across several studies, therefore scale and sequencing technologies are variable.

z- DDR Organism Scale Sequencing approach Reference value slope Reviewed in Horner-Devine et Plants 0.228 - - - al 2004

Reviewed in Horner-Devine et Birds 0.149 - - - al 2004

Reviewed in Horner-Devine et Butterflies 0.101 - - - al 2004

Reviewed in Horner-Devine et Earthworms 0.092 - - - al 2004

Reviewed in Horner-Devine et Ants 0.088 - - - al 2004

Reviewed in Horner-Devine et Diatoms 0.066 - - - al 2004

Reviewed in Horner-Devine et Ciliates 0.060 - - - al 2004

Salt marsh bacteria (97% 0.020 -0.039 3cm - 300m Cloning and sequencing Horner Devine 2004 OTUs)

Lake sediment bacteria 0.009 -0.018 1cm - 1400m Fingerprint (T-RFLP) Barreto et al 2014

Lake sediment archaea 0.016 -0.031 1cm - 1400m Fingerprint (T-RFLP) Barreto et al 2014

Desert soil fungi 0.074 -0.147 1m - 100km Fingerprint (ARISA) Green et al 2004

High-throughput sequencing Rainforest soil diazotrophs 0.060 - 1-200m Tu et al 2016 (MiSeq)

Temperate forest soil High-throughput sequencing 0.108 - 1-200m Tu et al 2016 diazotrophs (MiSeq)

Soil bacteria and archaea High-throughput sequencing 0.032 -0.063 - Terrat et al 2015 (97% OTUs) (454)

Endolithic bacteria Intracolony: High-throughput 0.075 -0.149 This study (97% OTUs) 0.4cm - 199cm sequencing (MiSeq)

CHAPTER 4 104

Supplementary table S3. Coral skeleton samples analysed (in section 4.2) and geographical coordinates. Sequences have been deposited in the Sequence Read Archive and accession numbers are provided (SAR ID). Coral colonies P1 – P3 were identified as Porites lutea, while P4 – P8 were identified as Porites lobata.

Sample ID Colony Sample Locality Latitude Longitude SAR ID VRM0051 P1 P1.1 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0052 P1 P1.2 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0053 P1 P1.3 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0054 P1 P1.4 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0055 P1 P1.5 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0055 P1 P1.5 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0056 P1 P1.6 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0056 P1 P1.6 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0057 P1 P1.7 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0058 P1 P1.8 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0059 P1 P1.9 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0060 P1 P1.10 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0061 P1 P1.11 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0062 P1 P1.12 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0063 P2 P2.1 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0064 P2 P2.2 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0065 P2 P2.3 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0066 P2 P2.4 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0067 P2 P2.5 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0067 P2 P2.5 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0068 P2 P2.6 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0069 P2 P2.7 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0070 P2 P2.8 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0071 P2 P2.9 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0072 P2 P2.10 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0073 P2 P2.11 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0074 P2 P2.12 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0076 P3 P3.1 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0077 P3 P3.2 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0078 P3 P3.3 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0079 P3 P3.4 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0080 P3 P3.5 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0081 P3 P3.6 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0081 P3 P3.6 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0082 P3 P3.7 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0083 P3 P3.8 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0084 P3 P3.9 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0085 P3 P3.10 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0086 P3 P3.11 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0087 P3 P3.12 Paradise beach WA -23.153188 113.768027 SRP073961 VRM0096 P4 P4.1 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0097 P4 P4.2 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0098 P4 P4.3 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0099 P4 P4.4 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0100 P4 P4.5 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0100 P4 P4.5 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0101 P4 P4.6 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0101 P4 P4.6 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0102 P4 P4.7 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0103 P4 P4.8 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0104 P4 P4.9 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0105 P4 P4.10 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0106 P4 P4.11 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0107 P4 P4.12 Loc 2 WA -23.203155 113.765309 SRP073961 CHAPTER 4 105

VRM0108 P5 P5.1 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0109 P5 P5.2 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0110 P5 P5.3 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0111 P5 P5.4 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0112 P5 P5.5 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0112 P5 P5.5 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0113 P5 P5.6 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0113 P5 P5.6 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0114 P5 P5.7 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0115 P5 P5.8 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0116 P5 P5.9 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0117 P5 P5.10 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0118 P5 P5.11 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0119 P5 P5.12 Loc 2 WA -23.203155 113.765309 SRP073961 VRM0516 P6 P6.1 Research beach QLD -23.443498 151.911965 To be provided VRM0517 P6 P6.2 Research beach QLD -23.443498 151.911965 To be provided VRM0518 P6 P6.3 Research beach QLD -23.443498 151.911965 To be provided VRM0519 P6 P6.4 Research beach QLD -23.443498 151.911965 To be provided VRM0520 P6 P6.5 Research beach QLD -23.443498 151.911965 To be provided VRM0521 P6 P6.6 Research beach QLD -23.443498 151.911965 To be provided VRM0522 P6 P6.7 Research beach QLD -23.443498 151.911965 To be provided VRM0523 P6 P6.8 Research beach QLD -23.443498 151.911965 To be provided VRM0524 P6 P6.9 Research beach QLD -23.443498 151.911965 To be provided VRM0525 P6 P6.10 Research beach QLD -23.443498 151.911965 To be provided VRM0526 P6 P6.11 Research beach QLD -23.443498 151.911965 To be provided VRM0527 P6 P6.12 Research beach QLD -23.443498 151.911965 To be provided VRM0529 P7 P7.1 Research beach QLD -23.443498 151.911965 To be provided VRM0530 P7 P7.2 Research beach QLD -23.443498 151.911965 To be provided VRM0531 P7 P7.3 Research beach QLD -23.443498 151.911965 To be provided VRM0532 P7 P7.4 Research beach QLD -23.443498 151.911965 To be provided VRM0533 P7 P7.5 Research beach QLD -23.443498 151.911965 To be provided VRM0534 P7 P7.6 Research beach QLD -23.443498 151.911965 To be provided VRM0535 P7 P7.7 Research beach QLD -23.443498 151.911965 To be provided VRM0536 P7 P7.8 Research beach QLD -23.443498 151.911965 To be provided VRM0537 P7 P7.9 Research beach QLD -23.443498 151.911965 To be provided VRM0538 P7 P7.10 Research beach QLD -23.443498 151.911965 To be provided VRM0539 P7 P7.11 Research beach QLD -23.443498 151.911965 To be provided VRM0540 P7 P7.12 Research beach QLD -23.443498 151.911965 To be provided VRM0542 P8 P8.1 Research beach QLD -23.443498 151.911965 To be provided VRM0543 P8 P8.2 Research beach QLD -23.443498 151.911965 To be provided VRM0544 P8 P8.3 Research beach QLD -23.443498 151.911965 To be provided VRM0545 P8 P8.4 Research beach QLD -23.443498 151.911965 To be provided VRM0546 P8 P8.5 Research beach QLD -23.443498 151.911965 To be provided VRM0547 P8 P8.6 Research beach QLD -23.443498 151.911965 To be provided

CHAPTER 4 106

Online material: The following supplementary tables can be found at: https://www.dropbox.com/sh/g9thuyr5ohlqzc6/AACG1ZKtQw45l85BNwkv-ohka?dl=0

Supplementary table S4. Samples of Porites sp. and environmental data Supplementary table 55. Distances between samples.

Supplementary References: Barreto DP, Conrad R, Klose M, Claus P, Enrich-Prast A (2014) Distance-decay and taxa-area relationships for bacteria, archaea and methanogenic archaea in a tropical lake sediment. PLoS One 9, e110128. Green JL, Holmes AJ, Westoby M, Oliver I, Briscoe D, et al. (2004) Spatial scaling of microbial eukaryote diversity. Nature 432, 747-750. Horner-Devine MC, Lage M, Hughes JB, Bohannan BJ (2004) A taxa-area relationship for bacteria. Nature 432, 750-753. Martiny JB, Eisen JA, Penn K, Allison SD, Horner-Devine MC (2011) Drivers of bacterial beta-diversity depend on spatial scale. Proceedings of the National Academy of Sciences, USA 108, 7850-7854. Terrat S, Dequiedt S, Horrigue W, Lelievre M, Cruaud C, et al. (2015) Improving soil bacterial taxa-area relationships assessment using DNA meta-barcoding. Heredity 114, 468- 475. Tu Q, Deng Y, Yan Q, Shen L, Lin L, et al. (2016) Biogeographic patterns of soil diazotrophic communities across six forests in the North America. Molecular Ecology 25, 2937-2948. Zinger L, Boetius A, Ramette A (2014) Bacterial taxa-area and distance-decay relationships in marine environments. Molecular Ecology 23, 954-964.

CHAPTER 5 EVOLUTIONARY DYNAMICS OF CHLOROPLAST GENOMES IN LOW LIGHT: A CASE STUDY OF THE ENDOLITHIC GREEN ALGA OSTREOBIUM QUEKETTII

Chapter 5 108

Evolutionary Dynamics of Chloroplast Genomes in Low Light: A Case Study of the Endolithic Green Alga Ostreobium quekettii

Vanessa R. Marcelino1,*, Ma Chiela M. Cremen1, Chistopher J. Jackson1, Anthony A. W. Larkum2,and Heroen Verbruggen1 1School of Biosciences, University of Melbourne, VIC 3010, Australia 2Plant Functional Biology and Climate Change Cluster, University of Technology Sydney, NSW 2007, Australia

*Corresponding author: E-mail: [email protected]. Accepted: August 20, 2016 Data deposition: Chloroplast genome sequences have been deposited in the European Nucleotide Archive and GenBank under the accession numbers LT593849, KX808496, KX808497 and KX808498.

Abstract Some photosynthetic organisms live in extremely low light environments. Light limitation is associated with selective forces as well as reduced exposure to mutagens, and over evolutionary timescales it can leave a footprint on species’ genomes. Here, we present the chloroplast genomes of four green algae (Bryopsidales, Ulvophyceae), including the endolithic (limestone-boring) alga Ostreobium quekettii, which is a low light specialist. We use phylogenetic models and comparative genomic tools to investigate whether the chloroplast genome of Ostreobium corresponds to our expectations of how low light would affect genome evolution. Ostreobium has the smallest and most gene-dense chloroplast genome among Ulvophyceae reported to date, matching our expectation that light limitation would impose resource constraints reflected in the chloroplast genome architecture. Rates of molecular evolution are significantly slower along the phylogenetic branch leading to Ostreobium, in agreement with the expected effects of low light and energy levels on molecular evolution. We expected the ability of Ostreobium to perform photosynthesis in very low light to be associated with positive selection in genes related to the photosynthetic machinery, but instead, we observed that these genes may be under stronger purifying selection. Besides shedding light on the genome dynamics associated with a low light lifestyle, this study helps to resolve the role of environmental factors in shaping the diversity of genome architectures observed in nature. Key words: genome streamlining, photosynthesis, rates of evolution, boring algae, stoichiogenomics.

Introduction ecotypes of the microalga Ostreococcus, for example, show Light is rapidly attenuated under water, yet some photosyn- distinctive genome traits (Jancek et al. 2008), although in this thetic organisms thrive in extremely low light marine habitats case it is not clear whether low light has played a role. (Shashar and Stambler 1992; Mock and Kroon 2002; Larkum, The production of high-energy cofactors (ATP and NADPH) Douglas, et al. 2003). Specialized lifestyles may leave a foot- and the uptake of nitrogen are modulated by light intensity print on organisms’ genomes (Dutta and Paul 2012; Raven (MacIsaac and Dugdale 1972; Cochlan et al. 1991; Kirk 1994) et al. 2013). For example, high-light and low light strains of and therefore it is logical to expect that the genome architec- the cyanobacterium Prochlorococcus have different genome ture of lineages living under low light conditions is influenced sizes, GC contents and rates of molecular evolution, as well as by resource constraints. Selection for saving resources and other genome features that have been associated with their shortening replication times, in addition to random genetic niche specialization (Hess et al. 2001; Rocap et al. 2003; drift, have been associated with the loss of genes, intergenic Dufresne et al. 2005; Paul et al. 2010). Similar studies target- spacers and introns, a process known as genome streamlining ing the nuclear genomes of eukaryotic algae have also begun (Giovannoni et al. 2005; Lynch 2006; Hessen et al. 2010; Wolf to emerge (see Raven et al. 2013 for a review). Different and Koonin 2013). Genome architecture can also be affected

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 109

by limited supply of key elements such as nitrogen and phos- The siphonous green alga Ostreobium is a convenient or- phorus: different nucleotides and amino acids differ in their ganism to investigate photosynthesis under low light condi- atomic composition, so molecules containing less atoms of the tions (Fork and Larkum 1989; Koehne et al. 1999; Wilhelm limiting nutrient may provide a selective advantage in certain and Jakob 2006). Ostreobium has an endolithic (limestone- niches (Acquisti et al. 2009; Elser et al. 2011; Raven et al. boring) lifestyle: it bores into carbonate substrates and popu- 2013). The Prochlorococcus strain with the smallest genome lates all sorts of marine limestones worldwide, including shells and highest content of nitrogen-poor molecules is found in and coral skeletons. Only a small portion of the available light surface waters, where irradiance is higher but nutrients are reaches Ostreobium in its usual habitat: ~99% of the light can more depleted than in the habitat of the low light strain be attenuated by the first millimeter of limestone (Nienow (Rocap et al. 2003; Dufresne et al. 2005). One could expect et al. 1988; Matthes et al. 2001). Other photosynthetic organ- that when light is low enough to restrict growth rates and isms living on the limestone substrate can further attenuate nitrogen uptake, organisms with small genomes and a high light: the living tissue of corals and their zooxanthellae, for proportion of nitrogen-poor molecules may have better evo- example, absorb 95–99.9% of the available light (Halldal lutionary fitness. 1968; Schlichter et al. 1997). Even under these extreme low Sunlight may also leave footprints in a genome by directly light conditions, Ostreobium carries out oxygenic photosyn- or indirectly altering molecular rates of evolution (the molec- thesis (Ku¨ hl et al. 2008). Cyanobacteria coexisting with ular pacemaker). Light is a major contributor to environmental Ostreobium enhance their light interception by manufacturing energy including solar radiation, thermal energy and chemical far red-absorbing chlorophylls (Chl d and f; Chen and (metabolic) energy (Clarke and Gaston 2006). Environmental Blankenship 2011), whereas Ostreobium has a special chloro- energy stimulates metabolism at many levels, and it is known phyll antenna that allows it to harvest far red light (Magnusson that energy-rich habitats are often characterized by higher et al. 2007). Ostreobium is also able to grow in quite deep evolutionary rates (Davies et al. 2004; Clarke and Gaston waters, being abundant even at depths over 200 m where 2006). Solar radiation, especially ultraviolet (UV), also plays a only a handful of algal species can persist (Littler et al. 1985; direct mutagenic role and may thus accelerate molecular evo- Dullo et al. 1995; Aponte and Ballantine 2001). Here the light lution (Rothschild 1999; Willis et al. 2009). Thermal and chem- is filtered strongly towards the blue end of the spectrum, with ical energy also depend on light: light incidence increases a peak at ~470–480 nm (Larkum and Barrett 1983)anda temperatures (e.g., in the tropics) and supports primary pro- different light harvesting strategy is employed: the carotenoid ductivity (and consequently increases the energy available for siphonaxanthin transfers light energy to chlorophyll and the metabolism and growth). Oxidative DNA damage generally reaction centers (Kageyama et al. 1977). Thus the success of occurs during metabolic reactions; therefore higher metabolic Ostreobium in terms of its cosmopolitan distribution is associ- rates can lead to higher mutation rates (Gillooly et al. 2005). ated not only with its efficiency in light utilization but also its Generation times also play into it, being shorter and fixing ability to employ a range of light harvesting strategies (Fork mutations (on populations) more rapidly when the environ- and Larkum 1989; Schlichter et al. 1997; Magnusson et al. mental energy is higher, which often happens when there is a 2007; Tribollet 2008), for which the underlying genomic basis combination of higher temperatures, metabolic rates and has never been explored. solar radiation (Rohde 1992; Wright and Rohde 2013). As a The light-driven genomic traits of Ostreobium can only be consequence of all these factors, it is reasonable to expect that investigated in a comparative framework. While algal nuclear organisms living in low-energy areas, like shaded habitats, genome sequences are still scarce, chloroplast genomes are have relatively slower rates of molecular evolution. better sampled and constitute a powerful tool for molecular Challenging environments may impose particular selective evolutionary studies (Lemieux et al. 2014). Ostreobium be- regimes, which could leave a footprint of positive selection in longs to the Bryopsidales (Ulvophyceae), a diverse order of genes undergoing adaptation. Changes in proteins that pro- seaweeds for which only a handful of chloroplast genomes vide higher fitness in a given circumstance (e.g., low light) can areavailable(Leliaert and Lopez-Bautista 2015). Additional be detected at the molecular level by an excess of nonsynon- chloroplast genomes of species from this order can help us ymous substitutions over synonymous ones (Yang 1998). investigate genomic traits correlated to low light in Evidence of positive selection for example in the Rubisco Ostreobium. gene (involved in carbon fixation) in has been associ- The goal of this study is to evaluate the evolutionary dy- ated with its adaptation to the declining levels of atmospheric namics of the chloroplast genome of the low light alga

CO2 since their origination in the Ordovician (Raven and Ostreobium using comparative and phylogenetic methods. Colmer 2016). In cases of organisms living in extremely low Because comparative analyses in a phylogenetic context re- light, it would be reasonable to expect positive selection in quire a sufficiently large sample of genomes, we present genes related to the photosynthetic machinery, reflecting ad- the chloroplast genomes of four green algae, including aptation to low light. To our knowledge, this idea has never Ostreobium quekettii and members from three other families been tested in eukaryotic algae. in the same order, all previously uncharacterized. We used a

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 110

combination of stoichiogenomics (the study of elemental a frame shift or a stop codon in the middle of the gene. The composition of macromolecules; Elser et al. 2011)and number of repeats, including tandem and palindromic models of molecular rate variation to investigate our expecta- repeats, were calculated with the Geneious implementation tions for a lineage adapted to low light conditions. Our first of Phobos v.3.3.11 (Mayer 2007) and with the Emboss expectation related to light-dependent resources limitation: if suite (http://www.bioinformatics.nl/emboss-explorer/); see the Ostreobium lineage has evolved in low-energy and low- Supplementary Materials for details. nutrient conditions, its chloroplast genome can be expected to Nitrogen (N) content quantification was based on the be smaller, more compact (i.e., with less intergenic spacers, counts of N atoms per nucleotide or amino acidP using the introns and repeats) and contain less nitrogen than the chlo- formula described in Acquisti et al. (2009): ðÞni pi roplast genomes of related algae. Our second expectation was where ni isthenumberofNatomsinthei-th base and pi is that the phylogenetic branch leading to Ostreobium has the proportion of each base in the chloroplast genome. For slower rates of molecular evolution (i.e., mutation rates) the nucleotide counts we used nC =nG =4 and nA =nT =3.5 than other branches in the phylogeny due to fewer mutations (Acquisti et al. 2009). For the coding DNA sequences (exons) induced by UV and slower generation times often associated we used nA =5, nT =2, nG =5, and nC =3. For amino acid with low energy niches. Lastly, we would expect genes related counts (the theoretical proteome) we used n = 2 for aspara- to its photosynthetic machinery to have experienced positive gine, glutamine, lysine and tryptophan; n =3 for histidine; selection and enabled Ostreobium’s highly efficient light n = 4 for arginine; and n = 1 for other amino acids. Copy utilization. number and expression levels play a major role in N utilization, but neither qPCR nor transcriptome analysis could be carried Materials and Methods out because our source materials were of different develop- mental stages and environmental conditions. Instead, we in- Sequencing, Assembly and Annotation vestigated N-content in coding sequences and amino acids on Total genomic DNA of Ostreobium quekettii, Halimeda discoi- agenebygenebasis,inadditiontodoingsoatthewhole dea, Derbesia sp. and Caulerpa cliftonii were extracted using a chloroplast genome level. Assuming that expression levels of modified cetyl trimethylammonium bromide (CTAB) method genes correlate among species, the gene by gene approach described in Cremen et al. (2016) and sequenced on an should reduce the problem of differential expression between Illumina platform. The collection sites and library preparation genes and make for more realistic among-species compari- details are described in the Supplementary Materials. sons. Finally, we also evaluated whether the average length Sequences were submitted to European Nucleotide of coding sequences is smaller in Ostreobium,asgenesize Archive and GenBank (accession numbers LT593849, reduction has been observed in some endosymbionts with KX808496, KX808497 and KX808498). reduced genome sizes (Charles et al. 1999). Sequences were assembled using CLC Genomics Workbench 7.5.1 (http://www.clcbio.com). Circularity and Phylogeny, Rates of Evolution and Selection Analysis scaffold regions were resolved by comparing the CLC assem- The coding sequences of all species were aligned at the amino bly with assemblies generated independently with MEGAHIT acid level using a locally installed version of MAFFT v7.215 (Li et al. 2015), SOAPdenovo2 (Luo et al. 2012) and SPADES (Katoh et al. 2002), with multithreading and default parame- (Nurk et al. 2013). Details about the assembly settings and ters, and then the aligned amino acid sequences were con- quality checks are reported in the Supplementary Materials. verted back to nucleotides using RevTrans (Wernersson and A combination of automated pipelines and manual editing Pedersen 2003). The ftsH, rpoB, rpoC1, rpoC2 and ycf1 genes was used to annotate the chloroplast genomes, which is could not be reliably aligned (according to a visual assessment) also described in the Supplementary Materials. and were excluded along with the tilS pseudogene from downstream analyses. A maximum likelihood phylogeny was Comparative Analysis built using RAxML (Stamatakis 2006) with a GTR + À model, a In order to compare Ostreobium with other Ulvophyceae, partitioning strategy separating 1st, 2nd and 3rd codon posi- the chloroplast genomes of Bryopsis plumosa tions, and a rapid bootstrap search of 500 replicates. (NC_026795), Tydemania expeditions (NC_026796), Ulva Oltmannsiellopsis viridis, Pseudendoclonium akinetum and sp. (KP720616), Pseudendoclonium akinetum (AY835431) Ulva sp. were used as outgroups. and Oltmannsiellopsis viridis (NC_008099), available in In order to test whether DNA mutation (substitution) rates GenBank, were included in our comparative analysis. were slower in the Ostreobium lineage, we studied lineage- Genome features were extracted with Geneious 9.0.4 specific rates of molecular evolution using the baseml pro- (Kearse et al. 2012). Hypothetical ORFs with <300 bp gram from the PAML v.4.7 package (Yang 2007). We chose were excluded and the tilS gene was re-annotated as a pseu- to perform this test at the nucleotide (rather than at the amino dogene (not a CDS) in Tydemania and Bryopsis, where it has acid) level because we expect environmental energy to affect

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 111

rates of molecular evolution at the nucleotide level (i.e., re- (see supplementary fig. S2, Supplementary Material online). gardless whether the mutations are synonymous or nonsyn- The main genome features including their sizes are shown in onymous). We compared the fit of a model with unique rates table 1. of evolution across all branches (global clock) to a model with Gene content of Ostreobium is similar to related algae but it a different rate for the Ostreobium lineage (local clock) using lacks the chloroplast envelope membrane protein gene (cemA) the Akaike Information Criterion (AIC). Because rates of mo- that is present in all other Ulvophyceae sequenced to date lecular evolution inherently vary among species, a model with (supplementary table S1, Supplementary Material online). two rates is likely to better fit the data than a single-rate The tRNA(Ile)-lysidine synthase gene (tilS) seems to be a pseu- model. While this is taken into consideration when calculating dogene in Ostreobium, Halimeda and Derbesia as it contains model fit (AIC penalizes parameter-rich models), we also ver- multiple in-frame stop codons. We could not identify it at all in ified the rates of molecular evolution under a relaxed clock Caulerpa cliftonii (i.e., no tBLASTx hits with e-values < 0.001 model, whereby rates are free to vary on all branches of the and identity > 50%, using Bryopsis plumosa as reference), al- phylogeny (see Supplementary Materials). though this pseudogene has been found in another Caulerpa To evaluate whether photosynthetic genes have been species (Zuccarello et al. 2009). None of our chloroplast ge- under positive selection in the Ostreobium lineage, we ex- nomes have the organelle division inhibitor factor gene cluded gene alignments containing less than four species (minD), supporting the notion that this gene has been lost and grouped (concatenated) genes into 15 gene classes (cf. from the of Bryopsidales (Leliaert and Wicke et al. 2011). We analyzed this data set using the branch Lopez-Bautista 2015). Like Ulva, Bryopsis and Tydemania, model implemented in PAML (Yang 1998, 2007)andthe the chloroplast genomes sequenced here do not have the random effects branch-site model (branch-site REL) imple- quadripartite architecture often found in green algae and mented in HyPhy (Kosakovsky Pond et al. 2005, 2011). The land plants (Lemieux et al. 2000; Pombert 2005). Despite an branch model was run with the codeml program, using the overall highly conserved gene content, the Ulvophyceae F3 4 codon model (Goldman and Yang 1994; Yang 2007). genomes have multiple rearrangements as indicated in the We compared the fit of a model with differential dN/dS ratio Mauve alignment (fig. 2). (o)forOstreobium and the background lineages, to a model with a universal o for all branches (the null hypothesis) using the Akaike information criterion (AIC). This approach directly Genome Economics tests our hypothesis, but has a risk of returning a good fit for In order to evaluate some of our expectations regarding light- poor models because the null hypothesis (universal o)maybe driven resource limitations on chloroplast genomes, we com- overly simple (see Kosakovsky Pond et al. 2011). Therefore we pared the chloroplast genome of Ostreobium with those of also used the branch-site REL, which allows detecting positive the eight other algae from the class Ulvophyceae in terms of selection in all branches of the phylogeny and the proportion size, compactness (gene-density) and nitrogen content. With of sites under selection (Kosakovsky Pond et al. 2005, 2011). 81,997 bp, Ostreobium has the smallest and most gene-dense We evaluated whether positive selection had occurred in the chloroplast genome of all Ulvophyceae sequenced to date Ostreobium lineage with the likelihood ratio test and P values (table 1 and fig. 3). The size reduction in the Ostreobium (with the Holm correction procedure) implemented in the chloroplast genome is not caused by gene loss (78 of 79 branch-site REL method in HyPhy (Kosakovsky Pond et al. common plastid genes are present, supplementary table S1, 2011). Supplementary Material online) but by a reduction of inter- genic spacers, introns and repeats (table 1 and fig. 3). Results Intergenic spacers compose only 11.9% of the Ostreobium chloroplast genome, compared with an average of 25.4% Four New Chloroplast Genomes of Bryopsidales (std 8.4%) in other Ulvophyceae. Ostreobium also has a The sequence data of Ostreobium queketti, Halimeda discoi- small number of introns, missing even the highly conserved dea, Derbesia sp. and Caulerpa cliftonii were assembled into tRNA-Leu (uaa) group I intron (Simon et al. 2003)thatispre- complete (circular mapping) chloroplast genomes (fig. 1 and sent in other Bryopsidales chloroplast genomes (Leliaert and supplementary fig. S1, Supplementary Material online). The Lopez-Bautista 2015; this study). Nitrogen utilization in mean coverage was 235 for Ostreobium,3,983 for Ostreobium did not differ substantially from other algae, Halimeda,1,116 for Caulerpa and 469 for Derbesia (sup- either in the nucleotide composition of the complete chloro- plementary fig. S2, Supplementary Material online). Two plast DNA, the coding regions, or the amino acids of predicted gapped scaffold regions in Halimeda seem to have a (possibly proteins (table 1). Likewise, the N counts on a gene by gene polymorphic) number of repeats. One of these gaps was basis did not reveal any obvious pattern (supplementary table closed with an alternative assembler software (SPADES, S2, Supplementary Material online). The average gene length Nurk et al. 2013) and the other was coded as stretch of Ns in Ostreobium was found to be similar to related algae

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 112

Gene categories 80,000 0 ATP synthesis 5,000 genetic systems 75,000 metabolism

rps18 CDS

psbM rpl20 photosystems ycf3

ftsH psaI petA 10,000 ribosomal proteins rpl23 rpl2 ycf20 rps19 psaC rps3 rpl16 tRNA-Trp(cca) transport rps4 tRNA-Pro(tgg) tRNA-Val(tac) rpl14 psbJ psbL psbF psbE rps8 70,000 infA unknown rpl36 rps11 accD tRNA-Arg(acg) tRNA-His(gtg) rpoA tRNA-Ser(gct) rps7 rrs rps12 tRNA-Met(cat)

ycf1 rbcL tRNA-Ile(gat) RNA petG orf131 cysT 15,000 tRNA-Asp(gtc) ycf47

rpl12

rps9

65,000 rpl32 petD tRNA-Met(cat)2 rrl tRNA-Gly(gcc) tRNA-Leu(gag) petB tRNA-Thr(ggt) tRNA-Leu(taa) tRNA-Glu(ttc) rps14 tRNA-T rrf tRNA-Ser(tga)yr(gta) tRNA-Arg(tct) atpB

atpE rpoB Ostreobium quekettii psaMpsb30 20,000 SAG6.99 psbK tRNA-Met(cat)3 tRNA-Phe(gaa) psbN tRNA-Leu(caa) psbI 81,997 nt tRNA-Gln(ttg)

60,000 tRNA-Cys(gca) chlB

rpoC1

psaA

psaB 25,000

rpoC2 tRNA-Lys(ttt) psbZ tRNA-Gly(tcc)

55,000 psbD tRNA-Leu(tag) tRNA-Ala(tgc) psaJ t petL rps2 psbC

atpI chlN atpH tRNA-Arg(ccg)

atpF tRNA-Asn(gtt) chlL

)tcc(grA-A N R tRNA-Thr(tgt)

ccs1 atpA cysA ycf4 30,000 psbB

chlI psbT psbH

clpP

A

fut

orf470 91lpr psbA 50,000

GC content 35,000 0.0 0.2 0.4 0.6 0.8 1.0 45,000 40,000

FIG.1.—Gene map of the Ostreobium quekettii chloroplast genome. Genes are colored by their known function.

(median difference of gene sizes = 0, supplementary table S3, tree. In other words, the relative rate of molecular evolution Supplementary Material online). along the Ostreobium branch is 19% slower than along the other branches of the phylogeny. A similar result was obtained by calculating the rates of Rates of Evolution molecular evolution with a relaxed molecular clock, but To investigate whether the molecular pacemaker along the some branches other than the Ostreobium branch also had branch leading to Ostreobium is slower than in the remainder slower rates of molecular evolution (supplementary fig. S3, of the tree, we constructed a Maximum Likelihood (ML) phy- Supplementary Material online). Except for Bryopsis and logeny from the chloroplast genomes (71 genes concate- Ostreobium, all other Bryopsidales showed a relatively fast nated, 47,559 bp, fig. 2) and fitted two models of rate. The rate estimated for the Ostreobium branch corre- molecular evolution to the same data set. We found that a sponds to 65% of the rates averaged across all other branches model with differential rates of evolution for the branch lead- of the phylogeny. ing to Ostreobium and the remaining branches of the phylog- eny fits the data much better (ÁAIC = 92) than a model with a homogeneous rate across the entire tree. The branch rate Selection on Genes Related to Photosynthesis parameter values estimated by ML are 0.81 for the Our third expectation was that genes related to the photosyn- Ostreobium branch versus 1.00 for the remainder of the thetic pathway have experienced positive selection in the

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 113

Table 1 Summary of the Chloroplast Genome Features of Ostreobium quekettii and Comparison with Other Ulvophyceae Chloroplast Genomes Species Genome N content N content N content GC Introns Repeats Tandem Palind. Int. Accession size (bp) genome coding DNA proteome content (50 bp+) repeatsa seqs spacers number (%) (%)b Oltmannsiellopsis viridis 151,933 3.702 3.698 1.361 40.5 10 84 5 652 39.57 NC_008099 Pseudendoclonium akinetum 195,867 3.657 3.699 1.379 31.5 28 100 22 418 37.46 AY835431 Ulva sp. 99,983 3.626 3.669 1.366 25.3 5 12 2 410 22.67 KP720616 Ostreobium quekettii 81,997 3.656 3.692 1.369 31.9 6 8 1 100 11.96 LT593849 Bryopsis plumosa 106,859 3.650 3.692 1.359 30.8 13 12 1 161 20.40 NC_026795 Derbesia sp. 115,765 3.644 3.685 1.374 29.7 12 8 5 146 19.09 KX808497 Caulerpa cliftonii 131,135 3.688 3.675 1.378 37.6 11 15 7 115 25.74 KX808498 Halimeda discoideac 122,075 3.653 3.681 1.363 32.2 14 19 11 112 19.96 KX808496 Tydemania expeditionis 105,200 3.668 3.656 1.377 32.8 11 7 1 72 18.73 NC_026796

NOTE.—Nitrogen (N) content in Genome and Coding DNA based on nucleotides, N content in Proteome based on amino acids counts. aOnly tandem repeats with 15–1,000 bp were included in the count. bExcluding ORFs < 300 bp. cHalimeda has one scaffold with a unknown number of repeats annotated with 100 Ns. Palind seqs, Palindromic repeats; Int. spacers, Intergenic spacers.

FIG.2.—Mauve alignment of chloroplast genomes available for algae of the class Ulvophyceae, including the endolithic alga Ostreobium quekettii and the three seaweeds sequenced in this study. Colored boxes indicate regions of synteny (collinear blocks, identified by the Progressive Mauve algorithm). The species are sorted according to a Maximum Likelihood phylogeny based on a concatenated alignment of the coding sequences of the chloroplast genomes; bootstrap values are indicated near branch nodes. lineage leading to Ostreobium. We concatenated genes o for other lineages in the phylogeny. If they do differ signif- encoding different subunits of the same protein to improve icantly, and if o is >1, then positive selection could be inferred signal from short gene alignments. Using the branch model of (Yang 1998). However, we found no indication that genes in

Yang (1998), we tested whether the o ratio (dN/dS)ofthe the branch leading to Ostreobium have been under positive branch leading to Ostreobium differs from the background selection (table 2 and supplementary table S4, Supplementary

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 114

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

Oltmannsiellopsis viridis

Ulva sp.

Pseudendoclonium akinetum

Ostreobium quekettii Genic

Derbesia sp. Intronic Intergenic Bryopsis plumosa

Caulerpa cliftonii

Tydemania expeditionis

Halimeda discoidea

FIG.3.—Proportion of genes, introns and intergenic spacers in the chloroplast genomes of algae of the class Ulvophyceae. Only ORFs >300 bp were included in the count. The percentage of intronic regions includes the intronic ORFs present in some species.

Material online). Instead, we observed that most of the pro- found that the economic nature of the Ostreobium chloroplast teins related to the photosynthetic machinery have a stronger genome is not accomplished by a replacement of expensive signature of purifying selection in the Ostreobium lineage than nucleotides or amino acids (i.e., containing more N atoms) by in other branches of the phylogeny (ÁAIC > 4, table 2). We more economic ones, but by an overall reduction of intergenic note though that these results should be interpreted with pru- regions (fig. 3). Energy limitation resulting from the low light dence given the methods’ susceptibility of returning a good fit niche that this alga occupies may have contributed to an evo- forpoormodels(seeKosakovsky Pond et al. 2011). In order to lutionary reduction of the genome size. Due to the limited verify these results in light of an alternative method, we per- light available for photosynthesis, saving energy in any formed a second analysis using the random effects branch-site aspect of its cell biology including genome replication and model. transcription would result in a selective advantage. Introns The second analysis with the branch-site REL model, which significantly increase the costs of transcription (Lehninger can detect selection in all branches of the tree and parts of the et al. 1993; Castillo-Davis et al. 2002). Likewise, repeats and alignment without having to specify lineages of interest a intergenic spacers consume resources, so these can be under priori (Kosakovsky Pond et al. 2011), confirmed that there selection towards reduction in energy-poor environments are no signatures of positive selection along the Ostreobium (Dufresne et al. 2005; Giovannoni et al. 2005). lineage (supplementary table S5, Supplementary Material Besides natural selection, neutral factors as random genetic online). As in the previous analysis, the branch-site REL drift and population sizes can also shape genome architecture model suggests that several gene classes have experienced (Lynch 2006; Lynch et al. 2006) and may have contributed to stronger purifying selection in the lineage leading to the chloroplast genome streamlining in Ostreobium.Genome Ostreobium: many of the genes show smaller o values in reduction resulting from neutral evolution (or from a relaxa- the Ostreobium lineage (both mean o and o1—representing tion of purifying selection) is typically observed in obligate purifying selection) and a higher proportion of sites under the parasitic or symbiotic species, which tend to have small effec- purifying selective regime when compared with the average tive population sizes and therefore a higher influence of ge- values obtained for all other branches (supplementary table netic drift (Mira et al. 2001; Wolf and Koonin 2013). Genes S5, Supplementary Material online). that are no longer essential to survival are under nearly neutral evolution, so the reduced genomes of parasitic species typi- cally show substantial gene loss (e.g., loss of genes involved in Discussion photosynthesis) and, sometimes, an accumulation of pseudo- genes (Mira et al. 2001; de Koning and Keeling 2006; McNeal An Economical Genome et al. 2007; Wicke et al. 2013; Yan et al. 2015). Ostreobium,in The endolithic alga Ostreobium has a remarkably small and contrast, has a tightly packed chloroplast genome, virtually no compact chloroplast genome (figs. 1 and 3; table 1). We gene loss (except for cemA) and no sign of gene size

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 115

Table 2 living in well-lit habitats can also have small chloroplast ge- Omega Values (dN/dS) for the Different Gene Classes in the nomes. Small cells tend to have small genomes, and small Chloroplast Genomes of Ulvophycean Algae chloroplast genomes have been observed in picoplanktonic Single-u model Two-u model "AIC species like Ostreococcus and other Prasinophyceae species (Derelle et al. 2006; Lemieux et al. 2014). Small genomes in u global u background u Ostreobium bloom forming species, as Ostreococcus, could be a selective Photosynthetic light reactions advantage given their reduced replication times (see Cavalier- atp 0.025 0.028 0.009 25.043 Smith 2005), however, at least for prokaryotes, no correlation pet 0.032 0.034 0.016 3.625 psa 0.019 0.021 0.007 29.851 between duplication times and genome sizes has been ob- psb 0.029 0.031 0.013 40.538 served (Mira et al. 2001). The effects of random genetic drift Photosynthetic dark reactions and population sizes likely play a major role in shaping chl 0.022 0.024 0.012 4.749 genome sizes in these cases (Lynch 2006). ccsA 0.025 0.025 0.032 1.938 Interestingly, small chloroplast genomes have also been rbcL 0.020 0.023 0.007 13.922 observed in other organisms commonly inhabiting resource- Translation and protein-modifying enzymes poor niches. Plants in the Gnetophytes have a reduced chlo- clp 0.014 0.017 0.004 2.825 roplast genome associated with a reduced number of introns infA 0.039 0.044 0.020 0.660 and intergenic regions, in addition to some gene loss, which rpl 0.039 0.040 0.029 0.643 the authors suggest to be an adaptation to the resource-con- rps 0.035 0.036 0.022 1.179 strained habitats that these plants occupy (Wu et al. 2009). tufA 0.023 0.025 0.011 1.516 Proteins not related to photosynthesis The recently published chloroplast genome of the palmophyl- accD 0.036 0.041 0.017 1.379 lalean alga Verdigellas peltata, which typically occurs in deep cys 0.027 0.027 6.291 2.000 waters and other shaded environments, is also small (79,444 rpo 0.003 0.003 0.002 1.749 bp long), compact and intronless (Leliaert et al. 2016). These observations support our hypothesis on the effects of low light NOTE.—Two models were tested a single o for all lineages and a model with different o values for Ostreobium and all other species. The goodness of fit of the and resource constraints on chloroplast genome size, though two-o over the single-o model is given by ÁAIC. more extremophile species need to be sequenced in order to verify whether this pattern sustains. Our results are restricted to the chloroplast genome. Based reduction, supporting a considerable role of adaptive pro- on microspectrophotometry estimates, Ostreobium’s nuclear cesses on its genome streamlining. genome (2C & 0.5 pg) is on the smaller side of the genome Evidently, not all photosynthetic organisms living in low size range among Ulvophyceae (0.1–6.1 pg) (Kapraun 2007). light environments have reduced genome sizes. Because no nuclear genomes of Ulvophyceae have been se- Acaryochloris marina is a shade specialist with an 8.3 Mb quenced to date, it is not currently feasible to analyze associ- genome, which is large for a cyanobacteria (Swingley et al. ations between low light and nuclear genome evolution. We 2008; Larsson et al. 2011). In this case, a different mechanism anticipate that when more complete nuclear genomes are can be speculated on: by producing chlorophyll d, sequenced, equivalent analyses of the impact of resource con- Acaryochloris may not experience the same resource con- straints on nuclear genome evolution will follow. straints that Ostreobium does, and as it occupies a relatively Regarding gene loss, the cemA gene, involved in the uncompetitive niche, this cyanobacterium could be under re- uptake of inorganic carbon into chloroplasts (Rolland 1997), laxed purifying selection which might culminate in genome is the only gene absent from Ostreobium but present in all expansion (see Swingley et al. 2008; Larsson et al. 2011). Wolf other Ulvophyceae (supplementary table S1, Supplementary and Koonin (2013) proposed the existence of two phases in Material online). Knock-out experiments in Chlamydomonas genome evolution: an explosive innovation phase that leads to have shown that cemA is not essential for life or photosyn- an increase in genome complexity followed by a longer reduc- thesis, but that its disruption drastically increases light sensitiv- tive phase. It is possible that the Acaryochloris genome size ity: mutants lacking a functional cemA have a lower threshold reflects its recent innovation/adaptive phase while the level of light perceived as excessive, so they accumulate large Ostreobium lineage, which has occupied an endolithic low amounts of zeaxanthin, which is a pigment that dissipates light lifestyle for more than 500 million years (Vogel and excess light as heat (Rolland 1997). Consequently, mutants Brett 2009; Marcelino and Verbruggen 2016), possibly has are only able to grow (photoautotrophically) under low light been in a reductive stage over a longer timespan. The genus conditions (Rolland 1997). Although the possibility that this Acaryochloris, however, is much older than 500 Ma (Sa´nchez- gene has been transferred to the nucleus in Ostreobium Baracaldo 2015), and in order to test this hypothesis it would cannot be completely ruled out, cemA was neither found in be necessary to know when the genus acquired chlorophyll d other assembled contigs nor in a recently sequenced and when it transitioned to a shaded lifestyle. Naturally, algae Ostreobium transcriptome (see section “Searching for

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 116

cemA”intheSupplementary Materials). Once the capacity to In Prochlorococcus, it is the high-light strain that contains tolerate high light is lost, there would be strong constraints on less nitrogen in its genome, although it is not substantially subsequent transitions to higher-light habitats, providing a different from one of the low light strains (Dufresne et al. plausible explanation for why Ostreobium lineages have diver- 2005). In this case, the nitrogen availability in the water sified abundantly within the endolithic niche (Marcelino and column seems to play a more important role than a restricted Verbruggen 2016; Sauvage et al. 2016) but are not known to nitrogen-uptake ability due to light limitation. Heterotrophic have diversified out of it (i.e., given origin to nonendolithic pathways have been observed in the genome of species). Endolithic algal species are often light saturated at Prochlorococcus, especially in the low light strains, suggesting low light intensities but some experimental studies show that that they might use other sources of energy in addition to light they are able to photoacclimate to light levels approaching full (Garcı´a-Ferna´ndez and Diez 2004). This potentially mitigates solar irradiance (see Tribollet 2008 for a review). There are the effects of low irradiance on nitrogen uptake in this organ- high levels of cryptic diversity within endolithic green algae ism, which would explain why low light Prochlorococcus (Marcelino and Verbruggen 2016; Sauvage et al. 2016)and strains have more nitrogen in their genomes. A recent it is not known which species are able to cope with higher review (Raven et al. 2013) suggests a theoretical association levels of light, raising the question of whether cemA has been between AT/GC ratios in genomes (which could culminate in lost in other lineages of Ostreobium and whether they ac- nitrogen bias) and UV irradiation, but notes that this is not quired other mechanisms to tolerate high light. commonly observed in nature because multiple other factors We expected to observe a larger proportion of nitrogen- influencing genome content may play a more significant role poor molecules in the Ostreobium chloroplast genome for than light alone. several reasons. First, low light irradiance limits the uptake of nitrogen (MacIsaac and Dugdale 1972; Cochlan et al. Slow Rates of Evolution 1991) and it has been empirically demonstrated that Ostreobium growth is limited by nitrogen and phosphorous The results of two independent tests show that Ostreobium in naturally occurring concentrations (Carreiro-Silva et al. has a relatively slow rate of molecular evolution than closely 2012). Second, absorption of nutrients may be difficult in en- related lineages. Other ulvophytes also seem to have slow dolithic environments due to limited circulation and thicker rates of molecular evolution, which might be related to diffusive boundary layers (see Larkum, Koch, et al. 2003). other species traits not analyzed here, but in Ostreobium, However, our results indicate that the nitrogen content in the most reasonable explanations relate to the effects of the the Ostreobium chloroplast genome (and predicted proteome) low light niche that this endolithic alga occupies. Sunlight, is similar to those of other algae in the same class (table 1 and including UV radiation, induces DNA damage, mutations supplementary tables S2 and S3, Supplementary Material and rearrangements (Ries et al. 2000; Raven et al. 2013; online). Kumaretal.2014). While these changes often get repaired Several potential explanations can be raised. First, sea- (see Boesch et al. 2011 for mechanisms), the frequency with weeds in general may naturally be under nitrogen limitation which remaining mutations are passed through generations (Vitousek and Howarth 1991; Harrison and Hurd 2001), re- dictates the molecular pacemaker (Baer et al. 2007). Following sulting in all of the examined genomes having similar nitrogen this logic, low light lineages will likely have slower rates of content. Alternatively, genome replication and DNA repair molecular evolution than lineages living in high light condi- may be less frequent in Ostreobium as a consequence of tions, as observed in Ostreobium andinlowlightstrainsof the reduced environmental energy, slow metabolism and Prochlorococcus (Dufresne et al. 2005). In Prochlorococcus,it growth, therefore a slower rate of nitrogen intake may be is likely that the loss of DNA repair genes also contributes to an required and this economic aspect of Ostreobium is not re- increase in mutation rates in high light strains (Dufresne et al. flected in its genome. Sample size could also be an issue: 2005). previous studies on N bias used nuclear genomes (Acquisti Sunlight also shapes evolutionary rates through environ- et al. 2009) and patterns may not be visible in the smaller mental energy—it sustains primary productivity and ambient chloroplast genome. Nitrogen limitation may also lead to over- temperature. Energy-rich habitats are the epicenter of evolu- all genome reduction (Kang et al. 2015) rather than biases in tionary change worldwide (Davies et al. 2004; Jetz and Fine nucleotide and amino acid composition. Finally, nitrogen uti- 2012; Wright and Rohde 2013). This environmental energy is lization is largely dependent on the number of copies of the positively correlated to metabolic rates in many organisms chloroplast genome and expression levels, which cannot be (Allen et al. 2002) and the by-products of metabolic reactions detected in our analyses. If fresh DNA extractions from algae (e.g., reactive oxygen and nitrogen species) are another major growing in their natural conditions and belonging to the same source of mutations (Gillooly et al. 2005; Boesch et al. 2011). It developmental stage were available, would be interesting to has been proposed that more solar radiation and higher tem- perform comparative qPCR and transcriptome analyses to test peratures increase metabolism and growth rates, shortening whether this is the case. generation times and increasing mutation rates (Rohde 1992).

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 117

Shorter generations lead to more mutations accumulated per 2013). Although both analyses show some sign of a stronger unit of time, so species living in high-energy habitats tend to purifying selection in the Ostreobium lineage, these results have faster rates of molecular evolution (Bromham 2011). should be interpreted with caution as the phylogeny contains One could speculate that the low energy niche that long branches (implying long periods of time: Ostreobium,for Ostreobium occupies results in slow metabolic rates and gen- example, diverged 500 Ma ago), therefore substitutions may eration times (although they are unknown for this alga), cul- have saturated the data to a point where evolution cannot be minating in a slow molecular pacemaker. Longer generation reliably characterized by the models. Simulations mimicking times have been associated with slow rates of molecular evo- the evolution of algal chloroplast genomes may help to char- lution in tree ferns (Zhong et al. 2014), which are also shade acterize those methodological limitations. Finally, the power plants (Page 2002). of these analyses will certainly increase as more genomic data of high and low light-adapted lineages become available. Selection in the Ostreobium Chloroplast Genome We did not find evidence for positive selection on genes re- Conclusion lated to photosynthesis in the lineage leading to Ostreobium We present the chloroplast genomes of four green algae (table 2 and supplementary table S5, Supplementary Material (Bryopsidales) and investigate the genomic footprints of a online). On the contrary, we observed some signs, though low light lifestyle in the endolithic Ostreobium quekettii. This weak, of stronger purifying selection in this lineage. alga has the smallest and most gene-packed chloroplast Ostreobium is known to have several features that facilitate genome among Ulvophyceae, which is a possible adaptation low light photosynthesis. It is able to produce red-shifted chlo- to light-related resources constraints. The molecular pace- rophylls and uses an uncommon uphill energy transfer from maker is significantly slower in the phylogenetic branch lead- these chlorophylls to photosystem II (Koehne et al. 1999; ing to Ostreobium, consistent with a scenario where low Wilhelm and Jakob 2006). The photosynthesis-related pro- energy levels reduce rates of molecular evolution. teins that are more likely to be affected by low light (e.g., Unexpectedly, we observed some signs of higher levels of the light harvesting complex superfamily and the pigments purifying selection in the photosynthesis-related genes in involved in light capture) are encoded in the nucleus (Green Ostreobium when compared with other algae. It is still unclear and Parson 2003), and so innovations in these genes would whether this result is allied to an early episodic positive selec- not be detected in our analysis. The recently sequenced nu- tion followed by a strong purifying selection or to a method- clear genome of the seagrass Zostera marina revealed an ex- ological limitation, as the current methods may not have the panded number of light harvesting complex B genes (Olsen power to detect selection in deep-branching lineages, espe- et al. 2016). Like Ostreobium, Zostera is adapted to a light cially if the data are saturated with substitutions. Sequencing depleted (aquatic) niche when compared with its land plant additional chloroplast and nuclear genomes of different relatives. We expect that interesting findings will result for Ostreobium lineages and other low light adapted species Ostreobium with the analysis of transcriptome and nuclear will help to further clarify the genomic correlates of low genome data. light adaptations. Another scenario that may have contributed to not observ- ing selection is that the lineage leading to Ostreobium could Supplementary Material have experienced an early burst of positive selection followed Supplementary figures S1–S3 and tables S1–S5 are available by purifying selection, and such a history may go undetected at Genome Biology and Evolution online (http://www.gbe. in analyses. If innovations related to low light adaptation ap- oxfordjournals.org/). peared early in Ostreobium evolution and increased its fitness, it is expected that they would be immediately followed by purifying selection—especially if the loss of the cemA gene Acknowledgments caused intolerance to high-light and confined the ancestral This work was supported by the Australian Biological endolithic lineage to shaded habitats (where any mutation Resources Study (RFL213-08), the Australian Research decreasing photosynthesis performance is likely to lead to de- Council (FT110100585, DP150100705), the Holsworth creased fitness). This scenario provides a plausible explanation Wildlife Research Endowment and the Sapere Aude for the stronger purifying selection on photosynthesis-related Advanced grant from the Danish Council for Independent genes in the branch leading to Ostreobium when compared Research for the Natural Sciences. The Sophie Ducker with other branches in the phylogeny. The available tools may Postgraduate Scholarship supported the publication fee. not have enough power to detect faint episodes of selection, V.R.M. and M.C.M.C. receive a University of Melbourne schol- particularly if the data are saturated with synonymous substi- arship. The Caulerpa sample was collected under the DEC tutions or if selection occurred at deep internal branches Flora permit 10006072. We thank John West for providing (Kosakovsky Pond et al. 2011; Gharib and Robinson-Rechavi the Derbesia strain and Claude Payri for facilitating field work

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 118

in PNG. We thank four anonymous reviewers for their helpful Elser JJ, Acquisti C, Kumar S. 2011. Stoichiogenomics: the evolutionary comments. We are thankful to Karolina Fucikova, John Raven ecology of macromolecular elemental composition. Trends Ecol Evol. 26:38–44. and the members of the Verbruggen lab for valuable insights Fork DC, Larkum AWD. 1989. Light harvesting in the green alga during the execution of this study and preparation of the Ostreobium sp., a coral symbiont adapted to extreme shade. Mar article. Biol. 103:381–385. Garcı´a-Ferna´ndez JM, Diez J. 2004. Adaptive mechanisms of nitrogen and carbon assimilatory pathways in the marine cyanobacteria Literature Cited Prochlorococcus. Res Microbiol. 155:795–802. Gharib WH, Robinson-Rechavi M. 2013. The branch-site test of positive Acquisti C, Elser JJ, Kumar S. 2009. Ecological nitrogen limitation shapes selection is surprisingly robust but lacks power under synony- the DNA composition of plant genomes. Mol Biol Evol. 26:953–956. mous substitution saturation and variation in GC. Mol Biol Evol. Allen AP, Brown JH, Gillooly JF. 2002. Global biodiversity, biochemical 30:1675–1686. kinetics, and the energetic-equivalence rule. Science 297:1545–1548. Gillooly JF, Allen AP, West GB, Brown JH. 2005. The rate of DNA evolution: Aponte NE, Ballantine DL. 2001. Depth distribution of algal species on the effects of body size and temperature on the molecular clock. Proc Natl deep insular fore reef at Lee Stocking Island, Bahamas. Deep Sea Res I Acad Sci. 102:140–145. Oceanogr Res Pap. 48:2185–2194. Giovannoni SJ, et al. 2005. Genome streamlining in a cosmopolitan oce- Baer CF, Miyamoto MM, Denver DR. 2007. Mutation rate variation in anic bacterium. Science 309:1242–1245. multicellular eukaryotes: causes and consequences. Nat Rev Genet. Goldman N, Yang Z. 1994. A codon-based model of nucleotide substitu- 8:619–631. tion for protein-coding DNA sequences. Mol Biol Evol. 11:725–736. Boesch P, et al. 2011. DNA repair in organelles: pathways, organization, Green BR, Parson WW, editors. 2003. Light-harvesting antennas in pho- regulation, relevance in disease and aging. Biochim Biophys Acta Mol tosynthesis. Dordrecht: Springer Netherlands. Cell Res. 1813:186–200. Halldal P. 1968. Photosynthetic capacities and photosynthetic action Bromham L. 2011. The genome as a life-history character: why rate of spectra of endozoic algae of the massive coral Favia. Biol Bull. molecular evolution varies between mammal species. Philos Trans R 134:411–424. Soc B Biol Sci. 366:2503–2513. Harrison PJ, Hurd CL. 2001. Nutrient physiology of seaweeds: application Carreiro-Silva M, Kiene WE, Golubic S, McClanahan TR. 2012. Phosphorus of concepts to aquaculture. Cah Biol Mar. 42:71–82. and nitrogen effects on microbial euendolithic communities and their Hess WR, et al. 2001. The photosynthetic apparatus of Prochlorococcus: bioerosion rates. Mar Pollut Bull. 64:602–613. insights through comparative genomics. Photosynth Res. 70:53–71. Castillo-Davis CI, Mekhedov SL, Hartl DL, Koonin EV, Kondrashov FA. Hessen DO, Jeyasingh PD, Neiman M, Weider LJ. 2010. Genome 2002. Selection for short introns in highly expressed genes. Nat streamlining and the elemental costs of growth. Trends Ecol Genet. 31:415–418. Evol. 25:75–80. Cavalier-Smith T. 2005. Economy, speed and size matter: evolutionary Jancek S, Gourbie`re S, Moreau H, Piganeau G. 2008. Clues about the forces driving nuclear genome miniaturization and expansion. Ann genetic basis of adaptation emerge from comparing the proteomes Bot. 95:147–175. of two Ostreococcus ecotypes (Chlorophyta, Prasinophyceae). Mol Biol Charles H, Mouchiroud D, Lobry J, Gonc¸alves I, Rahbe Y. 1999. Gene size Evol. 25:2293–2300. reduction in the bacterial aphid endosymbiont Buchnera. Mol Biol Jetz W, Fine PVA. 2012. Global gradients in vertebrate diversity predicted Evol. 16:1820–1822. by historical area-productivity dynamics and contemporary environ- Chen M, Blankenship RE. 2011. Expanding the solar spectrum used by ment. PLoS Biol. 10:e1001292. photosynthesis. Trends Plant Sci. 16:427–431. Kageyama A, Yokohama Y, Shimura S, Ikawa T. 1977. An efficient exci- Clarke A, Gaston KJ. 2006. Climate, energy and diversity. Proc Biol Sci. tation energy transfer from a carotenoid, siphonaxanthin to chloro- 273:2257–2266. phyll a observed in a deep-water species of chlorophycean seaweed. Cochlan WP, Price NM, Harrison PJ. 1991. Effects of irradiance on nitrogen Plant Cell Physiol. 18:477–480. uptake by phytoplankton: comparison of frontal and stratified com- Kang M, Wang J, Huang H. 2015. Nitrogen limitation as a driver of munities. Mar Ecol Prog Ser. 69:103–116. genome size evolution in a group of karst plants. Sci Rep. 5:11636. Cremen MCM, Huisman JM, Marcelino VR, Verbruggen H. 2016. Kapraun DF. 2007. Nuclear DNA content estimates in green algal lineages: Taxonomic revision of Halimeda (Bryopsidales, Chlorophyta) in south- Chlorophyta and . Ann Bot. 99:677–701. western Australia. Aust Syst Bot. 29:41–54. Katoh K, Misawa K, Kuma K, Miyata T. 2002. MAFFT: a novel method for Davies TJ, Savolainen V, Chase MW, Moat J, Barraclough TG. 2004. rapid multiple sequence alignment based on fast Fourier transform. Environmental energy and evolutionary rates in flowering plants. Nucleic Acids Res. 30:3059–3066. Proc R Soc B Biol Sci. 271:2195–2200. Kearse M, et al. 2012. Geneious Basic: an integrated and extendable desk- de Koning AP, Keeling PJ. 2006. The complete plastid genome sequence top software platform for the organization and analysis of sequence of the parasitic green alga Helicosporidium sp. is highly reduced and data. Bioinformatics 28:1647–1649. structured. BMC Biol. 4:12. Kirk JTO. 1994. Light and photosynthesis in aquatic ecosystems. Derelle E, et al. 2006. Genome analysis of the smallest free-living eukaryote Cambridge: Cambridge University Press. Ostreococcus tauri unveils many unique features. Proc Natl Acad Sci Koehne B, Elli G, Jennings RC, Wilhelm C, Trissl HW. 1999. Spectroscopic U S A. 103:11647–11652. and molecular characterization of a long wavelength absorbing Dufresne A, Garczarek L, Partensky F. 2005. Accelerated evolution asso- antenna of Ostreobium sp. Biochim Biophys Acta Bioenerg. 1412: ciated with genome reduction in a free-living prokaryote. Genome 94–107. Biol. 6:R14. Kosakovsky Pond SL, et al. 2011. A random effects branch-site model Dullo W-C, et al. 1995. Factors controlling holocene reef growth: an in- for detecting episodic diversifying selection. Mol Biol Evol. terdisciplinary approach. Facies 32:145–188. 28:3033–3043. Dutta C, Paul S. 2012. Microbial lifestyle and genome signatures. Curr Kosakovsky Pond SL, Frost SDW, Muse SV. 2005. HyPhy: hypothesis test- Genomics 13:153–162. ing using phylogenies. Bioinformatics 21:676–679.

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 119

Ku¨ hl M, Holst G, Larkum AWD, Ralph PJ. 2008. Imaging of oxygen dy- Mira A, Ochman H, Moran NA. 2001. Deletional bias and the evolution of namics within the endolithic algal community of the massive coral bacterial genomes. Trends Genet. 17:589–596. Porites lobata. J Phycol. 44:541–550. Mock T, Kroon BMA. 2002. Photosynthetic energy conversion under ex- Kumar RA, Oldenburg DJ, Bendich AJ. 2014. Changes in DNA dam- treme conditions—II: the significance of lipids under light limited age, molecular integrity, and copy number for plastid DNA and growth in Antarctic sea ice diatoms. Phytochemistry 61:53–60. mitochondrial DNA during maize development. J Exp Bot. Nienow JA, McKay CP, Friedmann EI. 1988. The cryptoendolithic microbial 65:6425–6439. environment in the Ross Desert of Antarctica: light in the photosyn- Larkum AWD, Barrett J. 1983. Light-harvesting processes in algae. Adv Bot thetically active region. Microb Ecol. 16:271–289. Res. 10:1–219. Nurk S, et al. 2013. Assembling genomes and mini-metagenomes from Larkum AWD, Douglas SE, Raven JA, editors. 2003. Photosynthesis in highly chimeric reads. In: Deng M, Jiang R, Sun F, Zhang X, editors. algae. Dordrecht: Springer Netherlands. Research in Computational Molecular Biology: 17th Annual Larkum AWD, Koch EW, Ku¨ hl M. 2003. Diffusive boundary layers and International Conference, RECOMB 2013, Beijing, China, April 7-10, photosynthesis of the epilithic algal community of coral reefs. Mar Biol. 2013. Proceedings. Berlin, Heidelberg: Springer. p. 158–170. 142:1073–1082. Olsen JL, et al. 2016. The genome of the seagrass Zostera marina reveals Larsson J, Nylander JA, Bergman B. 2011. Genome fluctuations in cyano- angiosperm adaptation to the sea. Nature 530:331–335. bacteria reflect evolutionary, developmental and adaptive traits. BMC Page CN. 2002. Ecological strategies in fern evolution: a neopteridological Evol Biol. 11:187. overview. Rev Palaeobot Palynol. 119:1–33. Lehninger AL, Nelson DL, Cox MM. 1993. Principles of biochemistry. 2nd Paul S, Dutta A, Bag SK, Das S, Dutta C. 2010. Distinct, ecotype-specific edn. New York; Worth Publishers. genome and proteome signatures in the marine cyanobacteria Leliaert F, et al. 2016. Chloroplast phylogenomic analyses reveal the dee- Prochlorococcus. BMC Genomics 11:103. pest-branching lineage of the Chlorophyta, Palmophyllophyceae class. Pombert J-F. 2005. The Chloroplast genome sequence of the green alga nov. Sci Rep. 6:25367. Pseudendoclonium akinetum (Ulvophyceae) reveals unusual structural Leliaert F, Lopez-Bautista JM. 2015. The chloroplast genomes of Bryopsis features and new insights into the branching order of Chlorophyte plumosa and Tydemania expeditiones (Bryopsidales, Chlorophyta): lineages. Mol Biol Evol. 22:1903–1918. compact genomes and genes of bacterial origin. BMC Genomics Raven JA, Beardall J, Larkum AWD, Sanchez-Baracaldo P. 2013. 16:204. Interactions of photosynthesis with genome size and function. Philos Lemieux C, Otis C, Turmel M. 2000. Ancestral chloroplast genome in Trans R Soc B Biol Sci. 368:20120264–20120264. viride reveals an early branch of green plant evolution. Raven JA, Colmer TD. 2016. Life at the boundary: photosynthesis at the Nature 403:649–652. soil–fluid interface. A synthesis focusing on mosses. J Exp Bot. 67: Lemieux C, Otis C, Turmel M. 2014. Six newly sequenced chloroplast 1613–1623. genomes from prasinophyte green algae provide insights into the re- Ries G, et al. 2000. Elevated UV-B radiation reduces genome stability in lationships among prasinophyte lineages and the diversity of stream- plants. Nature 406:98–101. lined genome architecture in picoplanktonic species. BMC Genomics Rocap G, et al. 2003. Genome divergence in two Prochlorococcus eco- 15:857. types reflects oceanic niche differentiation. Nature 424:1042–1047. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W. 2015. MEGAHIT: an ultra-fast Rohde K. 1992. Latitudinal gradients in species diversity: the search for the single-node solution for large and complex metagenomics assembly primary cause. Oikos 65:514–527. via succinct de Bruijn graph. Bioinformatics 31:1674–1676. Rolland N. 1997. Disruption of the plastid ycf10 open reading frame Littler MM, Littler DS, Blair SM, Norris JN. 1985. Deepest known affects uptake of inorganic carbon in the chloroplast of plant life discovered on an uncharted seamount. Science 227: Chlamydomonas. Embo J. 16:6713–6726. 57–59. Rothschild LJ. 1999. The influence of UV radiation on Protistan evolution. Luo R, et al. 2012. SOAPdenovo2: an empirically improved memory- J Eukaryot Microbiol. 46:548–555. efficient short-read de novo assembler. Gigascience 1:18. Sa´nchez-Baracaldo P. 2015. Origin of marine planktonic cyanobacteria. Sci Lynch M. 2006. Streamlining and simplification of microbial genome ar- Rep. 5:17418. chitecture. Annu Rev Microbiol. 60:327–349. Sauvage T, Schmidt WE, Suda S, Fredericq S. 2016. A metabarcoding Lynch M, Koskella B, Schaack S. 2006. Mutation pressure and the evolu- framework for facilitated survey of endolithic phototrophs with tion of organelle genomic architecture. Science 311:1727–1730. tufA. BMC Ecol. 16:8. MacIsaac JJ, Dugdale RC. 1972. Interactions of light and inorganic nitrogen Schlichter D, Kampmann H, Conrady S. 1997. Trophic potential and in controlling nitrogen uptake in the sea. Deep Sea Res Oceanogr photoecology of endolithic algae living within coral skeletons. Mar Abstr. 19:209–232. Ecol. 18:299–317. Magnusson SH, Fine M, Ku¨ hl M. 2007. Light microclimate of endolithic Shashar N, Stambler N. 1992. Endolithic algae within corals – life in an phototrophs in the scleractinian corals Montipora monasteriata and extreme environment. J Exp Mar Bio Ecol. 163:277–286. Porites cylindrica. Mar Ecol Prog Ser. 332:119–128. Simon D, Fewer D, Friedl T, Bhattacharya D. 2003. Phylogeny and self- Marcelino V, Verbruggen H. 2016. Multi-marker metabarcoding of coral splicing ability of the plastid tRNA-Leu group I Intron. J Mol Evol. skeletons reveals a rich microbiome and diverse evolutionary origins of 57:710–720. endolithic algae. Sci Rep. 6: 31508. Stamatakis A. 2006. RAxML-VI-HPC: maximum likelihood-based phyloge- Matthes U, Turner SJ, Larson DW. 2001. Light attenuation by limestone netic analyses with thousands of taxa and mixed models. rock and its constraint on the depth distribution of endolithic algae Bioinformatics 22:2688–2690. and cyanobacteria. Int J Plant Sci. 162:263–270. Swingley WD, et al. 2008. Niche adaptation and genome expansion in the Mayer C. 2007. Phobos: a tandem repeat search tool. [cited 2016 Sep2 ]. chlorophyll d-producing cyanobacterium Acaryochloris marina. Proc Available from: http://www.geneious.com/plugins/phobos-plugin. Natl Acad Sci U S A. 105:2005–2010. McNeal JR, Kuehl JV, Boore JL, de Pamphilis CW. 2007. Complete plastid Tribollet A. 2008. The boring microflora in modern coral reef ecosystems: a genome sequences suggest strong selection for retention of photo- review of its roles. In: Wisshak M, Tapanila L, editors. Current devel- synthetic genes in the parasitic plant genus Cuscuta. BMC Plant Biol. opments in bioerosion. Berlin, Heidelberg: Springer Berlin Heidelberg. 7:57. p. 67–94.

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 Chapter 5 120

Vitousek P, Howarth R. 1991. Nitrogen limitation on land and in the sea: Wright SD, Rohde K. 2013. Energy and spatial order in niche and com- how can it occur? Biogeochemistry 13:87–115. munity. Biol J Linn Soc. 110:696–714. Vogel K, Brett CE. 2009. Record of microendoliths in different facies of the Wu CS, Lai YT, Lin CP, Wang YN, Chaw SM. 2009. Evolution of reduced Upper Ordovician in the Cincinnati Arch region USA: the early history and compact chloroplast genomes (cpDNAs) in gnetophytes: selection of light-related microendolithic zonation. Palaeogeogr Palaeoclimatol toward a lower-cost strategy. Mol Phylogenet Evol. 52:115–124. Palaeoecol. 281:1–24. Yan D, et al. 2015. Auxenochlorella protothecoides and Prototheca wick- Wernersson R, Pedersen AG. 2003. RevTrans: multiple alignment of erhamii plastid genome sequences give insight into the origins of non- coding DNA from aligned amino acid sequences. Nucleic Acids Res. photosynthetic algae. Sci Rep. 5:14465. 31:3537–3539. Yang Z. 1998. Likelihood ratio tests for detecting positive selection Wicke S, et al. 2013. Mechanisms of functional and physical genome and application to primate lysozyme evolution. Mol Biol Evol. reduction in photosynthetic and nonphotosynthetic parasitic plants 15:568–573. of the broomrape family. Plant Cell 25:3711–3725. Yang Z. 2007. PAML 4: phylogenetic analysis by maximum likelihood. Mol Wicke S, Schneeweiss GM, DePamphilis CW, Mu¨ ller KF, Quandt D. 2011. Biol Evol. 24:1586–1591. The evolution of the plastid chromosome in land plants: gene content, Zhong B, Fong R, Collins LJ, McLenachan PA, Penny D. 2014. Two new gene order, gene function. Plant Mol Biol. 76:273–297. fern chloroplasts and decelerated evolution linked to the long gener- Wilhelm C, Jakob T. 2006. Uphill energy transfer from long-wavelength ation time in tree ferns. Genome Biol Evol. 6:1166–1173. absorbing chlorophylls to PSII in Ostreobium sp. is functional in carbon Zuccarello GC, Price N, Verbruggen H, Leliaert F. 2009. Analysis of a plastid assimilation. Photosynth Res. 87:323–329. multigene data set and the phylogenetic position of the marine macro- Willis KJ, Bennett KD, Birks HJB. 2009. Variability in thermal and UV-B alga Caulerpa filiformis (Chlorophyta). J Phycol. 45:1206–1212. energy fluxes through time and their influence on plant diversity and speciation. J Biogeogr. 36:1630–1644. Associate editor: John Archibald Wolf YI, Koonin EV. 2013. Genome reduction as the dominant mode of evolution. BioEssays 35:829–837.

Genome Biol. Evol. 8(9):2939–2951. doi:10.1093/gbe/evw206 CHAPTER 5 121

5.6 Supplementary Materials and Methods

Samples and Sequencing

The Ostreobium quekettii sample (voucher SAG 6.99) was obtained from the Culture Collection of Algae at the University of Gottingen. The Halimeda discoidea sample (voucher HV04923) was collected in New Ireland, Papua New Guinea in 2014. Caulerpa cliftonii (voucher HV03798) was collected in Point Lonsdale (VIC, Australia) in 2013. The strain of Derbesia sp. (WEST4838) was obtained from the culture collection of John West (University of Melbourne).

Total genomic DNA was extracted using a modified cetyl trimethylammonium bromide (CTAB) method as described in Cremen et al. (2016). The library preparation and sequencing was performed by the Georgia Genome Facility (Ostreobium) or by the Genome Center of Cold Spring Harbor Marine Laboratory (Halimeda, Derbesia and Caulerpa). The Ostreobium DNA extraction was sheared to ca. 500 bp (Duty 5%, PIP 105, CpB 200, time 80 sec), transferred to 96-well plate (with other libraries) and processed using the Kapa Biosystems DNA Library Preparation Kit using ligation with a common adapter stub followed by PCR addition of full length dual indexed adapters and with dual spri size selection. After six cycles of PCR, the amplification was checked by agarose gel electrophoresis and low concentration samples subjected to additional amplification. Libraries were purified and the concentration determined by fluorometry. The libraries were normalized to 10nM, quantified with qPCR and pooled at equivalent amounts of libraries. The pools of libraries were run on the Illumina NextSeq 500 using PE150 High Output settings. The sequencing run generated 11.4 million paired-end reads (2 × 151 bp) for the Ostreobium library. The DNA extraction of Caulerpa, Derbesia and Halimeda were sheared to ca. 350 bp, the library was prepared with a TruSeq Nano LT Kit (Illumina) and sequenced on an Illumina HiSeq 2000 platform (2 × 100 bp). The sequencing run generated 15.7 million paired-end reads for Caulerpa, 24.5 million paired-end reads for Derbesia and 19.4 million paired-end reads for Halimeda.

Assembly

Sequences were assembled using CLC Genomics Workbench 7.5.1 (http://www.clcbio.com). Quality trimming was done within CLC using default settings. De novo assembly was done using automatic word (k-mer) and bubble sizes, minimum contig size of 1000bp, simple contig production, and default parameters for the rest. The chloroplast

CHAPTER 5 122

contigs were identified using BLAST. Circularity and ambiguous scaffold regions were resolved by comparing the CLC assembly with assemblies generated independently with MEGAHIT (Li et al. 2015), SOAPdenovo2 (Luo et al. 2012) and SPADES (Nurk et al. 2013). For MEGAHIT assembly, raw (not trimmed) sequence reads were used, a minimum contig length of 1000bp was set and default values were used for the remaining settings. For SOAPdenovo2 assembly, the following settings were used: max. read length = 150 bp for Ostreobium and 101 bp for the other seaweeds; average insert size = 200bp and 300bp for Ostreobium and the other algae, respectively; reads length cut off (quality trimming) = 140 and 95, for Ostreobium and the other algae, respectively; cut off of pair number = 3; minimum aligned length to contigs for a reliable read location = 40bp. For the SPADES assembly, sequences were trimmed with Trimmomatic (Bolger et al. 2014), SPADES was run using the --careful flag and other default settings. Scaffolds regions were checked and resolved by comparing the results of different assemblers, by “closing the gaps” with GapCloser (Luo et al. 2012) and by mapping the sequence reads to the scaffold in Geneious 9.0.4 (Kearse et al. 2012). The assembly generated by CLC workbench for Halimeda had two gapped scaffold regions that are uncertain: one, assembled with the SPADES assembler, has a coverage peak, suggesting that a higher number of repeats may be present. The other region was not assembled with any of the software (only identified as a scaffold based on paired-end reads by the assemblers) and was coded as 100 Ns, as suggested by NCBI in cases where the gap size is unknown. We did not attempt to use PCR to solve these regions because they will not affect the conclusions of this manuscript. Coverage of all genomes, including a detail with the coverage peak on the Halimeda scaffolds, can be found in Supplementary Figure S2. Coverage was calculated with Geneious.

Annotation

The sequences were submitted to MFannot (http://megasun.bch.umontreal.ca/RNAweasel), DOGMA (Wyman et al. 2004) and ARAGORN (Laslett 2004) online tools. MFannot was run with the table 11 genetic code and otherwise default parameters. DOGMA was run with a 60% cutoff for protein coding genes, 80% for RNAs, and a BLAST e-value of 1e-5. ARAGORN was run with the following settings: type (tRNA and tmRNA) = both; allow introns = yes; topology = circular; strand = both. The results were converted to the GFF3 file format and inspected manually in Geneious. All resulting annotations were manually compared, vetted, and added to the final annotation layer once their accuracy was verified. Start and stop positions of coding sequences and introns were visually verified by aligning them with sequences of other algae using MAFFT (Katoh et al. 2002). Intron types were determined with RNAweasel

CHAPTER 5 123

(http://megasun.bch.umontreal.ca/RNAweasel). Open read frames (ORFs) were predicted with the GLIMMER (http://ccb.jhu.edu/software.shtml) plugin in Geneious.

For the comparative analyses, we excluded all hypothetical ORFs with less than 300 bp and re-annotated the tilS gene as a pseudogene (not a CDS) in species where it has a frame shift or a stop codon in the middle of the gene (Tydemania and Bryopsis). The number of repeats larger than 50bp was calculated in Geneious, allowing zero mismatches, excluding repeats up to 10bp longer than contained repeat, excluding contained repeats when longer repeat has a frequency at least 3. We used the Geneious implementation of Phobos v.3.3.11 (Mayer 2007) to identify tandem repeats with lengths between 15 and 1000bp, using the “perfect” search mode. Palindromic repeats were calculated with the Emboss suite (http://emboss.bioinformatics.nl/), using with default values (minleng = 10, max len = 100, maximum gap = 100bp). In order to compare the genomes’ synteny (Figure 2), the genomes were rearranged to start at the 16S rDNA position and aligned with the progressive Mauve algorithm implemented in Geneious, using the full alignment option and automated calculation of minimum locally collinear block score.

Searching cemA

The cemA gene was lost from the chloroplast genome of Ostreobium. In order to access whether it has been transferred to the nucleus, we performed blast searches in the contigs generated by the CLC assembly, and in a transcriptome data kindly provided by a member of the Verbruggen lab who generated this data for another unrelated study. The transcriptome was obtained from the same Ostreobium strain (SAG 6.99) used to sequence the chloroplast genome. We used the cemA gene of Bryopsis plumosa as a reference.

We performed blastn and tblastx searchers using CLC contigs as a blast database. No hit was obtained with E-value < 1. The three obtained hits (all with tblastx) had bit scores between 34 and 31.

We performed similar searches (blastn and tblastx) using the contigs generated from the transcriptome data. No hit was obtained with tblastx. With blastn, one hit was obtained with e-value = 0.62 (bit score = 37), and the matching sequence was 27 bp long.

Rates of evolution

Besides estimating whether rates of evolution are significantly different along the branch leading to Ostreobium with the model selection procedure implemented in PAML (Yang 2007), we also we also verified the rates of molecular evolution under a relaxed clock

CHAPTER 5 124

model, whereby rates are free to vary on all branches of the phylogeny. We ran a PhyloBayes (Lartillot et al. 2009) analysis with a relaxed lognormal (autocorrelated) clock model, a root prior (610 Ma) and a CAT-GRT substitution model for 131,956 cycles. After verifying convergence with Tracer (Rambaut et al 2014), we applied a burnin of 25,000 to summarize the trees. The analysis produced a chronogram (whereby branch lengths represent time from coalescence) and a phylogram (where branch lengths represent the amount of substitutions). We then estimated the amount of molecular change per time unit for each branch by dividing the branch lengths of the chronogram by the branch lengths of the phylogram.

CHAPTER 5 125

Supplementary Figures:

Supplementary Figure S1: Gene map of the chloroplast genomes of Caulerpa cliftonii, Halimeda discoidea and Derbesia sp..

CHAPTER 5 126

Supplementary Figure S2. Coverage of the four chloroplast genomes sequenced here, generated by mapping the reads to the final contigs in Geneious. Two regions seem to contain a polymorphic or unknown number of repeats in Halimeda (zoom). One of the regions was assembled but shows a coverage peak, indicating that there might be more repeats than what has been estimated by the assembler. The other gap was filled with 100 Ns (coverage minimum observed in the zoom). Other peaks of coverage can be observed in highly conserved regions (e.g. 16S rRNA gene). These are likely due to bacterial sequences (contaminants or symbiotic).

CHAPTER 5 127

Supplementary Figure S3. Rates of molecular evolution estimated with a relaxed molecular clock model. Branch lenghts and ages were estimated under a relaxed lognormal molecular clock in PhyloBayes (Lartillot et al. 2009), and the rates were calculated by dividing branches’ length by age.

Online material:

The following supplementary tables can be found at: https://www.dropbox.com/sh/g9thuyr5ohlqzc6/AACG1ZKtQw45l85BNwkv-ohka?dl=0

Supplementary table S1: Comparison of the protein coding gene content in Ulvophyceae

Supplementary table S2: Nitrogen counts per gene

Supplementary table S3: Comparison of gene lengths between Ostreobium and other algae

Supplementary table S4: dN/dS (dN, dS) among species pairs for the different gene classes.

Supplementary table S5: Results of the Branch-Site REL test of episodic selection (HYPHY)

CHAPTER 5 128

Supplementary References: Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30, 2114-2120. Cremen MCM, Huisman JM, Marcelino VR, Verbruggen H (2016) Taxonomic revision of Halimeda (Bryopsidales, Chlorophyta) in south-western Australia. Australian systematic botany 29, 41-54. Katoh K, Standley DM (2013) MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Molecular Biology and Evolution 30, 772-780. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, et al. (2012) Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28, 1647-1649. Lartillot N, Lepage T, Blanquart S (2009). PhyloBayes 3: A Bayesian software package for phylogenetic reconstruction and molecular dating. Bioinformatics 25, 2286-2288. Laslett D (2004) ARAGORN, a program to detect tRNA genes and tmRNA genes in nucleotide sequences. Nucleic Acids Research 32, 11-16. Li D, Liu C-M, Luo R, Sadakane K, Lam T-W (2015) MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. Bioinformatics 31, 1674-1676. Luo R, Liu B, Xie Y, Li Z, Huang W, et al. (2012) SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1, 18. Mayer C. (2007). Phobos: a tandem repeat search tool. Nurk S, Bankevich A, Antipov D, Gurevich A, Korobeynikov A, et al. (2013) Assembling Genomes and Mini-metagenomes from Highly Chimeric Reads. In: Research in Computational Molecular Biology: 17th Annual International Conference, RECOMB 2013, Beijing, China, April 7-10, 2013. Proceedings (eds. Deng M, Jiang R, Sun F, Zhang X), pp. 158-170. Springer Berlin Heidelberg, Berlin, Heidelberg. Rambaut A, Suchard MA, Xie D, Drummond AJ. (2014). Tracer v1.6, Available from http://beast.bio.ed.ac.uk/Tracer Wyman SK, Jansen RK, Boore JL. (2004). Automatic annotation of organellar genomes with DOGMA. Bioinformatics 20,3252–3255. Yang Z. (2007). PAML 4: Phylogenetic Analysis by Maximum Likelihood. Molecular Biology and Evolution 24,1586–1591.

CHAPTER 6 DISCUSSION AND PERSPECTIVES

This thesis provides baseline information for a long-term multidisciplinary study on coral holobionts and hologenomes as illustrated in Figure 6.1. The initial step to understand a holobiont is to assess “who is there” and “what they are doing”. The second step is to understand the underlying mechanisms generating and maintaining species diversity and functional diversity. These mechanisms established over evolutionary timescales and can be studied under an evolutionary framework. In this chapter I will discuss how this thesis advanced our knowledge in some aspects of this framework and what I consider to be priority areas for future research.

Figure 6.1. A general framework to study holobionts and hologenomes using corals as a model system. The first steps include identifying the biodiversity and function of the members of the holobiont. The distribution of the diversity and functioning of the communities depends on neutral and niche processes affecting species distributions. Evolutionary processes (e.g. natural selection) dictate the relative importance of biotic and abiotic factors shaping species distributions.

CHAPTER 6 130

6.1 Biodiversity

The first step towards understanding a holobiont is to characterise the biodiversity it contains. Species diversity was for a long time based on morphological identification and/or culturing, therefore only a few species were thought to occur in coral skeletons (e.g. Titlyanov et al. 2008, Tribollet 2008, Carilli et al. 2010). Developing a cost-effective high- throughput sequencing protocol and an analysis pipeline to identify the prokaryotic and eukaryotic endolithic microbes was, by far, the most time consuming task of this project. This allowed discovering dozens of Ostreobium species and many other algae never reported in live coral skeletons before (Marcelino & Verbruggen 2016 - Chapter 2). The results also revealed a surprisingly high phylogenetic diversity, with many distantly related lineages living inside coral skeletons (Chapters 2 and 3).

One challenge for the near future is to adopt a unified taxonomy for the different endolithic algal clades. There are four recent studies reporting several Ostreobium lineages, each study uses different markers and different tentative names for Ostreobium clades (Gutner-Hoch & Fine 2011; Del Campo et al. 2016; Marcelino & Verbruggen 2016; Sauvage et al. 2016). There is no correspondence between the Ostreobium clades described in the different studies and no consensus about their names. Without this taxonomic consensus it is difficult to guide future work to identify the ecophysiological traits of the different endolithic lineages. Chloroplast genomes can help to match the different clades identified with the different markers: we recently sequenced chloroplast genomes from four Ostreobium strains (Verbruggen et al. Accepted), and used them to classify and match Ostreobium clades (clades C1 – C4 described in Chapter 2) across the UPA and tufA datasets (Chapter 3 onwards). Describing microbial species is a challenge as most of the lineages sequenced have no cultured and isolated representatives. The most feasible solution would be to attribute names to the main clades (e.g. Marcelino & Verbruggen 2016; Sauvage et al. 2016) and incorporate the classified OTU sequences in reference databases that can be used in future studies.

Besides establishing a baseline of the microbial biodiversity in coral skeletons, this thesis established techniques that can be used to investigate other unexplored microbial groups in the future. For example, a high number of cyanobacteria (>90) and red algae (over 70) OTUs were retrieved in our biodiversity assessments and may be interesting targets for a more detailed investigation (see for example the work of Yoon et al. 2006 on endolithic red algae). Likewise, fungi and other microbial eukaryotes were observed in our data but not studied in detail. Viruses and Archaea are even less characterised and likely play important roles in the coral holobiont (Rosenberg et al. 2007). Evidently, a similar approach to study the

CHAPTER 6 131

microbial community present in coral tissues and mucus is equally important to understand the biodiversity of the coral microbiome.

6.2 Distribution

Distribution patterns give insights into the evolutionary processes that generate them and into the functional diversity of a community, providing a link between evolution and biodiversity (Figure 6.1). For example, by observing no influence of pH on the distribution of endolithic microbes we proposed the “tolerant endolith” hypothesis, which postulates that endolithic microorganisms are adapted to a wide pH range due to the daily exposure to pH fluctuations within the skeleton (Chapter 3). In Chapter 4 we observed that endolithic microorganisms are not homogeneously distributed along environmental gradients (e.g. depth and temperature), and we found no evidence for dispersal limitation at broad geographical scales. This implies that environmental filtering play an important role in shaping endolithic biodiversity. Our results also suggest that the patchy intracolony distribution of bacteria is a consequence of dispersal limitation, raising the question whether that would be sufficient to cause isolation by distance and bacterial speciation.

Ecophysiological experiments using different endolithic lineages and targeted sampling across relevant environmental gradients will help to better delineate species’ physiological tolerances and ecological niches. One of the main challenges here is to perform these experiments and analyses at the community level to capture the emergent features of the holobiont. Emergent features are the characteristics of a community that cannot be identified when analysing their members in isolation (Konopka 2009; Tan 2016). For example, a microbial species may be less tolerant to heat stress when growing in axenic cultures than when it grows in a multispecies consortium. If the community composition (in terms of genes/hologenomes) is the unit undergoing natural selection rather than individual organisms or species (Rosenberg et al. 2007; Doolittle & Zhaxybayeva 2010; Zarraonaindia et al. 2013), then analysing the emergent properties of these communities is a promising avenue to understand hologenome evolution.

6.3 Evolution

How holobionts and the hologenomes evolve is a trendy question in evolutionary biology (Rosenberg et al. 2007; Arnold 2013; Theis et al. 2016). The limited knowledge about the members of the coral holobiont, especially those living in the skeleton, hinders progress in this field. The evolution of endolithic microbes had not been addressed using

CHAPTER 6 132

molecular tools until very recently (besides this thesis, see also Del Campo et al. 2016; Sauvage et al. 2016). This thesis revealed aspects of the phylogenetic origins of endolithic green algae and the molecular basis underlying their lifestyle.

Phylogenetic origins: When my PhD project started, only two genera of green algae (core Chlorophyta) were known to occur in the skeletons of live corals (Ostreobium and Phaeophila). The first chapter of this thesis shows that the association with an endolithic habitat evolved over 20 times in the phylogenetic history of green algae. Considering that we only sampled tropical corals, this number is likely to increase when other types of marine limestone are assessed. A similarly high diversity of green algae was found in coral rubble and rhodoliths (Sauvage et al. 2016). These results indicate that endolithic filaments compose a large portion of the morphological diversity in green algae, especially in the order Bryopsidales (see also Verbruggen et al. 2009). We obtained many OTUs related to species known to grow out of limestone substrates, like Pseudochlorodesmis spp., suggesting that these algae possibly retain a large portion of their biomass inside hard substrates. Other sequences were related to known seaweeds like Halimeda spp. which may carry part of their lifecycle inside limestones. The ability to hide inside hard substrates, even if it is only partially or during a limited time, may be a selective advantage for avoiding grazing and harsh environmental conditions. It is possible that the association with limestone is ancestral in Bryopsidales and key to the successful diversification of this order. An endolithic lifestyle has also been proposed for ancestral prokaryotes based on the fossil record and on the notion that boring provides protection from extreme environments and UV radiation (see Cockell & Herrera 2008 for a review). The metabarcoding approach will help to further investigate the historical association between limestone substrates and algae diversification.

Genome dynamics: Evolutionary history can leave footprints on a species’ genome. We explored this idea using the chloroplast genome of Ostreobium, and observed that the low light inherent of the endolithic lifestyle could be associated with genome streamlining and reduced rates of molecular evolution (Marcelino et al. 2016 - Chapter 5). The next step is to obtain complete genomes for members of the holobiont and use comparative genomics to test, among others, whether similar genomic traits can be found in organisms with similar ecological niches, and whether genomes contain signatures of biotic associations as well. For example, horizontal gene transfer from bacteria to the endolithic hot-spring red alga Galdieria phlegrea provided this alga with the metabolic capacity to survive in extreme environments (Qiu et al. 2013). The work on Ostreobium’s chloroplast genome dynamics also provided me an interesting training in genome assembly, annotation and comparative methods, which are fundamental skills to continue the hologenome evolution research.

CHAPTER 6 133

6.4 The next projects

The functional diversity of microbes associated with corals is probably the most profound knowledge gap hindering our understanding about the evolution of holobionts at the moment. With our biodiversity assessment, we proposed several possible ecological functions that could be performed by endolithic microbes, including its potential function as a reservoir of the coral microbiome (Chapter 3). These suggestions however still need experimental validation. Understanding what the microbes are doing in the holobiont includes understanding their ecological roles, metabolic functioning and interaction with other microorganisms. It has been suggested that interactions among bacterial species can play a more important role in structuring microbial communities than does the environment (Gilbert et al. 2012). Our results from Chapter 4 also suggest that algae-bacteria interactions play a significant role in regulating endolithic community distribution. Below are some perspectives to study community functioning in the future.

Co-occurrence patterns: Patterns of microbial co-occurrence can guide experimental work and provide insights into the processes driving community structure (Horner-Devine et al. 2007; Gilbert et al. 2012; Williams et al. 2014). Species may co-occur more often than expected by chance because they share a similar ecological niche or have a mutualistic relationship. Species may co-occur less than expected by chance when they have different physiological tolerances or competitive interactions (Horner-Devine et al. 2007). Co- occurrence networks can be built from the metabarcoding data already available for the endolithic microbiome. My initial explorations with these techniques show that some Ostreobium clades and many bacterial families co-occur more often than expected by chance (Spearman’s ρ ≥ 0.5, P < 0.05, Figure 6.2). Co-occurrence networks can be performed at any taxonomic level (i.e. OTUs to phyla) and can help to assess the influence of biotic associations on the community distribution patterns observed in Chapter 4. The information obtained from these networks can be used to guide experimental work to further investigate the metabolic and physiological basis of the interactions between endolithic microbes.

Microbial interactions: Genomics, metabolomics and transcriptomics constitute powerful tools to investigate the molecular and functional basis of microbial associations. By characterising the metabolic pathways in complete genomes it is possible to infer their potential function (e.g. nitrogen fixation). Potential metabolic abilities can be further tested in transcriptomics assays by analysing gene expression in microorganisms exposed to different experimental conditions (e.g. do nitrogen fixation capabilities change under the presence of different microbes?). Metabolomics can help to chart the metabolites involved in the interaction between species (e.g. antibacterial substances and signalling molecules). These are

CHAPTER 6 134

just some examples among several “omics” approaches that can help to understand the functioning of holobionts (see also Voolstra et al. 2015).

Figure 6.2. Network of co-occurring bacterial (blue) and algal (green) genera occurring in the skeletons of Porites sp. collected in Australia, Papua New Guinea and Brazil. The sequences were obtained with the 16S rDNA and the tufA markers. Lines represent strong correlations between taxa (Spearman’s rho > 0.5 and P-values < 0.05). This is a preliminary assay and can be performed at any taxonomic level to characterise co-occurring taxa in the endolithic community.

Evolution of holobionts: Holobionts (and microbial communities in general) may be a unit of selection in evolution, but it is certainly not the only one (Theis et al. 2016). Identifying the cases where co-occurring organisms influence the evolution of each other and cases where they do not can inform us about the prevalence of hologenome evolution in nature. It is difficult to observe evolution in vitro, especially for endolithic organisms like Ostreobium that grow extremely slowly. Fortunately, phylogenetics and modelling are

CHAPTER 6 135

powerful tools to track evolutionary events. Phylogenetic comparative methods can inform us about the relationships among functional traits (including microbial associations) and species’ evolutionary history (Butler & King 2004; Martiny et al. 2015). It would be interesting to test, for example, whether the association between Ostreobium lineages and coral hosts is more conserved than expected by chance, or more conserved than their association with other dead limestone substrates, which might be an evidence of hologenome evolution. This approach however is based on correlations and models based on experimentally demonstrated physiological traits are expected to be more powerful. The information derived from experiments along with the information contained in genomes, transcriptomes and metabolomes will permit us to model microbial interactions, metabolic networks and predict how these may change in future environmental conditions (Edwards et al. 2002; Zarraonaindia et al. 2013; Garza & Dutilh 2015).

In summary, this thesis established a baseline of endolithic microbial diversity, distribution and evolution and paved the way to investigate other members of the microbiome in a multidisciplinary way. The next logical step is to fill in our knowledge gap about the diversity of functions carried out by microbes in the coral holobiont. Our growing knowledge of microbial communities and the increasing availability of different sorts of data and tools makes this an exciting time to study coral holobionts.

6.5 References Arnold C (2013) The hologenome: A new view of evolution. New Scientist 217, 30-34. Butler MA, King AA (2004) Phylogenetic comparative analysis: a modeling approach for adaptive evolution. American Naturalist 164, 683-695. Carilli JE, Godfrey J, Norris RD, Sandin Sa, Smith JE (2010) Periodic endolithic algal blooms in Montastraea faveolata corals may represent periods of low-level stress. Bulletin of Marine Science 86, 709-718. Cockell CS, Herrera A (2008) Why are some microorganisms boring? Trends in Microbiology 16, 101-106. Del Campo J, Pombert JF, Slapeta J, Larkum A, Keeling PJ (2016) The 'other' coral symbiont: Ostreobium diversity and distribution. ISME Journal. Doolittle WF, Zhaxybayeva O (2010) Metagenomics and the units of biological organization. Bioscience 60, 102-112. Edwards JS, Covert M, Palsson B (2002) Metabolic modelling of microbes: the flux-balance approach. Environmental Microbiology 4, 133-140.

CHAPTER 6 136

Garza DR, Dutilh BE (2015) From cultured to uncultured genome sequences: metagenomics and modeling microbial ecosystems. Cellular and Molecular Life Sciences 72, 4287- 4308. Gilbert JA, Steele JA, Caporaso JG, Steinbruck L, Reeder J, et al. (2012) Defining seasonal marine microbial community dynamics. ISME Journal 6, 298-308. Gutner-Hoch E, Fine M (2011) Genotypic diversity and distribution of Ostreobium quekettii within scleractinian corals. Coral Reefs 30, 643-650. Horner-Devine MC, Silver JM, Leibold MA, Bohannan BJ, Colwell RK, et al. (2007) A comparison of taxon co-occurrence patterns for macro- and microorganisms. Ecology 88, 1345-1353. Konopka A (2009) What is microbial community ecology? ISME Journal 3, 1223-1230. Marcelino VR, Cremen MC, Jackson CJ, Larkum AA, Verbruggen H (2016) Evolutionary dynamics of chloroplast genomes in low light: a case study of the endolithic green alga Ostreobium quekettii. Genome Biology and Evolution 8, 2939-2951. Marcelino VR, Verbruggen H (2016) Multi-marker metabarcoding of coral skeletons reveals a rich microbiome and diverse evolutionary origins of endolithic algae. Scientific Reports 6, 31508. Martiny JB, Jones SE, Lennon JT, Martiny AC (2015) Microbiomes in light of traits: A phylogenetic perspective. Science 350, aac9323. Qiu H, Price DC, Weber AP, Reeb V, Yang EC, et al. (2013) Adaptation through horizontal gene transfer in the cryptoendolithic red alga Galdieria phlegrea. Current Biology 23, R865-866. Rosenberg E, Koren O, Reshef L, Efrony R, Zilber-Rosenberg I (2007) The role of microorganisms in coral health, disease and evolution. Nature Reviews: Microbiology 5, 355-362. Sauvage T, Schmidt WE, Suda S, Fredericq S (2016) A metabarcoding framework for facilitated survey of endolithic phototrophs with tufA. BMC Ecology 16, 8. Tan (2016) All Together Now: Experimental Multispecies Biofilm Model Systems. Theis KR, Dheilly NM, Klassen JL, Brucker RM, Baines JF, et al. (2016) Getting the hologenome concept right: an eco-evolutionary framework for hosts and their microbiomes. mSystems 1. Titlyanov EA, Kiyashko SI, Titlyanova TV, Kalita TL, Raven JA (2008) δ13C and δ15N values in reef corals Porites lutea and P. cylindrica and in their epilithic and endolithic algae. Marine Biology 155, 353-361. Tribollet A (2008) The boring microflora in modern coral reef ecosystems: a review of its roles. In: Current developments in bioerosion (eds. Wisshak M, Tapanila L), pp. 67- 94. Springer, Berlin, Heidelberg. Verbruggen H, Marcelino VR, Guiry MD, Jackson CJ (Accepted) Phylogenetic position of the coral symbiont Ostreobium (Ulvophyceae) inferred from chloroplast genome data. Journal of Phycology.

CHAPTER 6 137

Verbruggen H, Vlaeminck C, Sauvage T, Sherwood AR, Leliaert F, De Clerck O (2009) Phylogenetic analysis of Pseudochlorodesmis strains reveals cryptic diversity above the family level in the siphonous green algae (Bryopsidales, Chlorophyta. Journal of Phycology 45, 726-731. Voolstra C, Miller D, Ragan M, Hoffmann A, Hoegh-Guldberg O, et al. (2015) The ReFuGe 2020 Consortium—using “omics” approaches to explore the adaptability and resilience of coral holobionts to environmental change. Frontiers in Marine Science 2, 68. Williams RJ, Howe A, Hofmockel KS (2014) Demonstrating microbial co-occurrence pattern analyses within and between ecosystems. Frontiers in Microbiology 5, 358. Yoon HS, Ciniglia C, Wu M, Comeron JM, Pinto G, Pollio A, Bhattacharya D (2006) Establishment of endolithic populations of extremophilic Cyanidiales (Rhodophyta). BMC Evolutionary Biology 6, 78. Zarraonaindia I, Smith DP, Gilbert JA (2013) Beyond the genome: community-level analysis of the microbial world. Biology and Philosophy 28, 261-282.

Minerva Access is the Institutional Repository of The University of Melbourne

Author/s: Rossetto Marcelino, Vanessa

Title: Biodiversity, distribution and evolution of endolithic microorganisms in coral skeletons

Date: 2016

Persistent Link: http://hdl.handle.net/11343/129312

File Description: Biodiversity, distribution and evolution of endolithic microorganisms in coral skeletons

Terms and Conditions: Terms and Conditions: Copyright in works deposited in Minerva Access is retained by the copyright owner. The work may not be altered without permission from the copyright owner. Readers may only download, print and save electronic copies of whole works for their own personal non-commercial use. Any use that exceeds these limits requires permission from the copyright owner. Attribution is essential when quoting or paraphrasing from these works.