Transcription Factor Evolution in Eukaryotes and the Assembly of the Regulatory Toolkit in Multicellular Lineages
Total Page:16
File Type:pdf, Size:1020Kb
Transcription factor evolution in eukaryotes and the assembly of the regulatory toolkit in multicellular lineages Alex de Mendozaa,b,1, Arnau Sebé-Pedrósa,b,1, Martin Sebastijan Sestakˇ c, Marija Matejciˇ cc, Guifré Torruellaa,b, Tomislav Domazet-Losoˇ c,d, and Iñaki Ruiz-Trilloa,b,e,2 aInstitut de Biologia Evolutiva (Consejo Superior de Investigaciones Científicas–Universitat Pompeu Fabra), 08003 Barcelona, Spain; bDepartament de Genètica, Universitat de Barcelona, 08028 Barcelona, Spain; cLaboratory of Evolutionary Genetics, Ruder Boskovic Institute, HR-10000 Zagreb, Croatia; dCatholic University of Croatia, HR-10000 Zagreb, Croatia; and eInstitució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain Edited by Walter J. Gehring, University of Basel, Basel, Switzerland, and approved October 31, 2013 (received for review June 25, 2013) Transcription factors (TFs) are the main players in transcriptional of life (6, 15–22). However, it is not yet clear whether the evo- regulation in eukaryotes. However, it remains unclear what role lutionary scenarios previously proposed are robust to the in- TFs played in the origin of all of the different eukaryotic multicellular corporation of genome data from key phylogenetic taxa that lineages. In this paper, we explore how the origin of TF repertoires were previously unavailable. shaped eukaryotic evolution and, in particular, their role into the In this paper, we present an updated analysis of TF diversity emergence of multicellular lineages. We traced the origin and ex- and evolution in different eukaryote supergroups, focusing on pansion of all known TFs through the eukaryotic tree of life, using the various unicellular-to-multicellular transitions. We report genome broadest possible taxon sampling and an updated phylogenetic background. Our results show that the most complex multicellular and/or transcriptome data from several unicellular relatives of fi fi lineages (i.e., those with embryonic development, Metazoa and Metazoa and Fungi (including one lasterean, ve ichthyosporeans, Embryophyta) have the most complex TF repertoires, and that these a nucleariid, and the incertae sedis Corallochytrium limacisporum), repertoires were assembled in a stepwise manner. We also show that and use published data from key, but previously unsampled, eu- asignificant part of the metazoan and embryophyte TF toolkits karyotic lineages, such as Glaucophyta, Haptophyta, Rhizaria, and evolved earlier, in their respective unicellular ancestors. To gain Cryptophyta. We show that an important fraction of the metazoan insights into the role of TFs in the development of both embryo- TF toolkit is not novel but rather appeared in the root of Opis- phytes and metazoans, we analyzed TF expression patterns through- thokonta and/or Holozoa. Similarly, we show that many plant TFs out their ontogeny. The expression patterns observed in both groups are present in unicellular chlorophytes, and many fungal TFs are recapitulate those of the whole transcriptome, but reveal some present in microsporidians and nucleariids. Finally, we analyze the important differences. Our comparative genomics and expression data reshape our view on how TFs contributed to eukaryotic evo- TFome (i.e., the general TF repertoire) expression throughout lution and reveal the importance of TFs to the origins of multicellu- embryonic development in embryophytes and metazoans and show larity and embryonic development. that each phylostratigraphic layer of TFs differentially contributes to successive stages of embryonic development, linking the evolu- phylotypic stage | Holozoa | LECA tionary history of TFs to organismal ontogeny. ranscription factors (TFs) are proteins that bind to DNA in Significance Ta sequence-specific manner (1) and enhance or repress gene – expression (2 4). In response to a broad range of stimuli, TFs Independent transitions to multicellularity in eukaryotes in- coordinate many important biological processes, from cell cycle volved the evolution of complex transcriptional regulation progression and physiological responses, to cell differentiation toolkits to control cell differentiation. By using comparative and development (5, 6). Thus, TFs have a central role in the genomics, we show that plants and animals required richer transcriptional regulation of all cellular organisms, being present transcriptional machineries compared with other eukaryotic in all branches of the tree of life (bacteria, archaea, and eukar- multicellular lineages. We suggest this is due to their orches- yotes). There appears to be a correlation between elaborate trated embryonic development. Moreover, our analysis of regulation of gene expression and the complexity of organisms transcription factor (TF) expression patterns during the de- ’ (7), such that the amount (as a proportion of an organism s total velopment of animals reveal links between TF evolution, spe- gene content) and diversity of TF proteins is expected to be di- cies ontogeny, and the phylotypic stage. rectly correlated with this complexity (8). Indeed, TFs play a crucial role in multicellular eukaryotes. For example, TFs are Author contributions: A.d.M., A.S.-P., T.D.-L., and I.R.-T. designed research; A.d.M., A.S.-P., the master regulators of embryonic development in embryophytes M.S.S., and M.M. performed research; G.T. contributed new reagents/analytic tools; A.d.M., and metazoans (9), and analyses of their embryonic transcrip- A.S.-P., M.S.S., M.M., T.D.-L., and I.R.-T. analyzed data; and A.d.M., A.S.-P., T.D.-L., and I.R.-T. tional profiles support the presence of a phylotypic stage in both wrote the paper. fl lineages (10–14). These studies have also shown that evolution- The authors declare no con ict of interest. arily younger genes tend to be expressed at earlier and later This article is a PNAS Direct Submission. stages of development, whereas the transcriptomes of the middle Data deposition: The sequences reported in this paper have been deposited with the National Center for Biotechnology Information [NCBI BioProject number PRJNA189477 stages (the phylotypic stage) are dominated by ancient genes (10, (Amoebididum parasiticum); and NCBI SRA projects SRX096927 and SRX096925 (Ministeria 13). It remains to be investigated how the evolutionary age and vibrans); SRX377507 (Pirum gemmata); SRX377508 (Abeoforma whisleri); and SRX377514 the expression patterns of the different TFs shift throughout the (Creolimax fragrantissima)]. ontogeny of these lineages and whether TF expression profiles 1A.d.M. and A.S.-P. contributed equally to this work. correlate with the general transcriptome profiles. 2To whom correspondence should be addressed. E-mail: [email protected]. Previous studies have analyzed the evolutionary history and This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. phylogenetic distribution of TFs in various branches of the tree 1073/pnas.1311818110/-/DCSupplemental. E4858–E4866 | PNAS | Published online November 25, 2013 www.pnas.org/cgi/doi/10.1073/pnas.1311818110 Downloaded by guest on September 23, 2021 Downloaded by guest on September 23, 2021 eages are indicated ( survey the presence, abundance,of an TF (8, 16).(DBDs) Therefore, are univocal we signatures analyzed of DBDs the across presence of eukaryotes a to particular type Lineage-Speci Results de Mendoza et al. (rows) are clustered accordingFig. to 1. abundance and distribution, and species (columns) are grouped according to phylogenetic af Presence and abundance of transcription factors (TFs) in eukaryotes. The heat map depicts absolute TF counts according to the color scale. TFs/DBDs 100 200 300 0 0 5 10 15 20 histogram Color key& Archaeplastida fi c and Paneukaryotic TFs. Unikonta/Amorphea Paneukaryotic cluster Top cluster Embryophyta Metazoa cluster ). Raw data are in Fungi cluster Holozoa cluster cluster cluster Homo sapiens Danio rerio Ciona intestinalis Saccoglossus kowalevskii Drosophila melanogaster d relative contribution of each Metazoa Caenorhabditis elegans Holozoa Capitella teleta Dataset S1 Lottia gigantea Nematostella vectensis Acropora digitifera Hydr a magnipapillata DNA-binding domains Tr ichoplax adhaerens Mnemiopsis leidyi Oscarella carmela Amphimedon queenslandica Opisthokonta . Further taxonomic information is in Monosiga brevicollis Salpingoeca rosetta Capsaspora owczarzaki Ministeria vibrans Sphaeroforma arctica Creolimax fragrantissima Abeofor ma whisleri Pir um gemmata Amoebidium parasiticum Corallochytrium limacisporum Saccharomyces cerevisiae Neurospora crassa Tuber melanosporum Schizosaccharomyces pombe Cryptococcus neoformans Coprinus cinereus Holomycota Ustilago maydis Fungi Mortierella verticillata Phycomyces blakesleeanus Mucor circinelloides Rhizopus oryzae Allomyces macrogynus largest possible diversity, especially aroundods multicellular transitions. to underestimate total counts,1). but It should minimizes be false noted positives thatTF we ( class used a (71 conservative in method that total) tends throughout various eukaryotic lineages (Fig. Batrachochytr ium dendrobatidis Spizellomyces punctatus ). We used a wide taxon sampling strategy optimized to have the Encephalitozoon cuniculi Apusozoa Nematocida parisii Nuclearia sp Thecamonas trahens Dicty ostelium discoideum Polysphondylium pallidum Entamoeba histolytica Amoebozoa Dataset S2 Acanthamoeba castellanii Embryophyta Arabidopsis thaliana Archaeplastida Oryza sativa Selaginella moellendorffii Physcomitrella patens Chlamydomonas reinhardtii Volvox carteri Chlorella