Supplementary Files
The ancestral animal genetic toolkit revealed by diverse choanoflagellate transcriptomes
Authors: Daniel J. Richter1,2, Parinaz Fozouni1,3,4, Michael B. Eisen1, Nicole King1
Author Affiliations:
1 Department of Molecular and Cell Biology and Howard Hughes Medical Institute, University of California, Berkeley, CA 94720-3200, USA 2 Sorbonne Universités, UPMC Univ Paris 06, CNRS UMR 7144, Adaptation et Diversité en Milieu Marin, Équipe EPEP, Station Biologique de Roscoff, 29680 Roscoff, France 3 Medical Scientist Training Program, Biomedical Sciences Graduate Program, University of California, San Francisco, CA 94143, USA 4 Gladstone Institutes, San Francisco, CA 94158, USA
Keywords: Urmetazoan, ancestral gene content reconstruction, evolution, choanoflagellates, transcriptome, Urchoanozoan, innate immunity, extracellular matrix, RNAi
Data Availability:
Raw sequencing reads: NCBI BioProject PRJNA419411 (19 choanoflagellate transcriptomes), PRJNA420352 (S. rosetta polyA selection test) Transcriptome assemblies, annotations, and gene families: https://dx.doi.org/10.6084/m9.figshare.5686984 Protocols: https://dx.doi.org/10.17504/protocols.io.kwscxee
a b c d e f g
h i j k l m n
o p' p'' q' q'' r s
Supplementary Figure 1. Phase contrast images of species sequenced in this study. All scale bars are 5 µm. (a) Acanthoeca spectabilis, a nudiform loricate. (b) Choanoeca perplexa, a thecate craspedid. (c) Codosiga hollandica, a naked stalked craspedid. (d) Diaphanoeca grandis, a tectiform loricate. (e) Didymoeca costata, a tectiform loricate. (f) Hartaetosiga balthica, a naked stalked craspedid. (g) Hartaetosiga gracilis, a naked stalked craspedid; the image shows a colony of cells attached to the same stalk. (h) Helgoeca nana, a nudiform loricate. (i) Microstomoeca roanoka, a thecate craspedid. (j) Mylnosiga fluctuans, a naked craspedid. (k) Salpingoeca dolichothecata, a thecate craspedid. (l) Salpingoeca helianthica, a thecate craspedid. (m) Salpingoeca infusionum, a thecate craspedid. (n) Salpingoeca kvevrii, a thecate craspedid. (o) Salpingoeca macrocollata, a thecate craspedid. (p) Salpingoeca punica, a thecate craspedid; p' represents a live cell within its theca, and p'' an empty theca. (q) Salpingoeca urceolata, a thecate craspedid; q' represents a live cell within its theca, and q'' a group of empty thecae joined at the base of their stalks. (r) Savillea parva, a nudiform loricate. (s) Stephanoeca diplocostata, a tectiform loricate.
a 1
0.9
0.8
0.7
0.6
0.5 2 Rounds 4 Rounds 0.4 With Length
0.3 Proportion of CDS Proportionof
0.2
0.1
0 0 2000 4000 6000 8000 10000 12000 14000 16000 18000 Distance from 3' End of CDS b
S. rosetta ribosomal A. machipongoensis genome
Supplementary Figure 2. Tests of two versus four rounds of polyA selection. (a) An additional to rounds of polyA+ selection do not cause significant loss of read coverage at the 5’ ends of transcripts. The proportion of transcriptions with at least one sequence read mapping at each given distance from their 3’ ends for two rounds (orange) and four rounds (yellow) of polyA+ selection. Also shown is the proportion of total transcripts in the S. rosetta genome at each length (green). There is a difference between the two round and four round coverage beginning at approximately 10,000 bases from the 3’ end of transcripts, but this represents only a very small fraction of total transcripts encoded in the S. rosetta genome. (b) Comparison of 2 rounds versus 4 rounds of polyA+ selection during library construction. The number of reads from a culture of S. rosetta mapping either to the S. rosetta ribosomal locus or to the genome of its prey bacterium, A. machipongoensis, is shown. Four rounds of polyA+ selection removes roughly an order of magnitude more non-polyadenylated RNA.
1.5