1 Supplementary Material a Major Clade of Prokaryotes with Ancient
Supplementary Material A major clade of prokaryotes with ancient adaptations to life on land Fabia U. Battistuzzi and S. Blair Hedges Data assembly and phylogenetic analyses Protein data set: Amino acid sequences of 25 protein-coding genes (“proteins”) were concatenated in an alignment of 18,586 amino acid sites and 283 species. These proteins included: 15 ribosomal proteins (RPL1, 2, 3, 5, 6, 11, 13, 16; RPS2, 3, 4, 5, 7, 9, 11), four genes (RNA polymerase alpha, beta, and gamma subunits, Transcription antitermination factor NusG) from the functional category of Transcription, three proteins (Elongation factor G, Elongation factor Tu, Translation initiation factor IF2) of the Translation, Ribosomal Structure and Biogenesis functional category, one protein (DNA polymerase III, beta subunit) of the DNA Replication, Recombination and repair category, one protein (Preprotein translocase SecY) of the Cell Motility and Secretion category, and one protein (O-sialoglycoprotein endopeptidase) of the Posttranslational Modification, Protein Turnover, Chaperones category, as annotated in the Cluster of Orthologous Groups (COG) (Tatusov et al. 2001). After removal of multiple strains of the same species, GBlocks 0.91b (Castresana 2000) was applied to each protein in the concatenation to delete poorly aligned sites (i.e., sites with gaps in more than 50% of the species and conserved in less than 50% of the species) with the following parameters: minimum number of sequences for a conserved position: 110, minimum number of sequences for a flank position: 110, maximum number of contiguous non-conserved positions: 32000, allowed gap positions: with half. The signal-to-noise ratio was determined by altering the “minimum length of a block” parameter.
[Show full text]