1 Predicting the Viability of Archaic Human Hybrids Using a Mitochondrial Proxy 2 3 Supplementary Figure Legends 4 5 Figure S1
Total Page:16
File Type:pdf, Size:1020Kb
1 Predicting the viability of archaic human hybrids using a mitochondrial proxy 2 3 Supplementary Figure Legends 4 5 Figure S1. This figure is identical to Figure 1 in the main text but includes the numbers 6 associated with each pairwise comparison that are listed in Table S1. The association 7 between the numbers and the pairs (with percent divergence listed in parentheses) are as 8 follows: 9 10 1. Papio_hamadryas-Macaca_mulatta (14.2) 11 2. Sus_scrofa_domesticus-Babyrousa_celebensis (12.9) 12 3. Peromyscus_truei_comanche-Peromyscus_nasutus (11.8) 13 4. Panthera_tigris-Panthera_leo (10.1) 14 5. Papio_hamadryas-Theropithecus_gelada (9.5) 15 6. Mus_musculus_musculus-Mus_spretus (8.8) 16 7. Cavia_fulgida-Cavia_porcellus (8.0) 17 8. Equus_caballus-Equus_asinus (7.7) 18 9. Pongo_pygmaeus-Pongo_abelii (7.6) 19 10. Myodes_rutilus-Myodes_glareolus (7.5) 20 11. Canis_latrans-Canis_aureus (6.5) 21 12. Canis_latrans-Canis_lupus (6.4) 22 13. Papio_cynocephalus-Papio_anubis (6.0) 23 14. Papio_hamadryas-Papio_anubis (5.3) 24 15. Peromyscus_polionotus-Peromyscus_maniculatus (4.6) 25 16. Ursus_arctos-Ursus_maritimus (2.4) 26 17. Mus_musculus_musculus-Mus_musculus_domesticus (2.3) 27 28 Hominini hybrid comparisons: 29 18. Pan_troglodytes-Homo_sapiens_sapiens_modern 11.1 30 19. Pan_paniscus-Homo_sapiens_sapiens_modern 10.8 31 20. Homo_sapiens_spp._Denisova-Homo_sapiens_neanderthalensis 2.7 32 21. Homo_sapiens_spp._Denisova-Homo_sapiens_sapiens_modern 2.5 1 33 22. Homo_sapiens_spp._Denisova-Homo_sapiens_sapiens_ancient 2.4 34 23. Homo_sapiens_neanderthalensis-Homo_sapiens_spp._Sima-de-los-Huesos 2.0 35 24. Homo_sapiens_spp._Sima-de-los-Huesos-Homo_sapiens_sapiens_modern 1.9 36 25. Homo_sapiens_spp._Sima-de-los-Huesos-Homo_sapiens_sapiens_ancient 1.8 37 26. Homo_sapiens_neanderthalensis-Homo_sapiens_sapiens_modern 1.6 38 27. Homo_sapiens_neanderthalensis-Homo_sapiens_sapiens_ancient 1.6 39 28. Homo_sapiens_spp._Denisova-Homo_sapiens_spp._Sima-de-los-Huesos 1.3 40 41 Felidae hybrid comparisons: 42 Serval x Cat. Leptailurus_serval-Felis_catus (11.3) 43 Leopard_Cat x Cat. Prionailurus_bengalensis-Felis_catus (10.9) 44 Jungle_Cat x Cat. Felis_chaus-Felis_catus (7.5) 45 46 Figure S2. A comparison of the relative CYTB divergence values between those hybrid 47 offspring with known degrees of fertility (green and brown circles, see Figure 1) and those 48 pairs who were able to produce live offspring, but for whom the fertility of their offspring is 49 unknown (white circles). Divergence values are listed on the y-axis as a percentage. Numbers 50 alongside each circle represent specific species pairs and their divergence values are listed in 51 parentheses: 52 53 1. Castor_canadensis-Castor_fiber (11.7) 54 2. Ursus_arctos-Ursus_americanus (11.5) 55 3. Macaca_nemestrina-Macaca_fascicularis (11.2) 56 4. Macaca_nemestrina-Macaca_mulatta (10.4) 57 5. Papio_anubis-Theropithecus_gelada (10.2) 58 6. Lepus_europaeus-Lepus_timidus (9.8) 59 7. Macaca_thibetana-Macaca_fascicularis (8.9) 60 8. Mus_musculus_domesticus-Mus_spretus (8.7) 61 9. Mustela erminea-Mustela_putorius (8.3) 62 10. Diceros_bicornis-Ceratotherium_simum_simum (8.2) 63 11. Macaca_mulatta-Macaca_fascicularis (8.2) 64 12. Acomys_dimidiatus-Acomys_minous (8.0) 2 65 13. Cavia_aperea-Cavia_porcellus (7.9) 66 14. Loxodonta_africana-Elephas_maximus (7.0) 67 15. Loxodonta_cyclotis-Loxodonta_africana (4.6) 68 16. Pan_paniscus-Pan_troglodytes (4.6) 69 17. Gorilla_beringei_graueri-Gorilla_gorilla_gorilla (4.4) 70 18. Connochaetes_gnou-Connochaetes_taurinus (2.8) 71 19. Ceratotherium_simum_cottoni-Ceratotherium_simum_simum (0.9) 72 73 The establishment of the framework and the threshold values can be used to predict the 74 relative fertility of the hybrid offspring in cases where there is insufficient experimental 75 information. 76 77 Figure S3. A comparison of the relative divergence values and pattern between those 78 calculated using CYTB and those using full mitogenomes, and four nuclear genes: ZFY, 79 ZFX, GHR, and CHRNA1. Divergence values are listed on the y-axis for each locus as a 80 percentage. Numbers alongside each circle represent a species pair: 81 82 1. Papio_hamadryas-Macaca_mulatta 83 2. Macaca_nemestrina-Macaca_fascicularis 84 3. Pan_troglodytes-Homo_sapiens_sapiens 85 4. Pan_paniscus-Homo_sapiens_sapiens 86 5. Macaca_nemestrina-Macaca_mulatta 87 6. Papio_anubis-Theropithecus_gelada 88 7. Papio_hamadryas-Theropithecus_gelada 89 8. Macaca_mulatta-Macaca_fascicularis 90 9. Papio_anubis-Papio_hamadryas 91 10. Pan_paniscus-Pan_troglodytes 92 93 In each case, the distance values of the nuclear genes are smaller relative to those obtained 94 using CYTB as a result of the slower pace of nuclear evolution. Despite this, a clear threshold 95 between the two categories of fertility amongst the hybrid offspring remains. 96 3 97 Figure S4. Images of H&E stained testes of an adult male liger (Panthera leo x Panthera 98 tigris) in panels a and b, and of an adult male tiliger (male tiger x female liger) in panels c 99 and d. The testes show clear seminiferous tubule degeneration, lined only with Sertoli cells in 100 the liger, and tubule degeneration with germ cell arrest in the tiliger. 101 102 Source code for a custom Python version 2.7 terminal program to calculate pair-wise 103 Hamming distances between the sequences contained within a Fasta file. 104 105 106 #!/usr/bin/python 107 108 #import all python 2.7 native modules 109 import operator, StringIO, itertools, sys, math, os 110 111 #import all non-native modules 112 import distance, Bio 113 114 #import various classes from modules 115 from Bio import SeqIO 116 from Bio import AlignIO 117 from Bio.Align import AlignInfo 118 from itertools import izip, imap 119 from os import path 120 121 #disable screen blanking and the terminal cursor 122 os.system('setterm -cursor off') 123 124 #calculate hamming distance between two strings 125 def hamming(str1, str2): 126 assert len(str1) == len(str2) 127 ne = str.__ne__ 128 ne = operator.ne 129 return sum(imap(ne, str1, str2)) 130 131 #count gaps in the consensus sequence 132 def count_gaps(consensus): 133 b = 0 134 for a in range(0,len(consensus)): 135 if consensus[a] == '-': 136 b +=1 4 137 return b 138 139 #build a consensus sequence from two sequences 140 def consensus_seq(str1,str2): 141 consensus_string = [] 142 for a, b in zip(range(0,len(str1)),range(0,len(str2))): 143 if str1[a] != str2[b]: 144 if str1[a]== '-' or str2[b] == '-': 145 consensus_string.append('-') 146 else: 147 consensus_string.append('N') 148 else: 149 consensus_string.append(str1[a]) 150 return ''.join(consensus_string) 151 152 #compare the gaps locations between pairwise sequences 153 def compare_gaps(str1,str2): 154 gaps1 = 0 155 gaps2 = 0 156 gaps3 = 0 157 for bp1, bp2 in zip(str1,str2): 158 if bp1 == "-" and bp1 == bp2: 159 gaps1 +=1 160 if bp1 == "-" and bp1 != bp2: 161 gaps2 +=1 162 if bp2 == "-" and bp2 != bp1: 163 gaps3 +=1 164 return [gaps1,gaps2,gaps3] 165 166 #clearout the command line so the progress counter remains in the same 167 location on the screen 168 def restart_line(): 169 sys.stdout.write('\r') 170 sys.stdout.flush() 171 172 #grab user defined input file from terminal 173 fasta_file = sys.argv[1] 174 175 #create a file to output raw distances to 176 raw_distance_file = open(sys.argv[2],'w') 177 5 178 #parse the fasta file using biopython to create iterable fasta object 179 sequences = SeqIO.parse(open(fasta_file),"fasta") 180 181 #define a list to append sequence data to 182 sequence_list = [] 183 184 #iterate over fasta sequence objects and add their id and sequence to a 185 line separated by a comma 186 for record in sequences: 187 sequence_list.append(record.id + ',' + record.seq) 188 189 #create an array containing all the possible pairwise combinations of all 190 the sequences in the sequence list 191 pairwise_sequences = itertools.combinations(sequence_list,2) 192 193 #print a spacing line in the terminal output 194 print"" 195 196 #count the number of pairwise comparisons and put into a variable 197 for i, item in enumerate(itertools.combinations(sequence_list,2)): 198 no_pairwise_seqs = i 199 200 #for each pair_wise comparison between sequences 201 for i, item in enumerate(pairwise_sequences): 202 203 #create an output file like class that can be written to and read from 204 output = StringIO.StringIO() 205 206 #format the two strings for comparison into fasta format in two string 207 variables 208 str1 = ">"+item[0].split(',')[0]+"\n"+item[0].split(',')[1]+"\n" 209 str2 = ">"+item[1].split(',')[0]+"\n"+item[1].split(',')[1]+"\n" 210 211 #create a temporary fasta file on the hard drive 212 temp_fasta = open("/home/richard/Documents/temp_fasta.fasta","w") 213 214 #write the fasta formatted sequences to the output class 215 output.write(str1+'\n') 216 output.write(str2+'\n') 217 218 #store the value of the output class to a variable 6 219 contents = output.getvalue() 220 221 #close the output class removing it from memory 222 output.close() 223 224 #write the contents of the variable to the temporary fasta file 225 temp_fasta.write(contents) 226 227 #close the fasta file removing it from memory and saving the changes 228 temp_fasta.close() 229 230 #grab temp file name and put into a variable 231 temp_fasta = "/home/richard/Documents/temp_fasta.fasta" 232 233 #create an alignment using the sequences in the temp file 234 alignment = AlignIO.read(open(temp_fasta),"fasta") 235 236 #summary_align = AlignInfo.SummaryInfo(alignment) 237 238 #grab the consensus sequence from the pairwise aligment 239 consensus