Supplementary Materials for

Contrasted sex chromosome evolution in with and without sexual dimorphism

Rylan Shearn, Emilie Lecompte, Corinne Régis, Sylvain Mousset, Simon Penel, Guillaume Douay, Brigitte Crouau-Roy, Gabriel A.B. Marais

Correspondence to: [email protected]

This PDF file includes:

Supplementary Text S1 to S2 Figs. S1 to S2 Tables S1

1

Supplementary Text Text S1: Regions of the strepsirrhine X chromosomes with unusual male:female coverage ratio In Fig. 1, both lemur X chromosomes exhibit regions with male:female coverage ratio close to 1 (shown in grey) in their X-specific parts, where a ratio of 0.5 is expected. The gray mouse lemur has five such regions, the northern greater three. The dot plots of the lemur and the X chromosomes (see Fig. 1 and S1) clearly show that little or no homologous genes are found in those regions, which suggest that they may be homologous to other human chromosomes. This would be consistent with the male:female coverage ratio of 1, typical of autosmal regions, that we found for these regions. To explore this possibility, we extracted the sequences of those regions and performed a tblastn against all the human proteins (human genome version GRCh38). In case of isoforms, the longest protein was kept so that a human gene was present only once. We then filtered the tblastn results by keeping only hits with >80% similarity (based on average nucleotide divergence between lemurs and ) and e-value < 10-9. From those, we kept human proteins covered by hits to >80% using SiLix (Miele, Penel, & Duret, 2011). Only proteins matching to no more than one region were kept. The results of the tblastn are shown in the table below.

Microcebus murinus X chromosome regions* Otolemur garnetti X chromosome regions* Human 30.3- 41.6- 46.8-48 61.5- 92.7- 49.5- 80-84.5 116- chromosomes 33.2 44.1 63.7 93.7 68.5 133 Chrom. 1 4 54 13 2 1 Chrom. 2 2 4 1 Chrom. 3 1 Chrom. 4 2 Chrom. 5 2 2 1 Chrom. 6 10 Chrom. 7 2 Chrom. 8 1 1 4 1 1 1 Chrom. 9 1 Chrom. 10 Chrom. 11 Chrom. 12 8 1 2 3 15 Chrom. 13 1 1 44 Chrom. 14 4 1 1 Chrom. 15 2 2 Chrom. 16 1 1 1 Chrom. 17 1 3 1 Chrom. 18 2 Chrom. 19 1 3 Chrom. 20 119 Chrom. 21 Chrom. 22 1 1 Chrom. X 2 1 2 3 *coordinates in Mb Human chromosome with the largest number of homologs is shown in bold

2

For all regions except one, most homologs that we identified are from the human autosomes, which confirms our hypothesis. These homologs are mainly from one source: chromosomes 1, 8 and 12 for regions 46.8-48, 61.5-63.7, 92.7-93.7 and 41.6-44.1 of the gray mouse lemur X chromosome, and chromosomes 12, 13 and 20 for regions 80-84.5, 116-133 and 49.5-68.5 of the northern X chromosome. These results can be interpreted two ways. A possibility is that the assemblies of those lemur X chromosomes wrongly include autosomal scaffolds. Another possibility is that during the evolution of strepirrhines, some autosomal fragments have been translocated to the PAR, and the assembly failed to order these fragments correctly. Our approach cannot tell apart these possibilities but in all cases, our results suggest that these regions are probably assembly errors. Changing tblastn outputs filtering did not change qualitatively the results. With lower %identity thresholds, we detected autosomal homologs for region 30.3-33.2 (for example, with %identity > 65, we found 2 proteins from chrom. 1, 1 from chrom. 2 and 1 from chrom 19).

References Miele, V., Penel, S., & Duret, L. (2011). Ultra-fast sequence clustering from similarity networks with SiLiX. BMC Bioinformatics, 12, 116. doi: 10.1186/1471-2105-12-116

3

Text S2: Statistical tests on strata formation

Assuming a constant rate l for the formation of new evolutionary strata, the number S of new strata formed during a time interval Dt is Poisson-distributed with parameter lDt

k lDt (lDt) e P(S = k)= . (1) k!

We partition the time interval Dt into two parts Dt = Dt1 +Dt2. During the time interval Dti we observe the formation of Si new strata. We want to contrast the following two hypotheses:

• H0: Strata accumulated at a common rate l0.

• H1: Strata accumulated at different rates li during the time intervals Dti. Likelihood ratio test

Differentiating equation (1) with respect to l enables to find the maximum likelihood estimator for l: ˆ S ˆ S1 + S2 ˆ Si l = l0 = andli = . (2) Dt ) Dt1 + Dt2 Dti

A likelihood-ratio test can be used to compare the likelihoods L0 and L1 of the models underlying the null and alternative hypotheses H0 and H1.

ˆ L0 = P S = k1 + k2 l = l0,Dt = Dt1 + Dt2 , | ⇣ ˆ ⌘ ˆ L1 = P S1 = k1 l1 = l1,Dt = Dt1 P S2 = k2 l2 = l2,Dt = Dt2 . | ⇥ | ⇣ ⌘ ⇣ ⌘ L0 2 Under the null hypothesis H0, the likelihood-ratio statistics X = 2log is approximately c -distributed L1 with 1 degree of freedom.

Exact binomial test

An alternative method consists in using the number S1 of strata formed in the time interval Dt1 as the test statistics and compute the the conditional probability to observe a larger value given the total number of strata formed in the time interval Dt1 + Dt2:

k +k k +k j k1+k2 j ( t + t ) (k + k )! 1 2 l 1 2 Dt Dt e l D 1 D 2 (S k k + k , t , t )= 1 2 ⇥ 1 ⇥ 2 ⇥ ⇥ P 1 > 1 1 2 D 1 D 2 k +k l (Dt +Dt ) Â | (l (Dt1 + Dt2)) 1 2 e 1 2 j!(k1 + k2 j)! ⇥ ⇥ j=k1 k +k j k1+k2 j 1 2 k + k Dt Dt = Â 1 2 1 2 , j Dt1 + Dt2 Dt1 + Dt2 j=k1 ✓ ◆✓ ◆ ✓ ◆ (3) where we recognize the binomial distribution. Note that this is now independent of the rate l of the Poisson process. Basically, applying this test is conceptually equivalent to tossing an unbalanced coin k +k times with a probability p = Dt1 to get a head and computing the probability to obtain at least 1 2 Dt1+Dt2 k1 times a head.

Evolutionary times

The phylogenetic relationships and mean divergence times for the included species were recovered from the previously published primate phylogeny and divergence dates (Pozzi et al., 2014, supplementary table 3). Detailed phylogenetic relationships among strepsirrhine lineages (Horvath et al., 2008) were used to infer phylogenetic reletionships in the cases when species in our analysis were not

1 4

included in this reference study. The values are reported in the following table. Node labela Mean agea (My) Connected nodes or species 4 7.65 Homo sapiens Pan troglodytes 5 10.63 4 Gorilla gorilla 7 17.29 5 Pongo pygmaeus 38 32.12 7 Macaca mulatta 44b 46.72 38 Callithrix jacchus 61 74.11 44 60 50c 24.24 Eulemur rufus Hapalemur simus 54d 43.46 50 Microcebus murinus 55 59.55 54 Daubentonia madagascarensis 56 17.28 Galago senegalensis Otolemur garnetti 58 36.35 56 Nycticebus cougang 60 66.33 55 58

aFor sake of clarity we report the node labels and ages from our reference primate phylogeny (Pozzi et al., 2014). bCallithrix jacchus is not included in the reference phyolgeny but the divergence time between platyrrhini and catarrhini can be used. cNeither Eulemur rufus nor Hapalemur simus are included in the reference phylogeny, but the divergence between Eulemur macaco and Lemur catta can be used (node 18 in Horvath et al., 2008). dMicrocebus murinus is not included in the reference phylogeny, but the divergence with Lepilemur sp. can be used (node 7 in Horvath et al., 2008).

Results

The catarrhini and platyrrhini lineages evolved for Dt1 = 188.52 My during which S1 = 3 new strata were formed. The lineages evolved for Dt2 = 321.32 My and no new strata was formed (S2 = 0). A likelihood-ratio test comparing these rates would be significant (X = 5.95, p = 0.015) however so few observations certainly violate the conditions for the convergence of the null distribution to the c2 distribution. The result of the exact binomial test approaches statistical significance (one-tailed test, p = 0.0506), suggesting new strata could form at a higher rate in the catarrhini and platyrrhini than in the strepsirrhini lineages. One has to keep in mind however that we used mean divergence times. Although 95% confidence intervals are available for the divergence times, we are not able to provide confidence intervals for the durations Dt1 and Dt2. Moreover, it would make sense to scale durations based on the generation times. Our approach assumes equal average generation time but shorter generations in the strepsirrhini lineages would lead to a more significant difference.

References

Horvath, J. E., D. W. Weisrock, S. L. Embry, I. Fiorentino, J. P. Balhoff, P. Kappeler, G. A. Wray, H. F. Willard, and A. D. Yoder. 2008, Development and application of a phylogenomic toolkit: resolving the evolutionary history of madagascar’s lemurs. Genome research 18:489–499.

Pozzi, L., J. A. Hodgson, A. S. Burrell, K. N. Sterner, R. L. Raaum, and T. R. Disotell. 2014, Pri- mate phylogenetic relationships and divergence dates inferred from complete mitochondrial genomes. Molecular phylogenetics and evolution 75:165–183.

5

2

PAR2 150

S1 100 Region

misA mPAR PAR1 PAR2

humanX S1 S23 S4 S5

50

S2

S3 S4 S5 PAR1 0

2.0

1.5 Region

1.0 misA mPAR 0.5 Otolemur (M/F) Otolemur

0.0 0 50 100 150 Otolemur Chr X (Mbp)

Fig. S1. Synteny analysis of and human X chromosomes. (A) synteny plot of the human and northern greater galago X chromosomes. The human X was used to order the northern greater galago scaffolds (see Methods). Black dots represent orthologous genes between the human and northern greater galago X chromosomes. Strata in humans using (Hughes & Rozen, 2012; Skaletsky et al., 2003) definition of strata are shown (note that old strata have been split into smaller strata in (Pandey, Wilson Sayres, & Azad, 2013)). (B) M:F read depth ratio along the northern greater galago X chromosome.

6

1.5

1.0 Region misA

0.5 mPAR Nycticebus (M/F) Nycticebus 0.0

1.5

1.0

0.5 Galago (M/F)

0.0

2.0

1.5

1.0

0.5 Otolemur (M/F) Otolemur 0.0 0 50 100 150 Otolemur Chr X (Mbp) 1.5

1.0

0.5

E. rubriventer (M/F) 0.0

1.5

1.0

0.5 Prolemur (M/F) Prolemur

0.0

1.5

1.0

0.5 Microcebus (M/F) Microcebus 0.0

1.5

1.0

0.5

Daubentonia (M/F) 0.0 80 60 40 20 0 0 25 50 75 100 Mya Microcebus Chr X (Mbp) Fig. S2. SNP density analysis. M:F SNP density ratio for all seven strepsirrhine species (see Methods for details).

7

Table S1. Statistics about the genome sequencing in the 7 species.

Cover Read Sequenc Lineage Genus Species Common name Sex Read # age* Sources length es (Gb) (X) Daubentonia madagascariensis aye-aye M 574 860 296 125 71.9 23.2 MNHN, Paris Lemuriformes Daubentonia madagascariensis aye-aye F 807 533 380 150 121.1 39.1 Zoo Frankfurt Lemuriformes Microcebus murinus gray mouse lemur M 553 217 340 125 69.2 22.3 MNHN, Brunoy Lemuriformes Microcebus murinus gray mouse lemur F 567 375 076 125 70.9 22.9 MNHN, Brunoy Lemuriformes Eulemur rubriventer red-bellied lemur M 361 251 832 150 54.2 17.5 Zoo de Lyon Lemuriformes Eulemur rubriventer red-bellied lemur F 316 639 574 150 47.5 15.3 Zoo de Lyon Lemuriformes Prolemur simus greater bamboo lemur M 242 884 578 150 36.4 11.8 Zoo de Lyon Lemuriformes Prolemur simus greater bamboo lemur F 428 087 286 150 64.2 20.7 Zoo de Lyon Lorisiformes Nyctibebus coucang slow loris M 665 798 842 150 99.9 32.2 MNHN, Paris Lorisiformes Nyctibebus coucang slow loris F 670 569 564 150 100.6 32.4 MNHN, Paris Lorisiformes Galago senegalensis M 641 724 580 150 96.3 31.1 MNHN, Paris Lorisiformes Galago senegalensis senegal bushbaby F 666 087 196 150 99.9 32.2 MNHN, Paris Lorisiformes Otolemur garnetti northern greater galago M 662 474 900 150 99.4 32.1 MNHN, Paris Lorisiformes Otolemur garnetti northern greater galago F 2 599 993 104 100 260.0 83.9 EBI** *based on human genome size (assuming similar genome sizes in humans and all these species) **SRR016877 to SRR016896 files fastq.gz 1 and 2.

8

References Hughes, J. F., & Rozen, S. (2012). Genomics and genetics of human and primate y chromosomes. Annu Rev Genomics Hum Genet, 13, 83-108. doi: 10.1146/annurev- genom-090711-163855 Pandey, R. S., Wilson Sayres, M. A., & Azad, R. K. (2013). Detecting evolutionary strata on the human x chromosome in the absence of gametologous y-linked sequences. Genome Biol Evol, 5(10), 1863-1871. doi: 10.1093/gbe/evt139 Skaletsky, H., Kuroda-Kawaguchi, T., Minx, P. J., Cordum, H. S., Hillier, L., Brown, L. G., . . . Page, D. C. (2003). The male-specific region of the human Y chromosome is a mosaic of discrete sequence classes. Nature, 423(19 June), 825 - 837.

9