<<

Supplementary information

Supplementary table

Table S1.: Number of H3.3- and H3.5-encoding genes per organism

No. Name Scientific name Class / phylum No. of H3.3 No. of H3.5 1 Human Homo sapiens 2 1 2 Chimpanzee Pan troglodytes Sarcopterygii 2 1 3 Pygmy chimpanzee Pan paniscus Sarcopterygii 2 1 4 Gibbon Nomascus leucogenys Sarcopterygii 2 1 5 Gorilla Gorilla gorilla Sarcopterygii 2 1 6 Orangutan Pongo abelii Sarcopterygii 2 0 7 Baboon Papio anubis Sarcopterygii 2 0 8 Macaque Macaca mulatta Sarcopterygii 2 0 9 Marmoset Callithrix jacchus Sarcopterygii 2 0 10 Mouse Mus musculus Sarcopterygii 2 0 11 Rat Rattus norvegicus Sarcopterygii 2 0 12 Prairie Vole Microtus ochrogaster Sarcopterygii 2 0 13 Opossum Monodelphis domestica Sarcopterygii 2 0 14 Dog Canis familialis Sarcopterygii 2 0 15 Pig Sus scrofa Sarcopterygii 2 0 16 Cow Bos taurus Sarcopterygii 2 0 17 Horse Equus caballus Sarcopterygii 2 0 18 Chicken Gallus gallus Sarcopterygii 2 0 19 Zebra Finch Taeniopygia guttata Sarcopterygii 2 0 20 Lizard Anolis carolinensis Sarcopterygii 2 0 21 Frog Xenopus tropicalis Sarcopterygii 2 0 22 Latimeria chalumnae Sarcopterygii 3 0 23 Tilapia Oreochromis niloticus 5 0 24 Zebrafish Danio rerio Actinopterygii 5 0 27 Fugu Takifugu rubripes Actinopterygii 3 0 26 Medaka Oryzias latipes Actinopterygii 3 0 27 Tetraodon Tetraodon nigroviridis Actinopterygii 3 0 28 Spotted gar Lepisosteus oculatus Actinopterygii 3 0 29 Stickleback Gasterosteus aculeatus Actinopterygii 3 0 30 Petromyzon marinus Hyperoartia 2 0 31 Fruit fly Drosophila melanogaster Insecta 2 0 32 Worm Caenorhabditis elegans Nematoda 2 0 Supplementary figures

Figure S1. Sequence similarity of H3F3A, H3F3B and H3F3C in metazoa

A Protein sequence identity computed between human, mouse and zebra finch H3.3 genes (H3F3A and H3F3B) and the H3.3 genes in analyzed metazoa genomes. The hominid-specific H3F3C is included at the bottom. B. Similar identities as in (A) computed using coding DNA sequences.

Figure S2. Synteny around H3F3A and H3F3B genes

A, B Conservation of genes around H3.3 genes between human and other

(mouse, zebra finch and lizard; top three rows in A) and conservation genes around H3.3 genes between tetrapods and distant organisms (actinopterygian, fly and worm; the last two rows in A and all rows in B). Sequence conservation (identities) was measured for both nucleotide and protein sequences. The brown graph shows identities between genes around tetrapods H3.3 and human H3F3A, or genes conserved around non- H3.3 and tetrapod H3F3A genes. Similarly, the blue graph shows identities between genes conserved around H3.3-encoding gene in a given organism and genes around Human

H3F3B or tetrapod H3F3B. A blue star indicates non-tetrapod gene sharing syntenic genes with tetrapod H3F3B. The highest peaks at gene position 0 (100% protein identity for brown and blue lines) represent the identity between a given H3.3 gene and tetrapod

H3F3A and H3F3B.

Figure S3. Comparison of tetrapod H3.3 genes to related genes in sarcopterygian and non-sarcopterygian lineages. Average sequences similarity was estimated for the CDS of tetrapod (human, mouse, zebra finch) H3.3 genes (H3F3A, x-axis and H3F3B, y-axis) and CDS of each H3.3 gene in sarcopterygians, actinopterygians and more distant organisms (lamprey, fly).

Additionally, CDS of tetrapod and zebrafish H3.1 and H3.2 genes were included in this analysis. Each point represents a gene and the organism name is written in the matching color. The sequence similarity represents percentage of the identical nucleotides in the sequence.

Figure S4. Conservation of H3.3 genes 3’UTRs in metazoa

A, B H3.3 gene 3’UTRs sequence conservation among tetrapod organisms (top two rows in A) and H3.3 gene 3’UTRs conservation between tetrapods and non-tetrapod organisms

(coelacanth, actinopterygians, fly and worm; last three rows in A and all rows in B). The blue line represents the sequence identity between a given H3.3 gene’s UTR and the

H3F3B 3’UTR from ten tetrapod organisms (x-axis). Similarly, the brown line shows the sequence identity between the 3’ UTR of a given gene and the 3’UTR of tetrapod H3F3A genes. A blue asterisk indicates non-tetrapod H3.3 genes for which UTRs sequence is more similar to that of tetrapod H3F3B (blue) or tetrapod H3F3A (brown) in the majority tetrapod organisms included.

Figure S5. CDS conservation of histone variant genes in tetrapods

A, B. Pairwise nucleotide substitution scores (genetic distances) computed for two H3.3 genes (H3F3A, brown and H3F3B, blue), and H2AFZ gene (red) which was included in this analysis for comparison. The analysis was performed separately for mammalian (A) and other tetrapod genomes (reptiles, birds and amphibians; B). Comparison of scores was done with a Wilcox sum rank test.

Figure S6. GC3 of H3.3 CDS

A. Distribution of GC3 scores for tetrapod H3F3A and H3F3B, actinopterygian H3.3, tetrapod/zebrafish H3.1, H3.2 and H2AFZ genes. B. Distribution of normalized GC3 scores. For a particular gene, the normalized GC3 scores is the GC-content at codon 3rd position (GC3) divided by the GC contend of the whole gene (GC-CDS). The pink dashed lines represents average correlation computed for human ubiquitously expressed genes

(UEG) [44].

Figure S7. Distinct codon usage preferences in the H3.3 genes

A. Correlation between codon usage in the genes specified at x-axis and the genome- wide codon usage. The box plots represent the lineage distributions of the correlation coefficients calculated for the ‘codon frequencies’ of a corresponding gene with those estimated genome-wide (e.g. all tetrapod H3F3A genes vs. genome-wide frequencies).

The brown and blue diamonds provide reference for human H3F3A and H3F3B respectively. The pink dashed lines represents average correlation computed for human ubiquitously expressed genes (UEG) [44]. B. Correlation of human H3F3A and H3F3B

‘codon frequencies’ with those computed for the genes associated with cell proliferation

(orange) and cell differentiation (green) [41]. Each dot represents individual gene from the corresponding group. The dotted lines indicate the correlation coefficient medians for each group and the H3.3 gene. C,D. Comparison of distributions of correlation scores of human H3F3A ‘amino-acid specific codon frequencies’ and those computed for genes associated with cell proliferation (orange), and cell differentiation (green) [41] (C). Similar comparisons performed for Human H3F3B and genes associated with cell proliferation

(orange) and differentiation (green) (D). A vertical dashed line represents median of correlation scores obtained for a given H3.3 gene and a given group of genes.

Comparison of scores was done with a Mann-Whitney test. E,F. Similar comparisons as in C and D respectively, performed based on ‘codon frequencies’. A .609 .609 .60.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.96 0.99 0.96 0.96 0.99 0.96 0.96 0.99 0.99 0.96 0.99 0.99 0.96 0.99 0.99 0.99 0.99 0.99 0.99 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 11111 Figure S1. Human H3F3A Protein identity(%) sequence

Mouse H3F3A 1 0.98 0.96

Zebra finch H3F3A

Human H3F3B

Mouse H3F3B

Zebra finch H3F3B 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Actinopterygii (ray-nned sh) H3.3 Sarcopterygii H3F3A

Sarcopterygii H3F3B B .907 .909 .10.86 0.91 0.86 0.99 0.92 0.79 0.86 0.99 0.78 0.92 0.79 0.79 0.78 0.99 0.78 0.83 0.76 0.79 0.78 0.81 0.76 0.78 0.85 0.78 0.8 0.82 0.87 0.79 0.87 0.88 0.79 0.81 0.88 0.76 0.92 0.89 0.79 0.78 0.89 0.75 0.88 0.8 0.89 0.78 0.96 0.76 0.88 0.98 0.79 0.78 0.96 0.76 0.87 0.98 0.79 0.78 0.96 0.77 0.88 0.98 0.78 0.78 0.96 0.76 0.88 0.98 0.79 0.78 0.99 0.75 0.87 0.96 0.78 0.98 0.88 0.96 0.96 0.99 .807 . . .70.79 0.77 0.8 0.8 0.79 0.8 0.81 0.79 0.81 0.77 0.8 0.8 0.78 0.8 0.77 0.8 0.82 0.77 0.78 0.8 0.86 0.77 0.83 0.8 0.85 0.85 0.84 0.78 0.8 0.82 0.78 0.8 0.78 0.83 0.84 0.84 0.81 0.8 0.82 0.82 0.82 0.79 0.83 0.78 0.82 0.78 0.78 0.78 0.8 0.77 0.83 0.79 0.81 0.79 0.8 0.77 0.83 0.82 0.81 0.82 0.84 0.83 0.82 0.8 0.81 0.81 0.83 0.78 0.84 0.81 0.83 0.84 0.82 0.8 0.82 0.83 0.83 0.77 0.84 0.83 0.83 0.81 0.79 0.83 0.77 0.8 0.8 0.83 0.81 0.84 0.79 0.86 0.83 0.8 0.82 0.84 0.86 0.83 0.78 0.82 0.85 0.78 0.83 0.83 0.84 0.81 0.85 0.83 0.81 0.82 0.83 0.79 0.83 0.83 0.81 0.8 0.81 0.8 0.85 0.78 0.77 0.83 0.78 0.84 0.78 0.83 0.83 0.78 0.85 0.83 0.83 0.77 0.81 0.82 0.82 0.8 0.8 0.82 0.79 0.8 0.8 0.79 0.85 0.8 0.87 0.91 0.78 0.93 0.9 0.86 0.79 0.92 0.77 0.9 0.79 0.76 0.85 0.93 0.78 0.88 0.79 0.78 0.86 0.91 0.78 0.95 0.79 0.77 0.87 0.91 0.79 0.97 0.82 0.78 0.85 0.92 0.77 0.86 0.89 0.78 0.76 0.96 0.77 0.9 0.78 0.76 0.86 0.97 0.77 0.86 0.91 0.79 0.78 0.92 0.97 0.79 0.99 0.79 0.79 0.78 0.8 0.78 0.78 0.79 0.79 0.78 0.78 .707 .709 . 0.85 0.9 0.86 0.98 0.9 0.86 0.77 0.97 0.77 0.9 0.77 0.77 0.86 0.97 0.77 0.89 0.77 0.77 0.75 0.97 0.77 0.75 0.77 0.77 0.76 0.76 0.76 0.75 0.78 0.77 0.76 0.76 0.77 0.77 0.76 0.78 . .908 . .20.81 0.82 0.8 0.83 0.79 0.8 0.83 0.81 0.82 0.81 0.79 0.8 0.82 0.79 0.81 0.79 0.81 0.8 0.81 0.8 0.8 0.82 0.81 0.8 0.85 0.82 0.84 0.8 0.81 0.8 Human H3F3A 1 1 1 1 1 1 1 .608 .807 0.79 0.75 0.79 0.78 0.75 0.88 0.79 0.78 0.96 0.75 0.88 0.79 0.78 0.96 0.75 0.88 0.79 0.78 0.96 0.75 0.88 0.79 0.78 0.96 0.75 0.88 0.79 0.78 0.96 0.75 0.88 0.78 0.96 0.88 0.96 .508 .51 0.95 0.85 0.75 Mouse H3F3A identity Nucleotide sequence

Zebra finch H3F3A

Human H3F3B 1 .10.87 0.91 Mouse H3F3B

Zebra finch H3F3B Frog H3F3B Lizard H3F3B Horse H3F3B Cow H3F3B Pig H3F3B Dog H3F3B Vole H3F3B Rat H3F3B H3F3B Marmoset Macaque H3F3B Baboon H3F3B Orangutan H3F3B H3F3B Gorilla Gibbon H3F3B Pyg Chimpanzee H3F3B Coelacanth H3F3A Frog H3F3A Lizard H3F3A Horse H3F3A Cow H3F3A Pig H3F3A Dog H3F3A Vole H3F3A Rat H3F3A H3F3A Marmoset Macaque H3F3A Baboon H3F3A Orangutan H3F3A H3F3A Gorilla Gibbon H3F3A Pyg Chimpanzee H3F3A Orangutan H3F3C Pyg Chimpanzee H3F3C Human H3F3C Worm His Worm His Fly H3.3chrX Fly H3.3chr2L Lamprey_H3.3_GL480101 Lamprey H3.3GL479001 Stickleback H3.3groupV Stickleback H3.3groupI chr13 H3.3 Medaka Medaka H3.3chr19 Fugu H3.3chr15 Fugu H3.3chr1 Zebrafish H3.3chr24 Zebrafish H3.3chr15 Zebrafish H3.3chr15a Zebrafish H3.3chr5 Zebrafish H3.3chr3 Tetraodon H3.3chr7 Tetraodon H3.3chr16 Tetraodon H3.3chr2 Tilapia H3.3LG142 Tilapia H3.3LG141 Tilapia H3.3LG10 Tilapia H3.3LG82 Tilapia H3.3LG81 Spotted garH3.3LG13 Spotted garH3.3LG2 Coelacanth H3.3 Coelacanth H3F3B    chimp H3F3B chimp H3F3A chimp H3F3C   72 chrIII 71 chrX Mouse H3F3A vs Human  CDS Mouse H3F3A vs Human  protein Mouse H3F3B vs Human  CDS Mouse H3F3B vs Human  protein

Mouse H3F3A vs Human H3F3A Mouse H3F3A vs Human H3F3A Mouse H3F3B vs Human H3F3A Mouse H3F3B vs Human H3F3A Mouse H3F3A vs Human H3F3B Mouse H3F3A vs Human H3F3B Mouse H3F3B vs Human H3F3B Mouse H3F3B vs Human H3F3B Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

30 20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20

Gene positions Gene positions Gene positions Gene positions

Zebra finch H3F3A vs Human  CDS Zebra finch H3F3A vs Human  protein Zebra finch H3F3B vs Human  CDS Zebra finch H3F3B vs Human  protein

Zebra finch H3F3A vs Human H3F3A Zebra finch H3F3A vs Human H3F3A Zebra finch H3F3B vs Human H3F3A Zebra finch H3F3B vs Human H3F3A Zebra finch H3F3A vs Human H3F3B Zebra finch H3F3A vs Human H3F3B Zebra finch H3F3B vs Human H3F3B Zebra finch H3F3B vs Human H3F3B Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30

Gene positions Gene positions Gene positions Gene positions

Lizard H3F3A vs Human  CDS Lizard H3F3A vs Human  protein Lizard H3F3B vs Human  CDS Lizard H3F3B vs Human  protein

Lizard H3F3A vs Human H3F3A Lizard H3F3A vs Human H3F3A Lizard H3F3B vs Human H3F3A Lizard H3F3B vs Human H3F3A Lizard H3F3A vs Human H3F3B Lizard H3F3A vs Human H3F3B Lizard H3F3B vs Human H3F3B Lizard H3F3B vs Human H3F3B Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

10 0 10 10 0 10 20 10 0 10 20 10 0 10

Gene positions Gene positions Gene positions Gene positions

Tetraodon H3.3 (chr2) vs Tetraodon H3.3 (chr2) vs Tetraodon H3.3 (chr7) vs Tetraodon H3.3 (chr7) vs Tetrapods  CDS Tetrapods  protein Tetrapods  CDS Tetrapods  protein

Tetraodon chr2 vs Tetrapods H3F3A Tetraodon chr2 vs Tetrapods H3F3A Tetraodon chr7 vs Tetrapods H3F3A Tetraodon chr7 vs Tetrapods H3F3A Tetraodon chr2 vs Tetrapods H3F3B Tetraodon chr2 vs Tetrapods H3F3B Tetraodon chr7 vs Tetrapods H3F3B Tetraodon chr7 vs Tetrapods H3F3B

1.TRIM65 2 2. WBP2 3. UNC13D 1 6 4. UNK 5. ITGB4 3 4 6. SAP32BP 5 Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

30 20 10 0 10 20 30 20 10 0 10 20 10 0 10 20 30 40 10 0 10 20 30 40

Gene positions Gene positions Gene positions Gene positions

Tetraodon H3.3 (chr16) vs Tetraodon H3.3 (chr16) vs Tetrapods  CDS Tetrapods  protein

Tetraodon chr16 vs Tetrapods H3F3A Tetraodon chr16 vs Tetrapods H3F3A Tetraodon chr16 vs Tetrapods H3F3B Tetraodon chr16 vs Tetrapods H3F3B Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

10 0 10 20 30 40 10 0 10 20 30 40

Gene positions Gene positions Figure S2A. Zebrafish H3.3 (chr3) vs Zebrafish H3.3 (chr3) vs Zebrafish H3.3 (chr5) vs Zebrafish H3.3 (chr5) vs tetrapods  CDS tetrapods  protein tetrapods  CDS tetrapods  protein

Zebrafish chr3 vs Tetrapods H3F3A Zebrafish chr3 vs Tetrapods H3F3A Zebrafish chr5 vs Tetrapods H3F3A Zebrafish chr5 vs Tetrapods H3F3A Zebrafish chr3 vs tetrapods H3F3B Zebrafish chr3 vs tetrapods H3F3B Zebrafish chr5 vs tetrapods H3F3B Zebrafish chr5 vs tetrapods H3F3B Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

10 0 10 20 30 40 10 0 10 20 30 40 30 20 10 0 10 20 30 20 10 0 10 20

Gene positions Gene positions Gene positions Gene positions

Zebrafish H3.3 (chr15) vs Zebrafish H3.3 (chr15) vs Zebrafish H3.3 (chr24) vs Zebrafish H3.3 (chr24) vs tetrapods  CDS tetrapods  protein tetrapods  CDS tetrapods  protein

Zebrafish chr15 vs Tetrapods H3F3A Zebrafish chr15 vs Tetrapods H3F3A Zebrafish chr15 vs Tetrapods H3F3A Zebrafish chr24 vs Tetrapods H3F3A Zebrafish chr15 vs tetrapods H3F3B Zebrafish chr15 vs tetrapods H3F3B Zebrafish chr15 vs tetrapods H3F3B Zebrafish chr24 vs tetrapods H3F3B Coding sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

40 30 20 10 0 10 40 30 20 10 0 10 20 10 0 10 20 30 20 10 0 10 20 30

Gene positions Gene positions Gene positions Gene positions

Medaka H3.3 (chr13) vs Medaka H3.3 (chr13) vs Medaka H3.3 (chr19) vs Medaka H3.3 (chr19) vs tetrapods  CDS tetrapods  protein tetrapods  CDS tetrapods  protein

Medaka chr13 vs Tetrapods H3F3A Medaka chr13 vs Tetrapods H3F3A Medaka chr19 vs Tetrapods H3F3A Medaka chr19 vs Tetrapods H3F3A Medaka chr13 vs tetrapods H3F3B Medaka chr13 vs tetrapods H3F3B Medaka chr19 vs tetrapods H3F3B Medaka chr19 vs tetrapods H3F3B Coding sequence identity Protein sequence identity Protein sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30

Gene positions Gene positions Gene positions Gene positions

Fly H3.3 (chrX) vs Fly H3.3 (chrX) vs Fly H3.3 (chr2L) vs Fly H3.3 (chr2L) vs tetrapods  CDS tetrapods  protein tetrapods  CDS tetrapods  protein

Fly chrX vs Tetrapods H3F3A Fly chrX vs Tetrapods H3F3A Fly chr2L vs Tetrapods H3F3A Fly chr2L vs Tetrapods H3F3A Fly chrX vs tetrapods H3F3B Fly chrX vs tetrapods H3F3B Fly chr2L vs tetrapods H3F3B Fly chr2L vs tetrapods H3F3B Protein sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

30 20 10 0 10 20 30 20 10 0 10 20 40 30 20 10 0 10 20 40 30 20 10 0 10 20

Gene positions Gene positions Gene positions Gene positions

Worm His71 (chrX) vs Worm H3.3 (chrX) vs Worm His72 (chrIII) vs Worm His72 (chrIII) vs tetrapods  CDS tetrapods  protein tetrapods  CDS tetrapods  protein

Worm chrX vs Tetrapods H3F3A Worm His71 vs Tetrapods H3F3A Worm His72 vs Tetrapods H3F3A Worm His72 vs Tetrapods H3F3A Worm chrX vs tetrapods H3F3B Worm His71 vs tetrapods H3F3B Worm His72 vs tetrapods H3F3B Worm His72 vs tetrapods H3F3B Protein sequence identity Protein sequence identity Coding sequence identity Protein sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

30 20 10 0 10 20 30 30 20 10 0 10 20 30 20 10 0 10 20 30 20 10 0 10 20 30

Gene positions Gene positions Gene positions Gene positions

Figure S2B. Raynned sh H3.3 Human H3F3B Tetrapod/zebrafish Primate H3F3C H3.2 genes Mouse H3F3B

Zebra finch H3F3B

Lamprey H3.3 GL480101 Coelacanth H3F3B Lamprey H3.3 GL479001 Mouse H3F3A Fly H3.3 chr2L Human H3F3A

Similarity H3F3B to tetrapod Zebra finch H3F3A Fly H3.3 chrX Coelacanth H3.3 Coelacanth H3F3A Tetrapod/zebrafish 0.75 0.80 0.85 0.90 0.95 H3.1 genes 0.75 0.80 0.85 0.90 0.95

Similarity to tetrapod H3F3A

Figure S3. 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0

Figure S4A. 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human Human Human

Tilapia H3.3(LG8_2)vstetrapods Chimpanzee Chimpanzee Zebra finch H3F3A vs tetrapods

Chimpanzee Chimpanzee Coelacanth H3F3Avstetrapods

Human Zebrafish H3.3(chr3) vstetrapods

Chimpanzee Human H3F3Avstetrapods Coelacanth H3F3AvsTetrapods H3F3B Coelacanth H3F3AvsTetrapods H3F3A

Pyg.chimp Zebra Zebra Pyg.chimp

Pyg.chimp Pyg.chimp Human H3F3AvsTetrapods H3F3B Human H3F3AvsTetrapods H3F3A

Pyg.chimp Tilapia LG8_2vsTetrapods H3F3B Tilapia LG8_2vsTetrapods H3F3A

Zebrafish chr3vsTetrapods H3F3B Zebrafish chr3vsTetrapods H3F3A Orangutan Orangutan

Orangutan Orangutan   Orangutan vsnch H3F3A Tetrapods H3F3B vsnch H3F3A Tetrapods H3F3A Baboon Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Macaque Mouse Mouse Mouse Mouse Mouse * * Vole Vole Zebra finchVole Zebra finchVole Zebra finch Zebra finch Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human 0.2 0.4 0.6 0.8 1.0 Human Human Chimpanzee Zebra finch H3F3B vs tetrapods Chimpanzee Tilapia H3.3(LG10)vstetrapods

Chimpanzee Chimpanzee Coelacanth H3F3Bvstetrapods

Human Zebrafish H3.3(chr5) vstetrapods

Chimpanzee Human H3F3Bvstetrapods Coelacanth H3F3BvsTetrapods H3F3B Coelacanth H3F3BvsTetrapods H3F3A

Pyg.chimp Zebra Zebra Pyg.chimp

Pyg.chimp Pyg.chimp Human H3F3BvsTetrapods H3F3B Human H3F3BvsTetrapods H3F3A Pyg.chimp Tilapia LG10vsTetrapods H3F3B Tilapia LG10vsTetrapods H3F3A

Zebrafish chr5vsTetrapods H3F3B Zebrafish chr5vsTetrapods H3F3A Orangutan Orangutan

Orangutan Orangutan   Orangutan vsnch H3F3B Tetrapods H3F3B vsnch H3F3A Tetrapods H3F3A Baboon Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Macaque Mouse Mouse Mouse Mouse Mouse * Vole Vole Zebra finchVole Zebra finchVole Zebra finch Zebra finch Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human

0.2 0.4 0.6 0.8 1.0 Human Tilapia H3.3(LG14_1)vstetrapods Human Zebrafish H3.3(chr15b) vstetrapods Chimpanzee Chimpanzee Human Chimpanzee Chimpanzee

Chimpanzee Coelacanth H3.3vstetrapods Mouse H3F3Avstetrapods Lizard H3F3Avstetrapods

Pyg.chimp Pyg.chimp Coelacanth H3.3vsTetrapods H3F3B Coelacanth H3.3vsTetrapods H3F3A Pyg.chimp Pyg.chimp Mouse H3F3AvsTetrapods H3F3B Mouse H3F3AvsTetrapods H3F3A Tilapia LG14_1vsTetrapods H3F3B Tilapia LG14_1vsTetrapods H3F3A Zebrafish chr15bvsTetrapods H3F3B Zebrafish chr15bvsTetrapods H3F3A Pyg.chimp Lizard H3F3AvsTetrapods H3F3B Lizard H3F3AvsTetrapods H3F3A Orangutan Orangutan Orangutan Orangutan Orangutan Baboon Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Macaque Mouse Mouse Mouse Mouse Mouse * * Vole Vole Zebra finchVole Zebra finchVole Zebra finch Zebra finch Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human

0.2 0.4 0.6 0.8 1.0 Human Tilapia H3.3(LG14_2)vstetrapods Human

Tilapia H3.3(LG8_1)vstetrapods Chimpanzee Chimpanzee Human Zebrafish H3.3(chr24) vstetrapods Chimpanzee Chimpanzee Chimpanzee Mouse H3F3Bvstetrapods Lizard H3F3Bvstetrapods Pyg.chimp Pyg.chimp Pyg.chimp Pyg.chimp Mouse H3F3BvsTetrapods H3F3B Mouse H3F3BvsTetrapods H3F3A Tilapia LG14_2vsTetrapods H3F3B Tilapia LG14_2vsTetrapods H3F3A Lizard H3F3BvsTetrapods H3F3B Lizard H3F3BvsTetrapods H3F3A

Pyg.chimp Tilapia LG8_1vsTetrapods H3F3B Tilapia LG8_1vsTetrapods H3F3A Zebrafish chr24vsTetrapods H3F3B Zebrafish chr24vsTetrapods H3F3A Orangutan Orangutan Orangutan Orangutan Orangutan Baboon Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Macaque

Mouse Mouse * Mouse Mouse Mouse * Vole Vole Zebra finchVole Zebra finchVole Zebra finch Zebra finch Zebra finchVole

Lizard Lizard Lizard Lizard Lizard 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 tetrapods vs groupI Stickleback_H3.3 0.2 0.4 0.6 0.8 1.0

0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human Tetraodon H3.3(chr2) vstetrapods Human Human Human Chimpanzee Chimpanzee Chimpanzee Medaka H3.3(chr19) vstetrapods Worm His Fugu H3.3(chr1) vstetrapods

Figure S4B. Chimpanzee Chimpanzee Stickleback groupIvsTetrapods H3F3B Stickleback groupITetrapods H3F3A Pyg.chimp Pyg.chimp Pyg.chimp Pyg.chimp Pyg.chimp Tetraodon chr2vsTetrapods H3F3B Tetraodon chr2vsTetrapods H3F3A Medaka chr19vsTetrapods H3F3B Medaka chr19vsTetrapods H3F3A Worm His Worm His Orangutan Orangutan Orangutan Orangutan Orangutan Fugu chr1vsTetrapods H3F3B Fugu chr1Tetrapods H3F3A 

71 (chrX) vstetrapods Baboon Baboon Baboon Baboon Baboon

  Macaque Macaque Macaque Macaque 71 vsTetrapods H3F3B 71 vsTetrapods H3F3A Macaque Mouse Mouse Mouse * Mouse Mouse * * Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity

0.2 0.4 0.6 0.8 1.0 tetrapods vs groupV H3.3 Stickleback 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human Tetraodon H3.3(chr16) vstetrapods Human Chimpanzee Chimpanzee Chimpanzee

Human Human Fugu H3.3(chr15) vstetrapods Medaka H3.3(chr13) vstetrapods Chimpanzee Worm His Chimpanzee Stickleback groupVvsTetrapods H3F3B Stickleback groupVvsTetrapods H3F3A

Pyg.chimp Pyg.chimp Tetraodon chr16vsTetrapods H3F3B Tetraodon chr16vsTetrapods H3F3A Pyg.chimp Pyg.chimp Pyg.chimp Medaka chr13vsTetrapods H3F3B Medaka chr13vsTetrapods H3F3A Worm His Worm His Orangutan Orangutan Orangutan Fugu chr15vsTetrapods H3F3B Fugu chr15Tetrapods H3F3A Orangutan Orangutan 

72 (chrIII) vstetrapods Baboon Baboon Baboon Baboon Baboon

  Macaque Macaque Macaque Macaque 72 vsTetrapods H3F3B 72 vsTetrapods H3F3A Macaque Mouse Mouse Mouse Mouse Mouse * * * Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity

0.2 0.4 0.6 0.8 1.0 tetrapods vs GL479001 H3.3 Lamprey 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Spotted garH3.3(LG2)vstetrapods

0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human Tetraodon H3.3(chr7) vstetrapods Human Human Human Chimpanzee Chimpanzee Chimpanzee

Chimpanzee Chimpanzee Lamprey GL479001vsTetrapods H3F3B Lamprey GL479001vsTetrapods H3F3A Yeat HHT1(chrII) vstetrapods Fly H3.3(chr2L) vstetrapods Pyg.chimp Pyg.chimp Pyg.chimp Spotted garLG2vsTetrapods H3F3B Spotted garLG2Tetrapods H3F3A Pyg.chimp Pyg.chimp Tetraodon chr7vsTetrapods H3F3B Tetraodon chr7vsTetrapods H3F3A Orangutan Orangutan Orangutan

Orangutan Yeast HHT1vsTetrapods H3F3B Yeast HHT1vsTetrapods H3F3A Orangutan l h2 sTetrapods H3F3B Fly chr2Lvs Tetrapods H3F3A Fly chr2Lvs Baboon Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Macaque Mouse Mouse Mouse Mouse Mouse * * Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole

Lizard Lizard Lizard Lizard Lizard

3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity 3'UTRs sequence identity Spotted garH3.3(LG13)vstetrapods 0.2 0.4 0.6 0.8 1.0 tetrapods vs GL480101 H3.3 Lamprey 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 0.2 0.4 0.6 0.8 1.0 Human Human Human Human Chimpanzee Chimpanzee Yeat HHT2(chrXIV) vstetrapods Chimpanzee Chimpanzee Lamprey GL480101vsTetrapods H3F3B Lamprey GL480101vsTetrapods H3F3A Spotted garLG13vsTetrapods H3F3B Spotted garLG13Tetrapods H3F3A Fly H3.3(chrX) vstetrapods Pyg.chimp Pyg.chimp Pyg.chimp Pyg.chimp Orangutan Orangutan

Orangutan Yeast HHT2vsTetrapods H3F3B Yeast HHT2vsTetrapods H3F3A Orangutan

Fly chrXvsTetrapods H3F3B Fly chrXvsTetrapods H3F3A Baboon Baboon Baboon Baboon Macaque Macaque Macaque Macaque Mouse Mouse * Mouse Mouse * Zebra finchVole Zebra finchVole Zebra finchVole Zebra finchVole

Lizard Lizard Lizard Lizard A Mammals B Reptiles, birds and amphibians

p=0.930 p=0.09 p=9.7-7 Human p=0.032 Chimpanzee -13 Orangutan p=4.1 p=0.013 Lizard* Baboon Zebra nch Macaque Chicken Marmoset Frog Cow Mouse Rat Dog Pig 0.05 0.10 0.15 0.20 Number of substitutions / CDS length 0.00 0.05 0.10 0.15 Number of substitutions / CDS length

H3F3A H3F3B H2AFZ H3F3A H3F3B H2AFZ

Figure S5. A GC3 B Normalized GC3

100 1.4

1.3 80

1.2

60 CDS  1.1

GC3(%) 40

GC3/GC 1.0

Human H3F3A Human H3F3A 20 Human H3F3B Human H3F3B Mean for severn human UEG chosen for calibration 0.9 Mean for eleven human UEG chosen for calibration Mean for all human UEG (3803) Mean for all human UEG (3803) (data obtained from Eisenberg et al. 2013) (data obtained from Eisenberg et al. 2013) 0 0.8 sh sh   H3F3A H3F3B H2AFZ H3F3A H3F3B H2AFZ Tetrapod Tetrapod Tetrapod nned Tetrapod Tetrapod Tetrapod nned   H3.3 genes H3.1 genes H3.2 genes H3.3 genes H3.1 genes H3.2 genes Sarcopterygii Sarcopterygii Sarcopterygii Sarcopterygii Ray- Ray-

Figure S6. A Comparison with genomewide codon usage B Correlation of codon usages in H3.3 and (codon frequencies) proliferation- /dierentiation-induced genes (codon frequencies) 0.7 (data from Gingold et al. 2014)

0.6 80 / 86 0.5 H3F3B 0.4

Correlation coefficient Human H3F3A Human H3F3B 0.3

Mean for severn human UEG chosen for calibration Correlation with 57 / 92 Mean for human UEG (3803) (data from Eisenberg et al. 2013) 0.0 0.2 0.4 0.6 0.8 Cell proliferation 0.2 Cell differentiation sh  0.0 0.2 0.4 0.6 0.8 H3F3A H3F3B H2AFZ Tetrapod Tetrapod Tetrapod Tetrapod Tetrapod nned  Correlation with H3F3A H3.1 genes H3.2 genes H3.3 genes Ray-

C Human H3F3A codon usage vs D Human H3F3B codon usage vs cell proliferation and cell differentiation cell proliferation and cell differentiation (AA-specic codon frequencies) (AA-specic codon frequencies)

H3F3A vs proliferation genes H3F3B vs proliferation genes H3F3A vs differentiation genes H3F3B vs differentiation genes 56 56 (data from Gingold et al. 2014) (data from Gingold et al. 2014)

-12 p=6.91 -12 p=8.30 Density Density 01234 01234

0.0 0.2 0.4 0.6 0.8 0.2 0.0 0.2 0.4 0.6 0.8 1.0 Correlation coefficient Correlation coefficient

E Human H3F3A codon usage vs F Human H3F3B codon usage vs cell proliferation and cell differentiation cell proliferation and cell differentiation (codon frequencies) (codon frequencies)

H3F3A vs proliferation genes H3F3B vs proliferation genes H3F3A vs differentiation genes H3F3B vs differentiation genes 56 56 (data from Gingold et al. 2014) (data from Gingold et al. 2014)

-18 -8 p=7.19 p=8.58 Density Density 01234 01234

0.0 0.2 0.4 0.6 0.4 0.2 0.0 0.2 0.4 0.6 0.8 Correlation coefficient Correlation coefficient

Figure S7.