Supplemental Data Supplemental Figures and Legends
Total Page:16
File Type:pdf, Size:1020Kb
Supplemental Data Supplemental Figures and legends A B 0%! 50%! 100%! oCYP1A_H01! 0%! 50%! 100%! oCYP1B1_H01! oCYP1A_H02! oCYP1B1_H02! N.JPN! oCYP1A_H03! N.JPN! n=10 n=10 oCYP1B1_H03! oCYP1A_H04! oCYP1B1_H04! S.JPN! oCYP1A_H05! S.JPN! oCYP1B1_H05! n=26 oCYP1A_H06! n=24 oCYP1B1_H06! E.KOR! oCYP1A_H07! E.KOR! n=4 oCYP1A_H08! n=4 oCYP1B1_H07! oCYP1A_H09! oCYP1B1_H08! W.KOR! oCYP1A_H10! W.KOR! n=4 n=4 oCYP1B1_H09! oCYP1A_H11! oCYP1B1_H10! Related species! oCYP1A_H12! Related species! oCYP1B1_H11! n=4 oCYP1A_H13! n=4 C D 0%! 50%! 100%! 0%! 50%! 100%! oCYP20A1_H01! oCYP20A1_H02! N.JPN! N.JPN! n=10 n=10 oCYP20A1_H03! oCYP5A1_H01! S.JPN! S.JPN! oCYP20A1_H04! n=26 oCYP5A1_H02! n=20 oCYP20A1_H05! E.KOR! E.KOR! oCYP20A1_H06! n=4 oCYP5A1_H03! n=4 oCYP20A1_H07! W.KOR! n.a.*! W.KOR! n=4 n=4 oCYP20A1_H08! oCYP20A1_H09! Related species! Related species! n=0 n=4 oCYP20A1_H10! E F 2500! Human cytochrome P450 gene names 2000! p = 1.6 x 10-2 p = 1.1 x 10-3 1500! CYP1A1 CYP4V2 CYP8A1 CYP20A1 CYP26B1 1000! CYP1B1 CYP5A1 CYP8B1 CYP21A2 CYP27B1 500! CYP2R1 CYP7A1 CYP11A1 CYP24A1 CYP27C1 RLU (Relative light unit) 0! 1" 2" 3" CYP2U1 CYP7B1 CYP17A1 CYP26A1 CYP51A1 GFP Tanabe Maegok Figure S1: Haplotype diversity of medaka CYPs, related to Figure 1. Haplotype frequencies based on CYP1A (A), CYP1B1 (B), CYP5A1 (C) and CYP20A1 (D) amino acid sequences in medaka local populations. Only homozygotes for each amino acid change were included. Colors represent haplotypes in local wild populations, n indicates the number of chromosomes. (E) Table showing 20 CYP orthologs that were detected in the medaka genome based on searches of the human genome. (F) Enzyme activity of medaka CYP1B1 with HA-tag, and western blotting. CYP1B1 proteins were expressed at comparable levels. The x-axis represents the names of wild medaka populations. Each bar represents the mean ± S.D. from multiple independent samples (n = 3). Statistical comparisons were performed using Tukey–Kramer test. Each column was compared against values from the Tanabe population; * not analyzed. A Model p* InL† κ‡ ω1§ ω2|| ω3¶ A. one: ω1=ω2=ω3 9 -2452.45 2.582 0.134 0.134 0.134 B. two: ω1=ω2, ω3 10 -2449.61 2.596 0.118 0.118 999.000 C. free: ω1, ω2, ω3 15 -2445.81 2.590 0.000 0.031 999.000 Null Hypothesis Alternative Hypothesis Model χ2** d.f. †† P‡‡ ω1=ω2=ω3 ω1, ω2, ω3 A vs. C 13.273 6 0.039 ω1=ω2=ω3 ω1=ω2, ω3 A vs. B 5.681 1 0.017 p = 0.3948 B C p = 0.3139 1.25! p = 0.0035 conservative 0.0/4.3 TN p = 0.0299 1.1/8.8 1! p = 0.0028 S.JPN T395P 1.0/1.1 KS K69Q Common! 0.75! Ancestor 2.2/5.4 SH D420E, D507A 3.1/0.0 W.KOR 0.5! V171L, I246V V329I 2.0/6.5 MG T38I, I261V non-conservative Relativeenzyme activity 0.25! 18.0/24.6 LZ outgroup 0! ! ! ! ! ! 0.01 ! ! Maegok T395P Tanabe Maegok oCYP1B1_H03 oCYP1B1_H08 Ancestral CYP1B1 Figure S2: Reconstruction of ancestral CYP1B1, related to Figure 4. (A) Log likelihood values (top) and likelihood ratio tests (2 ΔInl; bottom) were estimated under different models for the two hypotheses. (B) Phylogenetic tree based on the nucleotide sequences of CYP1B1, LZ was used as the out-group. The ancestral sequences of medaka CYP1B1 were inferred based on this tree. Amino acid substitutions are indicated under each branch. The numbers on each branch show dN/dS (ω) in Figure. 4. The orange arrow shows the amino acids replaced in subsequent mutagenesis analyses. (C) Comparison of enzyme activities between wild-type and mutant enzymes. Constructs were generated using site-direct mutagenesis. This figure indicates that substitution of 395th amino acid had no effect on CYP1B1 enzyme activity. Maegok T395P was also generated to confirm the effect of the amino acid change. Each bar represents the mean ± S.D. from multiple independent samples (n = 3). Data were analyzed for statistical significance using the Tukey–Kramer test. * number of parameters, † log likelihood value, ‡ transition/transversion rate ratio, § dN/dS for (TN, KS)–TN branch, || dN/dS for ((TN, KS), (SH, MG))–(TN, KS) branch, ¶ dN/dS for ((TN, KS), (SH, MG))–(SH, MG) branch, ** log likelihood difference (2 ΔInl), †† degrees of freedom, ‡‡ probability values A B CYP1B1*3! African! African! CYP1B1*1! CYP1B1*4! European! CYP1B1*2! European! CYP1B1*5! CYP1B1*6! Asian! Asian! residual! 0%! 25%! 50%! 75%! 100%! 0! 0.6! 1.2! 1.8! Haplotype frequency Mahalanobis’ generalized distance Figure S3: CYP1B1 and sexual dimorphism in human populations, related to Results and Discussion. (A) CYP1B1 haplotype frequencies based on HapMap and 1,000 genome data. Africa includes YRI: Yoruban in Ibadan; Europe includes CEU: Utah residents with Northern and Western European ancestry from the CEPH collection; Asia includes JPT: Japanese in Tokyo, Japan and CHB: Han Chinese in Beijing, China. (B) Mahalanobis’ generalized distances (D2) between males and females based on tooth-crown size (modified from Hanihara 1978). Supplemental Tables Table S1: Spearman’s rank correlation coefficients, related to Figure 1. Spearman’s Rho values are shown in the lower-left portion of the matrix, and p values are shown in the upper-right portion. Variables for correlation tests are shown below. “Mated male” indicates a dummy variable: a female with a Tanabe male (1) or female with a Maegok male (0). Likewise, “mated female” indicates a dummy variable: a female from Tanabe (1) or a female from Maegok (0) populations. ΔSL, ΔA-AFL, ΔP-AFL, and ΔAFL are differences between Tanabe and Maegok males’ morphometric data. Significant correlations after Holm corrections for multiple comparisons are indicated in bold. Mated male Mated female ΔSL ΔA-AFL ΔP-AFL ΔAFL ratio Mated male - 1.000 1.000 1.000 0.004 0.004 Mated female 0.045 - 1.000 1.000 1.000 1.000 ΔSL -0.146 0.047 - 0.000 0.008 0.000 ΔA-AFL -0.146 0.183 0.715 - 1.000 0.015 ΔP-AFL 0.470 -0.167 -0.441 -0.044 - 0.000 ΔAFL ratio 0.470 0.089 -0.647 -0.414 0.827 - Table S2. Mahalanobis’ generalized distances based on C scores from three variables of medaka morphology, related to Figure 3. Mahalanobis’ distances are shown in the lower-left portion of the matrix, and p values are shown in the upper-right portion; TT, homozygote of Tanabe CYP1B1; TM, heterozygote of Tanabe and Maegok CYP1B1; MM, homozygote of Maegok CYP1B1. Significance was assessed using F-statistics; p values less than 0.001 are shown in bold. TT♂ TM♂ MM♂ TT♀ TM♀ MM♀ TT♂ - 0.185 0.737 0.000 0.000 0.000 TM♂ 0.693 - 0.655 0.000 0.000 0.000 MM♂ 0.201 0.158 - 0.000 0.000 0.000 TT♀ 13.130 10.371 11.960 - 0.058 0.633 TM♀ 8.203 5.262 6.763 1.262 - 0.132 MM♀ 9.253 7.221 8.406 0.357 0.803 - Table S3. Estimation of time since the most recent common ancestor using whole mitochondrial genomes related to Results and Discussion. Number of TMRCA segregating ˆ Ne θW 95% CI sites Mean 95% CI (lower) (upper) 3,289 1342.449 3,015,384 4,987,445 3,527,999 6,489,106 Supplemental Movie Movie S1. Mating behavior in medaka. We put two males (from Tanabe and Maegok, respectively) and one female medaka (from Tanabe or Maegok) into a tank, although the Maegok male cannot be seen in the video. The Tanabe male medaka has larger anal fins than the Maegok male, and grasps the Tanabe female using his anal fin. After slowly descending together, the male and female vibrate their bodies to stimulate the release of sperm and eggs, respectively. The eggs are fertilized at this moment. Supplemental Experimental Procedures 1. Medaka SNP screening Identification of orthologous genes To identify CYP orthologs of humans in medaka, we reconstructed phylogenetic trees and investigated genome syntenies. The following amino acid sequences from seven species were obtained from the Ensembl database: medaka (Oryzias latipes), human (Homo sapiens), chimpanzee (Pan troglodytes), macaque (Macaca mulatta), zebrafish (Danio rerio), fugu (Takifugu rubripes) and tetraodon (Tetraodon nigroviridis). Phylogenetic analyses were performed using the program MEGA4 [S1]. Pairwise sequence divergences were calculated using the Poisson correction method, and phylogenetic trees were reconstructed using the neighbor-joining (NJ) method [S2]. Statistical reliability of tree branches was evaluated using 1,000 bootstrap replicates [S3]. Genome synteny was examined using Ensembl and Genomicus (http://www.dyogen.ens.fr/genomicus-64.01/cgi-bin/search.pl) [S4]. Samples We used lab-stocks of medaka from wild populations originated from Northern Japanese (N.JPN), Southern Japanese (S.JPN), Eastern Korean (E.KOR) and Western Korean/Chinese groups (W.KOR), which are distinguished by mitochondrial DNA sequences [13] (Figure 1A). These medaka strains were maintained for many generations as closed colonies in the Graduate School of Frontier Sciences, University of Tokyo [S5]. A total of 28 individuals were analyzed from 26 locations, including northern Japan: Niigata, Ryotsu, Kaga and Odate (n = 2); southern Japan: Tanabe, Takamatsu, Tessei, Kasumi, Urizura, Iwaki, Mishima, Hagi, Okewaki, Kikai, Nago (n = 2), Kochi, Yamaguchi, Akishima and Gushikami; east Korea: Yongchon and Sajin; west Korea: Maegok and Bugang; China: Shanghai and related species Luzon (O. luzonensis) and Hainan (O. curvinotus). Original habitats of each are described in Katsumura et al. 2009 [13]. DNA extraction Whole medaka were dissolved in a solution containing 10% SDS, 0.5 M EDTA and proteinase K.