Supplementary Material Genetic Diversity in the IZUMO1-JUNO
Total Page:16
File Type:pdf, Size:1020Kb
1 Supplementary Material 2 Genetic Diversity in the IZUMO1-JUNO Protein-Receptor Pair Involved in 3 Human Reproduction 4 Jessica Allingham and Wely B. Floriano* 5 Department of Chemistry, Lakehead University, Thunder Bay Ontario, Canada. E-mail: 6 [email protected], [email protected] 7 *corresponding author 8 9 Table S1: Comprehensive breakdown of the variants in the IZUMO1 gene sequence when 10 unfiltered and filtered with a minor allele frequency (MAF) of 5% using SNPEff (1). 11 No maf filtering Maf 5% frequency Variants 192 31 Variant rates 307963 1907386 SNPs 189 31 Insertions 1 0 Deletions 2 0 Low Impact Effects 199 (29.4%) 23 (21.5%) Moderate Impact Effects 11 (1.6%) 1 (0.9%) Modifier Impact Effects 466 (68.9%) 83 (77.6%) Missense Mutations 12 (57.1%) 1 (33.3%) Silent Mutations 9 (42.9%) 2 (66.7%) Downstream Effects 143 (21.2%) 29 (27.1%) Intergenic Effects 20 (3.0%) 6 (5.6%) Intragenic Effects 1 (0.1%) 0 Intron Effects 132 (19.5%) 18 (16.8%) Next Protein Effects 182 (26.9%) 21 (19.6%) Non-synonymous Coding 11 (1.6%) 1 (0.9%) Effects Non-synonymous Start 1 (0.1%) 0 Effects Splice Site Region and Intron 5 (0.7%) 0 Effect Start Gained Effect 2 (0.3%) 0 Synonymous Coding Effect 9 (1.3%) 2 (1.9%) Upstream Effects 157 (23.3%) 26 (24.3%) UTR 5 Prime Effect 13 (1.9%) 4 (3.7%) 12 Table S2: Comprehensive breakdown of the variants in the JUNO gene sequence when unfiltered 13 and filtered with a MAF of 5% using SNPEff (1). 14 No maf filtering Maf 5% frequency Variants 90 7 Variant rates 1500072 19286645 SNPs 82 7 Mixed Variants 8 0 High Impact Effects 3 (1.8%) 0 Low Impact Effects 86 (51.1%) 3 (30%) Moderate Impact Effects 19 (11.3%) 1 (10%) Modifier Impact Effects 60 (35.7%) 6 (60%) Missense Mutations 18 (60%) 1 (100%) Silent Mutations 3 (10%) 0 Downstream Effects 9 (30%) 0 Intergenic Effects 4 (2.4%) 0 Intron Effects 53 (31.5%) 5 (50%) Next Protein Effects 78 (46.4%) 3 (30%) Non-synonymous Coding 18 (10.7%) 1 (10%) Effects Stop Gained Effects 3 (1.8%) 0 Synonymous Coding Effect 9 (5.4%) 0 UTR 5 Prime Effect 3 (1.8%) 1 (10%) 15 16 17 18 19 20 21 22 Table S3: A list of the 26 different populations sampled by the 1000 Genomes project(2) 23 clustered into five larger population groups, where n signifies the number of individuals in each 24 population group. Category n Populations included: Population Code South Asia 489 Bengali in Bangladesh BEB Gujarati Indian GIH Indian Telugu in the UK ITU Punjabi in Lahore,Pakistan pjl PJL Sri Lankan Tamil in the UK STU East Asian 504 Japanese in Tokyo, Japan JPT Han Chinese in Bejing, CHB China Southern Han Chinese, China CHS Chinese Dai in Xishuangbanna CDX Kinh in Ho Chi Minh City, Vietnam KHV Europe 503 Northern and Western European Finnish in Finland CEU Finnish in Finland FIN British in England and Scotland GBR Iberian populations in Spain IBS Toscani in Italia TSI America 347 Colombian in Medellin, Colombia CLM Mexican Ancestry in Los Angeles, MXL Peruvian in Lima, Peru PEL Puerto Rican in Puerto Rico PUR Africa 661 African Caribbean in Barbados ACB African Ancestry in Southwest US ASW Esan in Nigeria ESN Gambian in Western Division GWD Luhya in Webuye, Kenya LWK Mende in Sierra Leone MSL Yoruba in Ibadan, Nigeria YRI 25 26 Table S4: Tajima’s D analysis of various genes under different types of selection(3-8). 27 Population sizes (n) are reported in parenthesis. Tajima’s D was calculated using VCFTools(9) in 28 bins of 100 bp for all biallelic sites within the location range of each gene. Tajima's D all Location populations Literature Selection Gene (GRCh37.p13) (n=2,504) Value Reference Unknown Chr 9 49244073- IZUMO1 49250831 -0.35532 N/A N/A Chr 11 94038803- JUNO 94040858 -0.77916 N/A N/A Neutral 0.746 Chr 6 (n=282) 31539876- 6 Chinese LTA 31542101 -0.45138 populations (5) Chr 7 141463897- 1.078 TAS2R38 141464997 -0.58725 (n=8,589) (10) -0.25 (n=124) (22 = EUR, Chr 22 27 = AFR, 19744226- 24 = ASI, TBX1 19771116 -0.69686 22 = AMR) (11) Chr 17 26694298- Value not VTN 26697373 -0.61777 reported Balancing 2.035 EUR Chr 9 (n=23) 136130563 - 1.772 AFR ABO 136150630 -0.07299 (n=24) (4, 6) (12) Chr 20 31669318- Value not BPIFB4 316699557 -0.56074 reported (7) Chr 6 26500577- Value not BTN1A1 26510653 -0.57767 reported (7) Chr 6 Value not CDSN 31082865- 0.333637 reported (7) 31088252, complement Chr 10 16370231- Value not CLCNKB 16383821 -0.34419 reported (7) Chr 5 96211644- 1.526 ERAP2 96255420 -0.30469 (n=180) (3) Chr 9 104331634- 104500862, Value not GRIN3A complement -0.53335 reported (7) Chr 6 29910247- 2.9 HLAA 29913661 0.656452 (n=205) (7, 13, 14) Chr 6 31321649- 31324989, 2.4 HLAB complement 0.354656 (n=205) (7, 13, 14) Chr 12 52862300- 52867569, Value not KRT6C complement -0.32544 reported (7) Chr 12 52771596- 52779417, Value not KRT84 complement -0.34531 reported (7) Chr 11 5710817 Value not TRIM22 - 5732093 -0.37645 reported (7) Positive Chr 2 27346632- Value not ABHD1 27353680 -0.72632 reported (8) Chr 2 73612886- Value not ALMS1 73837047 -0.67787 reported (8) Chr 22 39436609- Value not APOBEC3F 39451977 -0.67458 reported (8) Chr 22 39473010- Value not APOBEC3G 39483748 -0.67032 reported (8) Chr 7 80231504- Value not CD36 80308593 -0.5766 reported (8) Chr 1 117057156- 117113715, Value not CD58 complement -0.70801 reported (8) Chr 9 35609976 -35618862, Value not CD72 complement -0.7711 reported (8) Chr 2 109510927- Value not EDAR 109605828 -0.66222 reported (8, 15) Chr 1 50906935- 51425936, Value not FAF1 complement -0.73988 reported (8) Chr 22 40297086 - Value not GRAP2 40369347 -0.7289 reported (8) Chr 3 50330259 - 50336899, Value not HYAL3 complement -0.73139 reported (8) Chr 17 3617919 - 3704537, Value not ITGAE complement -0.57616 reported (8) -2.467 EUR Chr 7 (n=23) 142638201- -0.823 AFR KEL 142659503 -0.76554 (n=24) (6) Chr 2 136545415- 136594750, Value not LCT complement -0.61215 reported (16) Chr 15 24920541- Value not NPAP1 24928593 -0.74745 reported (8) Chr 16 11374693- 11375192, Value not PRM1 complement -0.84476 reported (8) Chr 16 11369493- 11370337, Value not PRM2 complement -0.91328 reported (8) Ch3 3 93591881- -1.44 93692934, (n = 47) complement (24 = AFR, PROS1 -0.80824 23 = EUR) (17) Chr 7 5085452- Value not RBAK 5112854 -0.62683 reported (8) Chr 1 25687853- 25747363, Value not RHCE complement -0.63686 reported (8) Chr 15 48413169- Value not SLC24A5 48434926 -0.73307 reported (8) Chr 12 79257773- Value not SYT1 79845788 -0.66917 reported (8) -2.865 EUR Chr 7 (n=23) 142568956- 0.893 AFR TRPV6 142583490 -0.67565 (n=24) (6, 18, 19) 29 30 31 32 Table S5: Hardy-Weinberg Equilibrium analysis of IZUMO1 gene in males only and the entire 33 population in each of the groups included in the analyzed haplotype(20). Location rs2307018 rs2307019 rs838148 Population All Males All Males All Males AFR 0.7809 1 0.7809 1 0.8287 0.9697 AMR 0.9762 0.57 0.9762 0.57 0.9762 0.2741 EUR 0.3691 1 0.3691 1 0.8109 0.428 EAS 0.8668 0.7768 0.8668 0.7768 0.9064 1.00 SAS 0.3308 0.2600 0.3308 0.2600 1 0.8749 ASI 5.83E-06 6.91E-05 5.83E-06 6.91E-05 0.3669 0.3098 ALL 2.06E-13 4.62E-08 2.06E-13 4.62E-08 0.0039 0.0025 34 35 36 37 Table S6: Hardy-Weinberg Equilibrium analysis of JUNO gene in females only and the entire 38 population in each of the groups included in the analyzed haplotype(20). Location rs61742524 rs55784852 rs16920146 rs7925833 rs7935583 Population All Females All Females All Females All Females All Females AFR 0.4875 0.0221 0.4875 0.022 0.4875 0.0221 0.666 0.1764 0.4875 0.0221 AMR 0.491 0.1201 0.491 0.120 0.491 0.1201 0.4049 0.0725 0.491 0.1201 EUR 1 1 1 1 1 1 1 1 1 1 EAS 1 1 1 1 1 1 1 1 1 1 SAS 1 1 1 1 1 1 1 1 1 1 ASI 1 1 1 1 1 1 1 1 1 1 ALL 1.36E- 4.25E- 5.26E- 4.25E- 5.26E-20 4.25E-10 1.17E-20 009.28E- 1.78E-20 4.25E-10 20 10 20 10 06 39 40 41 Table S7: FST values in the IZUMO1 gene between the five larger population groups for the 42 entire set of 2504 individuals sampled in the 1000 Genomes project. These FST values were 43 calculated using SNPs with a MAF of at least 1%. For comparison, a genome wide FST value for 44 the human genome is 0.12. The average of all pairwise values is 0.150. EUR EAS AMR SAS AFR EUR 0.296 0.023 0.020 0.080 EAS 0.296 0.196 0.224 0.447 AMR 0.023 0.196 0.004 0.123 SAS 0.020 0.224 0.004 0.085 AFR 0.080 0.447 0.123 0.085 45 46 47 Table S8: FST values in the JUNO gene between the five larger population groups for the entire 48 set of 2504 individuals sampled in the 1000 Genomes project.