Making Sense out of Y Chromosome Polymorphisms
Total Page:16
File Type:pdf, Size:1020Kb
Making Sense out of Y Chromosome Polymorphisms Or why males are so complicated... Thomas Krahn Human Y Chromosome Basics ➔Only in males (exceptions) ➔Inherited in strict paternal line ➔About 58 million bases long ➔Only ~27 Mbases sequenced ➔Highly repetitive ➔Contains pseudo autosomal regions ➔Largest palindromes in human genome Large Scale ChrY Changes ➔Insertions / Deletions ➔Whole chromosome duplications ➔Ring Y chromosomes ➔Inversions ➔Translocations / Fusion chromosomes ➔ Peter H.Vogt: AZF deletions and Y chromosomal haplogroups Hum. ➔ Reprod. 11 (4): 319-336. doi: Microscopic karyotype 10.1093/humupd/dmi017 ➔FISH with target specific fluorescent probes ➔Male infertility ➔Gender determination Turner Syndrome 45,XO:46,XY:46XX = 50:30:20 (Sports / Olympics) Premi S, Srivastava J, Panneer G, Ali S, 2008 Startling Mosaicism of the Y-Chromosome and Tandem Duplication of the SRY and DAZ Genes in Patients with Turner Syndrome. PLoS ONE 3(11): e3796. doi:10.1371/journal.pone.0003796 Y Chromosome Repeats Y-STR (DYS19, DYS385, DYF399) Mini satellites (MSY) Inverted repeats Palindromes Multi palindromes Parallel repeats (TSPY) Y chromosomal variation tracks the evolution of mating systems in chimpanzee and bonobo. Schaller F, Fernandes AM, Hodler C, Münch C, Pasantes JJ, Rietschel W, Schempp W (2010) PLoS ONE 5(9): Peter H.Vogt: AZF deletions and Y chromosomal haplogroups e12482. doi:10.1371/journal.pone.0012482 Hum. Reprod. 11 (4): 319-336. doi: 10.1093/humupd/dmi017 Y-STR ➔Classical paternal line sibling test ➔Interesting for genealogists (surname correlation) ➔Isolation of a male profile from a mixed trace ➔No contamination problems with female lab personal ➔Ready made multiplex kits available (Powerplex Y, Yfiler, Argus Y) ➔Number of markers not sufficient for genealogists because they demand higher resolution Adding More Markers to PPY DYS426 and DYS388 are usually slow mutators, but in some haplogroups they suddenly increase mutation frequency. They have been in the FTDNA database right from the start but they are absent in the PPY kit. PPY has some gaps in JOE and TMR. Just enough to fill them with DYS426 and DYS388 More single copy Y-STRs ➔Quick & easy to score ➔Not severely influenced by recombination ➔Easy and understandable comparisons for genealogists ➔Plenty of Y-STRs published ➔Many of them have standardized nomenclature (NIST) ➔FTDNA was always market leader with number of Y-STR (12, 25, 37, 67 and 111 marker panel plus specialty Y-STR) ➔My goal was always to have ALL markers that the competitors had so that FTDNA customers could compare with all databases. Why So Many Y-STR? ➔Huge surname projects with 800+ family members ➔Find splits in closely related Y lines ➔Predict haplogroups from Y-STR haplotypes ➔Consistency checks across panels ➔Precisely map Y chromosome deletions Special STRs: Multi Copy Y-STR DYS725: Difficult to interpret dinucleotide repeat but just a few 100 bp next to DYS464 Good to verify unusual DYS464 results DYF408: 188 bp segment doesn't actually contain STR repeat units. Good to calibrate molar equivalents DYF397: Asymmetric P1/P3 palindromic Y-STR 2 copies on P1 and 2 copies on P3 Good to distinguish different deletions / duplications DYS385, DYS464, DYF399, DYS425, DYF408 DYS385 Kittler Protocol Kittler R, Erler A, Brauer S, Stoneking M, Kayser M (2003) Apparent intrachromosomal exchange on the human Y chromosome explained by population history. Eur. J. Hum. Genet. 11(4): 304-14. Using Adjacent SNPs to Separate Loci of Multicopy Y-STRs Fluorescein JOE TAMRA DYS464 Extended Test (DYS464X) Using Adjacent SNPs to Separate Loci of Multicopy Y-STRs Y h g r o u p D Y S 4 6 4 A 1 1 g - 1 3 g - 1 3 g - 1 6 g E 1 4 g - 1 5 . 3 g - 1 7 g - 1 8 g E 3 b 1 1 4 g - 1 5 . 3 g - 1 7 g - 1 8 g G 1 3 g - 1 4 g - 1 5 g - 1 5 g Typing of G 2 * 1 2 g - 1 2 g - 1 2 g - 1 3 g I 1 2 g - 1 4 g - 1 5 g - 1 6 g I 1 a 1 2 g - 1 4 g - 1 4 g - 1 6 g DYS464X I 1 a 3 1 2 g - 1 2 g - 1 4 g - 1 4 g - 1 5 g - 1 6 g Other haplogroups have I 1 b 1 1 g - 1 4 g - 1 4 g - 1 4 g only G-type alleles I 1 b 1 1 g - 1 4 g - 1 4 g - 1 5 g I 1 b 2 a 1 1 g - 1 4 g - 1 4 g - 1 5 g I 1 b 2 a 1 1 1 g - 1 1 g - 1 4 g - 1 5 g I 1 c 1 4 g - 1 5 g - 1 5 g - 1 6 g J 2 a 1 * 1 2 g - 1 3 g - 1 5 g - 1 6 g - 1 6 g - 1 6 g N 1 4 g - 1 4 . 3 g R 1 a 1 * 1 2 g - 1 5 g - 1 5 g - 1 6 g R 1 b 1 6 c - 1 6 c - 1 6 g - 1 6 g R 1 b 1 5 c - 1 6 c R 1 b 1 5 c - 1 5 c - 1 7 c - 1 7 g R 1 b 1 4 c - 1 6 c - 1 7 c - 1 7 g R 1 b 1 4 c - 1 5 c - 1 6 g - 1 7 c R 1 b 1 6 c - 1 6 g R 1 b 1 5 c - 1 5 c - 1 6 c - 1 6 g R 1 b 1 4 c - 1 5 c - 1 7 c - 1 7 g R1b has usually 3 C-type R 1 b 1 5 c - 1 6 c - 1 7 g - 1 7 g R 1 b 1 5 c - 1 6 c alleles and one G-type allele R 1 b 1 5 c - 1 5 c - 1 5 c - 1 5 c R 1 b 1 5 c - 1 5 c - 1 7 c - 1 7 g R 1 b 1 5 c - 1 7 c - 1 7 c - 1 8 g R 1 b 1 5 c - 1 5 c - 1 7 c - 1 8 g R 1 b 1 5 c - 1 5 c - 1 6 g - 1 7 c R 1 b 1 6 c - 1 6 c - 1 7 c - 1 7 g R 1 b 1 4 c - 1 5 c - 1 5 c - 1 5 g R 1 b 1 5 c - 1 5 c - 1 6 c - 1 8 g R 1 b 1 5 c - 1 5 c - 1 6 c - 1 7 . 1 g R 1 b 1 5 c - 1 5 c - 1 6 g - 1 7 c R 1 b 1 3 c - 1 5 c - 1 7 c - 1 7 g R 1 b 1 5 c - 1 5 c - 1 7 g - 1 7 g Exceptions most likely R 1 b 1 5 c - 1 6 c - 1 6 c - 1 8 g R 1 b 1 5 c - 1 5 c - 1 6 c - 1 7 c products of recLOH Palindromic Map RecLOH centromere 9 39 14 DYF371 DYF399 DYS464 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 C-type DYF408 T-type C-type DYS725 188 bp 188 bp P1 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 DYF371 DYF408 DYF399 DYS464 DYS725 C-type C-type C-type telomere 10 36 16 centromere 109 3936 1416 DYF371 DYF399 DYS464 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 C-type DYF408 T-type C-type DYS725 188 bp 188 bp P1 N.N. DYF397 DYF401 DYF387 DYS459 DYF385 DYS724 DYF371 DYF408 DYF399 DYS464 DYS725 C-type C-type C-type telomere 10 36 16 Recombination driven Loss Of Heterozygosity P1/P2 Deletion Mechanism Symmetry in the red/red (P1/P2) region allows for another irregular conformation: DYF397 P3 DYF397 DYF399 Recombination breakpoint ins G DYF399 T-type DYS464 DYS725 DYS725 DYS464 C-type DYF408 188 bp DYF399 DYS464G DYS725 DYS725 DYS464 T-type p b DYF408 8 N.N. 8 1 Circle conformation DYF371 DYF397 DYS724 DYF385 DYF387 DYF401 P1/P2 Deletion Mechanism The circular DNA molecule can't replicate on its own and gets lost in the next cell cycle DYF397 P3 DYF397 DYF399 ins G T-type DYS464 DYS725 DYS725 DYS464 DYF399 DYF408 188 bp DYF399 DYS464G DYS725 DYS725 DYS464 T-type p b DYF408 N.N. 8 8 1 Deletion DYF371 DYF397 DYS724 DYF385 DYS459 DYF387 DYF401 Special Y-STRs: DYS389 DYS389 I+II fusion repeat observed TCTG TCTA TCTG TCTA SNPs are also affected by ChrY Self- Recombination L88 region in haplogroup J-L26/L27 SNPs are also affected by ChrY Self- Recombination L88 region in haplogroup E-M2 SNPs are also affected by ChrY Self- Recombination ChrX ChrY L88 L88 region of highly similar ChrX sequence Y-SNPs and Haplogroups ➔Haplogroups are defined by “stable“ Y-SNPs ➔YCC haplogroup tree (most parsimonous tree) ➔Hundreds of refinements and additions ➔The same characteristic mutation often shows up in completely distinct branches of the tree (.2) ➔Parallel and back mutations happen in real life ➔Those can often be explained by recombination events Keeping Track of New Y-SNPs and Y Tree Changes ➔Ymap Y chromosome browser contains information about most published Y markers ➔Don't add new marker names when they already exist ➔Info about location, base change, primers, hg association and palindrome position ➔Based on gbrowse ➔Instantly synchronized with our LIMS db Http://ymap.ftdna.com Keeping Track of New Y-SNPs and Y Tree Changes ➔Ytree (Draft Y chromosome tree) ➔Node based structure http://ytree.ftdna.com ➔New SNPs found are instantly added ➔Automatically keeps a traceable change log Walk Through the Y Project 90 80 70 60 50 40 30 20 10 0 A B C D E F G H I J K L M N O P Q R S T Coverage (currently) ~ 200 kB Sanger sequences On average 1.2 new SNPs per participant found Verification and mapping of new mutations on Ytree 230 WTY participants from mainly European haplogroups Designing PCR Primers for ChrY Input Segments target location fastacmd +/- 500 bp preset P3 params.