The Human Y Chromosome: an Evolutionary Marker Comes of Age
Total Page:16
File Type:pdf, Size:1020Kb
REVIEWS THE HUMAN Y CHROMOSOME: AN EVOLUTIONARY MARKER COMES OF AGE Mark A. Jobling* and Chris Tyler-Smith‡ Until recently, the Y chromosome seemed to fulfil the role of juvenile delinquent among human chromosomes — rich in junk, poor in useful attributes, reluctant to socialize with its neighbours and with an inescapable tendency to degenerate. The availability of the near-complete chromosome sequence, plus many new polymorphisms, a highly resolved phylogeny and insights into its mutation processes, now provide new avenues for investigating human evolution. Y-chromosome research is growing up. SATELLITE DNA The properties of the Y chromosome read like a list of genes (FIG. 1):but how complete is this catalogue, what A large tandemly-repeated DNA violations of the rulebook of human genetics: it is not proportion of genes can be identified and what do they array that spans hundreds of essential for the life of an individual (males have it, but do? Efforts to discover genome-wide sequence varia- kilobases to megabases. females do well without it), one-half consists of tandemly tion have identified vast numbers of Y-specific single RECOMBINATION repeated SATELLITE DNA and the rest carries few genes, and nucleotide polymorphisms (SNPs): the Ensembl The formation of a new most of it does not recombine. However, it is because database lists 28,650 at the time of writing, which combination of alleles through of this disregard for the rules that the Y chromosome is might seem enough to provide an extremely detailed meiotic crossing over. such a superb tool for investigating recent human evolu- PHYLOGENETIC TREE of Y-chromosomal lineages. But how Some authors include intrachromosomal gene tion from a male perspective and has specialized, but many of these SNPs are real, and how many are arte- conversion under this heading. important, roles in medical and forensic genetics. facts that are produced by unknowingly comparing As this has been shown on the The Y chromosome needs to be studied in different true Y-chromosomal sequences with similar sequences Y chromosome, they prefer not to ways from the rest of the genome. In the absence of (PARALOGUES) elsewhere on the same or other chromo- refer to it as 'non-recombining'. RECOMBINATION,genetic mapping provides no informa- somes2? Also, are these SNPs a representative set of EUCHROMATIN tion and physical analyses are of paramount impor- sequence variants from the human population as a The part of the genome that is tance, so the availability of the near-complete sequence whole? The answer is no, because of ascertainment decondensed during interphase, of the EUCHROMATIC portions of the chromosome might bias (BOX 1) in the range of populations that were sur- which is transcriptionally active. have even greater impact here than on the rest of the veyed for variation. As well as this possible treasure trove genome. In this review, we are mainly concerned with of unverified SNPs, the availability of Y-chromosome *Department of Genetics, the application of Y-chromosomal DNA variation to sequence means that there are now more than 200 University of Leicester, investigations of human evolution, but there is also con- binary polymorphisms that are well characterized, University Road, siderable overlap with other areas. Questions about phe- and 100–200 potentially useful new MICROSATELLITES,as Leicester LE1 7RH, UK. ‡The Wellcome Trust notype are relevant because phenotypic effects can lead well as the ~30 published polymorphic tri-, tetra- and Sanger Institute, Wellcome to evolutionary changes being influenced by natural pentanucleotide repeat markers. Finally, there is a Trust Genome Campus, selection rather than by gene flow and genetic drift. robust and developing phylogeny of Y-chromosomal Hinxton, Cambridgeshire It seems that the more we know about the Y chro- haplotypes (FIG. 3) that are defined by binary polymor- CB10 1SA, UK. 1 e-mails: [email protected]; mosome the more questions we have. Its sequence phisms (haplogroups), and a unified nomenclature sys- 3 [email protected] allows us to get a comprehensive picture of its struc- tem that allows diversity data from different research doi:10.1038/nrg1124 ture and organization, and to compile a catalogue of groups to be readily integrated. 598 | AUGUST 2003 | VOLUME 4 www.nature.com/reviews/genetics © 2003 Nature Publishing Group REVIEWS Structural features Protein-coding genes Phenotype Point Copy Expression mutation Locationnumber pattern or Deletion Satellites gene loss >99% X–Y 0 Y–Y repeats 1 estis PAR1 1 >1 Ubiquitous T Other Inversion polymorphism 2 SRY Sex reversal RPS4Y ZFY 3 TGIF2LY 4 PCDH11Y 5 TSPYB None 6 IR3 AMELY None TBL1Y None 7 PRKY None IR1 8 9 TSPYA array 10 IR3 11 cen AZFa 12 USP9Y Infertility Mb DBY 13 UTY TMSB4Y P8 VCY, VCY 14 NLGN4Y 15 P7 16 P6 17 XKRY AZFb P5 CDY2, CDY2 18 XKRY P4 HFSY 19 CYorf15A, CYorf15B SMCY DYZ19 20 EIF1AY PHYLOGENETIC TREE RPS4Y2 21 RBMY1, RBMY1 A diagram that represents the IR2 RBMY1, RBMY1 evolutionary relationships IR2 PRY between a set of taxa (lineages). 22 P3 RBMY1, RBMY1 IR1 PRY BPY2 PARALOGUES 23 P2 DAZ1 Sequences, or genes, that have DAZ2 originated from a common CDY1 24 BPY2 ancestral sequence, or gene, by a P1 DAZ3 duplication event. 25 DAZ4 BPY2 MICROSATELLITE CDY1 26 AZFc A class of repetitive DNA None Yqh sequences that are made up of PAR2 tandemly organized repeats Figure 1 | Y-chromosomal genes. Our present view of the Y chromosome, which is based on DNA sequence information that that are 2-8 nucleotides in was derived largely from a single HAPLOGROUP-R individual1. From left to right: cytogenetic features of the chromosome and their length. They can be highly approximate locations, which are numbered from the Yp telomere. Structural features include three satellite regions (cen, DYZ19 polymorphic and are frequently and Yqh), segments of X–Y identity (PAR1 and PAR2; dark brown) and high similarity (mid brown), and Y–Y repeated sequences used as molecular markers in population genetics studies. in which the regions with greatest sequence identity are designated ‘IR’ for ‘inverted repeat’ and ‘P’ for ‘palindrome’. An inversion polymorphism on Yp that distinguishes haplogroup P from most other lineages is indicated. The locations of the 27 HAPLOGROUP distinct Y-specific protein-coding genes are shown; some are present in more than one copy and their expression patterns are A haplotype that is defined by summarized. Pseudoautosomal genes and Y-specific non-coding transcripts are not shown. On the right, the phenotypes that 111 binary markers, which is more are associated with gene inactivation or loss are indicated; some deletions produce no detectable phenotype (black) and stable but less detailed than one represent polymorphisms in the population, whereas others result in infertility (AZFa, AZFb and AZFc) (red), although the defined by microsatellites. contributions of the individual deleted genes are unclear. NATURE REVIEWS | GENETICS VOLUME 4 | AUGUST 2003 | 599 © 2003 Nature Publishing Group REVIEWS Analysis of the near-complete euchromatic sequence Y chromosome than elsewhere in the nuclear genome, of the Y chromosome has called into question many of which is indeed observed4,5.We also expect it to be more the cliches that were previously associated with this chro- susceptible to genetic drift, which involves random mosome1.Also, this new knowledge of DNA sequence, changes in the frequency of haplotypes owing to sam- genes, polymorphisms and phylogeny, provides a starting pling from one generation to the next. Drift accelerates point for a new generation of studies into Y-chromosome the differentiation between groups of Y chromosomes in diversity in human populations, mutation processes and different populations — a useful property for investigat- the role of the Y chromosome in disease. In this review, ing past events. However, because of drift, the frequencies we briefly cover mutation processes and disease studies, of haplotypes can change rapidly through time, so, quan- and point the reader to more detailed coverage in tifying aspects of these events, such as admixture propor- the bibliography, but largely focus on the use of the Y tions between populations or past demographic changes, chromosome as a marker in studies of human evolution. might be unreliable. Geographical clustering is further influenced by the Special features of the Y chromosome behaviour of men, who are the bearers of Y chromo- Why focus on a piece of the genome that only tells us somes. Approximately 70% of modern societies practice about one-half of the population? Because of its sex- patrilocality6,7:ifa man and a woman marry but are not determining role, the Y chromosome is male specific from the same place, it is the woman who moves rather and constitutively haploid. It passes from father to son, than the man. As a consequence, most men live closer to and, unlike other chromosomes, largely escapes meiotic their birthplaces than do women, and local differentia- recombination (BOX 2).Two segments (the pseudoauto- tion of Y chromosomes is enhanced. By contrast, somal regions) do recombine with the X, but these mtDNA, which is transmitted only by women, is amount to less than 3 Mb of its ~60-Mb length; for the expected to show reduced geographical clustering. This purposes of this review,‘Y chromosome’ refers to the has been shown in studies of Europe8 and island non-recombining majority (also variously known as Southeast Asia9, and also in comparisons of populations NRY, NRPY and MSY). The importance of escaping recom- that practice patrilocality or matrilocality (in which men bination is that haplotypes, which are the combinations move and women do not) in Thailand10;here,matrilocal of allelic states of markers along the chromosome, usu- groups show enhanced mtDNA differentiation and ally pass intact from generation to generation.