Review Comparative Genomics of the MHC: Glimpses Into the Evolution Of
Total Page:16
File Type:pdf, Size:1020Kb
Immunity, Vol. 15, 351–362, September, 2001, Copyright 2001 by Cell Press Comparative Genomics of the MHC: Review Glimpses into the Evolution of the Adaptive Immune System Martin F. Flajnik1 and Masanori Kasahara2,3 especially the section adjacent to the class I region, as 1Department of Microbiology and Immunology the “inflammatory region” (Figure 3) and have suggested University of Maryland at Baltimore that linkage of such genes to each other and to class Room 13-009, 655 West Baltimore Street I/class II genes is somehow advantageous. Yet, there Baltimore, Maryland 21021 is no evidence to support a model that the genes must 2Department of Biosystems Science be linked; many genes in the class III region have no School of Advanced Sciences obvious connections to the adaptive or innate immune The Graduate University for Advanced Studies systems (Klein and Sato, 1998). Hayama 240-0193 One test of whether the linkage of immune-related Japan genes might be beneficial to the organism is to study whether such genes have remained physically linked over long periods of evolutionary time. Here we describe MHCs of nonmammalian vertebrates, emphasizing those Summary features that seem universal as well as those character- istics unique to each group of animals. We also specu- MHC gene organization (size, complexity, gene order) late on MHC origins, concentrating on the role of ancient differs markedly among different species, and yet all en bloc duplications in the genesis of the MHC. nonmammalian vertebrates examined to date have a true “class I region” with tight linkage of genes encod- Origins of the Adaptive Immune System ing the class I presenting and processing molecules. Class I and class II genes have been isolated from all Three paralogous regions of the human genome con- jawed vertebrates (gnathostomes) studied to date (Ka- tain sets of linked genes homologous to various loci sahara et al., 1995; Flajnik et al., 1999a), the cartilaginous in the MHC class I, class II, and/or class III regions, fish (e.g., sharks) being the oldest group (Figure 1). In providing insight into the organization of the “proto fact, all of the genes that define adaptive immunity, MHC” before the emergence of the adaptive immune including T cell receptors (TCR), immunoglobulins (Ig), system in the jawed vertebrates. and MHC (class I/II, LMP, TAP, TAPBP), as well as the recombination-activating genes (RAG1, RAG2) and the Introduction (unknown) mutational machinery acting upon the anti- The human major histocompatibility complex (MHC) is gen receptor genes, are all present in these oldest jawed a large genetic region of almost 4 megabases with well vertebrates (Figure 1, reviewed in Laird et al., 2000). over 100 genes, up to 40% of which are involved to some The two living forms of the older, jawless vertebrates degree in immunity (The MHC Sequencing Consortium, (agnathans), hagfish and lamprey, have yielded no such 1999; Trowsdale, 2001). The genes are historically as- genes, molecules, or mechanisms so far despite exten- sembled into three major regions, class I, class II, and sive scrutiny; furthermore, these animals lack the pri- class III (Figures 2 and 3). The class I region encodes mary and secondary lymphoid organs (thymus and the polymorphic, classical class I (or class Ia) genes, spleen) found in all jawed vertebrates. Thus, the adap- two lineages of nonclassical class I (or class Ib) genes, tive immune system as we know and cherish it arose in and other genes unrelated to the immune system. The its entirety seemingly over a relatively short period of class II region contains the polymorphic class II ␣ and geological time (Bernstein et al., 1996). Some believe  genes, as well as genes encoding the ␥ interferon- that the introduction of a RAG-dependent transposable inducible (immune) proteasome components (LMP2 and element into an existing Ig variable (V)-like gene initiated LMP7) and transporters associated with antigen pro- the adaptive immune system and that a method to gen- cessing (TAP1 and TAP2) that are involved in the produc- erate diversity of antigen-recognition elements became tion and transport of peptides that bind to class Ia mole- manifest (e.g., Agrawal et al., 1998). Because the RAG- cules, respectively. The so-called “extended class II mediated rearrangement break occurs in the part of the region” includes mostly nonimmune-related genes but V gene exon encoding CDR3, and this region is in the does encode the TAP binding protein (TAPBP or ta- center of the TCR that interacts with peptides bound pasin), which is essential for class Ia biosynthesis, and in MHC molecules, it was proposed that ␣/ TCR-like RXRB, one transcription factor that regulates class I molecules may have arisen first in evolution, followed expression (Dey et al., 1992). The central class III region by Ig and ␥/␦ TCR that recognize antigen in its free is the most gene-dense and well-conserved (colinear) form (Davis and Bjorkman, 1988). However, it has been section of the MHC in mammals, with the presence of difficult, using phylogenetic analyses, to discern which various genes involved in innate/adaptive immunity antigen receptor is oldest (Richards and Nelson, 2000), such as the complement components C4 and factor B and as described, the oldest vertebrates with an adap- (Bf), the inflammatory cytokine tumor necrosis factor tive immune system have all of the rearranging antigen (TNF), and HSP70 (among others, Figure 3). Gruen and receptor families found in mammals. Weissman (1997) have christened the class III region, Similarly, it is not known whether class I or class II molecules arose first in evolution. Phylogenetic trees 3 Correspondence: [email protected] made by two groups argue for a class II ancestor (re- Immunity 352 Figure 1. Phylogenetic Relationships among Extant Deuterostome Classes and Phyla Species discussed in the text are shown in parentheses. Divergence times are based on molecular data compiled by Kumar and Hedges (1998). The 2R hypothesis assumes two rounds of genome-wide duplication at indicated stages. It is a matter of debate whether hagfish and lampreys are monophyletic. Mya, million years ago. viewed in Hughes and Yeager, 1997; Klein and Sato, Rached et al., 1999). Of over 100 genes located in the 1998), proposed previously on thermodynamic grounds, HLA complex, nearly 40 genes have one, two, or three i.e., a homodimer with each chain having two domains paralogous copies on specific regions of chromosomes arising first, evolving into a class II heterodimer and 1, 9, and 19, namely 1q21-q25/1p11-p32, 9q32-q34, and finally into class I (Kaufman, 1989). Yet, the trees have 19p13.1-p13.3 (for a complete listing of MHC genes with little statistical support, and it is also possible that the such paralogous copies, see Tables S1–S3 [see Supple- peptide binding region (PBR) structure existed in an mental Tables online at http://www.immunity.com/cgi/ ancestor predating both class I and class II and that content/full/15/3/351/DC1]). With the exception of chro- exons encoding a proto-PBR were shuffled onto an exon mosome 1, the paralogous region is basically confined encoding an Ig-like C1-set domain (Flajnik et al., 1991). to either a long or short arm. There is compelling evi- Finally, it is not known whether classical or nonclassical dence that human chromosome 1 underwent a pericen- class I molecules arose first in evolution, or ancestrally tric inversion after the divergence of the human and whether the PBR groove was used to bind antigen, pep- chimpanzee lineages, which is probably responsible for tidic or otherwise. The nonclassical class I molecules the occurrence of the paralogous regions on both arms CD1 and FcRn are just as old as the peptide binding of chromosome 1. class I, and thus the ancestral class I-like molecule may Figure 2 shows that the genes with paralogous copies have had a function outside the immune system (dis- on chromosomes 1, 9, or 19 are distributed from the cussed below). Such problems can only be addressed extended centromeric border of the HLA to its extended by examining the molecular evolution of the adaptive telomeric border containing a class Ib gene, designated immune system. HFE (the gene, the mutation of which causes hemochro- matosis). Therefore, the paralogous region on chromo- some 6 encompasses the entire MHC and spans at least megabases. Gene families such as NOTCH and PBX 8ف Genome Paralogy and the MHC Genes within a single species that arose by duplication have copies on all four paralogous regions including the are termed paralogous genes or paralogs. Close inspec- MHC. However, the majority of gene families listed in tion of the vertebrate genome occasionally reveals that Tables S1–S3 have copies only on two or three paralo- closely linked sets of paralogous genes are located on gous regions. This suggests the existence of gene fami- more than two, and typically four, different chromo- lies that share paralogous copies only among chromo- somes (Nadeau and Kosowsky, 1991; Lundin, 1993). somes 1, 9, and 19 or between two of them; indeed, a This phenomenon is referred to as genome paralogy, survey of the NCBI database revealed more than 40 such and the chromosomal segments that display genome gene families (Table S4). Thus, when the information in paralogy are called paralogous regions. One of the major Tables S1–S4 is combined, more than 80 gene families developments in the field of MHC genomics is the dis- share at least two paralogous copies among the specific covery that the human genome contains at least three regions of chromosomes 1, 6, 9, and 19. We call these regions paralogous to the MHC (Figure 2, Kasahara et gene families the members of the MHC paralogous al., 1996; Katsanis et al., 1996; Kasahara, 1999; Abi- group.