<<

Inferring the ''Primordial Immune Complex'': Origins of MHC Class I and Antigen Receptors Revealed by Comparative Genomics This information is current as of September 29, 2021. Yuko Ohta, Masanori Kasahara, Timothy D. O'Connor and Martin F. Flajnik J Immunol published online 6 September 2019 http://www.jimmunol.org/content/early/2019/09/05/jimmun ol.1900597 Downloaded from

Supplementary http://www.jimmunol.org/content/suppl/2019/09/06/jimmunol.190059

Material 7.DCSupplemental http://www.jimmunol.org/

Why The JI? Submit online.

• Rapid Reviews! 30 days* from submission to initial decision

• No Triage! Every submission reviewed by practicing scientists

• Fast Publication! 4 weeks from acceptance to publication by guest on September 29, 2021

*average

Subscription Information about subscribing to The Journal of Immunology is online at: http://jimmunol.org/subscription Permissions Submit copyright permission requests at: http://www.aai.org/About/Publications/JI/copyright.html Email Alerts Receive free email-alerts when new articles cite this article. Sign up at: http://jimmunol.org/alerts

The Journal of Immunology is published twice each month by The American Association of Immunologists, Inc., 1451 Rockville Pike, Suite 650, Rockville, MD 20852 Copyright © 2019 by The American Association of Immunologists, Inc. All rights reserved. Print ISSN: 0022-1767 Online ISSN: 1550-6606. Published September 6, 2019, doi:10.4049/jimmunol.1900597 The Journal of Immunology

Inferring the “Primordial Immune Complex”: Origins of MHC Class I and Antigen Receptors Revealed by Comparative Genomics

Yuko Ohta,* Masanori Kasahara,† Timothy D. O’Connor,‡,x,{,‖ and Martin F. Flajnik*

Comparative analyses suggest that the MHC was derived from a prevertebrate “primordial immune complex” (PIC). PIC duplicated twice in the well-studied two rounds of -wide duplications (2R) early in , generating four MHC paralogous regions (predominantly on human [chr] 1, 6, 9, 19). Examining chiefly the amphibian Xenopus laevis, but also other , we identified their MHC paralogues and mapped MHC class I, AgR, and “framework” . Most class I genes mapped to MHC paralogues, but a cluster of Xenopus MHC class Ib genes (xnc), which previously was mapped outside of the MHC paralogues, was surrounded by genes syntenic to mammalian CD1 genes, a region previously proposed as an Downloaded from MHC paralogue on human chr 1. Thus, this block is instead the result of a translocation that we call the translocated part of the MHC paralogous region (MHCtrans). Analyses of Xenopus class I genes, as well as MHCtrans, suggest that class I arose at 1R on the chr 6/19 ancestor. Of great interest are nonrearranging AgR-like genes mapping to three MHC paralogues; thus, PIC clearly contained several AgR precursor loci, predating MHC class I/II. However, all rearranging AgR genes were found on paralogues derived from the chr 19 precursor, suggesting that invasion of a variable (V) exon by the RAG transposon occurred after 2R. We propose models for the evolutionary history of MHC/TCR/Ig and speculate on the dichotomy between the jawless http://www.jimmunol.org/ (lamprey and hagfish) and jawed vertebrate adaptive immune systems, as we found genes related to variable lymphocyte receptors also map to MHC paralogues. The Journal of Immunology, 2019, 203: 000–000.

he “2R hypothesis” has proposed that the early vertebrate (3, 4). Further analysis using the insulin/relaxin and neurotrophin/ genome experienced two rounds of genome-wide dupli- neurotrophin receptor family genes revealed that there are addi- T cations (1). Indeed, there are four paralogous clusters of tional regions containing paralogous genes in a similar order (5–7), genes in the of all jawed vertebrates, first studied in humans and it has been suggested that the precursors of these regions and for and MHC genes (2, 3). When genes or genetic regions MHCpara were syntenic during the preduplication era, but some were

are duplicated, some loci preserve their original function, whereas translocated over evolutionary time. These detached regions include by guest on September 29, 2021 others are modified (neofunctionalization or subfunctionalization) or sections of human chr 12, 14, and 15, and are generally shorter than may experience differential silencing. Other types of genome mod- the original regions; we refer to these detached regions as “minor ifications may occur, such as translocation of block regions, at times MHCpara,” and the original four regions as “major MHCpara.” blurring the origins of a particular genetic region. The MHC harbors many genes involved in adaptive and As mentioned, the MHC was one of the original gene clusters noted innate immunity (6, 8). Central to the adaptive immune system, for its paralogous regions (or “ohnologues”), found on human chro- the Ag-presenting MHC class I and class II molecules work in mosomes(chr)6(MHC),1,9,and19(MHCparalogues[MHCpara]) concert with Ag-processing (immunoproteasomes), peptide- transporting (TAP), peptide-editing (DM, TAPBP), and other *Department of Microbiology and Immunology, University of Maryland School molecules, to present antigenic peptides recognized by TCR. of Medicine, Baltimore, MD 21201; †Department of Pathology, Faculty of Med- Precursors of these genes were likely derived from the so-called icine and Graduate School of Medicine, Hokkaido University, Sapporo 060- 8638, Japan; ‡Institute for Genome Sciences, University of Maryland School primordial immune complex (PIC), predating the genome-wide of Medicine, Baltimore, MD, 21201; xProgram in Personalized and Genomic duplications in early vertebrates (9). Indeed, analysis of several in- Medicine, University of Maryland School of Medicine, Baltimore, MD, 21201; vertebrate deuterostome genomes [e.g., amphioxus (Branchiostoma {Marlene and Stewart Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD, 21201; and ‖Department of Med- lanceolatum) (10), and a placozoan (Trichoplax adhaerens)(11)] icine, University of Maryland School of Medicine, Baltimore, MD 21201 revealed conserved synteny of proteasome and “framework” genes ORCID: 0000-0002-0276-1896 (T.D.O.). (i.e., nonimmune genes in MHC). To date, and unfortunately, no Received for publication May 23, 2019. Accepted for publication August 2, 2019. candidate class I/II genes have been detected in species derived from This project was supported by National Institutes of Health Grants AI140326-26 and ancestors predating the jawed vertebrates, and thus most genes AI02877 to Y.O. and M.F.F. strictly involved in adaptive immunity (based on MHC, Ig, TCR) Address correspondence and reprint requests to Dr. Martin F. Flajnik, University of seem to have appeared “suddenly” in a gnathostome ancestor. Be- Maryland, Baltimore, 655 West Baltimore Avenue, Room 3-056, Baltimore, MD cause both MHC and MHCpara are derived from a preduplicated 21201. E-mail address: mfl[email protected] precursor region in a common vertebrate ancestor (3, 6, 9), analysis The online version of this article contains supplemental material. of these regions from different extant vertebrates provides insight Abbreviations used in this article: chr, ; FISH, fluorescence in situ - ization; huMHCpara,humanMHCpara; IgSF, Ig superfamily; L, long; LRR, leucine-rich into the evolutionary history of the MHC and its precursor. repeat; MHCpara,MHCparalogue;MHCtrans, translocated part of the MHC paralogous Previous work on the paralogous regions has focused only on region; NCBI, National Center for Biotechnology Information; NKC, NK complex; PIC, mammals. In this study, we took advantage of the published work in primordial immune complex; S, short; VLR, variable lymphocyte receptor. humans and focused on the genome of the amphibian Xenopus. Copyright 2019 by The American Association of Immunologists, Inc. 0022-1767/19/$37.50 Previous studies showed that the Xenopus genome is relatively stable

www.jimmunol.org/cgi/doi/10.4049/jimmunol.1900597 2 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1 and preserves some primordial features that were lost in other ver- content of the L chromosomes is most similar to the genome of tebrates (12), thus serving as a complementary model system to study the true diploid X. tropicalis. Although most housekeeping genome evolution. We used the true diploid Xenopus tropicalis (13) genes are present on both chromosomes, most class I (except a few and especially the tetraploid Xenopus laevis (14), in which the ge- class I–like genes), AgR, and AgR-like genes discussed in this report nomes have been recently sequenced and analyzed. In combination were diploidized and thus found only on the L chromosomes, and with comparative genomic analyses, we obtained evidence for the therefore we focused our analyses on the L chromosomes. timing of emergence of MHC class I/II and AgR genes. We further Xenopus MHC and identification of major and minor propose a model for the evolution of the human chr 1q21.1–23.3 MHCpara regions region, including the CD1 genes, and reflect on the dichotomy be- tween the jawed and jawless vertebrate adaptive immune systems. The Xenopus MHC was previously mapped by FISH to chr 8 (18) and now is precisely mapped to 8Lq21. To identify Xenopus MHCpara, we used sets of paralogous hallmark genes that were Materials and Methods originally used to identify the human MHCpara (huMHCpara) (3) Data mining (e.g., notch1, 2, 3, 4; pbx1, 2, 3, 4; rxra, b, g; and complement c3, We examined gene models (i.e., software-generated conceptual translation) 4, 5, a2m). Other conserved paralogues such as brd1, 2, 3, 4 were in the scaffolds and genome assembly with subsequent manual validation/ not all detected in the current Xenopus assemblies and thus were annotation. Additionally, we performed tblastn to find genes that were excluded from analyses. Like in humans, we found the same four overlooked by the gene-finder software at the web portal. Chromosomal Xenopus location of Xenopus genes was obtained based on the mapped BAC clones sets of clustered paralogous hallmark genes on chro- using fluorescence in situ hybridization (FISH) methods described else- mosomes: 8Lq21 (MHC), 4Lq24-25, 8Lp11-12, and 3Lq33-34, Downloaded from where (15). All information is publicly available through Xenbase (http:// as well as orthologs of the human minor MHCpara on 1Lq and xenbase.org) (X. laevis v7.1 and 9.1, X. tropicalis v8 and 9) and the Na- 7Lp23-24 (Fig. 1, Table I; hallmark genes in red). tional Center for Biotechnology Information (NCBI) (http://ncbi.nlm.nih. gov). We found inconsistent assemblies among different X. tropicalis Catalytically active proteasome b subunit genes are all versions as well as between X. laevis and X. tropicalis. More extensive encoded in Xenopus MHCpara mapping has been done with X. laevis chromosomes, and thus the Proteasomes are the most abundant proteins in the cytoplasm

X. laevis genome was largely used for this study. Genomic data from http://www.jimmunol.org/ vertebrates other than Xenopus were obtained from various databases and are required for cytosolic protein degradation and recycling in GenBank at NCBI. Gene models from the X. laevis genome are found at pathways (19). Eukaryote proteasomes form a barrel-shaped catalytic NCBI: VJC1 (ACB47447); VJC1 (OCT67647); VJC1 (OCT69143-7); 1310 258 406 tunnel with two identical outer rings composed of seven a-subunits class Ib112 (XP_018111305); class Ib145 (OCT68671); class Ib16004 (XP_018109328). Note that these gene model-based sequences are and two identical inner rings composed of seven b-subunits. Only predicted and thus may not always reflect the RNA sequence. We found three b-subunits (PSMB5 [LMPX], PSMB6 [LMPY], PSMB7 that most Ig superfamily (IgSF) domains encoded within a single exon [LMPZ]) are catalytically active. Upon immune stimulation, ex- are reliable with occasional inaccurate exon-intron boundaries. pression of three b-subunits, PSMB8 (LMP7), PSMB9 (LMP2), Statistical validation of conserved synteny and PSMB10 (MECL1), are upregulated, replacing the constitu- Synteny probability calculation was performed using the method described tive subunits PSMB5, PSMB6, and PSMB7, respectively, to form by guest on September 29, 2021 by Danchin et al. (16); we calculated the binomial probability that the the “immunoproteasome” that generates peptides preferable for Xenopus regions of interest are in synteny with their human corollaries or class I binding (19). Because some prokaryotes possess only one the probability that the genes were organized by chance. This probability is type of b-subunit, it has been proposed that the genes encoding calculated using a binomial probability as: the catalytically active b-subunits, psmb5, 6, and 7, were gener-   x n ated by cis-duplication in an eukaryote ancestor, likely present in PðX . xÞ¼ 2 + pið 2 pÞn2i 1 i 1 the proto-MHC (20, 21); indeed, b-subunit genes are found in i¼0 linkage groups with MHC framework genes in preduplicated ge- where x is the number of homologous genes of human found in the nomes in lower deuterostomes such as amphioxus (10, 21) and the Xenopus regions and p is the proportion of genes in the hypothesized human region (i.e., number of genes divided by 20,199 total protein- placozoan T. adhaerens (11). All three immunoproteasome genes coding genes in the human reference GRC38 dataset at NCBI). This psmb8, 9, and 10 are encoded in the MHC of many ectothermic gives the probability of our selected Xenopus regions have the vertebrates (12, 22). In humans, only PSMB8 and PSMB9 are same compliment of genes as humans by chance. To keep consis- found in the MHC (chr 6), and PSMB10 on human chr 16 is the tency of gene criteria, we obtained protein-coding genes from result of translocation out of the MHC. Likewise, the constitutive Xenopus_laevis_v2 dataset at NCBI. For all reported statistics, we included both hypothetical and dupli- proteasome PSMB7 maps on huMHCpara-9 (i.e., huMHCpara chr cated genes in Xenopus as a conservative probability estimate of synteny, 9) (light blue boxes in Fig. 1, Table I), but other PSMB genes were but with or without these gene subsets, all probabilities provide the same proposed to be translocated from their original location to other interpretation, if not decreasing the probability of synteny by chance. genomic regions outside MHCpara (20). We found that Xenopus psmb6 maps to 3Lq35, in the vicinity of Results c3 and notch3, a region corresponding to huMHCpara-19,andwe Two divergent subgenomes in the tetraploid X. laevis previously reported that Xenopus psmb10 maps in the MHC class X. laevis is an allotetraploid (4n) species, generated by hybrid- III region (Fig. 1, Table I) (12), suggesting that the translocation ization of two divergent ancestral diploid (2n) Xenopus species of psmb6 and psmb10 occurred after the amphibian–mammal di- (subgenomes long [L] and short [S]), and thus its genome contains vergence. PSMB5 is found on human chr 14q11.2 in the vicinity of sets of paired, or homeologous, chromosomes (i.e., 1L ∼ 9L and TCRA/D (14q11.2) and near the IgH chain (14q32.33) loci. This 1S ∼ 9S; n = 18). These two subgenomes have been independently synteny is well conserved in Xenopus, with psmb5 on chr 1Lq14-15, maintained, with no detectable intergenome recombination (14). near tcra/d (1Lq15), igh (1Lq14-15), and igl (l and s) (Fig. 1, Genome-wide analysis further revealed that synteny is generally Table I). As mentioned above, from the distribution of human well conserved between L and S chromosomes, but gene loss, insulin-relaxin genes (5), this region of human chr 14 is a genetic when it occurs [often the case for many adaptive immune genes fragment originally linked to an MHC precursor, but translocated (17)], is much more frequent on S chromosomes (14). Gene during vertebrate evolution, and is designated as a minor MHCpara The Journal of Immunology 3

FIGURE 1. MHC class I, AgR, and catalytic proteasome b-subunit genes are found in the human and especially Xenopus major and minor MHC paralogous regions. The location of MHCpara marker genes correspond well between the human and Xenopus genomes. Two minor MHCpara are also shown because these regions contain signifi- cant marker genes (e.g., PSMB5) and other paralo- gues (e.g., TAPBPL) and thus harbor remnants of the ancestral linkage. Marker, or hallmark, genes are indicated in red, and psmb genes are in light blue. MHC class I/II, AgR, and NCR3 homologs are shown in green, blue, and purple, respectively. A VLR homolog, GP1BB, is also shown with a gray box. Corresponding chromosomes in human and Xenopus are shown side-by-side. NK receptor complexes (NKC, LRC) also map in minor MHCpara. Note that both minor MHCpara showninthisfigurearelikelyderivedfromthe Downloaded from human chr 19 precursor (see Fig. 6).

(6, 7, 20) (Fig.1, Table I). In summary, unlike in humans, all Xenopus Most conspicuously, the Xenopus class Ib112 class Ib gene maps psmb genes encoding catalytic proteasome b subunitsmaptomajor between psmb5 and IgL on Xenopus chr 1Lq12 (Fig. 1, Table I), http://www.jimmunol.org/ or minor MHCpara. the region corresponding to the minor huMHCpara-14 described above that also contains TCRA/D and IgH/L genes. Consistent Xenopus MHC class I genes map to the descendants of with its location on the ancient paralogue, class Ib , like CD1, huMHCpara-6/19 precursor 112 clusters outside of all other vertebrate class Ia and class Ib genes In Xenopus, a single classical class I (class Ia) gene maps to the MHC in the maximum likelihood , and somewhat less (23), whereas a cluster of nonclassical class I (class Ib) genes (xnc) so in the neighbor-joining tree (Supplemental Fig. 2). We detected (24, 25) was previously mapped to the telomeric region of the MHC reptilian class I genes orthologous to Xenopus class Ib112 (Fig. chromosome (18). Now we report three additional nonclassical class 2A) that, where it was possible to examine, also map to this in- by guest on September 29, 2021 I genes in the Xenopus genome designated class Ib112, class Ib16004, teresting paralogous region (Fig. 2B). Upon closer examination of and class Ib145, based on their original scaffold numbers in ver 4.1 the Xenopus chr 1L region, we found that class Ib112 is surrounded (Table I). All three are single-copy genes on L chromosomes with by genes that map to human chr 19p13 (Supplemental Fig. 3). typical class I domain structures, but the deduced amino acid se- Conservation of synteny was further evaluated with probability quences lack the evolutionarily conserved peptide-binding residues by chance of ,1 3 10216 (Table II). It should be noted that the found in all classical class Ia molecules (Supplemental Fig. 1A); note so-called UT class Ib genes in opossum (26) (also with reptilian that the class Ib112 is highly divergent from class Ia (see below). In orthologs) are also linked to the psmb10 gene in an MHCpara addition, consistent with their designation as nonclassical class I (GenBank accession NC_008801.1: region 685896657- 705364100 genes, these three class I genes are monomorphic (data not shown), [www.ncbi.nlm.nih.gov]). In summary, all three Xenopus class Ib have a tissue-specific expression, andareexpressedatmuchlower genes map to MHCpara most likely derived from the chr 6/19 pre- levels than class Ia (Supplemental Fig. 1E). cursor, and two of them are linked to genes encoding constitutive Whereas Xenopus MHC class Ia and the xnc cluster map to catalytic proteasome b subunits. 8Lq21 and 8Lq31-32, respectively, the class Ib145 gene maps be- Note that the positions of class Ib16004 and class Ib145 in the tween the MHC and xnc (green box in Fig.1, Table I). Based on phylogenetic trees do not conform well to their ancient origins that phylogenetic analyses, the class Ib145 gene is intermediate in sim- we propose (Supplemental Fig. 2). At least in the case of class ilarity to the Xenopus class Ia and class Ib genes (Supplemental Ib145, its location on the same chromosome as the xnc and MHC Fig. 2). Interestingly, the class Ib145 gene is surrounded by genes might subject class Ib145 to gene conversion events that blur its mapping to human chr 14q13.2 (Supplemental Table I), near age (e.g., the high similarity of class Ia to class Ib145 in the N- huMHCpara-14. The class Ib16004 gene, most related to the xnc terminal region of the a2 domain and low similarity in the rest of genes (Supplemental Fig. 2), maps very near (only four genes the molecule, Supplemental Fig. 1A). Being in a paralogous re- apart) to psmb6 on 3q33-34 in an MHCpara (Fig. 1, Table I). gion on a different chromosome than MHC/XNC, the clustering of The human class Ib gene FCGRT encoding the p51 subunit of class Ib16004 with Xenopus xnc class Ib genes in the trees is dif- the neonatal IgG Fc receptor (FcRn) is found in a similar gene ficult to reconcile with its proposed origins at 1R. Considering the location as Xenopus class Ib1604, but we could not establish numerous class Ib genes in the frog genome (25) we speculate that orthology between these two genes in phylogenetic analyses or there may be opportunities for gene conversion or other unknown synteny (Supplemental Fig. 2). However, the synteny of genes mechanisms even among nonhomologous chromosomes. between class Ib16004 to psmb6 on human chr 17p13 is con- served (probability by chance: 3.33 3 10216,TableII),further Evidence of en bloc translocation of MHCpara and cementing the ancient class I–proteasome gene linkage. Most identification of MHCtrans likely, this part of the MHCpara was translocated later in the As mentioned above, a large cluster of xnc class Ib genes maps to the vertebrate lineage. telomere of the Xenopus MHC chr 8Lq31-32 (18), which is not 4 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1

Table I. Chromosomal locations of genes in human and Xenopus genomes

MHC and MHCpara

MHCpara Genes Human chr. X. laevis chr.a Scaffold (v7.1) Position (v7.1)b FISH BAC Position (v9.1)b MHC-6 TAPBP 6p21.3 8Lq14-21 50694 6,954,053..6,969,886 108L10 50,739,635..50,755,022 RXRB 6p21.3 8Lq14-21 50694 7,102,020..7,117,938 106L10 50,887,807..50,903,520 PSMB8 6p21.3 8Lq21 75398 274,622..283,843 290K18 78,508,537..78,523,720 PSMB9 6p21.3 8Sq21 12933 4,797,175..4,812,351 044A14 78,508,537..78,523,720 PBX2 6p21.3 8Lq21 75398 378,082..396,079 114D22 51,636,291..51,653,761 NOTCH4 6p21.3 8Lq21 75398 337,685..353,721 114D22 59,569,525..51,611,344 C4 6p21.33 8Lq21 75398 475,934..520,806 114D22 51,733,637..51,778,496 PSMB10 16q22.1 8Lq21 75398 524,686..539,832 114D22 51,782,368..51,796,858 MHCpara-1 NOTCH2 1p13-p11 4Lq25 78978 84,826..126,901 055J23 110,037,275..110,044,215 PBX1 1q23 4Lq24 47606 5,480,539..5,556,446 036M06 99,325,939..99,347,128 RXRG 1q22-q23 4Lq25 78978 1,407,934..1,480,769 055J23 111,399,108..111,408,361 MHCpara-9 NOTCH1 9q34.3 8Lp12 37448 2,529,375..2,559,128 030B08 4,177,800..4,228,940 RXRA 9q34.3 8Lp 255149 96,949..211,615 NA 5,266,355..5,268,209 PBX3 9q33.3 8Lp11 403228 523,205..572,639 020L15 11,095,878..11,257,315 PSMB7 9q33.3 8Lp11-12 3586 2,248,619..2,282,130 227M14 9,754,478..9,780,669 C5 9q33.2 8Lp 86205 1,227,102..1,317,602 NA 5,816,244..5,865,712

MHCpara-19 NOTCH3 19p13.2 3Lq33-34 171831 677,258..734,233 079J11 125,881,103..125,938,078 Downloaded from C3 19p13.3 3Lq34-35 175714 455,613..739,326 322O09 134,274,206..134,300,156 PSMB6 17p13.2 3Lq35 16004 50,127..57,691 017J04 139,511,604..139,519,183 PBX4 19p13.11 NA NA NA NA NA MHCpara-14 IgLs NA 1Lq12 39437 417,923..418,230 031N23 98,280,301..98,295,182 (minor) TRA 14q11.2 1Lq15 29869 458,946..459,559 039F04 140,207,982..140,211,379 TRD 14q11.2 1Lq15 272406 116,704..184,681 130J21 140,946,814..140,951,210 IgHMC 14q32-33 1Lq14-15 13576 6,811,972..7,160,435 312E22 139,040,662..139,059,333 PSMB5 14q11.2 1Lq14 13576 6,389,514..6,394,129 244A12 138,627,523..138,632,499 http://www.jimmunol.org/ IgLl 22q11.22 1Lq21 162663 1..140,765 159H19 153,417,276..153,418,351 MHCpara-12 TAPBPL 12p13.31 7Lp23-24 79772 4,980,784..7,959,403 225A12 7,950,550..7,960,609 (minor) LAG3 12p13.31 7Lp23-24 79772 5,304,485..5,317,904 225A12 7,593,489..7,606,908 CD4 12p13.31 7Lp23-24 79772 5,359,817..5,371,805 225A12 7,539,588..7,551,576 A2M 12p13.31 7Lp24 131666 1,275,208..1,307,453 307G18 5,334,645..5,366,890 CLEC2B 12p13.31 7Lp24 131666 693,899..709,661 307G18 5,932,418..5,948,215

Class Ia/Ib and AgR genes

Gene Human chr X. laevis chr.a Scaffold (v.7.1) Position (v7.1)b FISH BAC Position (v9.1)b Domains by guest on September 29, 2021 MHC class I and class I–like 112 1Lq12 72621 122,476..126,293 085N05 102,130,692..102,139,541 a1,2,3; a1,2; a2 145 8Lq25 265107 1,565,727..1,581,290 012C13 87,117,299..87,129,697 a1,2,3 Class Ia 6p21.3 8Lq21 75396 164,448..242,219 290K18 51,482,854..51,498,908 a1,2,3 XNC 8Lq31-32 26819 3,427,830..3,826,756 156D07 110,198,845..110,862,792 a1,2,3 16004 3Lq35 16004 123,032..130,911 017J04 139,582,763..139,592,397 a1,2,3 CD1 1q22-23 a1,2,3 MR1 1q25.3 a1,2,3 FCGRT 19q13.33 a1,2,3 PROCR 20q11.2 a1,2 ZAG 7q22.1 a1,2,3 ULBP RAET 6q25 a1,2,3 AgR-like 1310 8Lp12 127590 359,968..365,248 209G21 1,072,952..1,075,438 VC 258 8Lq14-21 50694 22,116..25,167 106L10 43,808,224..43,815,003 VC 406 Lost? (1q22) 8Lq31-32 115163 Multigene family 033B12 104,468,021..106,003,445 VC 221,846..1754,674 PTCRA 6p21.1 C(lossof V?) IgLk 2p12 1Lp32-34 109418 2,725,506..2,725,994 213L05 9,199,747..9,212,091 VC 3467 177,260..183,220 146J08 TCRbC 7q34 7Lp23-24 230427 307,269..307,610 191H14 315,991...316,317 VC TCRgC 7P14 6Lp12-13 19169 498,099..551,608 045F01 62,074,212..62,074,523 VC NKp30 homolog NKp30 6p21.3 4Lq25 35524 Multigene family 166F02 118,024,408..118,452,984 V 2,568,835..2,569,428 XMIV (6p21.3) 8Lq21 75398 Multigene family 154P18 52,754,412..52,854,193 V 1,530,600..1,631,611 aMapping location based on v9.1. bBeginning..end of positions in the scaffolds. assigned as an MHCpara (Figs. 1, 3, Table I, Supplemental Table I). the polymorphic psmb and tap genes (27, 28), forming a primordial In the MHC of Xenopus and other nonmammalian vertebrates, low “class I region” (29). Coevolution among the genes in the class I numbers (or only one) of class Ia genes (22) are closely linked to region has been suggested: there is a strong linkage disequilibrium The Journal of Immunology 5

Table II. Probabilistic calculation of Xenopus synteny with human for regions of interest

Probability Human and No. of Genes in Hypothesized Homologs in Total in Xenopus Xenopus Share Genes Region Human Regiona Pb Xenopus Region Region by Chance

22 215 VJC11310 327 1.62 3 10 172 220 3.89 3 10 22 216 Class Ib112 1158 5.73 3 10 139 279 ,1 3 10 MHC 216 1.07 3 1022 88 106 ,1 3 10216 MHC without butyrophilins 150 7.43 3 1023 85 103 ,1 3 10216 23 216 Class Ib16004 35 1.73 3 10 23 51 3.33 3 10 GP1BB 181 8.96 3 1023 66 78 ,1 3 10216 aBased on the human reference GRC38, with 20,199 total genome-wide protein-coding genes. bProportion of the found in the hypothesized syntenic region. between the bony fish [psmb and class Ia (medaka) (30) and psmb, scenario in which the block of human 1q21.1–23.2 genes, in- tap and class I (zebrafish) (31)] and shown functionally in birds cluding CD1, was the result of secondary translocation following [tap and class Ia (32)]. The XNC loci were likely generated via the intrachromosomal translocation from the MHC (Fig. 4). One cis-duplication of MHC class I genes and the subsequent trans- caveat is the synteny of cd1 genes in various bird species in which location to a telomeric location, perhaps to limit recombination/ the cd1 genes are found in various linkage groups that are not gene conversion between the single MHC class Ia gene and class consistent with each other and most of them are not in MHCpara Downloaded from Ib (xnc) genes. A similar organization is found for the chicken (Supplemental Table I): human chr 1q25 (mallard and swan MHC (B locus), where class Ib along with several class II genes goose); 9q22.31 (egret, pigeon, crow, finch, manakin, killdeer, map separately from the MHC in the telomeric region of the same falcon, cuckoo, ibis); and 6q22.31 (eagles). If the synteny on 1q25 chromosome (Y or Rfp-Y locus) (33) (see below). This secondary and 9q22.31 represents the original location, MHC class I could region also presumably arose by cis-duplication of MHC genes have existed even in the 0R ancestor (Fig. 1).

followed by translocation, but the situation in frogs and chicken is In this article, we propose the following scenario (Fig. 4): cd1 http://www.jimmunol.org/ thought to have developed via convergent evolution. We further was generated by tandem duplication from an MHC class I/II predict that the splitting of Xenopus class Ib genes from the MHC precursor, most likely pre-2R. Subsequently, the class I/II/cd1 to the telomere likely allowed expansion of xnc genes and drove genes were cis-duplicated and a block region was translocated to neofunctionalization. For example, xnc10-restricted NKT-like the telomeric region (translocated part of the MHCpara region cells have been identified in Xenopus (34), and other xnc genes [MHCtrans]), which allowed expansion of class Ib/cd1 genes. have prospective NKT partners (35, 36). Later, a block region was further translocated to human chr We found that XNC region contains many genes mapping to 1q21.1–23.3, coincidentally in huMHCpara-1. During the pro- human chromosomal region 1q21.1–23.3 (Fig. 3B, Supplemental cess, MHC and CD1 loci experienced differential gene loss (loss Table I), specifically a block region surrounding CD1 genes of MHC class II and CD1 in Xenopus MHCtrans, and loss of by guest on September 29, 2021 (dotted box in Fig.1). Previously, the 1q21.1–23.3 region was MHC genes on human chr 1q21.1–23.3). Finally, expansion of proposed to be a part of huMHCpara-1 (37). However, the pro- certain genes occurred (class Ib genes [xnc]inXenopus and CD1 posed MHCpara regions are spread broadly over human chr 1, genes in mammalian species including humans). Because most presumably because of a pericentric inversion on this chromosome genes mapping to human chr 1q21–23.3 are in the Xenopus XNC (more details below), and thus the integrity of the conservation of region [including KIRREL (49)], whereas all hallmark genes for the huMHCpara-1 has been questioned (37). huMHCpara-1 map to Xenopus 4Lq24-25 with no homologs in CD1 molecules are similar to MHC class Ia in their protein both the XNC and 4Lq24-25 regions, translocation seems to be structure, association with b-2 microglobulin, and Ag-presentation the simplest explanation. Note that the 39-end of this transloca- capacity (38, 39). CD1 molecules, however, do not present peptide tion is at the telomere (Fig. 3A, Supplemental Table I), and the Ags to conventional T cells but rather lipid Ags to unconventional 59-end contains large clusters of olfactory (OR) and vomeronasal T cells such as NKT cells and gdT cells, and thus are categorized (VNR) genes; both the telomere and repetitive genes may have as class Ib (40). Unlike MHC class Ia, which is expressed ubiq- played a role either in the translocation (especially the telomeric uitously, CD1 expression is usually limited to APC, and the CD1 location) or original duplication. Ag-loading machinery is similar to that of MHC class II (41). It To further examine the evolutionary timing of en bloc translo- was originally proposed that CD1 genes were generated during 2R cation of the 1q21.1–23.3 region, we searched for huMHCpara-1 and subfunctionalized (42). However, the discovery of cd1 genes orthologous regions in several representative vertebrates (Fig. 5A). in the chicken MHC did not conform well to the 2R hypothesis As mentioned earlier, the huMHCpara-1 spreads onto both arms (43–45). So far, two major hypotheses have been proposed to of chr 1, proposed to be partially a result of a pericentric inversion explain the timing of cd1 emergence and genome evolution: (37). For example, hallmark genes are split onto both arms of chr Salomonsen et al. (44) proposed that cd1 was generated by tandem 1: NOTCH2 maps to 1p13-p11, whereas RXRG and PBX1 map to duplication of MHC genes at the primordial state (0R), and 1q23.3 (Fig. 5B). Similarly, notch2 maps separately from rxrg paralogous copies were silenced in all paralogous regions during and pbx1 in the opossum genome. However, hallmark genes are genome duplications rather than direct product of 2R. Miller et al. closely linked in all nonmammalian species (on chr 8 in chicken; (45) proposed that cd1 may have arisen more recently, and cd1 on chr 4q25 in Xenopus; and in the elephant shark genome) genes were later translocated to an MHCpara in mammals. The (Fig. 5B), suggesting that the pericentric inversion must have discovery that cd1 genes map to Chinese alligator huMHCpara-19 occurred in a mammalian ancestor. Like in Xenopus, orthologous (46) (Fig. 4, Supplemental Table I) strongly suggests that cd1 genes on human chr 1q21.1–23.3 are found on chicken chr 25. arose pre-2R (reviewed in Refs. 47 and 48). Our discovery of the Therefore, both regions orthologous to 1q21.1–23.3 in chicken human chr 1q21.1–23.3 region containing genes whose Xenopus and Xenopus are found on different chromosomes, and thus it counterparts map to the XNC locus suggests a compromise seems likely that the translocation of 1q21.1–23.3 region occurred 6 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1 Downloaded from http://www.jimmunol.org/ by guest on September 29, 2021

FIGURE 2. Evolutionarily conserved MHC class Ib112 among lower vertebrates. (A)AlignmentoftheclassIb112 genes from Xenopus and reptiles. Dots show residues identical to X. tropicalis 112. Dashes show deletions. An asterisk (*), 8, and b denote peptide-binding residues that are evolutionary conserved among classical class Ia, CD8 binding sites, and b-2 microglobulin binding sites, respectively. Typical conserved amino acid residues for IgSF domains are highlighted in blue. GenBank accession numbers (obtained from ncbi.nlm.nih.gov) of the class Ib112:Chmy (Chelonia mydas: green sea turtle) XP_007069382; Pesi (Pelodiscus sinensis: Chinese soft-shell turtle) XP_014430793, XP_006126776; XP_014430792, XP_014430791, XP_014430790, XP_014430790; Chpib (Chrysemys picta bellii: painted turtle) XP_005313900, XP_008175642; Alsi (Alligator sinensis: Chinese alligator) XP_006037953; Almi (Alligator mississippiensis: American alligator) XP_019343116. (B)Conserved synteny of class Ib112 in amphibians and reptiles. Each box indicates a single gene. Red boxes represent the 112 class Ib genes. The number of genes varies depending on the species, and these genes could only be found in amphibian and reptiles. Data were retrieved from NCBI (www.ncbi.nlm.nih. gov/gene/). The Journal of Immunology 7 Downloaded from http://www.jimmunol.org/ by guest on September 29, 2021

FIGURE 3. Human chr 1q21.2–23.3 is likely a translocated MHCpara.(A) Comparative mapping of the Xenopus XNC region (top) and human chr 1q21-23 region (bottom). The gene cluster, including CD1 (purple boxes), mapping to the human chr 1q21-23 region has its counterparts in the Xenopus XNC region. The XNC maps to the telomeric region of the Xenopus MHC chromosome, and this region was proposed to be the result of translocation from the MHC. Other immune genes, such as slamf and fcr-like, are also found in this linkage group, suggesting the ancient linkage of these genes to the PIC (also refer to Fig. 5A). Furthermore, the presence of uninterrupted IgL-like (VJC1406) genes (shown in blue boxes) provides a strong case for ancestral MHC–AgR linkage. Marker genes as well such as KIRREL are shown in red boxes. Only relevant genes are shown in this figure, and the complete list of X. laevis genes is provided in Supplemental Table I. The solid bar on the far right end of the Xenopus chromosome indicates the telomere. (B) Conserved synteny of novel AgR-like VJC1406 (PRARP) genes among frog, birds, and reptiles. VJC1406 genes, most related to IgL genes, were found in other vertebrate species besides Xenopus and synteny is well conserved. Triangles indicate the 59 to 39 gene orientation. Red triangles represent AgR IgL-like genes, VJC1406. The number of genes varies depending on the species, and these loci have been lost in humans and bony fish. Although the genes are present in cartilaginous fish, there is no information on synteny. Synteny is consistent with pre- viously published data (73); however, we focused more on the context of particular genes found in MHCtrans. after the bird–mammal separation (Fig. 5A). Note that unlike In summary, we propose that the CD1 region in mammals is a Xenopus, the chicken MHC is not found on chr 25 (rather on result of a translocation event, by chance, into huMHCpara-1, chr 16); however, both chr 16 and 25 are microchromosomes, and and thus there is no strong evidence of class I genes on we predict that these two chromosomes were split during bird MHCpara-1 or -9. This is consistent with our hypothesis that a evolution. There seems to have been different genome modifica- class I precursor gene may have arisen after 1R on only one of tions among mammalian species, having multiple chromosomal the duplicated chromosomes, chr 6/19 (Figs. 4, 6). Contrary to breakpoints before the rodent/artiodactyla divergence (data not the existing hypothesis that class II predates class I (50–53), we shown). further propose that class I emerged first in evolution because 8 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1

C1-IgSF domains. Genes encoding these VJ and C1 domains likely combined to become AgR precursors (59, 60), and the RAG transposon (63–65) split one of the VJ single genes into separate V- and J- genetic elements (V-J). One candidate for such a precursor is the NCR3 gene encoding NKp30 (66). NCR3 contains a single VJ exon and maps to the MHC in most studied vertebrates (67). In Xenopus,aclusterofncr3 genes maptoanMHCpara, 4Lq25 (68), whereas there is another set of genes having exactly the same domain structure (xmiv) mapping to the MHC (12) (dark purple boxes in Fig. 1, Supplemental Fig. 3). Whether ncr3 is immediately related to the ancestor of the AgR precursor or not, the xmiv and ncr3 genes are clearly derived from a common VJ precursor gene that was linked to the primordial MHC (Fig. 6, Supplemental Fig. 3) (67). Recently, genes with VJ-C2 structure were dis- covered in amphioxus (), an invertebrate deuterostome (69). Whether these genes are immediate relatives to the VJ ancestor or is a divergent descendant is debatable; however,

one of the lancelet VJ-C2 genes maps adjacent to the kirrel Downloaded from FIGURE 4. Hypothetical scenario for the origins of the CD1 region. gene, which maps next to CD1 genes in human chr 1q (dotted We propose that the CD1 gene was originally generated by tandem du- red box in Fig.1), strongly supporting its relationship to the VJ plication of MHC class I/II precursor genes in the MHC, followed by precursor. subfunctionalization. Subsequently, part of the MHC region was trans- In addition to the previously identified IgH and L chains, and all located and differentially silenced, leaving MHC class I/II genes in the four types of TCR genes, there are three novel Xenopus genes that MHC, with CD1 in the translocated MHC region (MHCtrans). This MHCtrans was later translocated into another chromosome, coinciden- encode a single VJ and a C1-IgSF domains, like TCR or IgL http://www.jimmunol.org/ tally an MHCpara (shown here on human chr 1). Dotted boxes indicate chains in “pre-RAG transposon” state. All three genes are found in silenced/pseudogenes. 2R, second round of whole-genome duplication. MHCpara and we designate them VJC1258, VJC1406, and See text for further discussion. VJC11310 based on their domain structure and scaffold number in ver 4.1 (light purple boxes in Fig. 1, Table I). VJC11310 is a we have not found MHC class II genes anywhere outside the single-copy gene (Supplemental Fig. 1B) mapping to Xenopus bona fide MHC or paralogous regions (54). MHCpara-8Lp11-12. Preliminary BlastP analysis exhibited When did the original MHCtrans (red arrow in Fig. 4) arise in high identity with IgL from various vertebrates with highest 231 evolution? We found it in amphibians (Fig. 5A), but it may be similarity to the anole lizard (∼4 3 10 ), and spiny dogfish 3 225 by guest on September 29, 2021 older. Families of class Ib genes in cartilaginous fish that are (shark; 5 10 ). VJC11310 was previously reported to be a currently unmapped (55) may be a part of this original MHCtrans. “germline-joined igl chain” (GenBank accession ACB47447 Besides class I and AgR-like genes (see below) in MHCtrans, [www.ncbi.nlm.nih.gov]) (70). However, we mapped all three other immune-related genes such as fcrl (56, 57) and slamf (58) known rearranging IgL isotypes (l, k, s)toXenopus chr 1, are also found in this region (Figs. 3, 5). Unlike class I and whereas VJC11310 maps to a different MHCpara region (sur- AgR, however, slamf and fcrl per se are not found in bona fide rounding genes mapping in the huMHCpara-9 [Supplemental 215 MHCpara and thus likely emerged soon after 2R in early ver- Fig. 3]; linkage probability by chance 3.89 3 10 [Table II]), tebrates. We further predict that their origin, most likely, is from making it highly unlikely that VJC11310 is a bona fide IgL. constant (C) 2–type IgSF precursors that were present in the PIC VJC1258 is also a single-copy gene (Supplemental Fig. 1C), (e.g., KIR genes found on huMHCpara-19 are also derived from maps upstream of the MHC, and is expressed in the Xenopus these precursors). thymus (by northern blotting, data not shown). BlastP analysis using the VJ domain exhibited highest identity with IgL from various Emergence of AgR precursor in the PIC vertebrates with the highest match to coelacanth (4 3 10231)and Linkage of TCR- and Ig-like genes in association to the primordial large flying fox (2 3 10230), whereas the C domain matched var- MHC has been previously suggested (59, 60). AgRs bear a rare, ious cartilaginous fish IgH and IgL with much lower E-values specialized C1-type IgSF domain (61) like those found in MHC ranging from 1 3 1029 to 9 3 1025.ThePreTa (PTCRA) gene, class I/II, and thus one might predict their linkage to the primordial which encodes a single C1-IgSF domain and is so far found only in MHC. Human TCRA/D genes are found near PSMB5 (chr 14q11 mammalian species (71), also maps upstream of the human MHC in Fig. 1, Table I), also suggesting ancestral linkage of TCR to (striped box in Fig. 1). The prediction is that PTCRA originally had MHC. In Xenopus genome, in addition to the close linkage of a V(J) domain, but it was lost in evolution (72). It is possible that tcrad-psmb5,theigh locus (62) and igl genes (especially the l Xenopus VJC1258 was related to a precursor of preTa before loss of isotype) are closely linked (Xen1q in Fig. 1). These locations the V(J) domain, but phylogenetic analysis of VJC1258 and all AgR strongly support the ancestral linkage of precursor AgR genes to including preTa did not support this scheme (data not shown). the proto-MHC. Moreover, BlastP analysis using the C domain did not select PreTa AgRs have a variable (V) domain with a signature IgSF “G” in any other species, suggesting VJC1258 is not closely related to b-strand encoded in a separate element; in the germline of the preTa. Regardless of their function and orthology to other genes, most simple IgL, the V element encodes strands “A–F” and mapping of these AgR-like genes to all MHCpara strongly supports the J (joining) element encodes the “G” strand (61) (also the idea that an AgR precursor was present at the 0R stage (i.e., PIC) shown in Supplemental Fig. 1B, 1C). It has been proposed that (Fig. 6). genes containing a single uninterrupted VJ element (i.e., exon) We also mapped a cluster of VJC1406 genes (Supplemental Fig. were present in the primordial MHC, near to genes encoding 1D) to the scaffolds with xnc genes in the MHCtrans region The Journal of Immunology 9

FIGURE 5. (A) Origin of the translocated MHC (MHCtrans) region and its subsequent translocation in placental mammals. The region we describe as MHCtrans is found at the telomere of the MHC chromosome in Xenopus, chickens, and marsupials, suggesting that the translocation of MHCtrans to non-MHC chromosomes occurred in placental mammals. MHCtrans contains other immune genes such as SLAMF, FcR-like, and IgL-like Downloaded from

(VJC1406), suggesting that ancestors of these genes were present in the primordial MHC (e.g., C2 and VJ-IgSF– encoding genes). MHC and MHCtrans are shown in solid green and dotted red boxes, respectively. (B) Inferring the timing of the p-q split on human chr 1. Chromosomal locations of the NOTCH2 gene from other marker genes, RXRG, PBX1, which correlates with the evolutionary http://www.jimmunol.org/ timing of the p-q split between birds and mammals. The timing of the p-q split also correlates well with the translocation of MHCtrans into the 1q21.1–23.3 region (indicated by the hatched box including the KIRREL gene). MR1 is a nonclassical class I that maps to human chr 1, outside of the CD1 region, is found only in mam- malian lineage, and its evolutionary origin is unknown. by guest on September 29, 2021

along with the genes mapping to human 1q21.1–23.3 (Fig. 3A, showing that they were present in the PIC before 1R. The consistent Supplemental Table I). Again, linkage of MHC class I to AgR-like linkage of AgR-like and MHC class I genes on chromosomes de- genes is clear. We found VJC1406 orthologs in many species of rived from chr 6/19 after 1R further demonstrates that the presence reptiles, birds, and other species; during preparation of our article, of AgR precursors in the PIC predates the emergence of bona fide VJC1406 orthologs have been recently reported from chicken and MHC class I genes (Fig. 6). named PRARP. PRARP were likely lost in mammals and teleost fish but are present in coelacanth and likely in sharks (73). The Evolution of TCR genes authors did not conclude that PRARP were AgR-like genes or In a previous study, we proposed a scenario for the evolutionary MHC associated, but the chicken prarp genes were expressed emergence of TCRD/A and IgH genes (6, 62). In this study, we in lymphocytes and thus potentially have an immune function, and further examined the genome evolution of the TCRB/G genes. they were proposed as candidates for invasion by the RAG Whereas TCRA and TCRD genes are encoded in the minor transposon. Regardless of their functions, their synteny is well huMHCpara-14, TCRB and TCRG genes map at both ends of conserved among different vertebrate species (73) (Fig. 3B). In human chr 7 (Fig. 7A, Table III). Hood and colleagues (74) our study, we found a clear linkage of this to MHC proposed that this split arrangement is an evolutionarily de- class I genes in the MHCtrans region of lower vertebrates (Fig. 3), rived situation, and TCRB and TCRG had been originally further confirming the hypothesis that VJ-IgSF were present in closely linked, like the extant TCRA/D genes, but were sepa- the PIC. rated via a pericentric inversion. In Xenopus, tcrb and tcrg In summary, VJ- and C1-IgSF–containing AgR-like genes are are found on different chromosomes (tcrb 7Lq23-24; tcrg present in both major and minor MHCpara regions and MHCtrans, 6Lp12-13 [Table III]). However, the Xenopus tcrb gene maps 10 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1 Downloaded from http://www.jimmunol.org/

FIGURE 6. Emergence and genomic evolution of MHC class I/II and AgR genes. We hypothesize that an AgR precursor was present in the PIC prior to the first round of whole-genome duplication (1R). Subsequently, MHC class I/II arose on the chr 6/19 precursor after 1R. We anticipate that the RAG transposon insertion had not occurred until after the after the second round of genome duplication (2R), separated the VJ exon into separate exons onlyin by guest on September 29, 2021 genes on the chr 19 precursor. Hallmark genes are indicated in red and psmb genes are in light blue. MHC class I/II, AgR, complement, and NK receptors are shown in green, blue, orange, and purple, respectively. near tapbpl and cd4/lag3 genes, which are found in the NK In summary, the combined data favor the existing hypothesis cell complex (NKC) on human chr 12p13.31 (Figs. 1, 7B, that TCRB and TCRG were indeed originally linked in minor Table III). The NKC is also considered as a minor MHCpara, huMHCpara-12, followed by chromosome split to human chr 7, based on 1) the presence of the marker gene A2M (homolog of secondary translocation of block regions containing TCRG C3,4,5) (6); 2) the presence of the TAPBP paralogue, TAPBPL (Fig. 7B). Alternatively, TCRB and TCRG were differentially si- (75) (TAPBP maps to the MHC); 3) mapping of chicken C-type lenced after translocation from their original location. In either lectin NK receptor genes to the MHC (6, 76, 77), whereas the scenario, the splitting up of the two genes and subsequent trans- C-type lectin NK receptor genes map to the mammalian NKC; location(s) were involved in positioning tcrb and tcrg at either end and 4) studies of neurotrophin gene distribution in jawed ver- of human chr 7. tebrates (7). Thus, tcrb linkage to an MHCpara also suggests Based on the distribution of the orthologous genes found on an ancestral linkage of TCR precursor genes to the primor- Xenopus chr 1q (Supplemental Fig. 3), we speculate that dial MHC. In contrast, Xenopus tcrg mayhavebeentrans- huMHCpara-12 split from huMHCpara-14. Also, a block re- located to an unrelated region (chr 6) having no connection to gion containing the igll gene (human chr 22q11) is derived from the MHCpara. huMHCpara-14 (linkage probability by chance ,1 3 10216 We decided to further examine the linkage status of TCRB and [Table II]). Therefore, our analysis suggests that all rearrang- TCRG genes in other vertebrate genomes. Other mammals (e.g., ing AgR are likely derived from the huMHCpara-19 precursor. pig, mouse), besides humans, have a linkage of TCRB to NKC Invasion of the RAG transposon likely happened on hu- genes (Fig. 7A, Table III). Linkage of tcrb to the NKC is also seen MHCpara-19 after 2R, splitting the VJ element into separate V in birds (e.g., chicken and turkey). Linkage of tcrb to the NKC has and J elements, and the various pairs of AgR genes are sug- not been documented in bony fish: In the primitive bony fish, gested to have been generated via cis duplications. This theme spotted gar, tcrb is linked to genes on human chr 14q24.1 and 15q15 is discussed further below (Fig. 8). on LG7, whereas a2m and tapbpl map to LG26. Synteny of tcrg is conserved among vertebrate species; like Xenopus, tcrg was found Discussion on a separate chromosome in all nonprimate species. However, in We have conducted a genome survey for loci involved in adaptive opossum, tcrg is linked to tapbpl, suggesting a remnant linkage of immunity and propose hypotheses for the origins of the PIC tcrg to NKC. (Fig. 6). We also uncovered evidence of an en bloc translocation The Journal of Immunology 11 Downloaded from http://www.jimmunol.org/ by guest on September 29, 2021

FIGURE 7. (A) TCRb genes are found in the minor MHCpara. Human chr 12p, harboring the NKC, was identified as a minor MHCpara and contains marker genes like TAPBPL and A2M (shown in red and orange boxes). Although no TCR gene maps in the human NKC, TCRb genes from many species are closely linked to the region orthologous to the NKC, suggesting the linkage of TCR genes to the PIC. Moreover, the TCRg (Figure legend continues) 12 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1

Table III. Chromosomal locations of TCRb and TCRg genes in various and have uncovered nonrearranging AgR-like genes in MHCpara vertebrates that may be related to the Ig/TCR ancestor.

Species Chromosome Gene Positiona Emergence of IgSF Ag receptors and PIC Human 12p13.2 KLRD1 10,238,385..10,329,607 It has been previously predicted that AgR precursor genes were (NKC) linked to the proto-MHC and translocated later in evolution (59, 60, 12p13.31 A2M 9,067,708..9,115,962 78). To address this hypothesis, we mapped AgR/AgR-like genes on 12p13.31 CD4 6,789,472..6,820,810 Xenopus chromosomes and uncovered several nonrearranging genes 12p13.31 LAG3 6,772,483..6,778,455 12p13.31 TAPBPL 6,451,655..6,472,006 with structures similar to TCR and IgL chains: a single uninter- 7q34 TCRb 142,299,011..142,813,287 rupted VJ-type IgSF domain followed by a C1-IgSF domain. It has 7p14.1 TCRg 38,240,024..38,368,055 been also speculated (60) that modern AgRs were generated by Pig 5 klrd1 61,583,868..61,596,985 recruitment of C1-IgSF in the preadaptive immune complex fol- 5 a2m 65,274,903..65,320,342 lowed by the RAG transposon splitting a VJ gene into V- and 5 cd4 66,326,568..66,353,856 5 lag3 66,364,099..66,369,484 J- genetic elements (V-J). Thus, extant VJ-IgSF–containing genes 5 tapbpl 66,649,711..66,658,562 are potentially descendants of such precursor genes (69, 73). Like 18 tcrb 7,715,206..7,823,795 other immune genes directly involved in Ag recognition, all AgR- 9 tcrg 119,542,537..119,635,982 like genes described in this report are diploidized in the tetraploid X. Mouse 6 a2m 121,636,166..121,679,238 6 cd4 124,864,693..124,888,248 laevis, and therefore they likely play roles in immunity (18, 73). As 6 lag3 124,904,359..124,912,434 mentioned above, NCR3, another gene encoding a VJ-type domain, Downloaded from 6 tapbpl 125,223,927..125,231,923 maps to the human (and other vertebrate including sharks [M.E. 6 klrd1 129,588,092..129,598,775 Janes, L. Du Pasquier, M.F. Flajnik, and Y. Ohta, manuscript in 6 tcrb 40,891,296..41,558,371 preparation]) MHC (Fig. 1), and an amphioxus VJ gene (69) linked 13 tcrg 19,178,042..19,356,476 Opossum 8 a2m 104,682,643..104,771,506 to a kirrel homolog further supports the hypothesis that the AgR 8 cd4 108,220,454..108,260,998 precursor was present in the PIC at 0R. 8 lag3 108,170,654..108,179,156

Mapping of Ig and TCR genes in several vertebrates to http://www.jimmunol.org/ 8 klrk1 113,517,720..113,533,133 MHCpara indicates that all of the extant AgR seemed to be de- 8 tcrb 205,270,812..205,335,586 6 tapbpl 290,987,908..290,993,624 rived from an ancestral chr 19 paralogue. This suggests that an 6 tcrg 283,848,252.. 283,942,577 uninterrupted VJ element was split by the RAG transposon, and Chicken 1 a2m 76,229,983..76,255,770 after , one duplicate acquired a diversity (D) el- 1 tapbpl 76,876,884..76,889,825 ement, generating paired receptor genes (74). Hood et al. (79) 1 lag3 77,194,590..77,202,789 1 cd4 77,208,503..77,219,970 suggested an ancestral VJ homodimer, which, after the RAG 1 tcrb 78,071,772..78,072,534 transposon invasion and gene duplication, gave rise to a hetero- 1 klrdr1 78,423,947..78,430,724 dimeric receptor. As proposed by Davis and Bjorkman (80), the 2 tcrg 49,292,467..49,295,949 original receptor may have been TCR a/b-like, because the RAG by guest on September 29, 2021 tcrb Turkey 1 74,734,696..74,742,685 rearrangement break at CDR3 makes the most sense for an MHC- 1 lag3 75,575,531..75,581,610 1 cd4 75,588,055..75,599,611 restricted AgR (i.e., the most diverse part of the AgR binding to 1 tapbpl 75,900,408..75,911,755 the true Ag, peptide, or another original type of Ag) in the MHC 1 a2m 79,842,550..79,855,332 groove. We previously proposed (59) that the original AgR was 6 tcrg 47,636,597..47,652,020 derived from NK-like receptors that recognized MHC-like mole- Salmon 2 tapbpl 10,225,084..10,231,543 2 klrd1 24,161,287..24,177,629 cules encoded both in the PIC or the proto-MHC, and we now 2 cd4-2 30,978,314..30,984,887 provide evidence for such candidate receptors. Subsequent du- 2 cd4-1 30,986,632..31,013,703 plication of the paired TCR genes and translocation may have 9 a2m 108,156,058..108,189,009 relieved the pressure of MHC restriction, allowing the dupli- 1 tcrb 3,348,168..3,354,302 20 tcrg 9,074,301..9,083,342 cated receptor to bind free Ags, like g/d TCR today. Another Zebrafish 16 tapbpl 9,899,183..9,911,977 duplication in cis may have occurred [as previously suggested 16 cd4-1 12,021,001..12,055,289 (62)] on huMHCpara-14, generating IgH/L by a cis-duplication 16 cd4-2 12,057,069..12,072,262 of the neighboring (TCRA/D) pair: the two sets of loci (TCRA/D 16 clec 29,030,785..29,042,169 15 a2m 21,178,237..21,196,748 and IgH/L) are still linked in extant vertebrates including 17 tcrb 48,395,034..48,401,797 (C) Xenopus (62). 2 tcrg 31,873,021..31,902,832 (V) Class I, CD1, and class II aBeginning..end of positions in chromosomes. We also identified novel class I genes and mapped them in MHCpara derived from the chr 6/19 precursor after 1R. Our of the loci surrounding the CD1 genes (Figs. 4, 5A). Finally, we analyses suggest that MHC class I likely arose after the first round provide compelling evidence for the timing of the emergence of of genome duplication rather than prior to 1R (Fig. 6). The pre- MHC class I(/II) and AgR in a gnathostome ancestor (Figs. 6, 7B) vious proposals (43–45) were partially supported by the presence

gene is also linked to TAPBPL in the opossum genome, further suggesting that a precursor of TCRb/g was in the primordial MHC. CD4 and LAG3 (yellow boxes) define the region and contain genes encoding IgSF domains related to MHC. (B) Evolution of TCR and IgH/L from AgR precursors. We predict that a common AgR precursor with an “uninterrupted” VJ exon came together with C1-IgSF domain that was supplied by neighboring genes in the PIC. RAG insertion split the VJ exon into separate V and J exons, D fragments were generated, and became the TCR/IgH/L precursor, all post-2R. During the 2R duplication, this precursor region was further split into two as the precursors of ad and gb TCR lineages, consistent with Hood’s hypothesis (74). The TCRa/d precursor then cis-duplicated and generated IgH/L, as previously suggested (74). The TCR gb precursor split and was subsequently translocated as separate genes, as detailed in the text. Chromosome numbers are based on human locations. X denotes gene loss; a dot (•) denotes centromere. The Journal of Immunology 13

FIGURE 8. Dichotomy of vertebrate adaptive im- mune system. Because VLR homologs identified by Pancer (i.e., LRR–carboxy-terminal containing genes) were also mapped in MHCpara regions, we anticipate that a VLR precursor was present at least in the chr 6/19 common ancestor. We argue that the MHC/TCR/Ig system emerged and expanded in the jawed vertebrates soon after 2R as a consequence of the RAG transposon, and the VLR system was superseded (see text). Downloaded from

of CD1 genes on huMHCpara-1. In contrast, we present evidence genes, like KIRREL, FcRL,andSLAMF,maptoMHCtrans, http://www.jimmunol.org/ that the 1q21.1–23.3 region, including the CD1 genes, was sec- corresponding to the human 1q21.1–23.3 region (Figs. 3A, 5A, ondarily translocated from another location, which itself was Supplemental Table I). Therefore, other domains such as C2- translocated from the MHC (MHCtrans) (red arrow in Fig. 6); IgSF (building blocks of FcRL and SLAMF) and B30.2 thus, the presence of CD1 on huMHCpara-1 was likely the result (building block for butyrophilin) (11) were also present in the of a chance event and not a genome-wide duplication. There is, PIC and likely used as raw material to generate new sets of however, an alternative explanation: duplication of both MHC immune genes. In addition, the synteny of SLAMF and CD1 and MHCtrans may have been generated on both loci on chr 1 genes may be another example of functional clustering, be- and 6 but differentially silenced during 2R. We think this scenario cause SLAM family members are involved in NKT cell de- is unlikely because some housekeeping genes would have velopment in the thymus (84). by guest on September 29, 2021 remained in other MHCpara as homologs, as we commonly see in the tetraploid X. laevis genome compared with the diploid Jawless and jawed vertebrate immunological big bangs and X. tropicalis (14). KIRREL homologs, KIRREL 2 (19q13.12) the MHCpara and KIRREL 3 (11q24.2), are found in major and minor Finally, we also speculate on the dichotomy between the jawless huMHCpara (68, 78), whereas KIRREL maps to human chr and jawed vertebrate adaptive immune systems. Leucine-rich 1q23.1 but maps in Xenopus MHCtrans.Furthermore,kirrel repeat (LRR) domain-containing variable lymphocyte receptor maps adjacent to notch in the Drosophila genome, presum- (VLR) genes are rearranging adaptive immune genes unique to ably an ancestral linkage (16). Although this is only one jawless vertebrates (lamprey and hagfish) (85). LRR domains example, the distribution of KIRREL genes adds another are also present in many other proteins such as TLR (86, 87), layer of support to our hypothesis that the MHCtrans was which are predicted to be encoded in PIC because toll is linked initially translocated from the MHC (Fig. 5A). The presence to MHC paralogous hallmark genes in Drosophila (16). Pancer of a cd1 gene in Chinese alligator on huMHCpara-19 (46) identified three VLR homologous genes based on the presence further suggests that CD1 emerged after 1R but before 2R and of LRR carboxy-terminal domain (88), and, surprisingly, we was differentially silenced in reptiles and birds (Fig. 4). found all three genes mapping to MHCpara regions: GP1BB is Regardless of the precise timing of CD1’s emergence, we closelylinkedtoIgLl on human chr 22q11 (Figs. 1, 6) and propose that class II arose later and may have co-opted the Xenopus chr-1q (Supplemental Fig. 3) (linkage probability by CD1 pathway of Ag presentation. We found no class II genes chance ,1 3 10216 [Table II]); Xenopus gp1ba and gp9 outside of the MHC. could not be mapped, but human GP1BA maps closely to The overarching hypothesis is that all constituents/domains PSMB6 on human chr 17p13.2, and GP9 maps on human chr of current adaptive (and some innate) immune genes were ge- 3q21.3, a region also designated as minor MHCpara (60). Both netically linked in the PIC (9), which predated the MHC (6), GP1BB and GP1BA were mapped on chromosomes derived and these PIC components were “mixed and matched” to from huMHCpara-19. This unexpected result strongly suggests generate the precursors of modern immune genes (9), espe- that the precursor of VLR genes was also in PIC or an ancestral cially the VJ and C1-IgSF domains that are fundamental MHCpara. We have searched the lamprey and hagfish genomes components of the adaptive immune system (e.g., Igs, TCR, for synteny of the VLR genes but could not map any linked genes. MHC class I/II, B2M) (81–83). It was previously predicted that Better assembly of the lamprey and hagfish genomes could pro- Ig/TCR/MHC precursor genes originated in the MHC based on vide genetic evidence for further confirmation. Depending on preliminary evidence (6, 60). In addition to MHCpara, genes the precursor of human chr 3, the VLR predecessor could have linkedintheMHCtrans region also provide an indication of the been present either at 1R or 0R (PIC). In either scenario, our primordial linkage of AgR/MHC; as mentioned above, other model predicts that VLR predates the emergence of rearranging 14 EMERGENCE OF Ag RECEPTORS AND MHC CLASS I INCLUDING CD1

IgSF-containing AgR. At this point, we have no working hy- region specialized in cellular stress and ubiquitination/proteasome pathways. J. pothesis for why VLRs would be encoded in the MHCpara be- Immunol. 193: 2891–2901. 12. Ohta, Y., W. Goetz, M. Z. Hossain, M. Nonaka, and M. F. Flajnik. 2006. An- sides the basic idea that many immune gene families seems to be cestral organization of the MHC revealed in the amphibian Xenopus. J. Immunol. conceived in these regions. 176: 3674–3685. 13. Hellsten, U., R. M. Harland, M. J. Gilchrist, D. Hendrix, J. Jurka, V. Kapitonov, There was an expansion of gene families and neo- I. Ovcharenko, N. H. Putnam, S. Shu, L. Taher, et al. 2010. The genome of the functionalization [e.g., globin genes (89)] in early jawed ver- Western clawed frog Xenopus tropicalis. Science 328: 633–636. tebrates shortly after 2R and perpetuated in the gnathostome 14. Session, A. M., Y. Uno, T. Kwon, J. A. Chapman, A. Toyoda, S. Takahashi, A. Fukui, A. Hikosaka, A. Suzuki, M. Kondo, et al. 2016. Genome evolution in lineage. In contrast, the jawless fish either maintained the pri- the allotetraploid frog Xenopus laevis. Nature 538: 336–343. mordial state or evolved novel globin genes (89). We suggest 15. Uno, Y., C. Nishida, C. Takagi, N. Ueno, and Y. Matsuda. 2013. Homoeologous that such a major dichotomy occurred for the immune system as chromosomes of Xenopus laevis are highly conserved after whole-genome du- plication. Heredity 111: 430–436. well (Fig. 8): adaptive immunity likely emerged in the jawless 16. Danchin, E. G. J., L. Abi-Rached, A. Gilles, and P. Pontarotti. 2003. Conservation vertebrates in the first “Big Bang” with major features such as of the MHC-like region throughout evolution. Immunogenetics 55: 141–148. clonal selection of lymphocytes bearing somatically gener- 17. Du Pasquier, L., J. Schwager, and M. F. Flajnik. 1989. The immune system of Xenopus. Annu. Rev. Immunol. 7: 251–275. ated Ag receptors, emergence of the thymus, and appearance 18. Courtet, M., M. Flajnik, and L. Du Pasquier. 2001. Major histocompatibility of lymphocyte subsets (90). In our scenario, as opposed to a complex and immunoglobulin loci visualized by in situ hybridization on Xen- model proposing parallel evolution of VLR and Ig/TCR sys- opus chromosomes. Dev. Comp. Immunol. 25: 149–157. 19. Tanaka, K. 2013. The proteasome: from basic mechanisms to emerging roles. tems, the VLR system emerged during the first Big Bang, and Keio J. Med. 62: 1–12. then was superseded by the Ig/TCR system after invasion of an 20. Kasahara, M., M. Hayashi, K. Tanaka, H. Inoko, K. Sugaya, T. Ikemura, and

VJ-IgSF gene by the RAG transposon at 2R. As previously T. Ishibashi. 1996. Chromosomal localization of the proteasome Z subunit gene Downloaded from reveals an ancient chromosomal duplication involving the major histocompati- suggested, RAG-mediated rearrangement provides a distinct bility complex. Proc. Natl. Acad. Sci. USA 93: 9096–9101. advantage over APOBEC-mediated recombination in that the 21. Abi-Rached, L., A. Gilles, T. Shiina, P. Pontarotti, and H. Inoko. 2002. Evidence CDR3 loop can be wildly different in size (91), accommodating of en bloc duplication in vertebrate genomes. Nat. Genet. 31: 100–105. 22. Kelley, J., L. Walter, and J. Trowsdale. 2005. Comparative genomics of major either a rich adaptive repertoire or one that is more innate in na- histocompatibility complexes. Immunogenetics 56: 683–695. ture. We suggest that the RAG transposon invasion at 2R was the 23. Shum, B. P., D. Avila, L. Du Pasquier, M. Kasahara, and M. F. Flajnik. 1993. Isolation of a classical MHC class I cDNA from an amphibian. Evidence for only innovative event that initiated a second Big Bang of adaptive http://www.jimmunol.org/ one class I locus in the Xenopus MHC. J. Immunol. 151: 5376–5386. immunity, resulting in the emergence of immunoproteasomes, 24. Flajnik, M. F., M. Kasahara, B. P. Shum, L. Salter-Cid, E. Taylor, and L. Du emergence and expansion of AgR, and the first appearance of Pasquier. 1993. A novel type of class I gene organization in vertebrates: a large SLAM family members, all of which likely occurred on the chr 6 family of non-MHC-linked class I genes is expressed at the RNA level in the amphibian Xenopus. EMBO J. 12: 4385–4396. and chr 19 ancestral paralogues. Other features of the gnathostome 25. Edholm, E. S., A. Goyos, J. Taran, F. De Jesu´s Andino, Y. Ohta, and adaptive immune system, such as emergence of secondary lym- J. Robert. 2014. Unusual evolutionary conservation and further species- specific adaptations of a large family of nonclassical MHC class Ib genes phoid tissues, expansion of cytokine and chemokine networks, and across different degrees of genome in the amphibian subfamily appearance of a complex thymic architecture also occurred over a Xenopodinae. Immunogenetics 66: 411–426. short period of evolutionary time, in some cases under the influ- 26. Krasnec, K. V., A. T. Papenfuss, and R. D. Miller. 2016. The UT family of MHC

class I loci unique to non-eutherian mammals has limited polymorphism and by guest on September 29, 2021 ence of genes mapping to MHC paralogous regions, e.g., TNF tissue specific patterns of expression in the opossum. BMC Immunol. 17: 43. (92) and B7 family members (68). 27. Nonaka, M., C. Yamada-Namikawa, M. F. Flajnik, and L. Du Pasquier. 2000. Trans-species polymorphism of the major histocompatibility complex- encoded proteasome subunit LMP7 in an amphibian genus, Xenopus. Im- Acknowledgments munogenetics 51: 186–192. We thank Hanover Matz and Dr. Louis Du Pasquier for critical reading of 28. Ohta, Y., S. J. Powis, R. L. Lohr, M. Nonaka, L. Du Pasquier, and M. F. Flajnik. the manuscript and advice on the nonrearranging AgR-like genes. 2003. Two highly divergent ancient allelic lineages of the transporter associated with antigen processing (TAP) gene in Xenopus: further evidence for co- evolution among MHC class I region genes. Eur. J. Immunol. 33: 3017–3027. Disclosures 29. Nonaka, M., C. Namikawa, Y. Kato, M. Sasaki, L. Salter-Cid, and M. F. Flajnik. 1997. Major histocompatibility complex gene mapping in the amphibian Xenopus The authors have no financial conflicts of interest. implies a primordial organization. Proc. Natl. Acad. Sci. USA 94: 5789–5791. 30. Tsukamoto, K., M. Sakaizumi, M. Hata, Y. Sawara, J. Eah, C. B. Kim, and M. Nonaka. 2009. Dichotomous haplotypic lineages of the immunoproteasome References subunit genes, PSMB8 and PSMB10, in the MHC class I region of a teleost medaka, Oryzias latipes. Mol. Biol. Evol. 26: 769–781. 1. Ohno, S. 1970. Evolution by Gene Duplication. Springer-Verlag, New York. 31. McConnell, S. C., K. M. Hernandez, D. J. Wcisel, R. N. Kettleborough, 2. Lundin, L. G. 1993. Evolution of the vertebrate genome as reflected in paralo- D. L. Stemple, J. A. Yoder, J. Andrade, and J. L. de Jong. 2016. Alternative gous chromosomal regions in man and the house mouse. Genomics 16: 1–19. haplotypes of antigen processing genes in zebrafish diverged early in vertebrate 3. Kasahara, M. 1997. New insights into the genomic organization and origin of the evolution. Proc. Natl. Acad. Sci. USA 113: E5014–E5023. major histocompatibility complex: role of chromosomal (genome) duplication in the emergence of the adaptive immune system. Hereditas 127: 59–65. 32. Kaufman, J. 2015. Co-evolution with chicken class I genes. Immunol. Rev. 267: 4. Darbo, E., E. G. Danchin, M. F. Mc Dermott, and P. Pontarotti. 2008. Evolution 56–71. of major histocompatibility complex by “en bloc” duplication before mammalian 33. Miller, M. M., and R. L. Taylor, Jr. 2016. Brief review of the chicken major radiation. Immunogenetics 60: 423–438. histocompatibility complex: the genes, their distribution on chromosome 16, and 5.Olinski,R.P.,L.G.Lundin,andF.Hallbo¨o¨k. 2006. Conserved synteny be- their contributions to disease resistance. Poult. Sci. 95: 375–392. tween the Ciona genome and human paralogons identifies large duplication 34. Edholm, E. S., L. M. Albertorio Saez, A. L. Gill, S. R. Gill, L. Grayfer, events in the molecular evolution of the insulin-relaxin gene family. Mol. Biol. N. Haynes, J. R. Myers, and J. Robert. 2013. Nonclassical MHC class I- Evol. 23: 10–22. dependent invariant T cells are evolutionarily conserved and prominent from 6. Flajnik, M. F., and M. Kasahara. 2010. Origin and evolution of the adaptive immune early development in amphibians. Proc. Natl. Acad. Sci. USA 110: 14342–14347. system: genetic events and selective pressures. Nat. Rev. Genet. 11: 47–59. 35. Edholm, E. S., M. Banach, and J. Robert. 2016. Evolution of innate-like 7. Hallbo¨o¨k, F. 1999. Evolution of the vertebrate neurotrophin and Trk receptor T cells and their selection by MHC class I-like molecules. Immunogenetics gene families. Curr. Opin. Neurobiol. 9: 616–621. 68: 525–536. 8. Horton, R., L. Wilming, V. Rand, R. C. Lovering, E. A. Bruford, V. K. Khodiyar, 36. Edholm, E. S., M. Banach, K. Hyoe Rhoo, M. S. Pavelka, Jr., and J. Robert. M. J. Lush, S. Povey, C. C. Talbot, Jr., M. W. Wright, et al. 2004. Gene map of 2018. Distinct MHC class I-like interacting invariant T cell lineage at the the extended human MHC. Nat. Rev. Genet. 5: 889–899. forefront of mycobacterial immunity uncovered in Xenopus. Proc. Natl. Acad. 9. Abi Rached, L., M. F. McDermott, and P. Pontarotti. 1999. The MHC big bang. Sci. USA 115: E4023–E4031. Immunol. Rev. 167: 33–44. 37. Kasahara, M. 1999. The chromosomal duplication model of the major histo- 10. Danchin, E. G., and P. Pontarotti. 2004. Towards the reconstruction of the compatibility complex. Immunol. Rev. 167: 17–32. bilaterian ancestral pre-MHC region. Trends Genet. 20: 587–591. 38. Calabi, F., and C. Milstein. 1986. A novel family of human major histocom- 11. Suurva¨li, J., L. Jouneau, D. The´pot, S. Grusea, P. Pontarotti, L. Du Pasquier, patibility complex-related genes not mapping to chromosome 6. Nature 323: S. Ru¨u¨tel Boudinot, and P. Boudinot. 2014. The proto-MHC of placozoans, a 540–543. The Journal of Immunology 15

39. Martin, L. H., F. Calabi, and C. Milstein. 1986. Isolation of CD1 genes: a family natural cytotoxicity mediated by human natural killer cells. J. Exp. Med. 190: of major histocompatibility complex-related differentiation antigens. Proc. Natl. 1505–1516. Acad. Sci. USA 83: 9154–9158. 67. Ohta, Y., and M. F. Flajnik. 2015. Coevolution of MHC genes (LMP/TAP/class 40. Zajonc, D. M. 2016. The CD1 family: serving lipid antigens to T cells since the Ia, NKT-class Ib, NKp30-B7H6): lessons from cold-blooded vertebrates. Mesozoic era. Immunogenetics 68: 561–576. Immunol. Rev. 267: 6–15. 41. Jayawardena-Wolf, J., and A. Bendelac. 2001. CD1 and lipid antigens: intra- 68. Flajnik, M. F., T. Tlapakova, M. F. Criscitiello, V. Krylov, and Y. Ohta. 2012. cellular pathways for antigen presentation. Curr. Opin. Immunol. 13: 109–113. Evolution of the B7 family: co-evolution of B7H6 and NKp30, identification of a 42. Kasahara, M., J. Nakaya, Y. Satta, and N. Takahata. 1997. Chromosomal dupli- new B7 family member, B7H7, and of B7’s historical relationship with the cation and the emergence of the adaptive immune system. Trends Genet. 13: 90–92. MHC. Immunogenetics 64: 571–590. 43. Maruoka, T., H. Tanabe, M. Chiba, and M. Kasahara. 2005. Chicken CD1 genes 69. Chen, R., L. Zhang, J. Qi, N. Zhang, L. Zhang, S. Yao, Y. Wu, B. Jiang, Z. Wang, are located in the MHC: CD1 and endothelial protein C receptor genes constitute H. Yuan, et al. 2018. Discovery and analysis of invertebrate IgVJ-C2 structure a distinct subfamily of class-I-like genes that predates the emergence of mam- from Amphioxus provides insight into the evolution of the Ig superfamily. J. mals. Immunogenetics 57: 590–600. Immunol. 200: 2869–2881. 44.Salomonsen,J.,M.R.Sørensen,D.A.Marston,S.L.Rogers,T.Collen,A.van 70. Wu, Q., Z. Wei, Z. Yang, T. Wang, L. Ren, X. Hu, Q. Meng, Y. Guo, Q. Zhu, Hateren, A. L. Smith, R. K. Beal, K. Skjødt, and J. Kaufman. 2005. Two CD1 genes J. Robert, et al. 2010. Phylogeny, genomic organization and expression of map to the chicken MHC, indicating that CD1 genes are ancient and likely to have lambda and kappa immunoglobulin light chain genes in a reptile, Anolis caro- been present in the primordial MHC. Proc.Natl.Acad.Sci.USA102: 8668–8673. linensis. Dev. Comp. Immunol. 34: 579–589. 45. Miller, M. M., C. Wang, E. Parisini, R. D. Coletta, R. M. Goto, S. Y. Lee, 71. Del Porto, P., L. Bruno, M. G. Mattei, H. von Boehmer, and C. Saint-Ruf. 1995. D. C. Barral, M. Townes, C. Roura-Mir, H. L. Ford, et al. 2005. Characterization Cloning and comparative analysis of the human pre-T-cell receptor alpha-chain of two avian MHC-like genes reveals an ancient origin of the CD1 family. Proc. gene. Proc. Natl. Acad. Sci. USA 92: 12105–12109. Natl. Acad. Sci. USA 102: 8674–8679. 72. Saint-Ruf, C., K. Ungewiss, M. Groettrup, L. Bruno, H. J. Fehling, and H. von Boehmer. 1994. Analysis and expression of a cloned pre-T cell receptor gene. 46. Yang, Z., C. Wang, T. Wang, J. Bai, Y. Zhao, X. Liu, Q. Ma, X. Wu, Y. Guo, Science Y. Zhao, and L. Ren. 2015. Analysis of the reptile CD1 genes: evolutionary 266: 1208–1212. 73. Fu, Y., Z. Yang, J. Huang, X. Cheng, X. Wang, S. Yang, L. Ren, Z. Lian, H. Han, implications. Immunogenetics 67: 337–346. and Y. Zhao. 2019. Identification of two nonrearranging IgSF genes in chicken

47. Flajnik, M. F., J. F. Kaufman, P. Riegert, and L. Du Pasquier. 1984. Identification Downloaded from reveals a novel family of putative remnants of an antigen receptor precursor. J. of class I major histocompatibility complex encoded molecules in the amphibian Immunol. 202: 1992–2004. Xenopus. Immunogenetics 20: 433–442. 74. Glusman, G., L. Rowen, I. Lee, C. Boysen, J. C. Roach, A. F. Smit, K. Wang, 48. Rogers, S. L., and J. Kaufman. 2016. Location, location, location: the evolu- B. F. Koop, and L. Hood. 2001. Comparative genomics of the human and mouse tionary history of CD1 genes and the NKR-P1/ligand systems. Immunogenetics T cell receptor loci. Immunity 15: 337–349. 68: 499–513. 75. Du Pasquier, L. 2000. Relationships among the genes encoding MHC 49. Donoviel, D. B., D. D. Freed, H. Vogel, D. G. Potter, E. Hawkins, J. P. Barrish, molecules and the specific antigen receptors. In MHC Evolution, Structure B. N. Mathur, C. A. Turner, R. Geske, C. A. Montgomery, et al. 2001. Pro- and Function. L. Du Pasquier, and M. Kasahawa, eds. Springer-Verlag, teinuria and perinatal lethality in mice lacking NEPH1, a novel protein with

Tokyo, p. 53–65. http://www.jimmunol.org/ homology to NEPHRIN. Mol. Cell. Biol. 21: 4829–4836. 76. Trowsdale, J. 2001. Genetic and functional relationships between MHC and NK 50. Hughes, A. L., and M. Nei. 1993. Evolutionary relationships of the classes of receptor genes. Immunity 15: 363–374. major histocompatibility complex genes. Immunogenetics 37: 337–346. 77. Kaufman, J., S. Milne, T. W. Go¨bel, B. A. Walker, J. P. Jacob, C. Auffray, 51. Kaufman, J. F., C. Auffray, A. J. Korman, D. A. Shackelford, and J. Strominger. R. Zoorob, and S. Beck. 1999. The chicken B locus is a minimal essential major 1984. The class II molecules of the human and murine major histocompatibility histocompatibility complex. Nature 401: 923–925. complex. Cell 36: 1–13. 78. Du Pasquier, L. 2004. Speculations on the origin of the vertebrate immune 52. Kaufman, J. 2018. Unfinished business: evolution of the MHC and the adaptive system. Immunol. Lett. 92: 3–9. immune system of jawed vertebrates. Annu. Rev. Immunol. 36: 383–409. 79. Hood, L., M. Kronenberg, and T. Hunkapiller. 1985. T cell antigen receptors and 53. Dijkstra, J. M., and T. Yamaguchi. 2019. Ancient features of the MHC class II the immunoglobulin supergene family. Cell 40: 225–229. presentation pathway, and a model for the possible origin of MHC molecules. 80. Davis, M. M., and P. J. Bjorkman. 1988. T-cell antigen receptor genes and T-cell Immunogenetics 71: 233–249. recognition. [Published erratum appears in 1988 Nature 335: 744.] Nature 334: 54. Flajnik, M. F., C. Canel, J. Kramer, and M. Kasahara. 1991. Which came first,

395–402. by guest on September 29, 2021 MHC class I or class II? Immunogenetics 33: 295–300. 81. DuPasquier, L., and I. Chre´tien. 1996. CTX, a new lymphocyte receptor in 55. Bartl, S., M. A. Baish, M. F. Flajnik, and Y. Ohta. 1997. Identification of class I Xenopus, and the early evolution of Ig domains. Res. Immunol. 147: 218– genes in cartilaginous fish, the most ancient group of vertebrates displaying an 226. adaptive immune response. J. Immunol. 159: 6097–6104. 82. Du Pasquier, L. 2002. Several MHC-linked Ig superfamily genes have features of 56. Guselnikov, S. V., T. Ramanayake, A. Y. Erilova, L. V. Mechetina, ancestral antigen-specific receptor genes. Curr. Top. Microbiol. Immunol. 266: A. M. Najakshin, J. Robert, and A. V. Taranin. 2008. The Xenopus FcR family 57–71. demonstrates continually high diversification of paired receptors in vertebrate 83. Ohta, Y., T. Shiina, R. L. Lohr, K. Hosomichi, T. I. Pollin, E. J. Heist, S. Suzuki, evolution. BMC Evol. Biol. 8: 148. b 57. Guselnikov, S. V., T. Ramanayake, J. Robert, and A. V. Taranin. 2009. Diversity H. Inoko, and M. F. Flajnik. 2011. Primordial linkage of 2-microglobulin to the MHC. J. Immunol. 186: 3563–3571. of the FcR- and KIR-related genes in an amphibian Xenopus. Front. Biosci. 14: 130–140. 84. Godfrey, D. I., S. Stankovic, and A. G. Baxter. 2010. Raising the NKT cell family. Nat. Immunol. 11: 197–206. 58. Guselnikov, S. V., P. P. Laktionov, A. M. Najakshin, K. O. Baranov, and A. V. Taranin. 2011. Expansion and diversification of the signaling capabilities of 85. Pancer, Z., C. T. Amemiya, G. R. Ehrhardt, J. Ceitlin, G. L. Gartland, and the CD2/SLAM family in Xenopodinae amphibians. Immunogenetics 63: 679–689. M. D. Cooper. 2004. Somatic diversification of variable lymphocyte receptors in 59. Flajnik, M. F., and M. Kasahara. 2001. Comparative genomics of the MHC: the agnathan sea lamprey. Nature 430: 174–180. glimpses into the evolution of the adaptive immune system. Immunity 15: 86. Anderson, K. V., L. Bokla, and C. Nu¨sslein-Volhard. 1985. Establishment of 351–362. dorsal-ventral polarity in the Drosophila embryo: the induction of polarity by the 60. Du Pasquier, L., I. Zucchetti, and R. De Santis. 2004. Immunoglobulin superfamily toll gene product. Cell 42: 791–798. receptors in protochordates: before RAG time. Immunol. Rev. 198: 233–248. 87. Anderson, K. V., G. Ju¨rgens, and C. Nu¨sslein-Volhard. 1985. Establishment of 61. Williams, A. F., and A. N. Barclay. 1988. The immunoglobulin superfamily-- dorsal-ventral polarity in the Drosophila embryo: genetic studies on the role of domains for cell surface recognition. Annu. Rev. Immunol. 6: 381–405. the toll gene product. Cell 42: 779–789. 62. Parra, Z. E., Y. Ohta, M. F. Criscitiello, M. F. Flajnik, and R. D. Miller. 2010. 88. Rogozin, I. B., L. M. Iyer, L. Liang, G. V. Glazko, V. G. Liston, Y. I. Pavlov, The dynamic TCRd: TCRd chains in the amphibian Xenopus tropicalis utilize L. Aravind, and Z. Pancer. 2007. Evolution and diversification of lamprey an- antibody-like V genes. Eur. J. Immunol. 40: 2319–2329. tigen receptors: evidence for involvement of an AID-APOBEC family cytosine 63. Schatz, D. G., M. A. Oettinger, and D. Baltimore. 1989. The V(D)J recombi- deaminase. Nat. Immunol. 8: 647–656. nation activating gene, RAG-1. Cell 59: 1035–1048. 89. Hoffmann, F. G., J. C. Opazo, and J. F. Storz. 2012. Whole-genome duplications 64. Oettinger, M. A., D. G. Schatz, C. Gorka, and D. Baltimore. 1990. RAG-1 and spurred the functional diversification of the globin gene superfamily in verte- RAG-2, adjacent genes that synergistically activate V(D)J recombination. Sci- brates. Mol. Biol. Evol. 29: 303–312. ence 248: 1517–1523. 90. Flajnik, M. F. 2018. A cold-blooded view of adaptive immunity. Nat. Rev. 65. Agrawal, A., Q. M. Eastman, and D. G. Schatz. 1998. Transposition mediated by Immunol. 18: 438–453. RAG1 and RAG2 and its implications for the evolution of the immune system. 91. Hsu, E. 2011. The invention of lymphocytes. Curr. Opin. Immunol. 23: Nature 394: 744–751. 156–162. 66. Pende, D., S. Parolini, A. Pessino, S. Sivori, R. Augugliaro, L. Morelli, 92. Collette, Y., A. Gilles, P. Pontarotti, and D. Olive. 2003. A co-evolution per- E. Marcenaro, L. Accame, A. Malaspina, R. Biassoni, et al. 1999. Identification spective of the TNFSF and TNFRSF families in the immune system. Trends and molecular characterization of NKp30, a novel triggering receptor involved in Immunol. 24: 387–394.