Primordial Linkage of β2-Microglobulin to the MHC Yuko Ohta, Takashi Shiina, Rebecca L. Lohr, Kazuyoshi Hosomichi, Toni I. Pollin, Edward J. Heist, Shingo Suzuki, This information is current as Hidetoshi Inoko and Martin F. Flajnik of September 28, 2021. J Immunol published online 14 February 2011 http://www.jimmunol.org/content/early/2011/02/14/jimmun ol.1003933 Downloaded from

Supplementary http://www.jimmunol.org/content/suppl/2011/02/14/jimmunol.100393 Material 3.DC1 http://www.jimmunol.org/ Why The JI? Submit online.

• Rapid Reviews! 30 days* from submission to initial decision

• No Triage! Every submission reviewed by practicing scientists

• Fast Publication! 4 weeks from acceptance to publication

by guest on September 28, 2021 *average

Subscription Information about subscribing to The Journal of Immunology is online at: http://jimmunol.org/subscription Permissions Submit copyright permission requests at: http://www.aai.org/About/Publications/JI/copyright.html Email Alerts Receive free email-alerts when new articles cite this article. Sign up at: http://jimmunol.org/alerts

The Journal of Immunology is published twice each month by The American Association of Immunologists, Inc., 1451 Rockville Pike, Suite 650, Rockville, MD 20852 Copyright © 2011 by The American Association of Immunologists, Inc. All rights reserved. Print ISSN: 0022-1767 Online ISSN: 1550-6606. Published February 14, 2011, doi:10.4049/jimmunol.1003933 The Journal of Immunology

Primordial Linkage of b2-Microglobulin to the MHC

Yuko Ohta,*,1 Takashi Shiina,†,1 Rebecca L. Lohr,* Kazuyoshi Hosomichi,†,‡ Toni I. Pollin,x Edward J. Heist,{ Shingo Suzuki,† Hidetoshi Inoko,†,1 and Martin F. Flajnik*,1

b2-Microglobulin (b2M) is believed to have arisen in a basal jawed vertebrate (gnathostome) and is the essential L chain that associates with most MHC class I molecules. It contains a distinctive molecular structure called a constant-1 Ig superfamily domain, which is shared with other adaptive immune molecules including MHC class I and class II. Despite its structural similarity to class I and class II and its conserved function, b2M is encoded outside the MHC in all examined species from bony fish to mammals, but it is assumed to have translocated from its original location within the MHC early in gnathostome evolution. We screened a nurse shark bacterial artificial library and isolated clones containing b2M . A present in the MHC of all other vertebrates (ring3) was found in the bacterial artificial chromosome clone, and the close linkage of ring3 and Downloaded from b2M to MHC class I and class II genes was determined by single-strand conformational polymorphism and allele-specific PCR. This study satisfies the long-held conjecture that b2M was linked to the primordial MHC (Ur MHC); furthermore, the apparent stability of the shark genome may yield other genes predicted to have had a primordial association with the MHC specifically and with immunity in general. The Journal of Immunology, 2011, 186: 000–000.

he adaptive immune system as defined in humans arose external domains, the two membrane-proximal domains being http://www.jimmunol.org/ abruptly in a jawed vertebrate (gnathostome) ancestor members of the Ig superfamily (IgSF) and the membrane-distal T ∼500 million years ago. The major players of adaptive domains forming a unique structure called the peptide-binding immunity, the rearranging Ag receptors (Ig and TCR), the Ag- region (PBR). However, the chain composition differs between presenting molecules (MHC class I and class II), and molecules class I and class II molecules: class II molecules are heterodimers involved in Ag processing (e.g., immunoproteasomes and the of a- and b-chains each consisting of one half of the PBR, one IgSF TAPs) are all present in sharks as the oldest extant jawed verte- domain, and transmembrane/cytoplasmic regions, whereas class I brates but absent in all invertebrates and jawless fish (1). The molecules are composed of an H or a-chain and the requisite L MHC encodes the class I and class II , which present chain, b2-microglobulin (b2M), the former comprising the entire

foreign peptides to T cells to initiate adaptive immune responses, PBR, one IgSF domain, and transmembrane/cytoplasmic regions, by guest on September 28, 2021 as well as the Ag processing molecules and a large number of and the latter only one IgSF domain. The remarkable similarity of other genes involved in various immune functions. The class I and the class I and class II structures clearly suggests that they were class II tertiary structures are nearly identical, composed of four generated from a common ancestor, presumably by tandem du- plication; thus, it has been assumed that class I, b2M, and class II *Department of Microbiology and Immunology, University of Maryland, Baltimore, MD genes were tightly linked at one point in evolution (1), although it is 21201; †Division of Molecular Life Science, Department of Genetic Information, Tokai debatable whether the ancestor of class I/II molecule was class I- or University School of Medicine, Isehara, Kanagawa 243-1143, Japan; ‡Division of Human Genetics, Department of Integrated Genetics, National Institute of Genetics, class II-like or an unrecognizable common ancestor (2–5). Mishima, Shizuoka 411-8540, Japan; xDepartment of Medicine, University of Maryland, In all jawed vertebrates except teleost fish, a taxon having { Baltimore, MD 21201; and Department of Zoology, Fisheries and Illinois Aquaculture a highly modified genome correlating with a genome-wide du- Center, Southern Illinois University Carbondale, Carbondale, IL 62901 1 plication early in teleost evolution (6, 7), both MHC class I and II Y.O., T.S., H.I., and M.F.F. contributed equally to this work. genes are closely linked within the MHC (8). Although class I Received for publication November 30, 2010. Accepted for publication January 6, 2011. genes are encoded in a region downstream of the class II region (2–4 Mb) in the MHC of most mammals, a single or low number This work was supported by National Institutes of Health Grant AI27877 (to Y.O., R.L.L., and M.F.F.) and by Scientific Research on Priority Areas “Comparative of class I genes are found in close proximity to class I processing Genomics” (20017023) from the Ministry of Education, Culture, Sports, Science and (except teleost fish) class II genes in most nonmammalian and Technology of Japan (to T.S., K.H., S.S., and H.I.). E.H. was supported by the Department of Zoology at Southern Illinois University Carbondale and by the W.W. vertebrates, in what is predicted to be the primordial organiza- Diehl Endowed Professorship of Biology to J. Carrier at Albion College. tion (9–15). b2M is encoded in diverse regions outside the MHC The sequences presented in this article have been submitted to the DNA Data Bank of in all the species examined to date, including mammals (16), birds Japan under accession number AB571627 and to GenBank under accession numbers (17), amphibians (18), and bony fish (19), and therefore the lack of HM625830 and HM625831. linkage of b2M to the MHC and inconsistent synteny around b2M Address correspondence and reprint requests to Yuko Ohta, Department of Microbi- have been assumed to be a result of repeated translocations out of ology and Immunology, University of Maryland, 685 West Baltimore Street, Balti- more, MD 21201. E-mail address: [email protected] the MHC over evolutionary time or to serial translocations after The online version of this article contains supplemental material. the early loss of MHC linkage (20). Abbreviations used in this article: BAC, bacterial artificial chromosome; C1, con- In this study, we characterized the nurse shark (Ginglymostoma stant-1; IgSF, Ig superfamily; LOD, log of the odds; b2M, b2-microglobulin; NJ, cirratum) single-copy b2M gene and mapped it to the MHC. The neighbor-joining; PBR, peptide-binding region; ssCP, single-strand conformational primitive synteny preserved in this extant vertebrate validates polymorphism; ZFP, zinc finger . early suppositions regarding MHC evolution and further suggests Copyright Ó 2011 by The American Association of Immunologists, Inc. 0022-1767/11/$16.00 that other ancient features of the MHC also may be preserved.

www.jimmunol.org/cgi/doi/10.4049/jimmunol.1003933 2 LINKAGE ANALYSIS OF b2-MICROGLOBULIN

Materials and Methods template. We also found animals with the “CC-deletion” allele in two other Animals families (1 and 3). Genomic DNAwas isolated from RBCs for mapping analysis from the nurse Northern blotting shark family as previously described (21). The procedure of animal use was Total RNAwas isolated from various nurse shark tissues by using the TRIzol reviewed and approved by the Institutional Animal Care and Use Com- reagent (Invitrogen). Twenty micrograms of total RNA was electrophoresed mittee at the University of Maryland. and blotted onto Optitran Nitrocellulose membrane (Schleicher & Schuell). The membrane was hybridized with full-length shark probes and washed Bacterial artificial chromosome library screening under high-stringency conditions (23). The 17 bacterial artificial chromosome (BAC) filters with 11-fold genomic coverage (22) were screened with radiolabeled full-length b2M or ring3 Southern blotting probes under high-stringency conditions (23). Membranes were exposed to Genomic DNA (10 mg) was digested with various restriction enzymes to x-ray film for various lengths of time to obtain positive signals and the obtain useful RFLP in unrelated sharks with multiple enzymes. The IgSF desired background. Putative positive clones were then re-spotted on nylon exon was used to determine the number of loci for b2m under high- membranes for colony hybridization and tested by Southern blotting to stringency conditions (23). Hybridization with MHC class I leader and confirm true positives. BAC insert DNA was isolated using the PhasePrep a1 domain probe was performed under low-stringency conditions (23). To BAC DNA kit (Sigma-Aldrich), and the sequence was determined by determine the MHC groups in the shark family 2, we digested genomic shotgun sequencing at the sequencing facility at Tokai University with DNA with HindIII and hybridized with radiolabeled probe including the 3 7.5 coverage. leader–a1 domains of MHC class I under high-stringency conditions. Sequence alignment and phylogenetic tree Sequence analysis of MHC class I alleles and sire designation Downloaded from Amino acid sequences of constant-1 (C1)-IgSF domains were aligned using MHC class I sequences were obtained from PCR amplification with primers the ClustalX2 program with minor adjustments. A rooted neighbor-joining from a1 domain forward, 59-GGTCGGTTATGTGGATGATC-39; and a2 (NJ) bootstrapped (1000 runs) phylogenetic tree (24) was constructed, and domain reverse, 59-TTGCAGCCACTCGATACA-39. PCR amplification the consensus tree was then viewed with the TreeView program (25). was performed for 4 min at 94˚C, followed by 35 cycles of 94˚C for 1 min, 56˚C for 1∼2 min, 72˚C for 1 min, and a final extension at 72˚C for 10 Database searches min. An ∼550-bp fragment amplicon was cloned into the pCRII TA Genome synteny in various species was retrieved and analyzed from cloning vector (Invitrogen), and individual clones were sequenced. Nurse http://www.jimmunol.org/ publicly available Web sites as noted. Genes from mouse, chicken, human, shark families 2 and 3 were genotyped using 12 DNA microsatellite opossum, and zebrafish were retrieved from GenBank (http://www.ncbi. markers and assigned sires (E.J. Heist, J.C. Carrier, H.L. Pratt, and T.C. nlm.nih.gov), and information on other genomes was retrieved from the Pratt, submitted for publication). following Web sites: elephant shark genome (http://blast.fugu-sg.org/); Statistical analysis of linkage Anolis genome (http://genome.ucsc.edu/cgi-bin/hgGateway?db=anoCar1); Xenopus genome (http://genome.jgi-psf.org/Xentr4/Xentr4.home.html); and We used parametric linkage analysis to formally assess the evidence for Fugu genome (http://genome.jgi-psf.org/Takru4/Takru4.home.html). linkage of b2M to the MHC region in the offspring of deletion-carrying sires. This approach assesses the odds of the likelihood of obtaining the In-house EST collection observed data set if the two loci are linked versus if the loci are not linked, showing as a log of the odds (LOD) score. The paternal sibships were We constructed the cDNA library using the Gateway System (Invitrogen) determined based on consolidated data from combination of Southern by guest on September 28, 2021 from adult nurse shark pancreas. To eliminate Ig genes, we first hybridized blotting, sequencing of MHC class Ia alleles, and microsatellite analyses with Ig H and L chain probes under high-stringency conditions. Negative (shown in Table I). colonies (∼8000) were then manually picked and sequenced from the The LOD score is calculated as follows when parental phase (linkage vector end. All draft sequences were blastx searched against GenBank status) is known: LOD = log10 {[(u)R (1 2 u)NR]/(0.5)R+NR}, where u is the databases, and we obtained ∼1150 sequences not specific to the pancreatic recombination fraction, NR is the number of nonrecombinant offspring, enzymes (Y. Ohta and M.F. Flajnik, personal observations). and R is the number of recombinant offspring. Single-strand conformation polymorphism analysis Because the parental phase was unknown in the current study due to a lack of grandparental genotypes, a phase ambiguous LOD score was first Nurse shark ring3 primers were designed based on the sequence obtained calculated for each family by taking the log of the average odds for the two from BAC GC_614H19 clone. Multiple primers were tried, and we se- possible phases (1 and 2 in Table I), and the resulting LOD scores were lected the primer set anchoring exons 4 and 5 for the single-strand con- then summed over the two families to obtain the LOD score at a given formation polymorphism (ssCP) analysis. The primers were exon 4 recombination fraction. LOD scores were calculated at recombination forward, 59-GTTAACACCTGCACCAAAAT-39; and exon 5 reverse, 59- fractions between 0 and 0.5 to obtain the recombination fraction where the ATTGGGACCTGAGACACAGT-39. PCR was performed at 94˚C for 4 LOD score was maximized (26). The corresponding p value was calculated x2 3 min, followed by 35 cycles of 94˚C for 1 min, 62˚C for 1 min, and 72˚C for using a one-sided test of LOD 2 (loge10) (27). 1 min, with a final extension of 72˚C for 10 min using 2–500 ng genomic DNA as template. The ∼1340-bp PCR product was cleaned by gel ex- traction. The ssCP gel (0.53 MDE gel; Cambrex Bio Science Rockland) Results was run at 16˚C for 30 h in 0.63 Tris/borate/EDTA buffer with 1 W Characterization of nurse shark b2M constant power. Cartilaginous fish are the oldest living vertebrates having an Allele-specific PCR adaptive immune system centered upon Ig, TCR, and MHC (1). Nurse shark b2M sequences were obtained from family 2 with known When it was suggested that class I and class II genes may have MHC haplotypes. PCR was performed using a forward primer in intron 2 evolved in separate linkage groups from studies of teleost fish (NSB2mint2For: 59-TTACACATCACCACCACCTC-39) and a reverse pri- (28), we demonstrated in family studies that the two MHC classes mer designed from the IgSF exon (exon 3) (NSB2mex3Rev: 59-GATTGA- were closely linked in two shark species, nurse shark and banded 9 b TTCAGTAGC-3 ). We amplified 2M gene fragments from several ani- houndshark (21). To gain further insight into the primordial MHC mals carrying different maternal and paternal haplotype combinations to find allele-specific polymorphisms. After we identified a two-nucleotide organization, we have isolated many shark genes associated with deletion in intron 2 in the paternal haplotype in animals belonging to adaptive immunity, including b2M. The full-length b2M clone was groups “i” and “j” (p3), allele-specific primers were designed for each gene found in an in-house EST collection (GenBank accession number in which deletions are positioned at the third and fourth nucleotide posi- HM625831), as well as from a previously published genomic se- tions at the 39-end of primers. PCR was performed using a combination of allele-specific and NSB2mex3Rev primers at 94˚C for 4 min, followed by quence (GenBank accession number GQ865623) (29), and the 35 cycles of 94˚C for 1 min, 58˚C for 1 min, and 72˚C for 1 min, with deduced amino acid sequence was aligned with b2M from other a final extension of 72˚C for 10 min using 2–500 ng genomic DNA as species (S1). As was noted in previous studies, evolutionarily The Journal of Immunology 3 conserved residues are either found in all C1-IgSF (or just IgSF) Mapping of b2M to the MHC in family studies a domains (29, 30) or are predicted to be at class I -chain interac- Two families of nurse sharks previously were used to map several b tion sites (31). Some cartilaginous fish 2M have potential N- genes to the MHC (21, 36, 37). All of these families showed glycosylation sites that are rare in tetrapods but present in several multiple paternity, at least five fathers in family 1 and seven in bony fish species (32). Consistent with previous studies (33, 34), family 2. Southern blotting analysis using many restriction phylogenetic tree analysis revealed that cartilaginous fish b2M enzymes demonstrated that b2M is a single-copy gene (five rep- clustered with the orthologous proteins and to the IgSF domains of resentative digestions are shown in Fig. 1C); unfortunately, no MHC class IIA/DMA, suggesting that they share the most recent RFLPs were obtained to test the linkage status, and thus we se- common ancestor (Fig. 1A). Also consistent with previous studies quenced the gene from animals with different MHC haplotypes, (33), the IgSF domains of class IIB and class Ia shared the most hoping to find polymorphisms. A two-nucleotide deletion was recent common ancestor. b2M expression pattern seems to co- detected in one of the paternal b2M alleles “p3” from groups “i” incide with MHC class I expression (Fig. 1B). (p3/m2) and “j,” (p3/m1) from family 2 with 39 members (Fig. Downloaded from http://www.jimmunol.org/ by guest on September 28, 2021

FIGURE 1. A, Phylogenetic tree analysis of b2M. GenBank accession numbers used for this analysis are as follows. b2M: M17987 (human), X69084 (bovine), NM_009735 (mouse), Y00441 (rat), P01885 (rabbit), P01886 (guinea pig), M84767 (chicken), P21612 (turkey), AAM98336 (opossum), BQ389924 (X. tropicalis), AAF37230 (X. laevis), L05536 (carp), NP_571238 (zebrafish), L63534 (trout), CAA10761 (cod), AAG17535 (salmon), CAB61324 (Siberian sturgeon), AAN40738 (Japanese flounder), CAD44965 (African barb), O42197 (catfish), CA330181 (Fugu), AAN62852 (skate), and CX197532 (dogfish). Class IIa: AAF66123 (nurse shark), AAL58430 (X. laevis), AAA59760 (human), AAV40625 (rat), NP_001001762 (chicken), XP_001376764 (opossum). Class IIb: AAF82681 (nurse shark), AAB86437 (human), NP_001008884 (rat), BAA02845 (X. laevis), NP_001038144 (chicken), AAB68822 (opossum). Class Ia: BAD92354 (human), AAC53397 (rat), AAL59857 (nurse shark), NP_001079241 (X. laevis), AAG28835 (chicken), NP_001165308 (opossum). IgM: AAD21191 (opossum), P01871 (human), AAH92586 (rat). DMB: ABB85336 (X. laevis), NP_002109 (human), NP_942035 (rat). DMA: NP_006111 (human), NP_942036 (rat), ACY01474 (chicken), XP_001377359 (opossum). The NJ tree was rooted with the fourth constant IgSF domains of IgM, and bootstrapping analysis was done after 1000 runs. Values are noted at the branch nodes, and the asterisk (*) indicates no significant value. The scale indicates divergence time (genetic distance). Teleost fish that underwent a third round of genome expansion (“3R”) are omitted from this analysis because the sequences were more divergent and skewing the tree topology. DM genes have not been identified in any fish. B, Expression profiles of b2M, class Ia, and ring3 via Northern blotting. Twenty micrograms of total RNA isolated from various nurse shark tissues was loaded onto the gel, blotted, and hybridized with full-length shark b2M and ring3 probes and washed under high-stringency conditions (23). Nucleoside-diphosphate kinase (NDPK) (35) was used as a loading control. C, There is only one b2M locus in the nurse shark genome. Genomic Southern blot analysis was performed under low-stringency conditions (23) using the IgSF exon with three wild sharks (a, b, c) whose DNA was digested with five different restriction enzymes (from left to right: Bam HI, Eco RI, Hin dIII, PST I, and Sac I). 4 LINKAGE ANALYSIS OF b2-MICROGLOBULIN Downloaded from http://www.jimmunol.org/ by guest on September 28, 2021

FIGURE 2. The shark b2M is linked to the MHC. A, The two-nucleotide (CC) deletion polymorphism was found in intron 2 of b2m sequences in “p3” paternal allele from siblings belonging to the groups “i” and “j.” Thus, allele-specific primers were designed based on this polymorphism. All primers are underlined. The ends of coding regions are boxed. The (AG) at the end of intron 2 is underlined. B, PCR was carried out with a combination of allele- specific and universal NSB2mEx3Rev reverse primers. Presence or absence of the amplicon using the “p3”-specific primers was used for typing (top gel) the family 2 with 39 offsprings. Maternal primers were used for the positive control (bottom gel). Forward primers are indicated on the left side of the gels, and mother and sibling numbers are indicated above the gel along with MHC groups (36). C, Allele-specific PCR in the families 1 and 3. Only two animals belonging to the MHC groups “h” possessed the “CC-deletion” allele, and two animals belonging to the “g” groups had this allele in family 3. We partially typed family 3 based on the MHC groups by sequencing of the PBR of the class Ia alleles (maternal and paternal alleles are designated as numbers above the gel) and by Southern blotting with a probe containing MHC class Ia leader and a1 domains (small dot, band for maternal haplotype 1; large dot, maternal haplotype 2). The “p2” allele of the “g” group is the only haplotype possessing the “CC-deletion” allele of b2M. D, Plot of LOD scores at corresponding recombination fractions. The sums of the two families were used (Supplemental Table I). The Journal of Immunology 5

2A), and allele-specific PCR was performed in all members of the defined immune functions but nevertheless linked to the MHC of nurse shark families in our collection (Fig. 2B). Family 1 had two all other gnathostomes and to the “proto-MHC” in lower deuter- positive members that shared the same paternal MHC haplotype ostomes (42). A portion of ring3 was initially cloned via de- (group “h”) (Fig. 2C). In family 2, all seven members of groups generate PCR from nurse shark spleen cDNA, and this short “i” and “j” bearing the paternal MHC haplotype “p3” were pos- fragment was used as a probe to isolate a full-length cDNA from itive as well as one other offspring belonging to the “e9” group. a phage library. BLAST searches and phylogenetic tree analysis Family 3 with 29 offspring, which had not been MHC-typed confirmed the orthology of nurse shark ring3 to that of other previously, was tested, and two members were positive for the species (GenBank accession number HM625830) (Fig. 3A). The b2M polymorphism (Fig. 2C). Typing of this family by Southern nurse shark ring3 is ubiquitously expressed (Fig. 1B). To ensure blotting as well as sequencing of the class Ia alleles in all offspring that the shark ring3 is linked to the MHC as in all other species showed that these two animals share the same paternal MHC examined (8), we performed ssCP analysis using siblings of haplotype (Fig. 2C). Thus, a total of 11 of 12 siblings positive for family 2 (Fig. 3B). Two distinguishing ring3 bands corresponding the b2M polymorphism in three families showed precise cose- with the maternal MHC allele m2 were found in those siblings gregation with certain MHC haplotypes. In addition, 73 of 74 possessing this allele (groups “i” and “d” in Fig. 3) with 100% siblings with many other haplotypes lacked this polymorphism, fidelity, demonstrating that ring3 is closely linked to the MHC and further strongly indicating that b2M does not segregate in- further confirming the b2M linkage. We identified other BAC dependently of the MHC. The one discordant animal in family 2 clones that were either b2M-orring3-single-positive; unfor- (sib 36, group “e9”) was also typed by microsatellite analysis and tunately, none of them was positive for other MHC genes, again shown to have been sired by the same father as offspring in the “i” consistent with larger intergenic distances in sharks compared Downloaded from and “j” groups; thus this father had the MHC haplotypes “p3” and with those of other species (36). Chen et al. (29) drew a premature “p6” (E.J. Heist et al., submitted for publication), consistent with conclusion of non-MHC linkage; however, determining the link- a paternal intra-MHC recombination event in sib 36. To quantify age status of b2M (or almost any gene) based on a single BAC formally the evidence for linkage of b2M to the MHC, we con- sequence is not sufficient for the shark genome, where there are sidered all offspring of the two deletion-carrying sires (found large intragenic and intergenic distances. Several nurse shark BAC

within families 2 and 3) as assigned by Southern blotting with clones (22) were isolated with the ring3 and b2M probes, and http://www.jimmunol.org/ class I probes (Fig. 2C) (36), sequences of MHC class I alleles some of them were positive for both genes. As previously reported (Fig. 2C, Table I), and microsatellite analysis (E.J. Heist et al., (29), the b2M gene contains at least three exons, having a similar submitted for publication) (Table I). Family 1 sires have not been genomic organization and size to other species. The shark ring3 microsatellite-characterized, and therefore family 1 was not in- gene spans ∼20 kb and contains 12 exons, which is approximately cluded in the analysis. We performed a parametric linkage anal- twice as large as mammalian ring3 genes (e.g., 12.8 kb and 9.7 kb ysis (26) to evaluate the evidence for b2M and MHC synteny and for human and mouse, respectively), consistent with a larger gene obtained a maximum LOD score of 3.14 [1378:1 odds of linkage size found in most shark MHC genes (36). Sequencing through an versus no linkage, equivalent to p =73 1025 (27)] at a u of 0.056 entire BAC clone (GC_614H19) confirmed that the b2M and (Supplemental Table I, Fig. 2D). ring3 genes were adjacent to each other ∼45 kb apart (Fig. 4). by guest on September 28, 2021 b2M is adjacent to MHC-linked Ring3 Genetic descent of b2M Ring3 (or BRD2) is a putative nuclear transcriptional regulator and The chromosomal location of the b2M gene varies greatly among a nuclear kinase required for early development (38–41) with no vertebrate species (Fig. 5). Genomic synteny is well conserved in

Table I. List of sibs used for statistical analysis

MHCa Haplotypes m-Satellite Phase

Sib No. Old Group New Group MHC Class Ia b2m Sire 1 2 Family 2 15 i i m2/p3 del 4 NR R 30 i i m2/p3 del 4 NR R 21 j j m1/p3 del 4 NR R 25 j j m1/p3 del 4 NR R 31 j j m1/p3 del 4 NR R 33 j j m1/p3 del 4 NR R 39 j j m1/p3 del 4 NR R 20 e9 e9 m1/p6 ins 4 NR R 32 e9 e9 m1/p6 ins 4 NR R 36 e9 e9 m1/p6 del 4 R NR 28 g9 g9 m2/p6b ins 4 NR R 13 c g9 m2/p6b ins 4 NR R Family 3 8 g m2/p2 del 2 NR R 23 g m2/p2 del 2 NR R 6 d m1/p4 ins 2 NR R 7 d m1/p4 ins 2 NR R 9 d m1/p4 ins 2 NR R 19 d m1/p4 ins 2 NR R aOld group is taken from Ref. 28, and new groups are assigned in this study. bMHC class Ia sequences revealed that sib 13 is further categorized with group g9 in this study. del, CC-deletion haplotype; NR, nonrecombinant; R, recombinant. 6 LINKAGE ANALYSIS OF b2-MICROGLOBULIN

634,364 bp) (Supplemental Table II). Mouse b2M is linked to the so-called minor histocompatibility complex on chromosome 2 (16) and is located within a small region syntenic to human chromo- some 15 (43). Notably, a smaller syntenic block is embedded with genes mapping to human chromosome 14q11.2 in a marsupial, the opossum. Although these regions can be accounted for by block translocations or syntenic breakpoints, synteny is not conserved in species from lower vertebrate classes as b2M is surrounded by genes mapping to various human . The amphibian Xenopus b2M is linked to the genes mapping to human chromo- somes 16 and 17 (genomic scaffold-673). In zebrafish, b2M (chromosome 4) is surrounded by genes mapping to human chro- mosome 12p12, and various locations in the have syntenic regions on the Fugu scaffold-171 (638,182 bp). As men- tioned above, the teleost fish experienced a recent genome-wide duplication (“3R”), and there is another b2M locus in the zebrafish genome that is ∼60% similar to its paralogue at the amino acid level. Notably, the second b2M locus is found at the telomeric

region of chromosome 8 and is distantly linked to a class IIA gene Downloaded from and two class Ib genes of the L-lineage (44) (Supplemental Table II). Although the b2M linkage is not very close (i.e., 6.5 Mbp apart) in this chromosomal region (considering the rapid reorganization of syntenic regions in the teleost fish), this linkage group of class II/ class I/b2M is likely a vestige of the primordial synteny. Com-

bining all of the evidence, our study in nurse shark demonstrates http://www.jimmunol.org/ that b2M was originally encoded in the MHC, and from extensive database analysis in many taxa, this gene underwent multiple translocations in gnathostomes, either stepwise or independently from the MHC (Fig. 5).

Discussion Compared with other vertebrate models (e.g., chicken or teleost fish), the shark genome seems to be stable, first demonstrated with the linkage of MHC class I and II genes (21), which was lost in by guest on September 28, 2021 bony fish (28), and later with linkage conservation of genes found FIGURE 3. A, Phylogenetic tree analysis of Ring3 and homologues. in the mammalian MHC class III region (37). These MHC linkage GenBank accession numbers used in this analysis are as follows. Ring3 data are consistent with global genomic studies in the elephant (BRD2): CAM25760 (human), AAY34703 (bovine), CAI11405 (dog), shark suggesting that cartilaginous fish have greater preservation CAA15819 (mouse), CAE83937 (rat), XP_001369391 (opossum), of synteny than is found in any teleost model (45, 46). The b2M CAN13285 (pig), CAA65449 (chicken), BAC82511 (quail), AAI68574 (X. linkage to the shark MHC demonstrated here is likely the pri- tropicalis), AAI30180 (X. laevis), CAK04960 (zebrafish-1), CAD54663 mordial condition, thus further supporting the conservation of the (zebrafish-2), ABQ59684 (salmon), BAD93258 (medaka). Additional ac- cession numbers for Ring3 homologues used for this analysis are the fol- cartilaginous fish genome. Furthermore, the close proximity of b lowing: BRD3: AAI29055 (X. laevis), NP_031397 (human), NP_075825 class I, class II, and 2M is consistent with the theory that they (mouse), XP_001365890 (opossum), XP_425330 (Chicken). BRDT: were derived from a common ancestor by tandem (cis) duplica- NP_473395 (mouse), NP_997072 (human), XP_537079 (dog). BRD4: tion. The close linkage of b2M and class I may have regulated NP_490597 (human), NP_065254 (mouse), NP_001104751 (zebrafish), their original coordinated expression and upregulation. Class I and AAH76786 (X. laevis). BRD1: NP_001157300 (horse), XP_698063 b2M expression is nearly identical in the nurse shark (Fig. 1B), but (zebrafish), NP_001085846 (X. laevis), CAG30294 (human). Gene names in other vertebrates b2M is made in excess (47). Furthermore, the are noted after species name. BRD1 does not map to an MHC paralogous number of b2M loci is expanded in rainbow trout (48) and poly- region, whereas BRDT, BRD3, and BRD4 are found in the MHC paralogous ploid Xenopus species (18). regions. The tree was constructed using the NJ method, rooted with BRD1, Unlike class II genes, class I genes are extraordinarily plastic. and bootstrapping analysis was done with 1000 runs. Values are noted at the branch nodes, and an asterisk (*) indicates no significant value. The scale Besides the MHC-linked classical class Ia genes, there are also indicates the divergence time. B, The shark ring3 maps to the MHC. Primers many nonclassical class Ib genes with varied functions, some from exons 4 and 5 were used for PCR amplification and ssCP analysis. The encoded in the MHC and others not. The majority of class Ib ∼1440-bp amplicon from the siblings along with mother shark genomic proteins associates with b2M as well, and it has been speculated DNA were loaded on an 0.53 MDE gel. Under these conditions, “m2” was that there was an advantage of translocation of b2M out of the identified as two distinctive bands indicated as arrows. Mother and sibling MHC so that it would not be subject to duplications and deletions numbers are indicated above the gel along with MHC groups and haplotype (19), like class I genes in many vertebrates. Consistent with the combinations from previous work (36). idea of maintaining genomic stability, but in contrast to class I and class II genes, both b2M and ring3 genes are in a very stable part the region of chicken b2M relative to humans except for deletions of the shark MHC, with very few polymorphisms and transposable of certain genes (43), and the same seems to be true for the Anolis elements (Fig. 4); there was no polymorphism detected by using lizard in which the synteny near the b2M gene (GenBank acces- restriction enzymes/Southern blotting with either the ring3 or sion number FG703784, etc.) is conserved (genomic scaffold-670, b2M probe. Although there are a few bony fish species in which The Journal of Immunology 7

FIGURE 4. Map of BAC clone GC_614H19. Gene orientation is indicated as arrows and exons are shown in boxes. Only one exon for ZFP112-like gene was identified based on the similarity to other species. The positions of repetitive elements are shown above the map classified into four different categories. The total interspersed repeats are found in ∼5.35% of the sequences, consisting of ∼4.74% of LINEs and ∼0.63% of simple repeats. Each exon is indicated as a box, and transcriptional orientations are shown with an arrow in the 59 to 39 direction. The sequence has been deposited in the DNA Data Bank of Japan under accession number AB571627. the number of b2M loci has been expanded (49), and there are two early after the emergence of tetrapods, as no DM genes have been loci in the tetraploid Xenopus laevis (18), generally these species found in the teleost or cartilaginous fish; the maximum likelihood b are exceptions. There seems to be only one 2M locus in the nurse and Bayesian inference trees favor this scenario (S2). The NJ tree Downloaded from shark genome, because genomic Southern blotting with many (Fig. 1B), however, suggests that shark class IIA and IIB genes restriction enzymes yielded a single band with an exon-specific cluster with class II genes from other species rather than at the probe (Fig. 1C). basal position of class II/DM, suggesting that sharks may indeed The primordial linkage of b2M to the MHC does not contribute possess DM. to the debate on which gene came first, class I or class II. Among An orthologous gene related to the ancestor of ring3 is present the various IgSF domains, the C1-type is a rare form, found pri- in the urochordate (e.g., amphioxus) “proto-MHC” (42), and thus http://www.jimmunol.org/ marily in molecules associated with adaptive immunity (50). the MHC-linkage of ring3 in sharks is not surprising. To determine Therefore, it is reasonable to propose that C1-type IgSF-encoding the linkage status in other cartilaginous fish species, we examined genes like b2M were present in the “proto-MHC,” which then the elephant shark genome. Current analyses of the elephant acquired the PBR from another gene family. Furthermore, it has shark genome (46) has yielded only short (,1 kbp) scaffolds been speculated that all molecules containing C1-type IgSF do- (AAVX01540028.1) in which we only identified the b2M C1 do- mains arose from a common ancestor, and thus an Ig/TCR pre- main. Three scaffolds were found to contain some exons of the ele- cursor may have originated from the “proto-MHC” (20). Con- phant shark ring3 gene [AAVX01538535 (754 bp), AAVX01069837 sistent with previous studies dating back almost 30 y (3, 5, 33, 34), (5232 bp), AAVX01012433 (4324 bp)]; however, the assembly is still

our phylogenetic analysis demonstrated a common origin for the in its early stages. Further progress in this genome project will reveal by guest on September 28, 2021 class IIA/DMB/b2M and the class Ia/DMA/class IIB lineages, and the synteny around b2M and all of the other MHC genes and likely all of these genes share an ancestral C1 domain-encoding exon provide insight into the natural history of the adaptive immune system that emerged after the split between Ag receptors and MHC genes by revealing other genes that have been translocated out of the (Fig. 1B). Whereas class IIA, b2M, class IIB, and class Ia share an MHC during vertebrate evolution. For example, there is good evi- immediate common ancestor that arose by tandem duplication dence from various vertebrates that both IgSF- and C-type lectin- from the ancestral molecule, each DM gene was apparently gen- containing NK cell receptor genes (in humans, they are encoded in erated by tandem duplications of class IIA and class IIB, perhaps leukocyte receptor complex and NK complex, respectively) and the

FIGURE 5. Inconsistent synteny of b2M among vertebrate species. Genomic synteny of b2M is not consistent in bony fish and Xenopus, suggesting that multiple translocations of b2M occurred over evolutionary time. An asterisk (*) indicates the location of the b2M gene, and brackets indicate the genomic regions corresponding with the particular human chromosome. The detailed gene assignments can be obtained in Supplemental Table II. IgH and TCRa loci are marked in opossum chromosome 1. 8 LINKAGE ANALYSIS OF b2-MICROGLOBULIN

MHC were genetically linked at an early point in vertebrate evolution 16. Michaelson, J. 1981. Genetic polymorphism of beta 2-microglobulin (B2m) (20, 51, 52), suggesting that NK receptors co-evolved with MHC maps to the H-3 region of chromosome 2. Immunogenetics 13: 167–171. 17. Riegert, P., R. Andersen, N. Bumstead, C. Do¨hring, M. Dominguez-Steglich, proteins. We have found a fragment of a zinc finger protein (ZFP), J. Engberg, J. Salomonsen, M. Schmid, J. Schwager, K. Skjødt, and J. Kaufman. ZFP112-like, in BAC clone GC_614H19, adjacent to b2M (Fig. 1996. The chicken beta 2-microglobulin gene is located on a non-major histo- compatibility complex microchromosome: a small, G+C-rich gene with X and Y 5). ZFP112 is found on human chromosome 19q13.2 near FcRn boxes in the promoter. Proc. Natl. Acad. Sci. USA 93: 1243–1248. (19q13.3), a nonclassical class Ib molecule, and the leukocyte re- 18. Stewart, R., Y. Ohta, R. R. Minter, T. Gibbons, T. L. Horton, P. Ritchie, ceptor complex (19q13.4). This region had been suggested to be an J. D. Horton, M. F. Flajnik, and M. D. Watson. 2005. Cloning and character- ization of Xenopus beta2-microglobulin. Dev. Comp. Immunol. 29: 723–732. MHC paralogous region by pericentric inversion of 19p13.1. Whether 19. Ono, H., F. Figueroa, C. O’hUigin, and J. Klein. 1993. Cloning of the beta 2- the nurse shark ZNF112 is a pseudogene or divergent from human/ microglobulin gene in the zebrafish. Immunogenetics 38: 1–10. rodent ZFP112 genes, the linkage of ZFP112 suggests that the linkage 20. Flajnik, M. F., and M. Kasahara. 2010. Origin and evolution of the adaptive immune system: genetic events and selective pressures. Nat. Rev. Genet. 11: 47– of NK receptor(s) and MHC could be preserved in the shark genome. 59. Furthermore, we found b2M on the same chromosome as TCRa/d in 21. Ohta, Y., K. Okamura, E. C. McKinney, S. Bartl, K. Hashimoto, and horse (chromosome 1), cow (chromosome 10), and both TCRa/d and M. F. Flajnik. 2000. Primitive synteny of vertebrate major histocompatibility complex class I and class II genes. Proc. Natl. Acad. Sci. USA 97: 4712–4717. Ig in the opossum genome (Fig. 5, Supplemental Table II). In ad- 22. Luo, M., H. Kim, D. Kudrna, N. B. Sisneros, S. J. Lee, C. Mueller, K. Collura, dition, Ag receptor loci and other genes involved in immune defense A. Zuccolo, E. B. Buckingham, S. M. Grim, et al. 2006. Construction of a nurse shark (Ginglymostoma cirratum) bacterial artificial chromosome (BAC) library (e.g., B7 ligands and Fc-like receptors) are linked to genes related to and a preliminary genome survey. BMC Genomics 7: 106. the Xenopus MHC (Y. Ohta and M.F. Flajnik, manuscripts in prep- 23. Bartl, S., M. A. Baish, M. F. Flajnik, and Y. Ohta. 1997. Identification of class I aration), and cathepsins S and L are found on MHC paralogous genes in cartilaginous fish, the most ancient group of vertebrates displaying an adaptive immune response. J. Immunol. 159: 6097–6104. regions in mammals (20). Such evidence is consistent with our hy- 24. Saitou, N., and M. Nei. 1987. The neighbor-joining method: a new method for Downloaded from pothesis that Ag receptors (TCR, Ig), NK receptors, and other genes reconstructing phylogenetic trees. Mol. Biol. Evol. 4: 406–425. involved in Ag processing and generally in immune function might 25. Page, R. D. 1996. TreeView: an application to display phylogenetic trees on personal computers. Comput. Appl. Biosci. 12: 357–358. have been linked in a “pre-adaptive immune complex” in the an- 26. Ott, J. 1999. Analysis of Human Genetic Linkage. The Johns Hopkins University cestral configuration. Press, Baltimore, MD. 27. Lander, E., and L. Kruglyak. 1995. Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nat. Genet. 11: 241–

Acknowledgments 247. http://www.jimmunol.org/ ´ We thank Dr. Mike Criscitiello and Caitlin Doremus for critical reading. 28. Sato, A., F. Figueroa, B. W. Murray, E. Malaga-Trillo, Z. Zaleska-Rutczynska, H. Su¨ltmann, S. Toyosawa, C. Wedekind, N. Steck, and J. Klein. 2000. Non- linkage of major histocompatibility complex class I and class II loci in bony Disclosures fishes. Immunogenetics 51: 108–116. 29. Chen, H., S. Kshirsagar, I. Jensen, K. Lau, C. Simonson, and S. F. Schluter. 2010. The authors have no financial conflicts of interest. Characterization of arrangement and expression of the beta-2 microglobulin locus in the sandbar and nurse shark. Dev. Comp. Immunol. 34: 189–195. 30. Williams, A. F., and A. N. Barclay. 1988. The immunoglobulin superfamily— References domains for cell surface recognition. Annu. Rev. Immunol. 6: 381–405. 31. Saper, M. A., P. J. Bjorkman, and D. C. Wiley. 1991. Refined structure of the 1. Flajnik, M. F., and L. Du Pasquier. 2008. Evolution of the immune system. In human histocompatibility antigen HLA-A2 at 2.6 A resolution. J. Mol. Biol. 219:

Fundamental Immunology. W. E. Paul, ed. Lippincott Williams & Wilkins, by guest on September 28, 2021 277–319. Philadelphia, p. 56–124. 32. Criscitiello, M. F., R. Benedetto, A. Antao, M. R. Wilson, V. G. Chinchar, 2. Flajnik, M. F., C. Canel, J. Kramer, and M. Kasahara. 1991. Which came first, N. W. Miller, L. W. Clem, and T. J. McConnell. 1998. Beta 2-microglobulin of MHC class I or class II? Immunogenetics 33: 295–300. ictalurid catfishes. Immunogenetics 48: 339–343. 3. Kaufman, J. F., and J. L. Strominger. 1982. HLA-DR light chain has a poly- 33. Flajnik, M. F., K. Miller, and P. L. Du. 2003. Evolution of the immune system. In morphic N-terminal region and a conserved immunoglobulin-like C-terminal Fundamental Immunology. W. E. Paul, ed. Lippincott Williams & Wilkins, region. Nature 297: 694–697. Philadelphia, p. 519–570. 4. Klein, J., and C. O’hUigin. 1993. Composite origin of major histocompatibility 34. O’hUigin, C., H. Su¨ltmann, H. Tichy, and B. W. Murray. 1998. Isolation of mhc complex genes. Curr. Opin. Genet. Dev. 3: 923–930. class II DMA and DMB cDNA sequences in a marsupial: the gray short-tailed 5. Hughes, A. L., and M. Nei. 1993. Evolutionary relationships of the classes of opossum (Monodelphis domestica). J. Mol. Evol. 47: 578–585. major histocompatibility complex genes. Immunogenetics 37: 337–346. 35. Kasahara, M., C. Canel, E. C. McKinney, and M. F. Flajnik. 1991. Molecular 6. Meyer, A., and Y. Van de Peer. 2005. From 2R to 3R: evidence for a fish-specific cloning of nurse shark cDNAs with high sequence similarity to nucleoside di- genome duplication (FSGD). Bioessays 27: 937–945. phosphate kinase genes. In Evolution of the Major Histocompatibility Complex. 7. Postlethwait, J., A. Amores, W. Cresko, A. Singer, and Y. L. Yan. 2004. Sub- J. Klein, ed. Springer-Verlag, New York, p. 491–499. function partitioning, the teleost radiation and the annotation of the human ge- 36. Ohta, Y., E. C. McKinney, M. F. Criscitiello, and M. F. Flajnik. 2002. Protea- nome. Trends Genet. 20: 481–490. some, transporter associated with antigen processing, and class I genes in the 8. Flajnik, M. F., and M. Kasahara. 2001. Comparative genomics of the MHC: nurse shark Ginglymostoma cirratum: evidence for a stable class I region and glimpses into the evolution of the adaptive immune system. Immunity 15: 351–362. MHC haplotype lineages. J. Immunol. 168: 771–781. 9. Nonaka, M., C. Namikawa, Y. Kato, M. Sasaki, L. Salter-Cid, and M. F. Flajnik. 37. Terado, T., K. Okamura, Y. Ohta, D. H. Shin, S. L. Smith, K. Hashimoto, 1997. Major histocompatibility complex gene mapping in the amphibian Xenopus T. Takemoto, M. I. Nonaka, H. Kimura, M. F. Flajnik, and M. Nonaka. 2003. implies a primordial organization. Proc. Natl. Acad. Sci. USA 94: 5789–5791. Molecular cloning of C4 gene and identification of the class III complement 10. Kaufman, J., S. Milne, T. W. Go¨bel, B. A. Walker, J. P. Jacob, C. Auffray, region in the shark MHC. J. Immunol. 171: 2461–2466. R. Zoorob, and S. Beck. 1999. The chicken B locus is a minimal essential major 38. Denis, G. V., M. E. McComb, D. V. Faller, A. Sinha, P. B. Romesser, and histocompatibility complex. Nature 401: 923–925. C. E. Costello. 2006. Identification of transcription complexes that contain the 11. Clark, M. S., L. Shaw, A. Kelly, P. Snell, and G. Elgar. 2001. Characterization of double bromodomain protein Brd2 and chromatin remodeling machines. J. the MHC class I region of the Japanese pufferfish (Fugu rubripes). Immunoge- Proteome Res. 5: 502–511. netics 52: 174–185. 39. Sinha, A., D. V. Faller, and G. V. Denis. 2005. Bromodomain analysis of Brd2- 12. Matsuo, M. Y., S. Asakawa, N. Shimizu, H. Kimura, and M. Nonaka. 2002. dependent transcriptional activation of cyclin A. Biochem. J. 387: 257–269. Nucleotide sequence of the MHC class I genomic region of a teleost, the medaka 40. Denis, G. V., C. Vaziri, N. Guo, and D. V. Faller. 2000. RING3 kinase trans- (Oryzias latipes). Immunogenetics 53: 930–940. activates promoters of cell cycle regulatory genes through E2F. Cell Growth 13. Michalova´, V., B. W. Murray, H. Su¨ltmann, and J. Klein. 2000. A contig map of Differ. 11: 417–424. the Mhc class I genomic region in the zebrafish reveals ancient synteny. J. 41. Denis, G. V., and M. R. Green. 1996. A novel, mitogen-activated nuclear kinase Immunol. 164: 5296–5305. is related to a Drosophila developmental regulator. Genes Dev. 10: 261–271. 14. Phillips, R. B., A. Zimmerman, M. A. Noakes, Y. Palti, M. R. Morasch, L. Eiben, 42. Danchin, E. G., and P. Pontarotti. 2004. Towards the reconstruction of the S. S. Ristow, G. H. Thorgaard, and J. D. Hansen. 2003. Physical and genetic bilaterian ancestral pre-MHC region. Trends Genet. 20: 587–591. mapping of the rainbow trout major histocompatibility regions: evidence for 43. Jones, C. T., D. R. Morrice, I. R. Paton, and D. W. Burt. 1997. Gene homologs on duplication of the class I region. Immunogenetics 55: 561–569. human chromosome 15q21-q26 and a chicken microchromosome identify a new 15. Shiina, T., J. M. Dijkstra, S. Shimizu, A. Watanabe, K. Yanagiya, I. Kiryu, conserved segment. Mamm. Genome 8: 436–440. A. Fujiwara, C. Nishida-Umehara, Y. Kaba, I. Hirono, et al. 2005. In- 44. Dijkstra, J. M., T. Katagiri, K. Hosomichi, K. Yanagiya, H. Inoko, M. Ototake, terchromosomal duplication of major histocompatibility complex class I regions T. Aoki, K. Hashimoto, and T. Shiina. 2007. A third broad lineage of major in rainbow trout (Oncorhynchus mykiss), a species with a presumably recent histocompatibility complex (MHC) class I in teleost fish; MHC class II linkage tetraploid ancestry. Immunogenetics 56: 878–893. and processed genes. Immunogenetics 59: 305–321. The Journal of Immunology 9

45. Kulski, J. K., T. Shiina, T. Anzai, S. Kohara, and H. Inoko. 2002. Comparative 49. Shum, B. P., K. Azumi, S. Zhang, S. R. Kehrer, R. L. Raison, H. W. Detrich, and genomic analysis of the MHC: the evolution of class I duplication blocks, di- P. Parham. 1996. Unexpected beta2-microglobulin sequence diversity in in- versity and complexity from shark to man. Immunol. Rev. 190: 95–122. dividual rainbow trout. Proc. Natl. Acad. Sci. USA 93: 2779–2784. 46. Venkatesh, B., E. F. Kirkness, Y. H. Loh, A. L. Halpern, A. P. Lee, J. Johnson, 50. Du Pasquier, L. 2000. Relationships among the genes encoding MHC molecules N. Dandona, L. D. Viswanathan, A. Tay, J. C. Venter, et al. 2007. Survey se- and the specific antigen receptors. In MHC Evolution, Structure and Function. L. quencing and comparative analysis of the elephant shark (Callorhinchus milii) Du Pasquier, and M. Kasahawa, eds. Springer-Verlag, Tokyo, p. 53–65. genome. PLoS Biol. 5: e101. 51. Rogers, S. L., T. W. Go¨bel, B. C. Viertlboeck, S. Milne, S. Beck, and 47. Ploegh, H. L., L. E. Cannon, and J. L. Strominger. 1979. Cell-free translation of the mRNAs for the heavy and light chains of HLA-A and HLA-B antigens. Proc. J. Kaufman. 2005. Characterization of the chicken C-type lectin-like receptors Natl. Acad. Sci. USA 76: 2273–2277. B-NK and B-lec suggests that the NK complex and the MHC share a common 48. Magor, K. E., B. P. Shum, and P. Parham. 2004. The beta 2-microglobulin locus ancestral region. J. Immunol. 174: 3475–3483. of rainbow trout (Oncorhynchus mykiss) contains three polymorphic genes. J. 52. Trowsdale, J. 2001. Genetic and functional relationships between MHC and NK Immunol. 172: 3635–3643. receptor genes. Immunity 15: 363–374. Downloaded from http://www.jimmunol.org/ by guest on September 28, 2021 Supplemental Figure legends

S1

Amino acid alignment of shark β2M sequences compared to other vertebrate species using the

clustalX2 program with minor adjustment. The mature proteins are indicated with an arrow.

Identity to human β2M is shown as a percentage at the end of each sequence. The conserved

IgSF cysteine and tryptophan residues are shaded. The position of strands (based on the human

β2M) are indicated as a-g above alignment, and the contact residues with the α1 (‘1’), α2 (‘2’),

and α3 (‘3’) domains of human HLA-A2 are noted above the alignment (Saper, M. A. et al.,

1991. J. Mol. Biol. 219:277-319). GenBank accession numbers used for this analysis is shown in

the Figure 1a legend. Potential glycosylation sites in cartilaginous fish sequences are underlined.

The elephant shark Ig exon was predicted from the genomic scaffold AAVX01540028.1 based

on amino acid similarity and the predicted exon-intron boundaries.

S2

Further phylogenetic tree analysis of β2M using different algorithms. Multiple sequence

alignments were created using the ClustalW2 Sequence Alignment program of the Molecular

Evolution Genetics Analysis software 4 (MEGA4: http://www.megasoftware.net/, (Tamura K., et.

al., 2007. Mol. Biol. Evol. 24:1596-1599)). Phylogenetic trees were constructed by distance model based a) Maximum Likelihood (ML) method (PhyML 3.0: http://www.atgc-

montpellier.fr/phyml/, (Guindon, S., and O. Gascuel, 2003. Syst. Biol. 52:696-704)) and b)

Bayesian inference (BI) methods Mrbayes3.1.2: http://mrbayes.csit.fsu.edu/index.php). All

analyses were conducted on the aligned 37 amino acid sequences. We used Modelgenerator

v0.85 (http://bioinf.may.ie/software/modelgenerator/, (Keane, T. M. et. al., 2006. EMC Evol. Biol. 6:29)) to estimate the most likely model of the sequence evolution. Based on maximum likelihood values and the Akaike Information Criterion (AIC, (Posada D. and Burckley T.R.

2004. Syst. Biol. 53:793-808)) and Bayesian information criterion (BIC, (Schwarz G. 1978. The annals of statistics. 6:461-464)), the WAG+I model was selected as the most likely model for

ML method (-lnl = 7645.97, AIC = 15475.95) and BI method (-lnl = 7645.97, BIC = 15725.22).

The Bayesian analysis was run using the Metropolis-coupled Markov Chain Monte Carlo

(MCMC) algorithm from randomly generated starting trees for 1,000,000 generations with sampling every 100 generations and ML trees were assessed by 1,000 bootstrap replicates.

GenBank accession numbers used for these trees are shown in Figure 1 legend.

Supplemental Table I: LOD score calculation

θ Family 2 Family 3 Combined "R" "NR" LOD Score "R" "NR" LOD Score LOD Score 0 1 11 -∞ 0 6 -∞ -∞ 0.01 1 11 1.26 0 6 1.48 2.74 0.02 1 11 1.52 0 6 1.45 2.97 0.03 1 11 1.64 0 6 1.43 3.07 0.04 1 11 1.72 0 6 1.40 3.12 0.05 1 11 1.77 0 6 1.37 3.14 0.056 1 11 1.78 0 6 1.35 3.14 0.06 1 11 1.79 0 6 1.34 3.14 0.07 1 11 1.81 0 6 1.32 3.13 0.08 1 11 1.82 0 6 1.29 3.10 0.09 1 11 1.82 0 6 1.26 3.07 0.1 1 11 1.81 0 6 1.23 3.04 0.2 1 11 1.55 0 6 0.92 2.47 0.3 1 11 1.08 0 6 0.58 1.66 0.4 1 11 0.48 0 6 0.21 0.69 0.49 1 11 0.01 0 6 0.00 0.01

Supplemental Table II: List of genes around β2m in various species. Human chr position Opossum Human chr Chicken Human chr Anolis Human chr chr15 chr1 chr10 Scaffold-670 USP50 15q21.1 BTBD7 14q32.12-q32.13 634,364 bp USP8 15q21.2 C14orf130 14q32.12 GABPB2 15q21.2 C14orf142 14q32.12 HDC 15q21-q22 C14orf109 14q32.12 SLC27A2 15q21.2 ITPK1 14q31.1 ATP8B4 15q21.2 CHGA 14q32 DTW1 15q21.2 GOLGA5 14q32.12-q32.13 FGF7 15q15-q21.1 LGMN 14q32.1 C15orf33 15q21.1-q21.2 CPSF2 14q31.1 GALK2 15q21.1 NDUFB1 14q32.12 COPS2 15q21.2 FBLN5 14q32.1 KIAA0256 15q21.1 TC2N 14q32.12 KRT8P24 15q21.1 CATSPERB 14q32.12 EID1 15q21.1-q21.2 GZMB 14q11.2 SHC4 15q21.1-q21.2 STXBP6 14q12 CEP152 15q21.1 NOVA1 14q FBN1 15q21.1 FOXG1 14q13 DUT 15q15-q21.1 PRKD1 14q11 SLC12A1 15q15-q21.1 KIA1333 14q12 MYEF2 15q21.1 SCFD1 14q12 SLC24A5 15q21.1 COCH 14q12-q13 olfactory receptors SEMA6D 15q21.1 STRN3 14q13-q21 SHF 15q21.1 SQRDL 15q15 AP4S1 14q12 SLC28A2 15q15 PLDN 15q21.1 HECTD1 14q12 ALPK3 15q25.2 C15orf21 15q21.1 HEATR5A 14q12 MALT1 18q21 15q11.2-q21.3; SLC30A4 15q21.1 C14orf126 14q12 ZNF592 15q25.3 C15orf48 15q21.1 ARHGAP5 14q12 SEC11A 15q25.3 BRUNOL6 15q24 SPATA5L1 15q21.1 SLC30A4 15q21.1 DUOXA2 15q15.1 PARP6 15q23 GATM 15q21.1 SPATA5L1 15q21.1 DUOXA2 15q15.1 SLC28A2 15q15 SLC28A2 15q15 GATM 15q21.1 DUOX2 15q15.3 ALPK3 15q25.2 SHF 15q21.1 SLC28A2 15q15 SORD 15q15.3 ZNF592 15q25.3 DUOX1 15q15.3 DUOX1 15q15.3 RBPMS2 15q22.31 SHE 1q21.3 Human chr position Opossum Human chr Chicken Human chr Anolis Human chr chr15 chr1 chr10 Scaffold-670 DUOXA1 15q21.1 DUOXA1 15q21.1 ZNF609 15q22.31 DUOXA1 15q21.1 DUOXA2 15a15.1 DUOXA2 15q15.1 TRIP4 15q22.31 DUOX2 15q15.3 15q22.1- DUOX2 15q15.3 DUOX2 15q15.3 CSNK1G1 q22.31 DUOX1 15q15.3 SORD 15q15.3 SORD 15q15.3 PPIB 15q21-q22 SORD 15q15.3 C15orf43 15q21.1 C15orf43 15q21.1 SNX22 15q22.31 C15orf43 15q21.1 TRIM69 15q21.1 TRIM69 15q21.1 SNX1 15q22.31 TRIM69 15q21.1 B2M 15q21-q22.2 B2M 15q21-q22.2 B2M 15q21-q22.2 B2M 15q21-q22.2 PATL2 15q21.1 PATL2 15q21.1 RPS17 15q SNX1 15q22.31 SPG11 15q14 SPG11 15q14 CPEB1 15q25.2 TNS1 2q35-q36 EIF3J 15q15.3 EIF3J 15q15.3 AP3B2 15q PP1B 2p23 15q22.1- CTDSPL2 15q15.3 CTDSPL2 15q15.3 NR2E3 15q22.32 CSNK1G1 q22.31 15q15.3 CASC4 15q15.3 MYO9A 15q22-q23 TRIP4 15q22.31 CASC4 15q15.3 FRMD5 15q15.3 SENP8 15q23 ZNF609 15q22.31 FRMD5 15q15.3 WDR76 15q15.3 GRAMD2 15q23 OAZ2 15q22.31 ACTBP7 15q15.3 SERINC4 15q15.3 PKM2 15q22 WDR76 15q15.3 SERF2 15q15.3 PARP6 15q23 MFAP1 15q15-q21 ELL3 15q15.3 CELF6 15q24 HYPK 15q15.3 PDIA3/ERp57 15q15 ARIH1 15q24 SERINC4 15q15.3 STRCP 15q15.3 BBS4 15q22.3-q23 SERF2 15q15.3 CKMT1A 15q15 ADPGK 15q24.1 ELL3 15q15.3 HISPPD2A 15q15.3 NEO1 15q22.3-q23 PDIA3/ERp57 15q15 MAP1A 15q13-qter HCN4 15q24-q25 CATSPER2P1 15q15.3 TP53BP1 15q15-q21 NPTN 15q22 STRCP 15q15.3 TUBGCP4 15q15 CD276 15q23-q24 CKMT1A 15q15 ADAL 15q15.3 ZP3 7q11.23 HISPPD2B 15q15.3 TGM7 15q15.2 LOXL1 15q22 CATSPER2 15q14 TGM5 15q15.2 MUC1 1q21 STRC 15q15.3 EPB42 15q15-q21 PPCDC 15q24.2 CKMT1B 15q15 CCNDBP1 15q14-q15 SCAMP5 15q24.2 HISPPD2A 15q15.3 TMEM62 15q15.2 RPP25 15q24.2 MAP1A 15q13-qter UBR1 15q13 MPI 15q22-qter TP53BP1 15q15-q21 CDAN1 15q15.2 CSK 15q23-q25 Human chr position Opossum Human chr Chicken Human chr chr15 chr1 chr10 TUBGCP4 15q15 CEP27 15q15.1 CYP1A1 15q24.1 ZSCAN29 15q15.3 LRRC57 15q15.1 SCAMP2 15q23-q25 ADAL 15q15.3 SNAP23 15q15.1 CYP1A4 15q24.1 LCMT2 15q15.3 ZFP106 15q15.1 EDC3 15q24.1 TGM7 15q15.2 CAPN3 15q15.1-q21.1 CLK3 15q24 TGM5 15q15.2 GANC 15q15.2 ACTG1 17q25 EPB42 15q15-q21 TMEM87A 15q15.1 SENP8 15q23 CCNDBP1 15q14-q15 VPS39 15q15.1 SEMA7A 15q22.3-q23 TMEM62 15q15.2 PLA2G4F 15q15.1 CYP11A1 15q23-q24 UBR1 15q13 EHD4 15q11.1 STRA6 15q24.1 FDPSL4 15q15.2 MAPKBP1 15q15.1 ISLR 15q23-q24 TTBK2 15q15.2 MGA 15q14 PML 15q22 CDAN1 15q15.2 LTK 15q15.1-q21.1 STOML1 15q24-q25 STARD9 15q15.2 ITPKA 15q14-q21 PML 15q22 CEP27 15q15.1 RTF1 15q15.1 ZP3 7q11.23 LRRC57 15q15.1 NDUFAF1 15q11.2-q21.3 ZP3 7q11.23 SNAP23 15q15.1 NUSAP1 15q15.1 COMMD4 15q24.2 ZFP106 15q15.1 OIP5 15q15.1 NEIL1 15q24.2 CAPN3 15q15.1-q21.1 CHP 15q13.3 MAN2C1 15q11-q13 GANC 15q15.2 INOC1 15q15.1 SIN3A 15q24.2 TMEM87A 15q15.1 CHAC1 15q15.1 PTPN9 15q24.2 VPS39 15q15.1 DLL4 15q14 SNUPN 15q24.2 PLA2G4F 15q15.1 VPS18 15q14-q15 SH3PX3 15q24.2 PLA2G4D 15q15.1 RHOV 15q13.3 CSPG4 15q24.2 PLA2G4E 15q15.1 SPINT1 15q15.1 LINGO1 15q24.3 EHD4 15q11.1 PPP1R14D 15q15.1 LOC415345 SPTBN5 15q21 ZFYVE19 15q15.1 HMG20A 15q24 MAPKBP1 15q15.1 DNAJC17 15q15.1 SGK269 15q24.3 MGA 15q14 GCHFR 15q15 TSPAN3 15q24.3 TYRO3 15q15.1-q21.1 FAM82C 15q15.1 PSTPIP1 15q24-q25.1 TCEB1P2 15q15.1 RAD51 15q15.1 RCN2 15q23 RPAP1 15q15.1 CASC5 15q14 SCAPER 15q24

Human chr position Opossum Human chr Chicken Human chr chr15 chr1 chr10 LTK 15q15.1-q21.1 RPUSD2 15q13.3 LOC770273 ITPKA 15q14-q21 CCDC32 15q15.1 ETFA 15q23-q25 RTF1 15q15.1 CHST14 15q15.1 C15orf27 15q24.2 NDUFAF1 15q11.2-q21.3 BAHD1 15q15.1 NRG4 15q24.2 NUSAP1 15q15.1 IVD 15q14-q15 FBXO22 15q24.2 OIP5 15q15.1 DISP2 15q15.1 UBE2Q2 15q24.2 CHP 15q13.3 PLCB2 15q15 CHRNB4 15q24 EXDL1 15q15.1 BUB1B 15q15 CHRNA3 15q24 FAM92A2 15q15.1 BMF 15q14 CHRNA5 15q24 CYCSP2 15q15.1 SRP14 15q22 PSMA4 15q25.1 INOC1 15q15.1 EIF2AK4 15q15.1 AGPHD1 15q25.1 CHAC1 15q15.1 GPR176 15q14-q15.1 IREB2 15q25.1 DLL4 15q14 THBS1 15q15 CRABP1 15q24 VPS18 15q14-q15 ERVK6 7p22.1 WDR61 15q25.1 RHOV 15q13.3 RASGRP1 15q15 DNAJA4 15q25.1 SPINT1 15q15.1 FAM98B 15q14 ACSBG1 15q23-q24 PPP1R14D 15q15.1 SPRED1 15q14 IDH3A 15q25.1-q25.2 ZFYVE19 15q15.1 TMCO5 15q14 CIB2 15q24 DNAJC17 15q15.1 MEIS2 15q14 TBC1D2B 15q24.3-q25.1 GCHFR 15q15 C15orf41 15q14 LOC770502 FAM82C 15q15.1 ATPBD4 15q14 HERC1 15q22 RAD51 15q15.1 AQR 15q14 FBXL22 15q22.31 CASC5 15q14 ACTC1 15q11-q14 USP3 15q22.3 RPUSD2 15q13.3 ARHGAP11A 15q13.2 CA12 15q22 CCDC32 15q15.1 SCG5 15q13-q14 RAB8B 15q22.2 MRPL42P5 15q13.3 GREM1 15q13-q15 RPS27L 15q22.1 CHST14 15q15.1 FMN1 15q13.3 TPM1 15q22.1 BAHD1 15q15.1 TMCO5 15q14 LOC770647 IVD 15q14-q15 SGTA 19p13 LOC415374 C15orf23 15q15.1 RYR3 15q14-q15 LOC770688 DISP2 15q15.1 AVEN 15q13.1 LOC427491 C15orf52 15q15.1 CHRM5 15q26 VPS13C 15q22.2

Human chr position Opossum Human chr Chicken Human chr chr15 chr1 chr10 PLCB2 15q15 C15orf24 15q14 RORA 15q22.2 PAK6 15q14 C15orf29 15q14 NARG2 15q22.2 C15orf56 15q15.1 TMEM85 15q14 ANXA2 15q21-q22 BUB1B 15q15 SLC12A6 15q13-q15 LOC770837 BMF 15q14 NOLA3 15q14-q15 BNIP2 15q22.2 SRP14 15q22 C15orf55 15q14 LOC770897 EIF2AK4 15q15.1 AGPAT7 15q14 GTF2A2 15q22.2 H3F3AP1 15q15.1 UNC13C 15q21.3 GCNT3 15q21.3 GPR176 15q14-q15.1 WDR72 15q21.3 OTUD7A 15q13.3 FSIP1 15q14 ONECUT1 15q21.1-q21.2 KLF13 15q12 THBS1 15q15 C1orf123 1p32.3 LOC415381 C15orf54 15q14 KIAA1370 15q21.2 TRPM1 15q13-q14 C15orf53 15q14 ARPP16/19 15q21.2 MTMR15 15q13.2-q13.3 RASGRP1 15q15 MYO5A 15q21 MPHOSPH10 FAM98B 15q14 MYO5C 15q21 MCEE 2p13.3 SPRED1 15q14 GNB5 15q21.2 APBA2 15q11-q12 TMCO5 15q14 MAPK6 15q21 LOC415387 MEIS2 15q14 LEO1 15q21.2 TJP1 15q13 C15orf41 15q14 TMOD3 15q21.1-q21.2 LOC771170 ATPBD4 15q14 TMOD2 15q21.1-q21.2 TARSL2 15q26.3 NANOGP8 15q14 LYSMD2 15q21.2 TM2D3 15q26.3 ZNF770 15q14 SCG3 15q21 ADAL 15q15.3 AQR 15q14 DMXL2 15q21.2 LARP6 15q23 ACTC1 15q11-q14 GLDN 15q21.2 LRRC49 15q23 GJD2 15q14 TNFAIP8L3 15q21.2 THSD4 15q23 GOLGA8B 15q14 AP4E1 15q21.2 CHRNA7 15q14 GOLGA8A 15q11.2 PSL2 15q21.2 FAM81A 15q22.2 AGPAT7 15q14 TRPM7 15q21 MYO1E 15q21-q22 C15orf55 15q14 UBP50 15q21.1 RNF111 15q21 NOLA3 15q14-q15 USP8 15q21.2 LOC771357 SLC12A6 15q13-q15 GABPB2 15q21.2 SLTM 15q22.1 TMEM85 15q14 HDC 15q21-q22 FAM63B 15q21.3-q22.1

Human chr position Opossum Human chr Chicken Human chr chr15 chr1 chr10 C15orf29 15q14 SLC27A2 15q21.2 ADAM10 15q22 PGBD4 15q14 DTWD1 15q21.2 LIPC 15q21-q23 C15orf24 15q14 FGF7 15q15-q21.1 AQP9 15q22.1-q22.2 CHRM5 15q26 GALK2 15q21.1 ALDH1A2 15q21.3 AVEN 15q13.1 COPS2 15q21.2 GRINL1B 4q12 RYR3 15q14-q15 KIAA0256 15q21.1 CGNL1 15q21.3 FMN1 15q13.3 SHC4 15q21.1-q21.2 TCF12 15q21 GREM1 15q13-q15 CEP152 15q21.1 LOC427522 SCG5 15q13-q14 FBN1 15q21.1 LOC768435 C15orf45 15q13.3 DUT 15q15-q21.1 ZNF280D 15q21.3 ARHGAP11A 15q13.2 SLC12A1 15q15-q21.1 LOC426295 FAM7A1 15q13.3 SLC24A5 15q21.1 TEX9 15q21.3 CHRNA7 15q13.4 SEMA6D 15q21.1 RFXDC2 15q21.3 DEPDC1P1 15q13.3 PNRC2(?) 1p36.11 NEDD4 15q OTUD7A 15q13.3 CMA1 14q11.2 LOC768699 LOC400347 15q13.3 C14orf124 14q12 PRTG 15q21.3 LOC283711 15q13.3 KIAA0323 14q12 LOC768739 KLF13 15q12 CBLN3 14q12 PYGO1 15q21.1 LOC283710 15q13.3 KIAA1305 14q12 DYX1C1 15q21.3 TRPM1 15q13-q14 NFATC4 14q11.2 CCPG1 15q21.1 MTMR10 15q13.3 ADCY4 14q12 RAB27A 15q15-q21.1 MTMR15 15q13.2-q13.3 LTB4R 14q11.2-q12 RSL24D1 15q21 LTB4R2 14q11.2-q12 LOC768911 CIDEB 14q12 LOC768953 C14orf21 14q12 UNC13C 15q21.3 DHRS1 14q12 LOC769010 RABGGTA 14q11.2 LOC415414 TGM1 14q11.2 WDR72 15q21.3 TINF2 14q12 LOC769065 GMPR2 14q12 LOC769080 NEDD8 14q12 KIAA1370 15q21.2-q21.3 MDP-1 14q12 ARPP19 15q21.2

Opossum Human chr Chicken Human chr chr1 chr10 CHMP4A 14q12 MYO5A 15q21 TM9SF1 14q11.2 MYO5C 15q21 IPO4 14q12 GNB5 15q21.2 REC8 14q11.2-q12 NR13 IRF9 14q11.2 LOC415418 RNF31 14q11.2 MAPK6 15q21 PSME2 14q11.2 LOC769405 C14orf122 14q11.2 LEO1 15q21.2 PSME1 14q11.2 TMOD3 15q21.1-q21.2 FIT1 14q12 TMOD2 15q21.1-q21.2 WDR23 14q12 LYSMD2 15q21.2 PCK2 14q12 LYSMD2 15q21.2 NRL 14q11.1-q11.2 SCG3 15q21 CPNE6 14q11.2 AP4E1 15q21.2 DHRS4 14q11.2 LOC769161 DHRS2 14q11.2 TNFAIP8L3 15q21.2 DHRS2 14q11.2 CYP19A1 15q21.1 APIG 14q11.2 GLDN 15q21.2 THTPA 14q11.2 DMXL2 15q21.2 ZFHX2 14q11.2 LOC769781 NGDN 14q11.2 SEMA6D 15q21.1 MYH7 14q12 MYEF2 15q21.1 MYH6 14q12 SLC12A1 15q15-q21.1 CMTM5 14q11.2 DUT 15q15-q21.1 IL25 14q11.2 FBN1 15q21.1 EFS 14q11.2-q12 CEP152 15q21.1 SLC22A17 14q11.2 SHC4 15q21.1-q21.2 PABPN1 14q11.2-q13 LOC770176 BCL2L2 14q11.2-q12 SECISBP2L 15q21.1 PPP1R3E 14q11.2 COPS2 15q21.2 HOMEZ 14q11.2 GALK2 15q21.1-q21.2 SLC7A8 14q11.2 GALK2 15q21.1-q21.2

Opossum Human chr Chicken Human chr chr1 chr10 CEBPE 14q11.2 FGF7 15q15-q21.1 C14orf119 14q11.2 LOC770289 CDH24 14q11.2 DTWD1 15q21.2 PSMB11 14q11.2 LOC770318 PSMB5 14q11.2 LOC770338 C14orf93 14q11.2 SPATA5L1 15q21.1 C14orf94 14q11.2 LOC770357 PRMT5 14q11.2-q21 SLC30A4 15q21.1 RBM23 14q11.2 PLDN 15q21.1 REM2 14q11.2 SQRDL 15q15 LRP10 14q11.2 ITGB1BP3 19p13.3 MRPL52 14q11.1 SPPL2A 15q21.2 SLC7A7 14q11.2 TRPM7 15q21 OXA1L 14q11.2 USP8 15q21.2 OR6J1 14q11.2 GABPB2 1q21.3 ABHD4 14q11.2 HDC 15q21-q22 DAD1 14q11-q12 GATM 15q21.1 TCRdelta 14q11.2 PDE8A 15q25.3 TCRalpha 14q11.2 LOC415456 METTL3 14q11.1 FSD2 15q25.2 TOX4 14q11.2 WHDC1 15q25.2 RAB2B 14q11.2 HOMER2 15q24.3 CHD8 14q11.2 LOC415459 SUPT16H 14q11.2 C15orf40 15q25.2 RPGRIP1 14q11 BTBD1 15q24 HNRNPCL1 1p36.21 TM6SF1 15q24-q26 C14orf176 14q11.2 HDGFRP3 15q25.2 RNASE8 14q11.2 BNC1 15q25.2 TPPP2 14q11.2 SH3GL3 15q24

Zebrafish Human chr Zebrafish Human chr Fugu Human chr Xenopus Human chr chr4 chr8 scaffold-171 scaffold-673 CCDC59 12q21.31 MRPS25 3p25 NP_001017590 638,182bp 556,841bp DMBT1 10q26.13 NR2C2 3p25 NP_001116766 EFCAB4B 12p13.32 LOC402377 9q33.2 XP_001339446 PRMT8 12p13.3 hyp protein XP_002663293 TSPAN11 12p11.21 WDR46 6p21.3 NP_956917 SEMA3A 7p12.1 NADK 1p36.33 XP_002663292 PIM3 22q13 NADK 1p36.33 NP_001017580 PCLO 7q11.23-q21.3 WDR46 6p21.3 XP_002663288 ANKRD16 10p15.1 hyp protein XP_002663291 GDI2 10p15 LOC100288842 9q33.2 XP_002663290 ASB13 10p15.1 UBE2D4 7p13 NP_001082922 NET1 10p15 DBNL 7p13 NP_001018536 MKLN1 7q32 ARID5A 2q11.2 XP_695746 PODXL 7q32-q33 COX5B 2cen-q13 XP_001342184 CYB5A3 22q13.2-q13.31 GINS4 8p11.21 NP_001003546 BTBD11 12q23.3 PPP1R3B 8p23.1 NP_998345 CRY1 12q23-q24.1 PHYHIPL 10q11 XP_001342239 TNNT1 19q13.4 NR6A1 Pq33.3 NP_571331 ENDOU 12q13.1 TCF7L1 2p11.2 NP_571371 BRAF 7q34 VPS13A 9q21 NP_001112365 MRPS33 7q32-q34 GNA14 9q21 XP_683989 GYS2 12q23 LOC100129514 7q34 CEP78 9q21.2 XP_693164 LDHB 12q22 CNOT4 7q22-qter CABIN1 22q11.23 XP_700057 SLC25A3 12p12.3 MKRN1 7q34 TBX6 16p11.2 NP_571133 TMPO 12p11.2-p11.1 CECR6 22q11.1 KLH22 22q11.21 XP_002663287 STRAP 12q12.3 IL17RA 22q11.1 KLH22 22q11.21 XP_002663286 BICD1 22q12.3 DOC2A 16p11.2 PIM3 22q13 NPY1R 4q31.3-q32 NP_571511 LARGE 7q32-q34 CCDC95 16p11.2 PIM3 22q13 KCTD9 8p21.1 NP_001013585 BRAF 7q34 HIRIP3 16p11.2 PIM3 22q13 ANKRD39 2q11.2 NP_001107112 MRPS33 7q34 DUSP14 17q12 PIM3 22q13 FGFR1 8p12 NP_694494 MKLN1 22q11.1 SEZ6L2 16p11.2 PIM3 22q13 LGI3 8p21.3 NP_001034768 CECR6 9p13.3 NME2 17q21.3 RERGL 12p12.3 CHMP7 8p21.3 NP_957075 IL17RA 12p12.3 NUFIP2 17q11.2 C9orf100 9p13.3 SP3 2q31 NP_001037800 C9orf100 12p12.3 TAOK2 16p11.2 Zebrafish Human chr Zebrafish Human chr protein ID Fugu Human chr Xenopus Human chr chr4 chr8 scaffold-171 scaffold-673 LMO3 12p12.3 KIAA1310 2p12-p11.2 NP_001029348 LMO3 7q22.2 CDIPT 16p11.2 MGST1 12p12.3-p12.1 BMP1 8p21.3 NP_001035126 DERA 6p22.2-p21.3 ZNF DERA 12p12.3 DNM1L 12p11.21 XP_695250 RINT1 10p13 NLK 17q11.2 HELB 12q14.3 RAB7A 3q21.3 NP_001002178 PRL? 15q21-q22.2 COL8A2? 1p34.2 IRAK3 12q14.3 UBL4A Xq28 NP_956594 HSPA14 12q14 TNFRSF12A 16p13.3 B2M 15q21-q22.2 B2M 15q21-q22.2 NP_998291 B2M 12q14.3 B2M 15q21-q22.2 IRAK3 12q14.3 ACSL6 5q31 XP_002663285 LEMD3 12p11 TRIM72 16p11.2 MSRB3 12q14.3 BIN3 8p21.3 NP_001116516 MSRB3 12p11.21 PYCARD 16p12-p11.2 LEMD3 12q14 ENTPD4 8p21.3-p21.2 NP_001002419 CAPRIN2 12p11 NLRP1 17p13.2 WIF1 12q14.3 ANTXR1 2p13.1 XP_001332841 IPO8 12p11.21 Histon H2A VHLL 1q22 RHOBTB2 8p21.3 XP_001332198 NLRP1 part? 17q13.2 TMBIM4 12q14.1-q15 IDO2 8p11.21 NP_001077323 HIST1H1? 6p21.3 LOC795849 STC1 8p21-p11.2 NP_001038922 HIST1H1? 6p21.3 GRIP1 12q14.3 NKX2-6 8p21.2 NP_571494 Histon H2A CAND1 12q14 NKX3-1 8p21 XP_682960 Histon H2B DYRK2 12q15 SLC25A37 8p21.2 NP_001035060 Histon H3 IFNG1-2 12q14 ENTPD4 8p21.3-p21.2 NP_001002419 Histon H4 IFNG1-1 12q14 SLC25A37 XP_002663282 PYCARD 16p12-p11.2 IL26 12q15 STC1 8p21-p11.2 XP_002663281 Histon H2A IL22 12q15 NKX2-6 8p21.2 XP_002663280 Histon H2B MDM1 12q15 STC1 8p21-p11.2 XP_002663283 Histon H4 CALU 7q32.1 ENTPD4 8p21.3-p21.2 XP_002663284 Histon H4 OPN1SW 7q32.1 AGTPBP1 9q21.33 NP_001019616 Histon H2B mir729 NTRK2 9 XP_002663279 TRIM25? 17q23.2 TNPO3 7q32.1 RMI1 9q21.32 NP_956474 Histon H3 IRF5 7q32 HNRNPK 9q21.32-q21.33 NP_998159 Histon1 H4 PNPLA8 7q31 C9orf64 9q21.32 NP_001077027 PYCARD 16p12-p11.2 NRCAM 7q31.1-q31.2 GKAP1 9q21.32 NP_998237 CNTN1 12q11-q12 hyp protein XP_002663276 PDZRN4 12q12 FRMD3 9q21.32 XP_002663278 GXYLT1 12q12 FRMD3 9q21.32 XP_002663277 YAF2 12q12 RASEF 9q21.32 XP_699335 ZCRB1 12q12 OPN5 6p12.3 XP_001340602 Zebrafish Human chr Zebrafish Human chr protein ID chr4 chr8 PPHLN1 12q12 NFU1 2p15-p13 NP_001116180 PRICKLE1 12q12 SYP Xp11.23-p11.22 NP_001025413 PUS7L 12q12 NLRC3 16p13.3 XP_693445 TWF1 12q12 NLRC3 16p13.3 XP_001339109 NELL2 12q13.11-q13.12 PLP2 Xp11.23 NP_001154917 UTP20 12q23 PRICKLE3 Xp11.23-p11.22 XP_698649 ARL1 12q23.2 FAM110B 8q12.1 XP_001343536 TFEC 7q31.2 MFSD2A 1p34.2 NP_001104673 FGFBP2? 4p16 FKBP1B 2p23.3 NP_956239 ZMYND10 3p21.3 SNPH 20p13 XP_699720 PRL 6p22.2-p21.3 RAD21L1 20p13 NP_001073519 NUAK1 12q23.3 GOLM1 9q21.33 NP_956573 PLXNB2 22q13.33 DDX51 12q24.33 NP_001003864 PLXNB2 22q13.33 AAK1 2p14 XP_697452 PLXNB2 22q13.33 PER1 1pter-q24 NP_956969 ABCC9 12p12.1 TP73 1p36.3 NP_899183 KCNJ8 12p11.23 WDR8 1p36.3 NP_956187 CMAS 12p12.1 TPRG1L 1p36.32 XP_001922801 KIAA1644 22 PIM1 6p21.2 XP_002663274 ZNF629 16p11.2 PIM2 Xp11.23 XP_002663273 AKR1B1 7q35 PIM2 Xp11.23 XP_002663272 AKR1B1 7q35 PIM2 Xp11.23 XP_001919964 PRDM4 12q23-q24.1 PIM2 Xp11.23 XP_696811 PSMC2 7q22.1-q22.3 CLCN6 1p36 XP_696527 TSPAN33 7q32.1 NPPA 1p36.21 NP_942095 SMO 7q32.3 NPPB 1p36.21 XP_002663270 ZC3HC1 7q32.2 NPPC 2q24-qter NP_001122213 NPR2 9p21-p12 XP_696543 PRDM16 1p36.23-p33 XP_001922927 PLCH2 1p36.32 XP_002663269 GRIK1 21q22.11 NP_001138274 DPP9 19p13.3 NP_001070781 KANK1 9p24.3 XP_001919098 Zebrafish Human chr protein ID chr8 NDUFA7 19p13.2 NP_001003436 PIM2 Xp11.23 XP_696777 KIAA2013 1p36.22 NP_001107084 PLOD1 1p36.22 NP_001071210 MFN2 1p36.22 NP_001121726 PRDM16 1p36.23-p33 XP_001337090 PLCH2 1p36.32 XP_700421 PEX10 1p36.32 NP_001005994 MTHFR 1p36.3 NP_001121727 PIM1 6P21.2 XP_001336296 PPIH 1p34.1 NP_001009902 TBX1 1p34 NP_001119929 ARHGEF16 1p36.3 NP_001116755 ESPN 1p36.31 NP_001116754 HES2 1p36.31 NP_001116717 HES2 1p36.31 NP_001038818 ACOT7 1P36 NP_001004617 GPR153 1p36.31 XP_001919466 hyp protein XP_001921235 ADAMTS13 9q34 XP_692137 ADAMTS13 9q34 XP_002663267 UBAP2 9p13.3 NP_001075950 PABPC1 8q22.2-q23 NP_956133 MAK16 8p12 NP_775346 YWHAB 13 NP_998310 PDPR 16q22.1 NP_956771 1-6P21.2; 2-Xp11.23; 3- PIM 22q13 XP_002663264 SLC18A1 8p21.3 XP_700171 EGR3 8p23-p21 XP_002663257 SLC18A1 8p21.3 XP_002663262 PPIF 10q22-q23 NP_997923 OGDH 7p14-p13 NP_957073 ADAM9 8p11.22 XP_002663261 Zebrafish Human chr protein ID chr8 NUDT18 8p21.3 NP_001017843 FAM160B2 8p21.3 NP_001071234 HMGXB3 5q32 XP_002663259 PIM2 Xp11.23 XP_697934 ANGPTL7 1p36 NP_001006073 MTOR 1p36.2 NP_001070679 hyp protein NP_001116781 QARS 3p21.31 NP_957507 OGG1 3p26.2 NP_001116780 sult1st7 16p12.1? NP_001132953 sult1st8 2q12.3? NP_001132954 sult1st1 16p12.1? NP_891986 sult1st2 2q12.3? NP_899190 PIM2 Xp11.23 XP_694306 PIM2 Xp11.23 XP_694387 sult1st3 XP_002663266 TAS1R1 1p36.23 NP_001034920 CKAP2L 2q13 XP_001919249 hyp protein NP_001116175 PURB 7p13 NP_001122194 H2AFV 7p13 XP_001921188 ARL6IP5 3p14 XP_002663253 retr-1-like XP_002663256 ARL6IP5 3p14 NP_001017698 TIMM17B Xp11.23 NP_001107065 CRAT 9q34.1 NP_001073466 hyp protein NP_001107120 CACNA1D 3p14.3 XP_001333514 AEBP1 7p13 XP_696022 STAR 8p11.2 NP_571738 GRK5 10q24-qter XP_695841 LSM1 8p11.2 NP_001003551 BAG4 8p11.23 XP_001337521 Zebrafish Human chr protein ID chr8 RTDR1 22q11.2 XP_700258 MHC class II alpha XP_699949 MHC class I: L-lineage XP_001920787 MHC class I: L-lineage NP_001017904 hyp protein XP_002663252 GABRB3 15q11.2-q12 XP_694878 GPR34 Xp11.4-p11.3 NP_001007218 MPP1 Xq28 NP_999857 NDFIP1 5q31.3 NP_001002503 CHM4B 20q11.22 NP_998622 C8orf41 8p12 XP_002663251