Molecular Immunology 46 (2009) 457–472

Contents lists available at ScienceDirect

Molecular Immunology

journal homepage: www.elsevier.com/locate/molimm

The B7 family of immunoregulatory receptors: A comparative and evolutionary perspective

John D. Hansen a,∗, Louis Du Pasquier b, Marie-Paule Lefranc c, Virginie Lopez d, Abdenour Benmansour e, Pierre Boudinot e a US Geological Survey—Western Fisheries Research Center, Seattle, WA 98115, USA b University of Basel, Institute of Zoology and Evolutionary Biology, Vesalgasse 1, CH-4051 Basel, Switzerland c Institut Universitaire de France, Laboratoire d’ImmunoGénétique Moléculaire, Université Montpellier II, UPR CNRS 1142, France d UMR 6632, Équipe Évolution biologique et Modélisation, Université de Aix Marseille/CNRS, case 19, 3, place Victor-Hugo, 13331 Marseille Cedex 03 France e Institut National de la Recherche Agronomique, Unité de Virologie et Immunologie Moléculaires, 78352 Jouy-en-Josas Cedex, France article info abstract

Article history: In mammals, activation requires specific recognition of the peptide–MHC complex by the TcR and Received 8 October 2008 co-stimulatory signals. Important co-stimulatory receptors expressed by T cells are the molecules of the Accepted 9 October 2008 CD28 family, that regulate T cell activation, proliferation and tolerance. These receptors recognize B7s and Available online 9 December 2008 B7-homologous (B7H) molecules that are typically expressed by the antigen presenting cells. In teleost fish, typical T cell responses have been described and the TcR, MHC and CD28/CTLA4 genes have been Keywords: characterized. In contrast, the members of the B7 gene family have only been described in mammals B7 and birds and have yet to be addressed in lower vertebrates. To learn more about the evolution of com- CD80 CD86 ponents guiding T cell activation in vertebrates, we performed a systematic genomic survey for the B7 B7-H1 co-stimulatory and co-inhibitory IgSF receptors in lower vertebrates with an emphasis on teleost fish. Our B7-DC search identified fish sequences that are orthologous to B7, B7-H1/B7-DC, B7-H3 and B7-H4 as defined by B7-H3 sequence identity, phylogeny and combinations of short or long-range syntenic relationships. However, B7-H4 we were unable to identify clear orthologs for B7-H2 (CD275, ICOS ligand) in bony fish, which correlates Co-stimulation with our prior inability to find ICOS in fish. Interestingly, our results indicate that teleost fish possess a single B7.1/B7.2 (CD80/86) molecule that likely interacts with CD28/CTLA4 as the ligand-binding regions seem to be conserved in both partners. Overall, our analyses implies that gene duplication (and loss) have shaped a molecular repertoire of B7-like molecules that was recruited for the refinement of T cell activation during the evolution of the vertebrates. Published by Elsevier Ltd.

1. Introduction ality against non-self. In mammals, both B7.1 and B7.2 are required for full complete activation of the naïve T cell by providing a balance The activation of T lymphocytes is finely tuned by a combination of activating and inhibitory signals. However, these two receptors of signals delivered through the TcR–CD3 complex and accessory display distinct expression patterns: B7.1 is inducible, while B7.2 signals that can be either stimulatory or inhibitory. The initial sig- is constitutively expressed on APCs, up-regulated upon activation nal through the MHC/peptide/TcR complex termed “signal 1” is in APCs (Larsen et al., 1994) and is required for the generation of not sufficient to induce full activation of naïve T lymphocytes and mature DC repertoires. In contrast, the ligation of B7.1 and B7.2 thus requires an additional co-stimulatory signal, which is antigen- with CTLA4 exerts an inhibitory effect on T cell activation, blocking independent (signal 2). This co-stimulatory signal is provided by Th2 responses and maintaining peripheral tolerance. Accordingly, interactions between B7.1 (CD80) and B7.2 (CD86) ligands on APCs comparison of B7.1-KO, B7.2-KO and B7.1/.2-KO mice and blocking and CD28 expressed on T cells (Freeman et al., 1993). Once receiving antibodies suggests a complex role for these receptors in autoim- this confirmatory signal from the APC (signal 2), the armed T cell munity (Larsen et al., 1994; Lenschow et al., 1995; Poussin et al., will only require signal 1 for future activation and effector function- 2003; Salomon and Bluestone, 2001). Mammalian B7.1 and B7.2 are membrane bound receptors containing one IgSF V domain, one IgSF C domain, a transmembrane region and rather divergent intra- ∗ Corresponding author. cytoplasmic regions. Additionally, a splicing variant lacking the TM E-mail address: [email protected] (J.D. Hansen). has also been described for B7-2, that is expressed by non-activated

0161-5890/$ – see front matter. Published by Elsevier Ltd. doi:10.1016/j.molimm.2008.10.007 458 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 monocytes. The secretion of B7.2deltaTM induces proliferation and cytokine production by both naive and memory T cells (Greenfield et al., 1998), providing an example of additional structural com- plexities of B7-mediated co-stimulation. B7.1 and B7.2 genes have been found in many mammalian species and are tightly linked on human chromosome 3 and mouse chromosome 16 (Table 1). Aside from B7.1 and B7.2, five additional receptors have been Induced on B, T, DC, some NKs and monocyte lineages VTCN1 characterized and named “B7-homologs” (B7-H), owing to shared structural features with the primary B7 molecules. Three of these B7-H receptors bind members of the CD28 family, providing functional support to the notion that the B7 family mirrors the diversity of the CD28-related receptors for evoking T-cell stimu- latory/inhibitory pathways in the immune system: B7-H1 (PDL-1) and B7-DC (PDL-2) bind the programmed cell death (PD-1) receptor Induced upon activation on B, T, DC and monocyte lineages (Keir et al., 2008), and B7-H2 (CD275) is the ligand of ICOS (inducible CD276, B7-H3 B7X, B7S1, FLJ22418, co-stimulator) (Wang et al., 2000). B7-H1 and B7-DC deliver negative co-stimulatory signals through PD1. B7-H1 is constitutively expressed on multiple cell types in mice including activated B, T, myeloid and DC and also on endothelial cells. In contrast, B7-DC is restricted to DC and and is induced by IL4 while B7-H1 is primarily regu- lated by IFN␥. Both genes are up-regulated during T cell activation Induced upon activation in DC and monocyte lineages and B7-DC has a higher affinity for PD-1 than B7-H1. Surprisingly, CD273, PD-L2, PDL2, Btdc, 18731, bA574F11.2, PDCD1LG2 expression cloning revealed that B7-H1 not only interacts with PD- 1 but also with B7.1 (but not with B7.2). The exact function of B7-H1 expressed by T cells is still unknown, but the large distribution of these receptors has to be compared to the broad expression pattern for their main ligand, PD-1. Overall, the PD1:PD-ligand pathway is critical for peripheral tolerance (Fife et al., 2006; Keir et al., 2008) Constitutive on B, DC, monocytes and T cell (upon induction). Up-regulated during activation and elicits an essential role in chronic viral infections by balancing CD274, PD-L1, B7-H, PDCD1LG1, PDL1 immune responses to pathogens and subsequent CMI mediated tis- sue damage (Barber et al., 2006; Petrovas et al., 2006). The delivery of an inhibitory signal through B7-H1 and B7-DC has been well doc- umented in the context of anti-tumor immunity since many tumors express B7-H1 and thus down-regulate specific T cell responses. In contrast, B7-H2 delivers positive co-stimulatory signals through ICOS, a member of the CD28 family. It is constitutively B, DC, mono lineage and some T-cell subsets expressed on the surface of a broad range of cell types including B LICOS, GL50, Icosl, GL50-B, ICOS-L, B7h, MGI:1354701, Ly115l cells, macrophages, DC, some T-cell subsets and on certain epithe- lial and endothelial cells. ICOS is induced on CD4+ and CD8+ T cells during T cell activation and the ICOS/B7-H2 pathway is critical for the delivery of T-cell help to B cells to promote humoral immunity. The B7-H2 knockout mice demonstrated that B7-H2 is required for T helper cell activation, differentiation, and the expression of effec- tor cytokines as well as for the development of NKT cells (Chung et al., 2008; Nurieva et al., 2003). Constitutive on B, DC, monocytes and T cells (upon induction). Up-regulated during activation The ligands of the last two members of the B7 family (B7-H3 and B7-H4) have yet to be identified, but their domain composi- tion, sequence similarities and functional properties group them with the B7/B7H receptors. Mammalian B7-H3 has been detected on the surface of T cells, B cells, DC, macrophages and on specific carcinoma cells. B7-H3 transcripts are up-regulated upon in vitro ␥

IFN stimulation (in Th1 responses), but are down-regulated dur- CS/CIand monocytes lineages CS/CI CS CI CI CS/CI CI ing Th2 responses (Suh et al., 2003). The functional properties of the B7-H3 receptor seem to be rather complex. A B7-H3Ig fusion pro- tein binding to activated T cells provided a positive co-stimulatory response leading to T-cell proliferation, cytotoxicity and IFN␥ pro- duction, suggesting a positive co-stimulatory function (Chapoval et al., 2001; Sun et al., 2002). Soluble versions of B7-H3 are also released from monocytes, DC and activated T cells via proteolytic processing resulting in the binding and activation of T cells by the soluble receptor (Zhang et al., 2008). However, B7-H3 KO mice (Suh et al., 2003) showed increased T cell responses, accelerated EAE and severe hyperinflammatory response, supporting an inhibitory Surface expression Induced on B, T, DC Location in human genomeLigand and outcome of interaction CD28, CTLA4 3q13 CD28, CTLA4 3q21 ICOS 21q22 PD-1 and B7.1 PD-1 9p24 ? 9p24 15p24 ? 1p13.1 function. The last member of the B7 family, B7-H4, has the same Table 1 Properties of the B7 family in mammals. B7 receptorOther names B7.1 CD80CS, co-stimulatory; CI, co-inhibitory. B7.2 CD86 B7-H2 CD275, B7RP-1, B7-H1 B7-DC B7-H3 B7-H4 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 459 overall domain composition as the other members of the B7 family (http://smart.embl-heidelberg.de/). Amino acid alignments and but in contrast, B7-H4 is bound to the APC via a GPI-linked (glycosyl IMGT numbering were performed using Clustal W and tools found phosphatidylinositol) anchor. Human and mouse B7-H4 expression at IMGT (IMGT®, the International ImMunoGeneTics information can be induced on freshly isolated T cells, B cells, DC, and mono- system, http://imgt.cines.fr/ (Lefranc et al., 2008). Secondary cytes and B7-H4 mRNA is broadly expressed in many lymphoid and structures and IMGT Collier de Perles were based upon the IMGT non-lymphoid tissues, including tumors. Overall, B7H4 is a nega- unique number system IgSF V and V-like domains and for C and tive regulator of T cell responses (Prasad et al., 2003; Sica et al., C-like domains using IMGT tools (Lefranc et al., 2005). Signal pep- 2003; Zang et al., 2003), while the expression of B7-H4 on tumors tides, TM domains and putative N-linked glycosylation sites were suggests that B7-H4 is involved in evasion from anti-tumor immu- identified with using SignalP (www.cbs.dtu.dk/services/SignalP/), nity (Choi et al., 2003). Surprisingly, B7-H4 KO mice develop normal TMpred (www.ch.embnet.org/software/TMPRED form.html), cytotoxic T-lymphocyte reactions against viral infection (Suh et al., TMHMM (www.cbs.dtu.dk/services/TMHMM/) and NetNGlyc 2006), and thus the precise role of this receptor has yet to be fully v1.0 (www.cbs.dtu.dk/services/NetNGlyc/), respectively. GPI pre- understood. dictions for B7-H4 were made using the “Big PI” predictor at The B7 family members are therefore engaged in multi- http://mendel.imp.ac.at/gpi/cgi-bin/gpi pred.cgi. Finally, phyloge- ple stimulatory and/or inhibitory pathways, being expressed on netic comparisons were performed using MEGA 3.1 (Kumar et al., antigen-presenting cells and for the most part, binding a ligand on 2004). Briefly ClustalW alignments of the IgSF domains (single or T cells. They represent a dedicated subset of a large extended fam- tandem) were used for the generation of Neighbor-Joining trees ily of B7-related proteins including MOG and butyrophilins (Henry with Poisson correction, deletion of gaps and bootstrap analysis et al., 1999), which includes molecules involved in innate immu- (1000 replicants). nity. While B7s and B7H molecules share structural features and sequence similarity, it is not clear yet whether they all bind one or 2.2. Synteny analysis using C.A.S.S.I.O.P.E several members of the CD28 family and to which extent they also recognize other members of the B7 family. The complexity of their The syntenic relationships of the B7 family were initially con- expression patterns reflects the complexity of their functional con- ducted using BLAT (http://genome.ucsc.edu/cgi-bin/hgBlat) and tributions to the regulation of immune responses. A comparative Ensembl and then more refined analysis of synteny was per- analysis with their homologs involved in the less complex stim- formed using C.A.S.S.I.O.P.E (Clever Agent System for Synteny ulation pathways of the lower vertebrates may therefore provide Inheritance and Other Phenomena in Evolution). Searching for clues about their functional relevance. Teleost CD28 and CTLA4 biologically relevant conserved genomic regions requires both phy- orthologs which possess conserved B7 ligand-binding sites have logenetic orthology assessment and statistical testing for the genes been recently described suggesting that they likely engage a B7 of the relevant regions in as many genomes as possible. The pro- counterpart (Bernard et al., 2007, 2006). We therefore made a sur- cess developed in C.A.S.S.I.O.P.E. integrates these two important vey of the available fish genomic and transcript databases to search steps in a single automated process: (1) the phylogeny: orthol- for B7 family members and we identified several sequences show- ogous/paralogous genes are determined by the aggregation of ing most of the hallmarks of their mammalian counterparts. Here three phylogenetic methods using the Figenix plateform (Gouret et we provide a phylogeny-based classification of these molecules and al., 2005), in contrast to over-simplistic BLAST approaches. Addi- a comparative analysis that is intended to shed light on the origins tionally, phylogenetic information allows reconstruction of the of T-cell co-stimulation. evolutionary history and thereby more accurate ancestral genome reconstruction (2) a statistical test: CASSIOPE therefore utilizes a 2. Materials and methods specific statistical test (Danchin and Pontarotti, 2004) to assess the significance of the predicted, conserved gene clusters. 2.1. Identification and sequence characterization of B7 family members 3. Results An orderly approach utilizing current genome drafts and EST indices was taken to identify B7 and B7-H genes from the 3.1. B7.1 and B7.2 family members various vertebrates. In short, the majority of searches involved using mammalian, avian and teleost sequences to search the In mammals, B7.1 and B7.2 are responsible for signal 2 that is various databases using TBLASTN. For EST searches, sequences absolutely required for the activation of naïve T cells. Sequences were used as queries for TBLASTN analysis of the EST indices similar to B7.1 and B7.2 were easily identified in various mam- in Genbank (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and at the malian databases and in the avian databases using TBLASTN with Institute for Genomic Research (http://compbio.dfci.harvard.edu/ human or murine B7.1 or B7.2 as queries. However, the identifica- tgi/tgipage.html). ESTs representing partial sequences were used tion of teleost B7 genes was more elusive. The fish B7 homologs to search EST indices using BLASTn and overlapping sequences (which we have named B7R, for B7-related) were in fact retrieved were assembled using the Assembler function of MacVector. using a fish sequence that was initially identified as an Ig domain- Individual genomes were searched using TBLASTN at ENSEMBL containing molecule. Sequences showing significant identity with (www.ensembl.org/index.html)orBLAT(http://genome.ucsc. either B7.1 or B7.2 were identified from zebrafish, fugu, Atlantic edu/cgi-bin/hgBlat). In addition, TBLASTN searches were conducted salmon, stickleback, fat-head minnow and a cichlid species. Only for Tetraodon (www.genoscope.cns.fr/externe/tetranew/), medaka one sequence could be identified per fish species even when the http://dolphin.lab.nig.ac.pj/medaka/) and the elephant shark (Cal- complete genome sequence was available, suggesting that there is lorhinchus milii, http://esharkgenome.imcb.a-star.edu.sg/)atthe a single, common B7 receptor in teleosts. The human B7.1 and B7.2 specified URLs. Upon assembled consensus sequences, sequences amino acid sequences are ∼30% identical to each other, thus the were used for BLASTP analysis of the NR database at Genbank. B7R sequences display roughly equivalent levels of identity to B7.1 Amino acid sequences of the B7 members were then scanned and B7.2 (∼25%). for domain architecture using CDART (http://www.ncbi.nlm. A partial B7R EST was also identified from the spiny dogfish nih.gov/Structure/lexington/lexington.cgi?cmd=rps) and SMART (Squalus acanthias), an elasmobranch, indicating the presence of 460 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472

B7 genes in the earliest jawed vertebrates. Interestingly the spiny of the domain. The transmembrane regions did not display obvi- dogfish sequence identified B7-H2 (ICOSL) as the top 4 matches ous similarity. Finally, the intra-cytoplasmic regions do not contain in BLASTP analyses (29% identity over 212 AA) whereas the fish tyrosine-based signaling motifs such as ITAMs or ITIMs. In addition, B7R sequences all retrieved either B7.1 or B7.2 sequences as the the fish and avian sequences lack the RRNE motif found in mice top matches (avg. 25% identity over 240 AA) when used in BLASTP that is required for co-stimulation and capping (Doty and Clark, queries of the NR. 1998). However, the same study indicated that a serine residue near The B7R sequences displayed the characteristic structure of the the RRNE motif was also required for co-stimulation; all of the B7 B7 family including a prototypical V-like domain, a C-like domain, a and B7R intra-cytoplasmic sequences contain serine and proline transmembrane hydrophobic stretch and a short intra-cytoplasmic residues, the role of which remains to be determined. region. An alignment of the human, murine, avian and fish B7 sequences was then performed using ClustalW. Additionally, IMGT 3.2. B7-H1 and B7-DC sequences in fish and other vertebrates numbering and modeling was applied to the Ig V and C domains to produce an optimized alignment (Fig. 1). Both V and C domains B7-H1 and B7-DC are the natural ligands for PD-1, a CD28 family displayed conserved 23C and 104 C residues that are required for the member, and together they are partially responsible for periph- Ig fold, but all fish B7R molecules lacked the canonical 41 W in the V eral tolerance of T cells via inhibitory signals. Chicken and murine domain. Instead, 41 W is replaced with other amino acids (I/F/L/V) B7-H1 and DC were used as queries in TBLASTN searches to iden- which possess non-polar side chains like tryptophan. Overall, the tify orthologous sequences in fish and other tetrapods. Single mammalian B71/B72 and fish B7R sequences are not highly sim- sequences displaying roughly equivalent identity to chicken and ilar in amino acid identity but, a few common structural features mammalian B7-H1 and DC were identified for salmonids, 2 species appear from the collier de perle representation of the V domains of pufferfish and representative cyprinids as well as distinct B7- (see Supplementary Fig. I). In particular, the short C → C and D → E H1 and DC genes in amphibians. Thus for all bony fish species loops appear to be a typical feature of the B7 molecules. examined, we were only able to define a single B7-H1/DC gene Interestingly, the elasmobranch B7-H2-like sequence maintains per genome compared to the two genes encoding B7-H1 and DC 41 W in the presumptive V domain. The C-like domain was slightly in tetrapods. Similar to other B7 family members, all of the B7-H1 degenerated and was not easily recognized using the conserved and DC sequences identified within this study are composed of a domain database (CDD) at NCBI or via SMART analysis. signal sequence, extracellular IgSF V and C domains followed by Although the fish B7R and tetrapod B7 sequences do not show a transmembrane region and a relatively short intra-cytoplasmic high sequence identity (avg. 25–30% AA identity) to each other, domain. As shown in Fig. 2, all B7-H1 and DC molecules display the multiple alignments highlight important conserved features. conservation of the canonical cysteines in the IgSF V-like domain Previously we reported that the CDR3 loop of fish CD28 and CTLA4 (23C and 104 C) and C-like (23C and 104 C) domains that are required receptors possess the conserved (L/F/M)(Y/F)PPP(I/L/F) motif that for the Ig fold. In addition, all of the sequences except human and interacts with the B7.1 and B7.2 receptors (Bernard et al., 2007, mouse (V domain) encode canonical 41 W residues in the C strands 2006). This motif was absolutely conserved in all CD28 and CTLA4 for both the V and C-like domains. The V domain of human B7-H1, sequences from fish and tetrapods, suggesting that fish should B7-DC and fish B7-H1 also shared a motif GXE in the C–C loop, possess B7 orthologs with a conserved CD28/CTLA4 binding site. and the same length of the D → E loop, as evidenced by the col- Human CTLA4 contacts residues within the B7.1 or B7.2 IgSF V lier de perle representation (Supplementary Fig. I). Moreover, all domain and this interaction involves the PPP-containing loop at sequences contain 3-4 N-linked glycosylation sites (NXS/T) within the top surface of the CTLA-4V domain and at the side of the B7V the extracellular domains including the presence of a conserved domain. The positions of the B7V domain involved in interaction N-linked site in or near the EF loop of the C domain. with CTLA4 are located in the C, C,C, F and G strands and they Overall, the teleost B7-H1/DC sequences displayed slightly are slightly different in B7.1 and B7.2 (Fig. 1). The alignment higher levels of identity (avg. 28–34% ID over 215 AA) to the tetra- suggests that the biochemical properties of the CD28/CTLA4 pod B7-H1 sequences, followed directly by the B7-DC genes in a interacting residues are rather well conserved at most of the key situation similar to that found for the B7R and the B7.1 and B7.2 positions between fish B7R and mammalian B7.1 and/or B7.2. The genes. We have tentatively named the teleost sequences as B7- potential contact residues at positions 40, 42, 52 and 55 that are H1 based upon the relative conservation of residues implicated in critical in the B7.1V domain for CD28/CTLA4 association are well PD-1 association for B7-H1. Similar to CD28 and CTLA4, PD-1 inter- conserved among the teleost sequences (40Y, 42Q, 51 F and 54G). acts with the B7-H1/DC ligands through interaction of the PD-1V Interestingly, the conserved positions among the fish sequences domain with the B7-H1/DC V domains. One major difference is that implies that the overall contact interface is more similar to those the PD-1 CDR3 loop lacks the conserved proline rich motif found described for human B7.1 than B7.2. Site-directed mutagenesis in CD28 and CTLA4. Butte et al. (2007) then identified 10 residues of several conserved residues in the ABED ␤-sheet of the B7.1C in murine B7-H1 via site directed mutagenesis that are involved domain completely abrogated the binding to CTLA4, implying in PD-1 recognition. In particular, 7 residues impact PD-1 binding that the C domain plays an important, although indirect role in while three additional residues resulted in increased affinities for the interaction between CTLA4 and B7. Most of these positions PD-1 and were considered as stabilizing residues (Fig. 2). Those are well conserved in the fish B7R sequences including positions residues in the IgSF V domain of murine B7-H1 and DC, which are 1(F/Y), 4 (P) and 6 (I/L/V/F), that constitute critical residues for critical for PD-1 association, are well conserved between chickens binding CD28/CTLA4. Taken together, these observations support and mammals and to some degree between Xenopus and the other our hypothesis that the fish B7R receptors bind CD28 and CTLA4 tetrapods. Several of these residues are also moderately conserved via the P-rich loops and that the geometry of the interaction has within the gnathostomes. Aside from interacting with PD-1, it was been conserved among gnathostomes. recently shown that mammalian B7-H1 can also interact with B7.1 The alignment of the B7R constant domains also reveals con- and that the region of interaction overlaps with the for B7.1/PD- served cysteine residues (positions 10 and 19 in strands A and B) 1. Based upon this, it has been suggested that CD28, CTLA4 and that are limited to the teleost sequences. This is reminiscent of the B7-H1 could all compete for B7.1 binding and similarly, B7.1 and signature for the cortical thymocyte marker in Xenopus (CTX) con- PD-1 may compete for binding to B7-H1 (Butte et al., 2007). Inter- stant domain, which may reflect special constraints on the structure estingly, chemically cross-linked lysine residues that mapped the J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 461 sp Gallus ith blue Lipochromis. B7.2: AAH13807, B7R: DW580717, Mus musculus Salmo salar ions involved in the human B7 dimer interface B7.2: AAA58389, e and C-like domains and includes IMGT region delimitations B7R: DN656970, Homo sapiens ith an @ symbol. Positions that are found in the human B7.1/B7.2/CTLA4 l properties are in blue lettering. C-domain: positions found in the C domain B7.1: AJ851659, Gasterosteus aculeatus Gallus gallus B7.1: X60958, B7R-scafold 172 (chrUn231812808), Mus musculus Fugu rubripes Y) and those with common biochemical properties are in red. In addition, amino acid residues involved in the human B7.1/CTLA4 and 40 B7.1: AAA58390, B7R: DT293786, Homo sapiens h a # symbol. The connecting peptide (CP), transmembrane region (TM) and cytoplasmic domain are also shown. Proline residues in the IC are shown in red w Pimephales promelas B7R/B7-H2: (For ES651617. interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) B7R: CN017839, ). Signal sequences were removed. Conserved cysteine residues involved in the IgSF fold are highlighted in red on yellow background. V-domain: posit Danio rerio Squalus acanthias Multiple sequence alignment of B7 and B7R genes from fish and tetrapods. Sequences are numbered according to the unique IMGT numbering system for V-lik B7.2: CAJ18297, Lefranc et al., 2005, 2008 are in bold black letteringinterfaces with that green are background. Amino strictly acids conserved directly are involved in in black the lettering B7.1 or with .2/hCTLA4 red interface background that (i.e. are conserved are indicated w Fig. 1. ( B7.2/CTLA4 interface that are conserved in fishthat are shown contribute in to black lettering B7/CTLA4 with binding bluebackground. background are Accession while numbers indicated those or wit residues genome possessing positions similar are biochemica as follows: B7R: DB860025 and gallus 462 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472

Fig. 2. Multiple sequence alignment of B7-H1 and DC related sequences from teleost fish and tetrapods. Sequences are numbered according to the unique IGMT numbering system for V-like and C-like domains and includes IMGT region delimitations (Lefranc et al., 2005, 2008). Signal sequences were removed. Conserved cysteine residues involved in the IgSF fold as well as those located in TM are highlighted in red with yellow background. Positions with white lettering on red background correspond to amino acids that interact with PD1 and are relatively conserved in the alignment. Residues in black lettering/red background indicate mutations that increase affinity for PD1. Murine lysine residues with grey background demarcate the boundaries of the B7/B7-H1 interaction site (note K119 and K124 mark remaining boundary). Putative ITM motifs in the IC region are in red on yellow background. Finally, potential N-linked glycosylation sites (N-X-S/T) are underlined. Accession numbers or genome positions are as follows: Homo sapiens B7H1: AAF25807, Mus musculus B7H1: AAG18509, Gallus gallus B7H1: AI980757–BU254037–XM 424811, Xenopus tropicalis B7H1-Scaf 86 (2029395-2024440), Oncorhynchus mykiss B7H1R: CA366631, Danio rerio B7H1R: DN833503–EV760159, Gasterosteus aculeatus B7H1R: DT956717-chrXIV (7425302.27368), Pimephales promelas B7H1R: DT159564–DT173258, Homo sapiens B7DC: BO331153, Mus musculus B7DC: AAD33892, Gallus gallus B7DC: CF251191 and chrZ (28228704–31137), Xenopus tropicalis B7DC-Scaf 86 (1998704.1996153). (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 463

B7-H1/B7.1 interaction boundaries via surface plasmon resonance fish B7-H4 is hindered by the low similarity with their mammalian analysis for B7-H1 are well conserved within the vertebrate B7-H1 counterparts and by the absence of obvious motifs conserved in the sequences, implying that teleost B7-H1R could interact with B7.1 molecule. However, all sequences maintain a conserved N-linked (Butte et al., 2008). glycosylation site in the V domain (E strand). We also examined the genomic structure of zebrafish and tetraodon B7-H4. They pos- 3.3. B7-H3 family members sess the same exon/intron structure (splice sites and phase) as that found in mammals: the first exon encodes the signal sequence, fol- By far, the most conserved B7 family member found in this lowed by exons 2 and 3 which code for the V and C domains and study is B7-H3 as sequences were readily retrieved from represen- exon 4 which encodes the carboxy-terminus. tative elasmobranch, teleost and tetrapod species using standard TBLASTN analysis and conserved synteny examination. Overall, the 3.5. Phylogenetic analysis of the B7 family members mature proteins (minus leader segment) display >50% amino acid identity between the various vertebrates. Higher levels of identity All sequences of the B7 family members were added to a sin- were maintained within individual vertebrate classes as exempli- gle multiple alignment to assess their phylogenetic relationship(s) fied by the teleosts, which had an average amino acid identity of using either single IgSF domains (V or C) or both IgSF domains 62% across the extracellular portion of the predicted protein. together. Both analyses produced nearly identical trees for the A multiple sequence alignment (Fig. 3) was generated for the phylogeny of the B7 family. In addition, trees produced by either mature proteins to exemplify the conserved nature of B7-H3. The neighbor-joining, maximum evolution and maximum parsimony various vertebrate sequences display absolute conservation within methods produced nearly identical branching orders (data not the IgSF V-like and C-like domains of cysteine residues required shown) thus lending additional support for the designation of the for the Ig fold. B7-H3, like other B7 family members is a glyco- B7 and B7H clades. protein. Analysis of potential N-linked glycosylation sites revealed In general, the newly identified teleost B7R sequences formed a absolute conservation of two N-linked sites within E strand of the distinct sister-clade from the main branch that separates the B7 and V domain and the D strand within the C-like domain. Strong con- B7H sequences. Overall, the B7 (B7.1/.2 and B7R) sequences do not servation of additional glycosylation sites were also observed at form a monophyletic clade, likely due to the highly divergent nature the C and D strand intersection and within the F strand of the C- of these genes (Fig. 5). The B7R clade includes a recently deposited like domain and within the V domain B strand and CDR1 of the sequence for trout B7R (Zhang et al., accession ACH58052), which teleost sequences suggesting that these sites are likely important grouped tightly with the Atlantic salmon sequence (91% AA identify for B7-H3 function. Following the IgSF extracellular domains, there across the V and C domains). The branch grouping all fish B7R is sup- is a short connecting peptide region containing a conserved proline ported by a fair bootstrap value (>70%) but the group B71/B7R/B72 residue that is immediately followed by a highly conserved trans- is not, although a similar consensus tree is observed with all dif- membrane domain (68% average AA identity, 86% between trout ferent phylogeny algorithms. Interestingly, grouping of chicken B71 and zebrafish) which includes a conserved cysteine residue but no and B72 with their respective counterparts is not supported by high charged residues. The cytoplasmic region contains two conserved boostrap values either. C residues, which may be important for biological activity. The first In contrast, each group of the fish B7H-related sequences conserved C residue is found at the transmembrane/cytoplasmic clustered in well-defined B7H sister clades (i.e. B7-H3) within junction. The second is located 8 positions downstream in the the overall B7H monophyletic clade which is supported by good cytoplasmic region and is immediately followed by a series of con- boostrap scores. The fish sequences reproducibly cluster with served acidic and basic residues suggestive that this region interacts their respective mammalian counterparts, although with moder- with additional proteins based upon charge–charge interactions. In ate bootstrap scores. Within the overall B7-H1/DC clade, there was silico post-translational analysis did not reveal any conserved phos- a relatively clear demarcation of both B7-H1 and DC within the phorylation motifs. The genomic structure is identical between 2 tetrapods and a single sister clade corresponding to teleost B7-H1R teleosts (zebrafish and tetraodon) and tetrapod B7-H3 (data not implying that all of these genes derived from a common B7-H1 shown). molecule. Finally, the phylogenetic analysis also suggests that the B7-H3 and B7-H4 sequences form distinct sister clades within the 3.4. B7-H4 family members overall B7H branch (Fig. 5). It should be noted that the “fish” B7- H3 group includes two elasmobranch sequences (L. enicacea—a Homologs of the human B7-H4 sequences were detected using skate and C. milii—a chimaera) further strengthening the phylo- TBLASTN in different fish genomes including zebrafish, tetraodon, genic groupings of the B7-H3 clade and that the origins of this group and fugu. Additional sequences similar to B7-H4 were found in predate the emergence of the gnathostomes. EST databases from rainbow trout, Atlantic salmon, minnow and Additional sequences displaying similarity with the B7 genes flounder. Overall, the teleost amino acid sequences were 30–35% were also retrieved from medaka and rainbow trout EST databases. identical across the extracellular region respective to their mam- However, these last sequences were more similar to mammalian malian counterparts. B7-H4 homologs were also identified from butyrophilins than to the B7.1 and B7.2 genes. Therefore we did the genome assembly and EST indices of chicken and Xenopus. not consider these butyrophilin-like molecules as B7 counterparts Fig. 4 illustrates that these proteins all contain a typical IgSF V-like as they did not cluster with the other fish B7R and B7 molecules domain and a degenerated IgSF C-like domain that is missing one in the phylogenetic analysis (data not shown) nor with the B7H or both of the canonical 23C and 104 C residues that maintain the Ig sequences. fold. In addition, the G strand of the V domain contains a diglycine Although our phylogenetic analysis was not always supported by bulge (GxG motif) for some species, which is reminiscent of a J seg- high bootstrap scores (i.e. >80%), the analysis supports our hypoth- ment. However, this feature is not strictly conserved and one of esis that the fish B7R, B7H1, B7H3 and B7H4 are likely orthologs the G residues forming the putative GxG bulge is missing in three of the mammalian genes and are therefore expected to be derived of the teleosts. The carboxy-terminal portion of these proteins lack from common ancestral genes. It also suggests a clear and ancient recognizable TM domains but they all contain predicted GPI linkage difference between the B7 and the B7H clades. The low bootstrap sites, both of which are characteristic of B7-H4. The classification of values may simply reflect the high evolutionary pressures exerted 464 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 : in the Oncorhynchus mykiss : CD301231–BJ067228, Xenopus laevis nique IMGT numbering system for V-like and C-like domains ally, acidic residues found in the cytoplasmic domain are indicated in : BM488497–BU249770, Gallus gallus : DT378804. (For interpretation of the references to color in this figure legend, the reader is referred to : AAH19436, Leucoraja erinacea Mus musculus : AAK15438, AAVX01064573 and -Scaf Homo sapiens ). Conserved cysteine residues involved in the IgSF Ig fold are highlighted in red with yellow background. Additionally conserved cysteines located Callorhinchus milii Lefranc et al., 2005, 2008 : CD759065–EE719264, Danio rerio Multiple sequence alignment of B7-H3 sequences (lacking leader sequences) from teleost fish and tetrapods. Sequences are numbered according to the u and includes IMGT region delimitationsTM ( and cytoplasmic domainbold are and also blue. highlighted Accession in numbers yellow or with genome red positions background. are as Potential follows: N-linked glycosylation sites (N-X-S/T) are underlined. Fin Fig. 3. CA346702–CA352727, the web version of the article.) J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 465

Fig. 4. Multiple sequence alignment of mature B7-H4 sequences from teleost fish and tetrapods. Sequences are numbered according to the unique IMGT numbering system for V and C domains and includes IMGT region delimitations (Lefranc et al., 2005, 2008). Conserved cysteine residues involved in the IgSF Ig fold are highlighted in red on yellow background. Potential N-linked glycosylation sites (N-X-S/T) are underlined. SMART analysis did not reveal typical TM regions; potential GPI linkage sites are indicated in magenta background (IMGT prediction) or bold lettering with red background (prediction http://mendel.imp.ac.at/sat/gpi/gpi server.html). Putative C-terminal hydrophobic peptide removed at the addition of the GPI anchor are underlined. Accession numbers or genome positions are as follows: Homo sapiens: AAP37283, Mus musculus: AAP37284, Gallus gallus: chr1 (82647831.82658137), Xenopus tropicalis: CR439118, Danio rerio-chr9 (13865193.13868734), Tetraodon nigroviridis-chr2 (17459002.17460514), Fugu rubripes- chrUn (318872930.318873762), Oncorhynchus mykiss:BX911540, Salmo salar: DY726411 and Pimephales promelas: DT117185. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of the article.) 466 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472

Fig. 5. Phylogenetic analysis of the B7 family in vertebrates. The tree was constructed from CLUSTAL generated amino acid alignments for the V and C domains combined (∼180 amino acids as deduced by SMART analysis) using the neighbor-joining method. Tree topography was evaluated by bootstrapping 1000 times with percentages shown at nodes. Brackets denote fish (F) and tetrapod (T) groupings. Accession numbers and gene locations in addition to those found in Figs. 1–4 are as follows: Mesocricetus auratus-B7.1: BAC24767, Rattus norvegicus-B7.1: EDM11215. Oryctolagus cuniculus B7.1.: BAA08643, Canis familiaris-B7.1: AAF17293, Sus scrofa B7.1: AAL58443, Felis catus B7.1: AAB53575, Canis familiaris-B7.2: AAF17297, Felis catus B7.2 BAB11688, Sus scrofa B7.2: AAV74621, Oncorhynchus mykiss-B7R: EU927452, Oryzias latipes B7R: AM306315, Salmo salar B7-H1R: DW562815 and DY708145, Tetraodon nigroviridis B7-H1R: CAF93166, Fugu rubripes B7H1R chrUn (268841252.268843284), Rattus norvegicus B7-H1: EDM13096, Ornithorhynchus anatinus B7-H1: XP 001506123, Ambystoma tigrinum B7-H1: CN065102, Tetraodon nigroviridis B7-H3: chr5 (1823258.1824878), Fugu repices B7-H3 chrUn (316485061.31648775), Oryzias latipes B7-H3: DK244157 and DK216942, Gasterosteus aculeatus B7-H3: DW637643, Gallus gallus B7-H2: CF257773, Mus musculus B7-H2: AAH29227, Rattus norvegicus B7-H2: XP 001079346, Canis familiaris B7-H2: XP 544918, Homo sapiens B7-H2: AAG01176.

on these receptors for example by pathogen subversion. The pres- lishes that the B7 related sequences flanked by these markers ence of B7/butyrophilin related sequences in the genomes of fish have also been generated by speciation (i.e. true orthologs). Taken iridoviruses (YP 164128; gb|AAV91039; gb|AAV91038) adds sup- together, this is critical evidence that sequence similarity is due port to this interpretation. to inheritance and not convergence, and therefore builds a solid framework for further analysis of the evolution of functional prop- 3.6. Identification of conserved syntenies involving B7 family erties governing T cell activation. members Conserved genes in the vicinity of the B7R genes were identi- fied in different fish species. Several genomic markers located on As our phylogenetic analysis of the B7 family was inconclusive each side of B7R were identified in humans, mice and chickens for the origin(s) of the B7 and B7H genes, we searched for con- which defined two groups of conserved synteny that were located served syntenies involving B7 related genes and other genes in on the same region in humans and chickens, and in two clusters the B7 genomic neighborhood. For this purpose, we used a dedi- on different chromosomes in mice. Unfortunately, the B7s genes cated program (C.A.S.S.I.O.P.E) that has been developed to compare were located in a totally different region on another chromosome phylogenetic trees of all genes in a genomic region, leading to the in tetrapods (Fig. 6) suggesting that genomic translocations affect- validation of true ortholog gene sets. The C.A.S.S.I.O.P.E method- ing this region during vertebrate evolution preclude the detection ology establishes that synteny groups correspond to sets of real of a direct conserved synteny involving B7R and B7.1/.2. orthologs by congruent phylogenetic analyses and statistical vali- In contrast, a conserved synteny involving the B7-H1 genes and dation, which avoids the many pitfalls of the classical reverse-blast 5 markers (CDC37, AK3, RCL1, JAK2 and a putative ORF) was iden- analysis. When this approach provides a clear demonstration that tified in Stickleback, Tetraodon and mammals (Fig. 7). Syntenic several markers are true orthologs in two species, it firmly estab- relationships (Fig. 7) were also established for regions flanking J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 467

Fig. 6. Genes flanking B7R but not B7R/B7 itself display conserved short-range synteny in fish and tetrapods. The C.A.S.S.I.O.P.E program was used to establish the syntenic relationships of genes in the vicinity of fish B7R. The location of each marker on the corresponding chromosome is indicated in the table. the B7-H3 sequences including 6 markers in zebrafish, tetraodon, primates. We analyzed the various genome drafts and found that human, mouse and chicken: neuroplastin (NPTN), hyperpolariza- duplication event which generated the 4Ig B7-H3 molecule is prob- tion activated cyclic nucleotide-gated potassium channel 4 (HCN4), ably confined to primates as examination of rodent, avian, amphib- lysyl oxidase-like 1 (LOXL1), secretory carrier membrane protein 5 ian and teleost genomes by BLAT analyses were negative for the (SCAMP5), poly (ADP-ribose) polymerase family member 6 (PARP6) presence of duplicated VC domains for B7-H3. Finally, microsyn- and the pyruvate kinase-muscle gene (PKM2). These genetic mark- tenic clustering involving four markers around B7-H4 in zebrafish, ers were also linked to the Xenopus B7-H3 sequence on Scaf 103 pufferfish, mouse and human confirmed the identification of the (data not shown) thereby further supporting the orthologous rela- teleost sequences as true B7-H4 orthologs (Fig. 7). This conserved tionships of the B7-H3 genes. Sun et al. (2002) previously reported cluster involved the genes encoding the transcription termination that the presence of 2 differentially spliced versions of B7-H3 con- factor 2 (TTF2), a dsDNA-dependent ATPase which acts as a tran- taining either 2 or 4 IgSF domains, which was likely due to tandem scription termination factor, mannosidase alpha 1A2 (MAN1a2) and duplication of the V and C domain within the lineage leading to all the “family with sequence similarity 46” (FAM46C) gene. 468 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472

Fig. 7. Syntenic relationships of the B7H genes in vertebrates. The C.A.S.S.I.O.P.E program was used to establish syntenic relationships for B7-H1, B7-H3 and B7-H4. Gene locations and identifiers in the various genomes (Ensembl) are shown.

3.7. Linkage groups and long-distance conserved syntenies: straightforward as that for the B7H genes. In addition, searching tracking the likely origin of the B7 family for classical synteny groupings to track the origin of the B7/B7H family appeared inadequate due to long-term sequence drift and As noted earlier (Fig. 6), the establishment of syntenic relation- extensive recombination within the genomes. However, tracking ships for the primordial B7 genes in the vertebrates was not as long-range syntenies of well-defined molecules can be used to J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 469

Fig. 8. Long-range synteny analysis of genes flanking B7R suggests an ancient common origin of B7R and 4 paralogous regions in mammals that includes B7.1. Gene identifiers and locations in the genomes of medaka, stickleback and humans are shown. Positions were determined using Ensembl and BLAST/BLAT analysis. Circles define the location of B7R (CD80 homologue) in the appropriate genomes. define weakly linked, ancient gene groups that were duplicated paralogs located in human chromosomes 1p, 3q, 11, 21q (and 19q), into large paralogous sets (also known as syntenic blocks). This is that contain many genes coding for membrane receptors involved made possible by the fact that intra-chromosomal recombination in APC/lymphocyte interactions or the development of the nervous events are more frequent than inter-chromosomal recombination system (Daeron et al., 2008; Du Pasquier et al., 2004). These partic- events (O’Brien et al., 1999; Richard et al., 2003). Therefore, track- ular paralogous regions (1p, 3q, 11, 21q and 19q) track the common ing the evolution of gene sets corresponding to old and long-range origin of many well-known regulatory and signaling molecules of synteny blocks provides a synthetic view of paralogs in a genomic the immune system including genes of the B7 family: CD2, CD3 context, which could help for future nomenclature revisions based (␥, ␦, ␧ and ␨), BTLA (CD272), members of the leukocyte receptor on evolutionary origin. Thus, such an approach revealed the exis- complex (LRC), and B7.1 (3q), B7.2 (3q, close to B7.1) and B7-H2 tence of an ancestral cluster of IgSF genes defined by four sets of (21q). We therefore looked for the fish homologs of the mark- 470 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 ers located in the mammalian paralogous regions (1p, 19q, 11p The discovery of B7-H1 orthologs in fish was slightly unexpected and 21q) which led to the identification of two linkage groups as we could not identify a fish PD-1 homolog. It is difficult to exclude in both stickleback and medaka (Fig. 8). In such an approach, the on the basis of unproductive Blast and synteny searches that a PD-1 current nomenclature may be misleading since it has been built homolog is not present in the genomes of fish. However, we scanned mainly from sequence similarity rather than solid phylogenetic all available databases for sequences similar to PD-1 and found assignment. Often a name is assigned rather than another based only counterparts of other members of the immunoglobulin super on minute sequence differences. Moreover, the phylogenetic anal- gene family. In retrospect, two very intriguing reports have recently ysis is sometimes difficult due to gene loss and different rates of shown a clear interaction between B7.1 and B7-H1 in mammals genetic drift or selection, and the functional equivalence of these suggesting that maybe this interaction was the primordial one for forces is not a reliable criterion for a common origin. However, this B7-H1. Therefore, the fish B7-H1 may have originally bound B7R approach showed that these two teleost genomic regions represent in the absence of PD-1 (Butte et al., 2007, 2008). The B7-H1 recep- an obvious reshuffling of the genes encountered in the 4 paralo- tor would then have acquired the capacity to bind PD-1 after its gous regions identified in human, including the B7R. Surprisingly, appearance in tetrapods. Interestingly, the binding interfaces are even markers from the human 19q region which contains the LRC well conserved for both B7/CD28 and B7H1/PD-1, implying that and is thought to be a fragment of chromosome 21 are retrieved the fish B7-H1 could bind B7R. The presumptive roles of B7-H3 and (Daeron et al., 2008), supporting the hypothesis that human regions B7-H4 in primitive vertebrates is more elusive since their ligand(s) 21q21 and 19q13 constitute two parts of the same ancestral com- have yet to be defined in mammals. However, it is striking that the plex. Interestingly, the homologs of three markers belonging to the B7-H3 sequences are the most conserved among the whole family, 3q linkage group containing B7.1 and B7.2 – JAM3, LSAMP and CD96 suggesting that the modalities of ligand recognition/interaction for – were encoded on the same chomosome as the B7R gene (Fig. 8). B7-H3 have been maintained during vertebrate evolution to fulfill In addition, the markers JAM3, LSAMP, CD47 and CD166 all have an important functional role. homologs on stickleback LGVII. These observations suggest that the Aside from the structural signatures that imply functional teleost B7R gene belonged to the same overall linkage group that conservation, the genomic and phylogenetic analyses provided per- includes B7.1/B7.2. In contrast, we did not find any B7H molecules tinent information about the origins of the B7 family members. in these long-range synteny blocks or any obvious syntenic rela- Although the accumulation of genomic rearrangements makes the tionships of the fish B7 genes with the major histocompatibility classical short-range synteny approach nearly impossible for track- regions. ing the history of B7, we could assign it to one of the ancient sets of paralogous regions that has been maintained from tunicates to 4. Discussion mammals which contains many genes involved in APC/lymphocyte interactions (Du Pasquier et al., 2004). The functional and evolu- In this article, we identified several new sequences of B7 fam- tionary significance of these particular syntenies awaits further ily members in teleost fish. The fish B7R genes appear to be likely investigation in other genomes such as those for elasmobranchs, orthologs of B7.1 and B7.2 based upon several criteria including but our observations reinforce the hypothesis that B7/B7R genes best Blast scores (% identity), phylogenetic analysis of Ig domains belong to a primordial set of immune receptors. These observa- and the presence of conserved contact residues for binding the tions prompted us to investigate more ancient species such as the PPP-loop of CD28 and CTLA4. The fish counterparts of B7H1, B7H3 lamprey, an agnathan. Unfortunately, we could not find any B7-like and B7H4 were also retrieved and constitute the likely orthologs genes in the lamprey genomic contigs, but sequences that harbor all of these B7H receptors. Our conclusions are also based upon our the hallmarks of regulatory receptors with a V domain, a C domain, ability to establish syntenic relationships, both long and short- a TM and an intra-cytoplasmic region with Y-based signaling motifs range, for the B7 and B7H genes. These observations indicate that are present in agnathans (Pancer et al., 2004; Haruta et al., 2006). prototypical B7 and B7H molecules were present in the common The final assemblies of the lamprey and elasmobranch genomes ancestor of teleosts and mammals, defining ancient clades of stim- will clarify whether such sequences are related to the paralogous ulatory/inhibitory receptors. This is also supported by the finding sets described above and whether these receptors represent ances- of elasmobranch sequences related to B7-H2 and B7-H3, indicating tors of the B7s. that processes governing T-cell co-stimulation and inhibition were In contrast to B7, the evolutionary links between the B7H genes present at the origin of the gnathostomes. and the paralogous regions described earlier were more elusive. Previously we demonstrated that teleost CD28 and CTLA4 pos- While the human B7-H2 belongs to the linkage group located on sess canonical B7 ligand-binding motifs, implying that they should chromosome 21, B7-H2 appears to be missing in teleosts, thus pre- interact with a B7 molecule in a classical fashion (Bernard et al., cluding further speculation about the existence of the linkage group 2007, 2006). This observation therefore prompted our search for to which it belongs (defined by DSCAM and NCAM2, see Fig. 8). For conserved motifs potentially involved in receptor/ligand interac- B7-H1, B7-H3 and B7-H4, it was difficult to determine whether they tions in the fish B7R sequences. The conserved positions in the fish come from the linkage groups and were translocated, or whether B7R alignment matched well with key positions involved in the they represent an older independent origin. Thus, we were not able binding of B7.1/B7.2 to CD28 and CTLA4 thereby further supporting to track the common ancestor of the B7H genes, and our observa- that the B7/CD28 pathway is a very ancient regulatory pathway for tions simply show that the diversification of the B7H subset (into lymphocytes. Moreover, the teleost CD28 molecules are authentic B7-H1, B7-H3, B7-H4) occurred before the split between teleosts co-stimulatory receptors as they are capable of transducing strong and tetrapods (Fig. 9). activation signals in a mammalian context (Bernard et al., 2006). We therefore propose the following evolutionary scenario for The function of fish CTLA4 is not as well defined as we were pre- signal 2 molecules: the common ancestor of fish and tetrapods viously unable to establish any signaling capacity for trout CTLA4. expressed CD28, CTLA4, B7R and different B7H membrane recep- However, the significant conservation of the B7-binding region in tors with the B7R molecule binding both CD28/CTLA4 and B7-H1. the CTLA-4V domain and the presence of the GxG motif responsible CD28-, B7- and B7H-clades further diversified during vertebrate for high affinity interactions between B7/CTLA4 suggests that fish evolution in a class-specific way leading to unique binding oppor- CTLA4 most likely binds B7R and may act as an inhibitory receptor tunities between newly emerging members of the CD28 and B7H by competition with CD28. clades. In this scenario, PD-1, B7-DC, B7.1 and B7.2 were produced J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472 471

Fig. 9. Schematic depiction of the hypothetical evolution of the B7 family in vertebrates. 2R reflects the genome duplication events that are common to all gnathostomes, while 3R signifies the additional teleost-specific genome duplication event. Asterisks denote that partial sequences were found for B7-H1 and DC in the chicken and Xenopus genomes. by gene duplication during tetrapod evolution and selected by the ligand may be B7R. The recruitment of PD-1 as a B7-H1 lig- coevolution with their ligands. When a duplication product was and coupled with the differentiation of two receptors (B7-H1 and compatible (i.e. binding resulting in a “signal”) with a pre-existing DC) appears as a consequence of the step-wise expansion observed receptor, it was co-opted into a new co-evolving unit constrained in the adaptive immune systems in tetrapods. The differentia- by the previous interaction. Such a model would explain the sin- tion of new dedicated lymphoid tissues and microenvironments gle B7-H1/DC ortholog in fish in the absence of PD-1. However, (lymph nodes and true germinal centers) during tetrapod evolu- it is surprising that fish do not have a more diverse CD28 and tion likely led to the creation of compartmentalization favorable B7 families than mammals since they have undergone one or to the step-wise expansion of the regulatory network of molecu- more additional rounds of whole genome duplication, thus provid- lar interactions, which is reflected in the history of CD28 and B7 ing additional opportunities for recruitment and/or specialization families. of the duplicated products. Therefore, it appears that most of the additional copies have been lost, probably reflecting a func- Acknowledgements tional simplicity of the fish lymphocyte activation system or more generally, that gene loss is a common occurrence post-genome The authors would like to thank Pierre Pontarotti for criti- duplication in teleosts (Postlethwait et al., 2004; Semon and Wolfe, cal comments in regard to the manuscript and for help with 2007). Thus, the teleost B7R and B7H molecules potentially reflect conserved synteny discovery. We also thank Olivier Chabrol for internal constraints due to the (regulatory) structure of the fish the bioinformatic development of CASSIOPE and Yong-An Zhang immune system. While mammals and birds have two B7 recep- and Oriol Sunyer for sharing the trout B7R sequence prior to tors with different functional properties, fish appear to have only its release in GenBank. This work has been supported by U.S. one B7R per species possibly mirroring the primordial pathway Geological Survey base funding and the Institut National de la for delivering signal 2 that was established at the origin of cell Recherche Agronomique. IMGT is supported by the CNRS and meditated immunity. B7-H2 and its ligand ICOS were not found by the Ministère de l’Education Nationale, de l’Enseignement in our teleost searches, but a B7-H2-related sequence is found in Supérieur et de la Recherche (Université Montpellier II Plan Pluri- sharks and Xenopus (scaffold 201; 753441) as well as a putative Formation, Genopole Montpellier-Languedoc-Roussillon, Agence ICOS (Bernard et al., 2007)inXenopus and in other tetrapods. Since Nationale de la Recherche [ANR-06-BIOSYS-0005-01], European ICOS delivers positive stimulatory signals that are important for Union 6th PCRD ImmunoGrid project [IST-2004-0280069], Région differentiation of T helper 1 and 2 cells, its absence in fish could Languedoc-Roussillon). The use of trade, firm or corporation names reflect a fundamental difference in the regulation of the Th1 and in this publication is for the information and convenience of the Th2 differentiation. Finally, teleosts possess typical B7-H1, B7-H3 reader. Such use does not constitute an official endorsement or and B7-H4 genes that in mammals are involved in T-cell tolerance approval by the U.S. Department of Interior or the U.S. Geological and inhibition. Such receptors are logical members of the primor- Survey of any product or service to the exclusion of others that may dial system for controlling adaptive immunity: the regulation of T be suitable. cell activation and tolerance. Of note, the high degree of sequence conservation of B7-H1 and B7-H3 suggests that they were tightly constrained by critical interactions of which potential mutations Appendix A. Supplementary data or gene loss could be detrimental, thus implying direction selec- tion for these genes. Interestingly, while mammals possess two Supplementary data associated with this article can be found, PD-1 ligands (B7-H1 and B7-DC), fish only encode one of which in the online version, at doi:10.1016/j.molimm.2008.10.007. 472 J.D. Hansen et al. / Molecular Immunology 46 (2009) 457–472

References Lefranc, M.P., Giudicelli, V., Regnier, L., Duroux, P., 2008. IMGT, a system and an ontol- ogy that bridge biological and computational spheres in bioinformatics. Brief Barber, D.L., Wherry, E.J., Masopust, D., Zhu, B., Allison, J.P., Sharpe, A.H., Freeman, Bioinform. 9, 263–275. G.J., Ahmed, R., 2006. Restoring function in exhausted CD8 T cells during chronic Lenschow, D.J., Ho, S.C., Sattar, H., Rhee, L., Gray, G., Nabavi, N., Herold, K.C., Bluestone, viral infection. Nature 439, 682–687. J.A., 1995. Differential effects of anti-B7-1 and anti-B7-2 monoclonal antibody Bernard, D., Hansen, J.D., Du Pasquier, L., Lefranc, M.P., Benmansour, A., Boudinot, P., treatment on the development of diabetes in the nonobese diabetic mouse. J. 2007. Costimulatory receptors in jawed vertebrates: conserved CD28, odd CTLA4 Exp. Med. 181, 1145–1155. and multiple BTLAs. Dev. Comp. Immunol. 31, 255–271. Nurieva, R.I., Mai, X.M., Forbush, K., Bevan, M.J., Dong, C., 2003. B7 h is required for T Bernard, D., Riteau, B., Hansen, J.D., Phillips, R.B., Michel, F., Boudinot, P., Benmansour, cell activation, differentiation, and effector function. Proc. Natl. Acad. Sci. U.S.A. A., 2006. Costimulatory receptors in a teleost fish: typical CD28, elusive CTLA4. 100, 14163–14168. J. Immunol. 176, 4191–4200. O’Brien, S.J., Menotti-Raymond, M., Murphy, W.J., Nash, W.G., Wienberg, J., Stanyon, Butte, M.J., Keir, M.E., Phamduy, T.B., Sharpe, A.H., Freeman, G.J., 2007. Programmed R., Copeland, N.G., Jenkins, N.A., Womack, J.E., Marshall Graves, J.A., 1999. death-1 ligand 1 interacts specifically with the B7-1 costimulatory molecule to The promise of comparative genomics in mammals. Science 286 (458–62), inhibit T cell responses. Immunity 27, 111–122. 479–481. Butte, M.J., Pena-Cruz, V., Kim, M.J., Freeman, G.J., Sharpe, A.H., 2008. Interaction of Pancer, Z., Mayer, E.W., Klein, J., Cooper, M.D., 2004. Prototypic T cell receptor and human PD-L1 and B7-1. Mol. Immunol. 45, 3567–3572. CD4-like coreceptor are expressed by lymphocytes in the agnathan sea lamprey. Chapoval, A.I., Ni, J., Lau, J.S., Wilcox, R.A., Flies, D.B., Liu, D., Dong, H., Sica, G.L., Zhu, G., Proc. Natl. Acad. Sci. USA 101, 13273–13278. Tamada, K., Chen, L., 2001. B7-H3: a costimulatory molecule for T cell activation Petrovas, C., Casazza, J.P., Brenchley, J.M., Price, D.A., Gostick, E., Adams, W.C., Pre- and IFN-gamma production. Nat. Immunol. 2, 269–274. copio, M.L., Schacker, T., Roederer, M., Douek, D.C., Koup, R.A., 2006. PD-1 is a Choi, I.H., Zhu, G., Sica, G.L., Strome, S.E., Cheville, J.C., Lau, J.S., Zhu, Y., Flies, D.B., regulator of virus-specific CD8+ T cell survival in HIV infection. J. Exp. Med. 203, Tamada, K., Chen, L., 2003. Genomic organization and expression analysis of B7- 2281–2292. H4, an immune inhibitory molecule of the B7 family. J. Immunol. 171, 4650–4654. Postlethwait, J., Amores, A., Cresko, W., Singer, A., Yan, Y., 2004. Subfunction parti- Chung, Y., Nurieva, R., Esashi, E., Wang, Y.H., Zhou, D., Gapin, L., Dong, C., 2008. A tioning, the teleost radiation and the annotation of the human genome. Trends critical role of costimulation during intrathymic development of invariant NK T Genet. 20, 481–490. cells. J. Immunol. 180, 2276–2283. Poussin, M.A., Tuzun, E., Goluszko, E., Scott, B.G., Yang, H., Franco, J.U., Christadoss, Daeron, M., Jaeger, S., Du Pasquier, L., Vivier, E., 2008. Immunoreceptor tyrosine- P., 2003. B7-1 costimulatory molecule is critical for the development of experi- based inhibition motifs: a quest in the past and future. Immunol. Rev. 224, 11–43. mental autoimmune myasthenia gravis. J. Immunol. 170, 4389–4396. Danchin, E.G., Pontarotti, P., 2004. Statistical evidence for a more than 800-million- Prasad, D.V., Richards, S., Mai, X.M., Dong, C., 2003. B7S1, a novel B7 family member year-old evolutionarily conserved genomic region in our genome. J. Mol. Evol. that negatively regulates T cell activation. Immunity 18, 863–873. 59, 587–597. Richard, F., Messaoudi, C., Bonnet-Garnier, A., Lombard, M., Dutrillaux, B., 2003. Doty, R.T., Clark, E.A., 1998. Two regions in the CD80 cytoplasmic tail regulate CD80 Highly conserved chromosomes in an Asian squirrel (Menetes berdmorei, redistribution and T cell costimulation. J. Immunol. 161, 2700–2707. Rodentia: Sciuridae) as demonstrated by ZOO-FISH with human probes. Chro- Du Pasquier, L., Zucchetti, I., De Santis, R., 2004. Immunoglobulin superfamily recep- mosome Res. 11, 597–603. tors in protochordates: before RAG time. Immunol. Rev. 198, 233–248. Salomon, B., Bluestone, J.A., 2001. Complexities of CD28/B7: CTLA-4 costimula- Fife, B.T., Griffin, M.D., Abbas, A.K., Locksley, R.M., Bluestone, J.A., 2006. Inhibition of tory pathways in autoimmunity and transplantation. Annu. Rev. Immunol. 19, T cell activation and autoimmune diabetes using a B cell surface-linked CTLA-4 225–252. agonist. J. Clin. Invest. 116, 2252–2261. Semon, M., Wolfe, K.H., 2007. Reciprocal gene loss between Tetraodon and zebrafish Freeman, G.J., Gribben, J.G., Boussiotis, V.A., Ng, J.W., Restivo Jr., V.A., Lombard, L.A., after whole genome duplication in their ancestor. Trends Genet. 23, 108–112. Gray, G.S., Nadler, L.M., 1993. Cloning of B7-2: a CTLA-4 counter-receptor that Sica, G.L., Choi, I.H., Zhu, G., Tamada, K., Wang, S.D., Tamura, H., Chapoval, A.I., Flies, costimulates human T cell proliferation. Science 262, 909–911. D.B., Bajorath, J., Chen, L., 2003. B7-H4, a molecule of the B7 family, negatively Gouret, P., Vitiello, V., Balandraud, N., Gilles, A., Pontarotti, P., Danchin, E.G., 2005. regulates T cell immunity. Immunity 18, 849–861. FIGENIX: intelligent automation of genomic annotation: expertise integration Suh, W.K., Gajewska, B.U., Okada, H., Gronski, M.A., Bertram, E.M., Dawicki, W., Dun- in a new software platform. BMC Bioinform. 6, 198. can, G.S., Bukczynski, J., Plyte, S., Elia, A., Wakeham, A., Itie, A., Chung, S., Da Costa, Greenfield, E.A., Nguyen, K.A., Kuchroo, V.K., 1998. CD28/B7 costimulation: a review. J., Arya, S., Horan, T., Campbell, P., Gaida, K., Ohashi, P.S., Watts, T.H., Yoshinaga, Crit. Rev. Immunol. 18, 389–418. S.K., Bray, M.R., Jordana, M., Mak, T.W., 2003. The B7 family member B7-H3 pref- Haruta, C., Suzuki, T., Kasahara, M., 2006. Variable domains in hagfish: NICIR is a poly- erentially down-regulates T helper type 1-mediated immune responses. Nat. morphic multigene family expressed preferentially in leukocytes and is related Immunol. 4, 899–906. to lamprey TCR-like. Immunogenetics 58, 216–225. Suh, W.K., Wang, S., Duncan, G.S., Miyazaki, Y., Cates, E., Walker, T., Gajewska, B.U., Henry, J., Miller, M.M., Pontarotti, P., 1999. Structure and evolution of the extended Deenick, E., Dawicki, W., Okada, H., Wakeham, A., Itie, A., Watts, T.H., Ohashi, B7 family. Immunol. Today 20, 285–288. P.S., Jordana, M., Yoshida, H., Mak, T.W., 2006. Generation and characterization Keir, M.E., Butte, M.J., Freeman, G.J., Sharpe, A.H., 2008. PD-1 and its ligands in of B7-H4/B7S1/B7x-deficient mice. Mol. Cell Biol. 26, 6403–6411. tolerance and immunity. Annu. Rev. Immunol. 26, 677–704. Sun, M., Richards, S., Prasad, D.V., Mai, X.M., Rudensky, A., Dong, C., 2002. Character- Kumar, S., Tamura, K., Nei, M., 2004. MEGA3: Integrated software for Molecular ization of mouse and human B7-H3 genes. J. Immunol. 168, 6294–6297. Evolutionary Genetics Analysis and sequence alignment. Brief Bioinform. 5, Wang, S., Zhu, G., Chapoval, A.I., Dong, H., Tamada, K., Ni, J., Chen, L., 2000. Cos- 150–163. timulation of T cells by B7-H2, a B7-like molecule that binds ICOS. Blood 96, Larsen, C.P., Ritchie, S.C., Hendrix, R., Linsley, P.S., Hathcock, K.S., Hodes, R.J., Lowry, 2808–2813. R.P., Pearson, T.C., 1994. Regulation of immunostimulatory function and cos- Zang, X., Loke, P., Kim, J., Murphy, K., Waitz, R., Allison, J.P., 2003. B7x: a widely timulatory molecule (B7-1 and B7-2) expression on murine dendritic cells. J. expressed B7 family member that inhibits T cell activation. Proc. Natl. Acad. Sci. Immunol. 152, 5208–5219. U.S.A. 100, 10388–10392. Lefranc, M.P., Giudicelli, V., Kaas, Q., Duprat, E., Jabado-Michaloud, J., Scaviner, D., Zhang, G., Hou, J., Shi, J., Yu, G., Lu, B., Zhang, X., 2008. Soluble CD276 (B7-H3) is Ginestoux, C., Clement, O., Chaume, D., Lefranc, G., 2005. IMGT, the international released from monocytes, dendritic cells and activated T cells and is detectable ImMunoGeneTics information system. Nucleic Acids Res. 33, D593–D597. in normal human serum. Immunology 123, 538–546.