481 (2011) 83–92

Contents lists available at ScienceDirect

Gene

journal homepage: www.elsevier.com/locate/gene

Claudin-18 gene structure, regulation, and expression is evolutionary conserved in mammals

Özlem Türeci a,⁎, Michael Koslowski b, Gerd Helftenbein a, John Castle b, Christoph Rohde a, Karl Dhaene c, Gerhard Seitz d, Ugur Sahin e,b a Ganymed Pharmaceuticals AG, Freiligrathstr. 12, 55131 Mainz, Germany b TRON—Translational Oncology and Immunology GmbH, Langenbeckstr. 1, 55131 Mainz, Germany c Algemeen Stedelijk Ziekenhuis Aalst, Department of Pathology, Merestraat 80, B-9300 Aalst, Belgium d Hospital Bamberg, Department of Pathology, Buger Str. 80, 96049 Bamberg, Germany e III. Department of Internal Medicine, Division of Translational and Experimental Oncology, Johannes Gutenberg University, Obere Zahlbacherstr. 63, 55131 Mainz, Germany article info abstract

Article history: Claudin-18 isoform 2 (CLDN18.2) is one of the few members of the human claudin family of tight junction Accepted 15 April 2011 molecules with strict restriction to one cell lineage. The objective of the current study was to compare Available online 4 May 2011 molecular structure and tissue distribution of this gastrocyte specific molecule in mammals. We show here that the CLDN18.2 sequence is highly conserved, in particular with regard to functionally relevant Received by A.J. van Wijnen domains in mouse, rat, rabbit, dog, monkey and human and also in lizards. Moreover, promoter regions of orthologs are highly homologous, including the binding site of the transcription factor cyclic AMP–responsive Keywords: Tight junction element binding protein (CREB), which is known to regulate activation of human CLDN18.2. Employing RT- Claudins PCR and immunohistochemistry, we found that, analogous to the human gene, all orthologous CLDN18.2 Ortholog transcripts and are exclusively expressed in differentiated gastric cells. Gene structure, promoter Conservation elements and RNA expression pattern of the lung-tissue specific Claudin-18 isoform 1 (CLDN18.1) as well, are homologous across species. These findings exemplify phylogenetic conservation of lineage-specific members of a multigene family. Given that CLDN18.2 is a novel drug target candidate, our data is also relevant for drug development as it reveals all six investigated mammalian species as suitable models for testing safety of CLDN18.2 targeting regimen. © 2011 Elsevier B.V. All rights reserved.

1. Introduction (Smetana, 1947). The paracellular aspect of this control is mainly attributed to cell–cell contact sites known as tight junction (TJ) In multicellular organisms, maintenance of compartments that strands (Farquhar and Palade, 1963). TJs have additional roles, such as differ in fluid and solute composition is ensured by epithelial and maintaining cell polarity by forming a barrier that prevents lateral endothelial cell layers, which control the passage of water and solutes diffusion of membrane proteins and lipids and recruitment of signaling molecules for regulation of proliferation, differentiation, motility and other cellular functions (Stevenson et al., 1988). Abbreviations: CLDN18.2, Claudin-18 isoform 2; CLDN18.1, Claudin-18 isoform 1; Tight junctions require claudin molecules for their formation CREB, cyclic AMP response element-binding protein; RT-PCR, reverse transcription (Furuse et al., 1998). The claudin family comprises more than two polymerase chain reaction; TJ, tight junction; BLAST, basic local alignment search tool; dozens of four-pass transmembrane proteins with similar sequence nr, non-redundant protein database; RefSeq, NCBI Reference Sequence database; T/EBP/NKX2.1, homeodomain transcription factor; EBI, European Bioinformatics and structure, but divergent tissue distribution. Whereas most Institute; IgG, immunoglobulin G; HRP, horseradish peroxidase; FITC, fluorescein claudins are active in multiple tissues or even expressed ubiquitously, isothiocyanate; FFPE, formalin fixed paraffin-embedded; min, minutes; PBS, phosphate a few claudins are restricted to single tissues (reviewed in (Krause ′ buffered saline; DAB, 3,3 -diaminobenzidine; kb, kilobases; UCSC, University of et al., 2008)). Some cell types express unique claudin species (Fujita California Santa Cruz; aa, amino acids; bp, base pairs; TM, transmembrane domain; fi ECL, extracellular loop; PDZ, PDZ protein domain; MUPP-1, multi-PDZ domain protein et al., 2006). For correct tissue-speci city of transcriptional activation, 1; ICH, The International Conference on Harmonisation of Technical Requirements for individual claudin are controlled by unique regulatory Registration of Pharmaceuticals for Human Use; CPMP, Committee for Proprietary mechanisms. Medicinal Products; ID, identifier; IHC, immunohistochemistry; NTC, no template Tight junction strands generally contain multiple claudin species control. that may interact in homo- and heterophilic ways (Gonzalez-Mariscal ⁎ Corresponding author at: Ganymed Pharmaceuticals AG, Freiligrathstr. 12, 55131 Mainz, Germany. Tel.: +49 6131 101; fax: +49 6131 114. et al., 2003; Morita et al., 1999). It is hypothesized that the distinct E-mail address: [email protected] (Ö. Türeci). composition of claudins plays a major role in variable physiological

0378-1119/$ – see front matter © 2011 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2011.04.007 84 Ö. Türeci et al. / Gene 481 (2011) 83–92 properties of TJs in different tissues (Anderson, 2001; Mitic et al., targets (Sahin et al., 2008). We showed that expression of CLDN18.2 in 2000; Rahner et al., 2001). Furthermore, due to their crucial function normal tissues is strictly confined to differentiated epithelial cells of the in compartment separation, TJs are an evolutionary well conserved gastric mucosa and is absent from the gastric stem cell zone. CLDN18.2 is component of vertebrate cytoarchitecture (Kollmar et al., 2001). retained upon malignant transformation and is expressed in a However, phylogenetic analysis of claudins has been largely done significant proportion of primary gastric cancers and metastases thereof. in the context of evolutionary aspects concerning tight junction Moreover, we found frequent ectopic activation of CLDN18.2 in conservation (Krause et al., 2008) and little is known about the pancreatic, esophageal, ovarian, and lung tumors. The closely related evolutionary conservation of single claudins. splice variant isoform 1 (CLDN18.1), in contrast, is exquisitely restricted Recently, a member of the claudin multigene family, isoform 2 to cells of lung tissue (Niimi et al., 2001). of claudin-18 (CLDN18.2), appeared among the hits in a combined The strict lineage specificity of both CLDN18 variants in normal in silico data mining and wet bench strategy to identify gastrocyte human tissues prompted us to consider this gene as a model to in- lineage–specific cell surface molecules for use as therapeutic antibody vestigate how gene structure, regulation and expression of highly

Fig. 1. Phylogenetic conservation of gene structure, transcript sequence and cis-acting regulatory promoter elements of CLDN18. (A) Genomic organization of the human CLDN18 gene on 3q22, which was used as reference. CLDN18.1 and CLDN18.2 result from alternative splicing of the first exon. (B) Screenshot of the UCSC genome browser shows an alignment of the human CLDN18 gene locus against genomic sequences of primate, mammals, and vertebrates. (Top) RefSeq transcripts feature alternative first exons for transcripts encoding isoform 1 or isoform 2 (Kent et al., 2002). (Middle) Summarized conservation across 46 vertebrates by PhastCons, with taller bars indicating higher conservation (Siepel et al., 2005). (Lower) Cross-species alignments for selected species, with bars indicating conservation (Blanchette et al., 2004). (C) Alignment of the Claudin18.1 and Claudin18.2 promoter sequences. Cis-acting regulatory elements described for the human CLDN18 isoforms are boxed. The genomic sequence of O. cuniculus for the Claudin18.1 promoter was not available. .Trc ta./Gn 8 21)83 (2011) 481 Gene / al. et Türeci Ö. – 92

Fig. 1 (continued). 85 86 Ö. Türeci et al. / Gene 481 (2011) 83–92 selective members of multigene families with housekeeping function 2.4. First and second antibodies evolve across species. For immunohistochemistry, four different antibodies with IgG 2. Materials and methods constant regions from three different species were available. This facilitated detection of CLDN18 across different species. Generally 2.1. Sequence retrieval and analysis speaking, we chose secondary antibodies to be used from different species as the first antibody they were combined with. Rabbit antisera Orthologs of human CLDN18 isoforms were detected by BLAST aCLDN18.2/n-term (generated at Eurogentec) and anti-CLDN18/mid searches either in the non-redundant protein database (nr) at NCBI (purchased from Zymed) were used on fixed paraffin embedded (http://blast.ncbi.nlm.nih.gov/Blast.cgi) or in all available species data- tissue in combination with the second antibody Powervision Poly HRP bases at the Ensembl server site (http://www.ensembl.org/Multi/ goat-anti-rabbit (ImmunoLogic). anti-CLDN18/mid was the only one blastview). In each case the human CLDN18.2 protein sequence RefSeq among the four antibodies which was cross-reactive with CLDN18.1. ID: NP_001002026 served as query. Failing RefSeq NM transcripts and Mouse IgG2a muMAB362 and its chimerized human IgG1 variant NP proteins, we selected Ensembl identifier.ABLASTsearchofthe iMAB362, which recognize a complex conformational epitope were human CLDN18.1 and CLDN18.2 proximal core promoter sequences up used for staining of cryo sections. Here, Powervision Poly HRP goat- to −280 bp encompassing the T/EBP/NKX2.1 binding sites (CLDN18.1) anti-mouse (ImmunoLogic) and Rabbit-anti-human IgG (Jackson and the CREB binding site (CLDN18.2) against the genomic sequence ImmunoResearch Laboratories) followed by incubation with Power- databases of all selected species led to the orthologous promoter vision Poly HRP goat-anti-rabbit (ImmunoLogic), respectively were sequences. All alignments were conducted by use of the ClustalW2 used as second antibodies. None of these first/second antibody sets program via the web interface at the EBI webserver (http://www.ebi.ac. was used for staining of tissues from species from which one of the uk/Tools/clustalw2/index.html). two antibodies were derived. The anti-CD20 antibody Rituximab was used as negative control. 2.2. RNA extraction and cDNA synthesis For some experiments, antibodies were conjugated with FITC (by Squarix) and used in combination with Rabbit-anti-FITC-HRP (AbD fi Fresh frozen human tissue specimens were purchased from Serotec) as second antibody, again to prevent unspeci c staining by different commercial providers. Monkey (Macaca fascicularis) tissue direct binding of the secondary antibodies to tissue sections. sections were purchased from LPT (Hamburg, Germany), dog (Canis familiaris, Beagle) tissues from Aurigon (Tutzing, Germany) and rabbit 2.5. Immunohistochemistry (Oryctolagus cuniculus), rat and mouse tissue sections were prepared in house. RNA preparation and first strand cDNA synthesis were Tissue microarrays were either prepared by standard techniques performed as described elsewhere (Grunwald et al., 2006; Koslowski from frozen (cryo) or formalin fixed paraffin-embedded (FFPE) tissue et al., 2006). sections or purchased from the companies Biomax (Rockville, USA) and Biochain (Hayward, USA). Commercial tissue microarrays were 2.3. RT-PCR profiling treated according to the suppliers' instructions prior to antibody treatment. For end-point RT-PCR analysis, degenerated sense primers specific Sections of FFPE tissues were deparaffinized and treated for 30 min for CLDN18.1 (5′-GTG TTC CAr TAy GAr GGG CTs TGG-3′) or CLDN18.2 at 95 °C with DakoCytomation Target Retrieval solution (DakoCyto- (5′-CTG ATy GGG wTT GCv GGC ATy ATT GC-3′) orthologs from mation, Glostrup, Denmark) for epitope retrieval. Cryo sections were monkey, dog, rabbit, rat and mouse were used in combination with a fixed for 10 min in cold acetone at −20 °C, air-dried and either degenerated antisense primer (5′-CCA GAA GTT rGT nAC sAG CAT GTT directly processed or stored at −80 °C. Sections were rehydrated in GG-3′) at an annealing temperature of 64 °C in a 35 cycle PCR reaction. PBS before staining. Staining of both FFPE and cryo sections was Real-time quantitative analysis of transcript expression was per- performed according to the same protocol. Endogenous peroxidases formed using the ABI PRISM 7900 Sequence Detection System were quenched by incubation in 0.3% H2O2 in PBS. After washing instrument and software (Applied Biosystems). Expression of 18s with PBS, unspecific antibody binding sites were blocked with rRNA (sense: 5′-CGA TGC TCT TAG CTG AGT GTC-3′; antisense: 5′-TAA blocking buffer (10% goat serum in PBS) followed by incubation CCA GAC AAA TCG CTC CAC-3′;65°C)wasassessedinhexamer-primed with the primary antibody diluted in blocking buffer. Samples were cDNAs using QuantiTect SYBR Green PCR Kit (Qiagen). CLDN18 then washed 3 times with PBS and incubated with the respective transcripts were amplified in a 40 cycle PCR reaction using the same secondary antibody. Visualization was performed using the peroxi- primers used for conventional RT-PCR. The relative expression levels of dase substrates DAB (DakoCytomation) or Nova Red substrate kit CLDN18 transcripts was computed using ΔΔCT calculation with respect (Vector Laboratories, Burlingame, CA, USA). After counterstaining to the internal 18 s rRNA standard to normalize for variances in the with Hematoxylin, dehydration and mounting sections were analyzed quality of RNA and the amount of input cDNA. microscopically.

Table 1 CLDN18 splice variants in six mammalian species are listed by either RefSeq IDs or Ensembl IDs. Isoforms 1 and 2 nomenclature are standardized according to the human nomenclature.

Species Chromosome Isoform 1 Isoform 2

Transcript Protein Transcript Protein

Human 3 NM_016369 NP_057453 NM_001002026 NP_001002026 Dog 23 ENSCAFT00000011986 ENSCAFP00000011114 ENSCAFT00000011985 ENSCAFP00000011113 Macaque 3 ENSMMUT00000013455 ENSMMUP00000012607 ENSMMUT00000013453 ENSMMUP00000012605 Mouse 9 NM_019815 NP_062789 NM_001194921 NP_001181850 Rabbit 14 ENSOCUT00000027967 ENSOCUP00000018995 ENSOCUT00000014966 ENSOCUP00000012862 Rat 8 NM_001014096 NP_001014118 ENSRNOT00000044524 ENSRNOP00000041474 .Trc ta./Gn 8 21)83 (2011) 481 Gene / al. et Türeci Ö. – 92

Fig. 2. Conservation of protein sequence and composition of CLDN18 isoforms across mammals. Alignment of the Claudin18.1 and Claudin18.2 protein sequences of primate and mammalian species. Transmembrane domains (TMs) 1–4 are shown in gray. The consensus claudin family signature located in the first extracellular loop (ECL1) is boxed. Negative residues in ECL1 are in red, positive charged residues in green. Conserved cysteines are depicted in yellow. 87 88 Ö. Türeci et al. / Gene 481 (2011) 83–92

3. Results predict and annotate the gene structures of orthologs, we were able to find both isoforms in all species (Table 1). Interspecies sequence 3.1. Gene structure and transcript sequence of CLDN18 orthologs conservation can be computed using a probabilistic model that describes the process of DNA substitution at each site (Siepel et al., The human CLDN18 gene locus on chromosome 3q22 covers 2005). This summarization of conservation can be viewed using the approximately 35 kb and is organized in 6 exons and 5 introns “Conservation by PhastCons” genome track found in the UCSC (Fig. 1A). Alternative usage of exons 1a and 1b gives rise to the two Genome Browser (Kent et al., 2002) and allowed us to examine isoforms CLDN18.1 (RefSeq ID: NM_016369, NP_057453) and interspecies conservation (Fig. 1B middle). Additionally, we analyzed CLDN18.2 (RefSeq ID: NM_001002026, NP_001002026), which code the alignments of individual species genomes to the , for closely related proteins (Fig. 1A and B top). To gain insight into the generated through the “multiz” algorithm that performs a progressive evolutionary conservation of Claudin-18, we used the human CLDN18 genome-genome alignment to generate the best-in-genome pairwise protein to search for orthologous proteins in macaque, mouse, rat, alignment (Blanchette et al., 2004). We exploited this tool to obtain dog, and rabbit utilizing sequence similarity search and genome cross-species “multiz” alignments for selected representatives of alignments. Starting with the human CLDN18 protein sequences to mammals (Pan troglodytes, Macaca mulatta, Mus musculus, Rattus

Fig. 3. Conserved cell lineage specificity of CLDN18 isoforms. Tissue distribution of CLDN18 isoforms in different species was analyzed by (A) end point and by (B) quantitative RT- PCR. NTC; no template control. (C) Immunohistochemistry with FITC-labeled splice-variant 2 specific antibody IMAB362. Rituximab was used as negative control. (D) Immunohistochemical analysis of CLDN18 protein expression with anti-claudin-18/mid antiserum, which is reactive with both splice variants. (E) Epithelia from the antral region of the stomach were stained with anti-claudin-18/n-term antiserum. Ö. Türeci et al. / Gene 481 (2011) 83–92 89

Fig. 3 (continued).

norvegicus, Oryctolagus cuniculus, Canis familiaris) and vertebrates further upstream (Fig. 1B). We analyzed this region and found high (chicken, lizard, Xenopus tropicalis)(Fig. 1B, lower, “Multiz Align- conservation even to lizards and chicken. Moreover, sequence ments of 46 Vertebrates”). alignment shows that among the six primates and mammals, the Across the entire locus these analyses revealed higher conserva- CREB binding site TGACGTG is perfectly conserved, as is an unusually tion across primates, some conservation in mammals, and less high number of nucleotides (37 of 55) between the start of the CREB in chicken, lizard and frog. The intronic and intergenic regions of binding site and the transcription start site. chicken, lizard and frog have no sequence conservation to human. All Interestingly, the lung-specific isoform 1 of mouse CLDN18 has exons and thus the entire coding region, in contrast, show high levels been shown to be activated by T/EBP/NKX2.1 (Niimi et al., 2001), a of conservation, which extends to chicken, lizard and frog. For homeodomain-containing transcription factor that is restricted to example, between the human and rabbit coding regions of CLDN18.2, alveolar type II and Clara cells of the lung and to cells in brain and 243 of 261 aa residues (94%) are identical, 254 aa residues are similar thyroid. We found that the two T/EBP/NKX2.1 binding sites reported without any gaps. The protein coding region of CLDN18.2 exon 1 as being responsible for activation of this gene in lung cells are also contains 220 nucleotides. Between human and rabbit, 71 of 73 aa present in promoter regions of all investigated orthologs (Fig. 1C). residues are identical (97%). At the nucleotide level, there are no Moreover, the entire 280 bp sequence region proximal to the insertions or deletions in this alternative exon across human, transcription start is nearly identical for all orthologous genes. macaque, mouse, rat, dog and rabbit, and 81% of the nucleotides are identical. The CLDN18.1 first exon shows a similarly high level of conservation, with only four non-identical residues and no gaps 3.3. Protein sequences of CLDN18 orthologs between human and rabbit, for example. In summary, the CLDN18 isoforms are phylogenetically highly conserved with regard to gene Next, we compared protein sequences of orthologs from mouse, structure and nucleotide sequence. rat, rabbit, dog and primates for the splice variants of human CLDN18 (Fig. 2) and found an overall higher than 84% across orthologs of each isoform. 3.2. Promoter regions of CLDN18 orthologs Overall sequence identity between human CLDN18.1 and CLDN18.2 is 91%, with the differences restricted to the N-terminus We have previously reported that the activation of human that is encoded by exons 1a and 1b. The 73 amino acid sequence CLDN18.2 depends on the binding of the transcription factor cyclic encoded by CLDN18.2 exon 1a is about 95% identical across the six AMP–responsive element binding protein (CREB) to its unmethylated species, as is the N-terminal sequence region encoded by CLDN18.1 consensus site within a CpG island (Sahin et al., 2008). To determine exon 1b. whether this mechanism plays a role for regulation of other CLDN18.2 Both splice variants of human CLDN18 harbor four hydrophobic orthologs, we fragmented the CLDN18.2 promoter along lines of transmembrane (TM) domains, a short intracellular N-terminal se- evolutionary conservation. A well-conserved 180 bp hypothetical quence, two extracellular loops (ECL) of 73 and 23 aa and a cytoplasmic core promoter exists immediately upstream of the transcription start C-terminal sequence (71 aa). This overall structure and the framing of site and the overall sequence conservation shows a sharp decline aforementioned elements, which dictate the tetraspanin membrane 90 Ö. Türeci et al. / Gene 481 (2011) 83–92

Fig. 3 (continued).

topology of CLDN18, are conserved in all analyzed orthologs (Fig. 2). narrowing of the paracellular cleft and has a holding function between Only in the mouse sequence is a four amino acid insertion, located in the the opposing cell membranes. When we compared human CLDN18 second extracellular domain. splice variants, we found a considerable difference in the total number Most interestingly, other essential sequence elements were also and distribution of negatively and positively charged residues in ECL1 of represented in orthologs. These include the consensus claudin family CLDN18.1 and CLDN18.2, suggesting differences in selective ion signature [GN]-L-W-x(2)-C-x(7,9)-[STDENQH]-C located in the first permeability (Fig. 2). Interestingly, in orthologs of each splice variant ECL, which harbors two cysteines (aa 52 and 63) that potentially form the pattern of charged amino acids was found to be conserved. an intramolecular disulfide bond to stabilize protein conformation Altogether, this data implies conservation of the cellular function (Angelow and Yu, 2009). Moreover, the two conserved cysteines at of CLDN18 variants in mammals and primates. positions 103 and 193, which serve as palmitoylation sites, as well as the C-terminal PDZ binding motif, which mediates direct interaction 3.4. Tissue distribution of CLDN18 orthologs with tight junction-associated cytoplasmic proteins such as ZO-1, -2, and -3 or multi-PDZ domain protein (MUPP-1), are also conserved. To determine experimentally, whether cell type specificity of The larger first ECL of claudin family members affects the CLDN18.2 expression has evolutionary evolved, we assayed a compre- paracellular tightness and the selective ion permeability (Krause et al., hensive set of more than 30 different tissue and organ types in six 2009) whereas the shorter second ECL is thought to be responsible for mammals. Ö. Türeci et al. / Gene 481 (2011) 83–92 91

CLDN18.2 transcript-specific expression was analyzed both by end reagents. A selection of IHC data obtained with CLDN18 splice variant point (Fig. 3A) as well as quantitative RT-PCR (Fig. 3B). Multiple samples 2 specific antibodies is summarized in Table 2 and exemplified in in up to 34 different tissue types per species were investigated (Table 2). Fig. 3C. Both clearly confirm restriction to gastric tissue. The isoform- We classified CLDN18.2 as present when the signal was reproducible cross-reactive antibody anti-CLDN18/mid, in contrast, additionally and greater than 1/10 of the maximum signal obtained in gastric stained epithelial cells of lung tissue (Fig. 3D). mucosa. Meta-analysis of these data showed expression to be restricted We previously reported that in human stomach mucosa CLDN18.2 to gastric mucosa in the primates and mammals we studied. Splice staining is observed in differentiated exocrine and endocrine cells of variant 1 of CLDN18, in contrast, was transcriptionally activated only in gastric glands and not in cells of the neck zone harboring gastric stem lung tissue of all species. cells (Sahin et al., 2008). We found a similar zonated staining pattern Exclusive expression of CLDN18.2 in stomach was confirmed on in monkey and mouse stomach tissue sections (Fig. 3E). protein level by screening up to 16 different tissues per species by Conservation of tissue expression pattern is in line with our immunohistochemistry. For human, monkey and mouse tissues three finding that promoter gene regions and sites essential for transcrip- different antibodies were used to confirm data with independent tional regulation of CLDN18 are highly homologous in all orthologs.

Table 2 Expression of CLDN18.2 transcript and protein orthologs as assayed by RT-PCR and immunohistochemistry. The following splice variant 2 specific antibodies were used for IHC data summarized in this table (and others as independent confirmation for data not shown here or exemplified in Fig.3): Human and monkey tissues were stained with anti-CLDN18/n- term (FFPE), muMab362 and IMAB362-FITC (cryo) and data were combined in the table. Rabbit tissue was stained with muMAB362, whereas dog, mouse and rat tissues were stained with IMAB362 (cryo).

Human Monkey Dog Rabbit Rat Mouse

RT-PCR IHC RT-PCR IHC RT-PCR IHC RT-PCR IHC RT-PCR IHC RT-PCR IHC

Abdominal organs Esophagus 0/6 0/5 0/2 0/2 0/1 0/1 1/1 0/1 0/1 0/1 0/2 0/1 Stomach 10/11 16/16 2/2 3/3 1/1 1/1 1/1 1/1 1/1 1/1 2/2 1/1 Intestine – 0/3 0/1 0/5 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Colon 0/3 0/13 0/2 0/2 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Liver 0/4 0/4 0/2 0/3 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Pancreas 0/5 0/4 0/2 0/2 0/1 0/1 0/1 0/1 0/1 0/1 0/1 0/1 Gall bladder – 0/3 0/2 0/2 ––––––0/2 –

Urogenital organs Breast 0/5 0/7 0/2 0/2 ––0/1 0/1 – 0/1 –– Ovary 0/5 0/7 0/1 0/1 ––0/1 0/1 0/1 0/1 0/1 0/1 Kidney 0/4 0/13 0/2 0/3 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Testis 0/4 0/3 0/1 0/1 ––0/1 0/1 0/1 0/1 0/1 0/1 Cervix 0/2 0/2 0/1 –– –– –– –– – Placenta 0/3 0/3 –––––––––– Uterus 0/4 0/1 0/1 0/1 ––0/1 0/1 0/1 – 0/1 Fallopian tube ––0/1 0/1 ––––– –– Urinary bladder 0/5 0/3 0/2 0/2 ––0/1 0/1 0/1 0/1 0/2 0/1 Ureter ––0/2 0/2 ––0/1 0/1 0/1 0/1 –– Prostate 0/5 0/6 0/1 0/1 ––0/1 0/1 0/1 0/1 – 0/1 Endometrium – 0/3 ––––––––––

Cardiopulmonary system Lung 0/10 0/5 0/2 0/3 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Heart 0/5 0/3 0/1 0/2 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Endothelia (aorta) – 0/2 0/2 0/2 0/1 0/1 0/1 0/1 0/1 0/1 – 0/1

Nervous system Total brain 0/5 –– –0/1 0/1 – 0/1 0/1 –– – Cerebellum 0/1 0/3 0/2 0/2 – 0/1 ––0/1 0/1 0/2 0/1 Cerebrum – 0/2 0/2 0/2 ––––––0/2 – Pituitary gland 0/1 0/3 0/2 0/2 –––––––– Cerebral cortex 0/1 0/4 0/2 0/2 0/1 –– –0/1 –– – Spinal cord ––0/1 0/2 ––––0/1 0/1 – 0/1 Retina 0/1 0/3 0/2 0/2 ––0/1 0/1 0/1 0/1 – 0/1 Inner ear ––0/2 0/2 –– –0/1 – 0/1 0/1

Hematopoietic system Bone marrow 0/1 0/5 1/2 0/2 – 0/1 ––0/1 0/1 – 0/1 Lymph nodes 0/5 0/3 0/4 0/4 ––––0/1 0/1 0/2 0/1 Thymus 1/2 – 0/2 0/2 ––0/1 0/1 0/1 0/1 0/2 0/1 Tonsil 0/2 ––––– ––––– Spleen 0/3 0/3 0/2 0/3 ––0/1 0/1 0/1 0/1 0/2 0/1 Blood cells 0/5 –– 0/2 – 0/1 0/1 0/1 0/1 0/1 0/1

Others Adrenal gland – 0/3 0/2 0/2 – 0/1 ––0/1 0/1 –– Salivary gland – 0/3 –––––––––– Sceletal muscle 0/4 0/3 – 0/3 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 Thyroid 0/3 0/3 0/2 0/2 ––0/1 0/1 0/1 0/1 – 0/1 Skin 0/3 0/3 0/1 0/2 0/1 0/1 0/1 0/1 0/1 0/1 0/2 0/1 92 Ö. Türeci et al. / Gene 481 (2011) 83–92

4. Discussion References

fi Anderson, J.M., 2001. Molecular structure of tight junctions and their role in epithelial The ndings described in this study are of relevance for two areas transport. News Physiol. Sci. 16, 126–130. of investigation. Angelow, S., Yu, A.S., 2009. Structure-function studies of claudin extracellular domains One is the analysis of gene expression evolution, of which the by cysteine-scanning mutagenesis. J. Biol. Chem. 284, 29205–29217. Blanchette, M., Kent, W.J., Riemer, C., Elnitski, L., Smit, A.F., Roskin, K.M., Baertsch, R., underlying patterns are poorly understood. The advancement of high- Rosenbloom, K., Clawson, H., Green, E.D., Haussler, D., Miller, W., 2004. Aligning throughput technologies for oligonucleotide microarrays and se- multiple genomic sequences with the threaded blockset aligner. Genome Res. 14, quencing have enabled characterization of the expression of thou- 708–715. – sands of genes simultaneously, opening the door for studies into the Brazma, A., Vilo, J., 2000. Gene expression data analysis. FEBS Lett. 480, 17 24. Farquhar, M.G., Palade, G.E., 1963. Junctional complexes in various epithelia. J. Cell Biol. general principles of gene expression evolution (Brazma and Vilo, 17, 375–412. 2000; Holter et al., 2001). Reported findings with regard to disparity Fujita, H., Chiba, H., Yokozaki, H., Sakai, N., Sugimoto, K., Wada, T., Kojima, T., Yamashita, of orthologs, however, are ambiguous (Liao and Zhang, 2006; Yanai T., Sawada, N., 2006. Differential expression and subcellular localization of claudin- 7, -8, -12, -13, and -15 along the mouse intestine. J. Histochem. Cytochem. 54, et al., 2004). There are in fact genes which, consistent with common 933–944. sense, are conserved across species with regard to coding sequence, Furuse, M., Fujita, K., Hiiragi, T., Fujimoto, K., Tsukita, S., 1998. Claudin-1 and -2: novel function as well as tissue distribution (Li et al., 1997). Others in integral membrane proteins localizing at tight junctions with no sequence similarity to occludin. J. Cell Biol. 141, 1539–1550. contrast, differ from their orthologous counterpart largely and are f.e. Gonzalez-Mariscal, L., Betanzos, A., Nava, P., Jaramillo, B.E., 2003. Tight junction activated in entirely different tissues (Pao et al., 2006). Most likely, proteins. Prog. Biophys. Mol. Biol. 81, 1–44. sequence and expression profile of different categories of genes Grunwald, C., Koslowski, M., Arsiray, T., Dhaene, K., Praet, M., Victor, A., Morresi-Hauf, A., Lindner, M., Passlick, B., Lehr, H.A., Schafer, S.C., Seitz, G., Huber, C., Sahin, U., evolve under different phylogenetic constraints and there is no Tureci, O., 2006. Expression of multiple epigenetically regulated cancer/germline general rule. Claudins are particularly interesting model genes, as they genes in nonsmall cell lung cancer. Int. J. Cancer 118, 2522–2528. have to fulfill two seemingly contradictory features. Being building Holter, N.S., Maritan, A., Cieplak, M., Fedoroff, N.V., Banavar, J.R., 2001. Dynamic modeling of gene expression data. Proc. Natl. Acad. Sci. U. S. A. 98, 1693–1698. blocks of tight junctions and thus conferring essential housekeeping Kent, W.J., Sugnet, C.W., Furey, T.S., Roskin, K.M., Pringle, T.H., Zahler, A.M., Haussler, D., functions, claudins have to be expressed throughout all epithelial 2002. The human genome browser at UCSC. Genome Res. 12, 996–1006. cells regardless of their lineage. On the other hand, their house Klamp, T., Schumacher, J., Huber, G., Kuehne, C., Meissner, U., Selmi, A., Hiller, T., Kreiter, fi keeping function has to manifest itself in a tissue-specific way (f.e. S., Tuereci, O., Sahin, U., 2011. Highly speci c auto-antibodies against claudin-18 isoform 2 induced by a chimeric HBcAg virus-like particle vaccine kill tumor cells distinct ion and solute permeabilities). These requirements are met by and inhibit growth of lung metastases. Cancer Res., accepted for publication. a combinatorial claudin code, as claudins constitute a family with Kollmar, R., Nakamura, S.K., Kappler, J.A., Hudspeth, A.J., 2001. Expression and multiple members, which occur in various tissue-specific combina- phylogeny of claudins in vertebrate primordia. Proc. Natl. Acad. Sci. U. S. A. 98, 10196–10201. tions. Our data implies with CLDN18 as a model that specialization of Koslowski, M., Sahin, U., Huber, C., Tureci, O., 2006. The human X chromosome is members of such gene families occurs early in evolution of mammals. enriched for germline genes expressed in premeiotic germ cells of both sexes. Hum. The second area impacted by our data is drug development. Mol. Genet. 15, 2392–2399. Krause, G., Winkler, L., Mueller, S.L., Haseloff, R.F., Piontek, J., Blasig, I.E., 2008. Structure CLDN18.2 is frequently activated in various cancer types, is absent and function of claudins. Biochim. Biophys. Acta 1778, 631–645. from indispensible and vital normal organs, is expressed on the cell Krause, G., Winkler, L., Piehl, C., Blasig, I., Piontek, J., Muller, S.L., 2009. Structure and surface and belongs to a gene family, which is involved into tumor- function of extracellular claudin domains. Ann. N.Y. Acad. Sci. 1165, 34–43. Li, J., Rhodes, J.C., Askew, D.S., 1997. Evolutionary conservation of putative functional promoting cellular functions. All this makes CLDN18.2 an attractive domains in the human homolog of the murine His-1 gene. Gene 184, 169–176. target for antibody based therapies. Two drug development programs Liao, B.Y., Zhang, J., 2006. Evolutionary conservation of expression profiles between are underway. One features iMAB362, a recombinant chimeric human and mouse orthologous genes. Mol. Biol. Evol. 23, 530–540. fi Mitic, L.L., Van Itallie, C.M., Anderson, J.M., 2000. Molecular physiology and monoclonal antibody against CLDN18, which is highly speci c for pathophysiology of tight junctions I. Tight junction structure and function: lessons splice variant 2 (Sahin et al., 2008). This antibody has been shown to from mutant animals and proteins. Am. J. Physiol. Gastrointest. Liver Physiol. 279, execute potent antitumoral activity in vitro and in animal models G250–G254. (manuscript in preparation) and is currently in clinical development. Morita, K., Furuse, M., Fujimoto, K., Tsukita, S., 1999. Claudin multigene family encoding four-transmembrane domain protein components of tight junction strands. Proc. The other uses virus-like particles decorated with the extra-cellular Natl. Acad. Sci. U. S. A. 96, 511–516. loop of CLDN18.2 as vaccine to induce immune responses recognizing Niimi, T., Nagashima, K., Ward, J.M., Minoo, P., Zimonjic, D.B., Popescu, N.C., Kimura, S., 2001. Claudin-18, a novel downstream target gene for the T/EBP/NKX2.1 home- CLDN18.2 expressing cancer cells (Klamp et al., 2011) and is currently fi fi odomain transcription factor, encodes lung- and stomach-speci c isoforms evaluated for the pre-clinical proof of ef cacy. through alternative splicing. Mol. Cell. Biol. 21, 7380–7390. According to regulatory guidelines development of such targeted Pao, S.Y., Lin, W.L., Hwang, M.J., 2006. In silico identification and comparative analysis drugs involves testing of their safety in a relevant animal species. of differentially expressed genes in human and mouse tissues. BMC Genomics 7, 86. Relevant animal species are those in which the test material is Rahner, C., Mitic, L.L., Anderson, J.M., 2001. Heterogeneity in expression and subcellular pharmacologically active (ICH S6 CPMP/ICH/302/95). Minimal require- localization of claudins 2, 3, 4, and 5 in the rat liver, pancreas, and gut. ments are that animals express the target epitope and demonstrate a Gastroenterology 120, 411–422. fi fi Sahin, U., Koslowski, M., Dhaene, K., Usener, D., Brandenburg, G., Seitz, G., Huber, C., similar tissue distribution pro le as humans. Our ndings show high Tureci, O., 2008. Claudin-18 splice variant 2 is a pan-cancer target suitable for conservation of the CLDN18.2 sequence, functional domains and therapeutic antibody development. Clin. Cancer Res. 14, 7624–7634. expression, with stomach as the only potential on-target toxicity relevant Siepel, A., Bejerano, G., Pedersen, J.S., Hinrichs, A.S., Hou, M., Rosenbloom, K., Clawson, H., Spieth, J., Hillier, L.W., Richards, S., Weinstock, G.M., Wilson, R.K., Gibbs, R.A., organ. Together with the notion that tight junction molecules have the Kent, W.J., Miller, W., Haussler, D., 2005. Evolutionarily conserved elements in same functional role across species necessary (though not sufficient) vertebrate, insect, worm, and yeast genomes. Genome Res. 15, 1034–1050. conditions for suitability of these mammals as relevant preclinical models Smetana, H., 1947. The permeability of the renal glomeruli of several mammalian – for safety pharmacology are fulfilled. species to labelled proteins. Am. J. Pathol. 23, 255 267. Stevenson, B.R., Anderson, J.M., Bullivant, S., 1988. The epithelial tight junction: structure, function and preliminary biochemical characterization. Mol. Cell. Acknowledgements Biochem. 83, 129–145. Yanai, I., Graur, D., Ophir, R., 2004. Incongruent expression profiles between human and mouse orthologous genes suggest widespread neutral evolution of transcription We thank Gunda Brandenburg and Dirk Usener for their assistance. control. OMICS 8, 15–24.