Oncogene (2003) 22, 4287–4300 & 2003 Nature Publishing Group All rights reserved 0950-9232/03 $25.00 www.nature.com/onc

Profiling, comparison and validation of gene expression in gastric carcinoma and normal

Karin A Oien*,1,2, J Keith Vass3, Ian Downie2, Grant Fullarton4 and W Nicol Keith1

1Cancer Research UK Department of Medical Oncology, Cancer Research UK Beatson Laboratories, University of Glasgow, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK; 2University Department of Pathology, Glasgow Royal Infirmary, 84 Castle Street, Glasgow G4 OSF, UK; 3Beatson Institute for Cancer Research, Cancer Research UK Beatson Laboratories, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK; 4Department of Surgery, Gartnavel General Hospital, 1053 Great Western Road, Glasgow G12 0YN, UK

Gastric carcinoma is the fourth most common cause of Introduction cancer death worldwide but its molecular biology is poorly understood. We catalogued the genes expressed in two Gastric carcinoma is a common tumour with a high gastric adenocarcinomas and normal stomach, using serial mortality (Fuchs and Mayer, 1995; Blok et al., 1997). analysis of gene expression (SAGE), and compared the Worldwide, it is the fourth most common cause of profiles on-line with other glandular epithelia. Candidates cancer death (Parkin, 2001). It ranks fifth in Scotland, were validated by Northern blotting and immunohisto- where the overall 5-year survival for gastric carcinoma is chemistry. A total of 29 480 transcripts, derived from only around 12%, which compares poorly with 44% for 10 866 genes, were identified. In all, 1% of the genes were colonic carcinoma (Harris et al., 1998). The reason is differentially expressed (Xfivefold difference plus P-value that most patients with gastric carcinoma in the Western p0.01) between cancers and normal stomach. The most world develop symptoms and are diagnosed when abundant transcripts included ribosomal and mitochon- disease is locally advanced, often with metastases drial proteins, of which most were upregulated in the (Roukos, 2000). Curative surgery is then no longer tumours, as were other widely expressed genes including possible, and effective adjuvant therapies are not yet transcription factors, signalling molecules (serine/threo- well-established. Much better survival figures of over nine protein kinases), beta 10 and collagenase I. 80% can be achieved, however, when the tumour is Transcripts abundant in normal stomach were functionally identified and treated at an earlier clinical stage important, including , immunoglobulin alpha, (Roukos, 2000). lysozyme, MUC5, pS2 and pepsinogens, which were Over 90% of cancers in the stomach are carcinomas among 55 gastric-specific genes. Many transcripts were (specifically, adenocarcinomas), but these are hetero- minimally characterized or new, some cancer-associated geneous (Fuchs and Mayer, 1995; Blok et al., 1997). genes reflected their intestinal morphology, and some Many classifications exist: the two most common normal gastric genes had previously been considered as subdivide tumours by anatomical location (Owen, pancreatic carcinoma markers. The gastric carcinoma 1986), into distal and proximal, and by histopathology, profiles resembled other tumours’, supporting the ex- into intestinal and diffuse cancers (Laure´ n, 1965). These istence of common cancer-associated targets. These data tumour subtypes differ in their epidemiology and provide a catalogue from which to develop markers for aetiology, as well as in their molecular biology. Most better diagnosis and therapy of gastric carcinoma. of the classic cancer-associated molecular alterations Oncogene (2003) 22, 4287–4300. doi:10.1038/sj.onc.1206615 have been found in gastric carcinomas (Hanahan and Weinberg, 2000), and some, such as loss of functional Keywords: stomach; stomach neoplasms; adenocarcino- p53 and telomerase activation, are common to all ma; gene expression profiling; serial analysis of gene tumour subtypes (Blok et al., 1997; Chan et al., 1999). ONCOGENOMICS expression (SAGE) Downregulation of cell adhesion molecules, such as cadherins, however, is associated mainly with diffuse cancers; E-cadherin mutations were recently the first germline changes to be identified as predisposing to (diffuse) gastric carcinoma (Guilford et al., 1998). Conversely, such abnormalities are uncommon in tumours of intestinal type, which in molecular terms *Correspondence: KA Oien, Cancer Research UK Department of resemble colonic carcinomas so frequently containing Medical Oncology, Cancer Research UK Beatson Laboratories, adenomatous polyposis coli (APC) mutations (Chan University of Glasgow, Garscube Estate, Switchback Road, Glasgow G61 1BD, UK; E-mail: [email protected] et al., 1999). Received 13 August 2002; revised 21 March 2003; accepted 24 March Such classical molecular studies of gastric carcinoma 2003 have been extremely fruitful, but have of necessity Gene expression in gastric carcinoma and normal stomach KA Oien et al 4288 focused on small numbers of genes. Changes in the propriate tags (resulting from duplicate ditags or linker global pattern of gene transcription may be equally sequences), we obtained a total of 29 480 tags, or significant, and technological advances now permit their transcripts, derived from 10 866 different genes (unique investigation (Carulli et al., 1998). Gastric carcinoma tags). The libraries from tumour 1, tumour 2 and has been studied using differential display, Clontech normal stomach contained 10 222, 10 825 and 8433 tags, cDNA membrane arrays and cDNA microarrays (Ebert respectively (Table 1). To permit direct comparison, et al., 2000; Jung et al., 2000; Yoshikawa et al., 2000; El- each library was normalized to a total of 10 000 tags (at Rifai et al., 2001; Wang et al., 2001; Hippo et al., 2002; which level, a tag present 10 times has an abundance of Mori et al., 2002). However, we chose to use serial 0.1%). The normalized, rather than absolute, tag counts analysis of gene expression (SAGE), which produces are used henceforth. comprehensive, quantitative and reproducible gene expression profiles (Velculescu et al., 2000). Global analysis of gastric SAGE libraries SAGE is based on generating clones of concatenated (linked) short sequence tags derived from mRNA from Table 1 presents a global analysis of the three SAGE the target tissue (Velculescu et al., 1995; Zhang et al., libraries. The most common tag was CTCCCCCAA, 1997). Each tag is 9 or 10 bp long and represents a single which was found 781 times (7.8%) in normal stomach mRNA transcript; each clone insert contains up to 40 and represents immunoglobulin alpha (IgA). Only tags joined together. Sequencing of multiple concate- around seven genes were expressed at levels of 1% or nates therefore efficiently describes the pattern and over in each library. In contrast, the vast majority (97– numbers of genes expressed. The mRNA transcript 98%) of transcripts, including the classical house- corresponding to the short SAGE tag is identified via keeping genes, were present at levels in single figures genetic databases. As SAGE is labour-intensive and (below 0.1%). The global figures for each library were hence limited to small numbers of specimens, resulting similar. candidate genes are usually validated in a larger sample set. SAGE has already been applied to the other Global comparison of gastric carcinoma and normal common adenocarcinomas, of colon, , breast, SAGE libraries and prostate, and to squamous lung carcinoma and other cancers (Velculescu et al., 1995; Zhang et al., Differential expression between libraries was defined as 1997; Hibi et al., 1998; Lal et al., 1999; Nacht et al., a difference in tag ratio of fivefold or more, combined 1999; Velculescu et al., 1999; Hough et al., 2000; with a P-value of 0.01 or less. When tags from the two Waghray et al., 2001) but, to our knowledge, no SAGE tumours were pooled and compared with normal data for gastric carcinoma have yet been published in stomach, 106 tags were differentially expressed (0.97% the literature. of the total genes in the three libraries), of which 47 were Since the most common subtype of gastric carcinoma higher in the tumours and 59 were higher in normal is distal and intestinal (Fuchs and Mayer, 1995), we have stomach. These genes are listed in Table 2. When generated SAGE profiles of two such tumours, along compared individually with normal stomach, 76 tags with one sample of normal gastric antral (distal) (1.09%) and 125 tags (1.53%) were differentially mucosa. From the resulting catalogues of gene expres- expressed in tumours 1 and 2, respectively, of which sion, we have highlighted abundant, differentially 85 tags were upregulated in either tumour. The two expressed and gastric-specific transcripts and have tumours themselves differed by 64 tags (0.75%). These validated selected genes in a wider panel of normal results are in keeping with their histopathology: tumour and tumour gastrointestinal tissues. 1 was better differentiated and of an earlier stage.

Global comparison of gastric libraries digitally with Results SAGE libraries from other glandular epithelial tissues and mesothelium Generation of gastric SAGE tag libraries The normal gastric SAGE library was compared SAGE libraries were created from two gastric adeno- digitally with normal breast, colonic, ovarian, pancrea- carcinomas, both of distal, intestinal subtype, and from tic and prostatic tissue and with normal mesothelium. In normal gastric antral mucosa. After excluding inap- total, 55 genes (0.65% of the total genes in the normal

Table 1 Summary of generation and overall analysis of SAGE tag libraries Library Sequences Tags Tags at X1% Tags at X0.5% Tags at X0.1% X Two tags Unique tagsa

Tumour 1 (T1) 825 10222 7 21 124 1008 4284 Tumour 2 (T2) 690 10825 6 16 115 1008 5350 Normal (N) 987 8433 7 19 107 822 3671

aHere we have simply given the number of unique tags obtained. Owing to inevitable sequencing errors, this is usually regarded as a slight overestimate of the true number of different genes expressed and a correction which removes around 7% of tags is sometimes applied (Velculescu et al., 1995; Zhang et al., 1997; Velculescu et al., 2000)

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4289 Table 2 Tags which were highly expressed, differentially expressed between gastric tumours and normal tissue, specifically expressed in the stomach or otherwise of interest

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4290

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4291

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4292

aWhere a tag is assigned to a gene, the tag clearly matches a well-characterized mRNA transcript. * indicates that gene expression has been investigatedby Northern blotting or immunohistochemistry. bThe two tumour and the normal samples are represented by T1, T2 and N, respectively. The absolute tag counts are normalized to 10 000 total tags per sample to permit direct comparison. Genes expressed highly, differentially or (tissue-)specifically in the tumours are shaded in mid grey, whereas the equivalents in normal stomach are shaded in light grey. Genes which are highly or (tissue-)specifically expressed in both tumour and normal libraries but which are not differentially expressed are shaded in black. cThe 20 most abundant (‘high’) tags in each library are listed. Ts indicates both tumours. dDifferential expression between the tumours and normal stomach was defined as a difference of fivefold or more combined with a P-value of 0.01 or less. ‘Pool’ indicates that the genes were differentially expressed in a comparison using the pooled tumours. ‘T either’ indicates that the tag was differentially expressed only in an individual comparison between one or other tumour and normal gastric antrum. eTissue-specific gene expression was defined as in note d

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4293 gastric library) were specifically overexpressed in normal Gastrin was expressed highly and only in the normal stomach. These genes are also listed in Table 2. The gastric antrum. The new gene CA11 was expressed pooled gastric adenocarcinomas were compared with highly and only in normal stomach, in all areas. Prostate SAGE tumour libraries of breast ductal carcinoma, stem cell antigen (PSCA) was indeed present in normal ovarian serous adenocarcinoma, colonic, pancreatic and gastric mucosa and in four of the eight gastric tumours. prostatic adenocarcinomas, and malignant mesothelio- Lipocalin 2 (neutrophil gelatinase-associated lipocalin) ma of epithelial type. A total of 20 genes (0.23% of the was present in normal antrum and the two SAGE total genes in the pooled gastric tumour libraries) were tumours, although its relative levels differed slightly specifically overexpressed in gastric carcinoma. Eight from those expected. gastric-specific genes were common to both tumour and As predicted, intestinal trefoil factor was identified in normal comparisons. Of the 47 tags more highly seven of the eight gastric tumours, and in normal colon expressed in the pooled tumours than in normal and its adenocarcinoma. Of the widely expressed genes, stomach, only five were also gastric-specific. In contrast, thymosin beta 10 was indeed upregulated in the gastric of the 59 genes more highly expressed in normal and cell lines, but was also abundant in the tumour and stomach compared with the pooled tumours, 33 were normal oesophagus and colon. Id1 was highly expressed also gastric-specific. in the two SAGE tumours, as predicted, but was present at similar levels in normal stomach, and was in fact Investigation of individual SAGE tags lower in the other gastric tumours and, perhaps unexpectedly, absent from the three cell lines. Prothy- Table 2 contains the tags differentially expressed mosin alpha tended to be more highly expressed in the between the gastric carcinomas and normal stomach, tumours than in the normal samples but its relative plus the gastric-specific tags. The 20 most abundant tags levels in the SAGE samples were rather low. Overall, in each library are also listed and included many Northern blotting showed significant differences in ribosomal and mitochondrial proteins, of which many expression levels between tumours, even of the same were more highly expressed in the tumours. Most other subtype. transcripts upregulated in the tumours are expressed in Heat-shock protein 90 alpha (Hsp90a) was included many cell types and will henceforth be termed ‘widely because of a tag, ACGCAGGGA, which was abundant expressed’. These included growth factors, signal trans- in the tumours but low in normal stomach. SAGEmap duction molecules, transcription factors, and on-line suggested a match to the Unigene cluster of genes involved in protein turnover and cellular invasion. Hsp90a. However, by Northern blotting, Hsp90a was Conversely, most cytoskeletal proteins were downregu- low and not differentially expressed (Figure 1). In fact, lated in the tumours. one of the three gene sequences in this cluster was a Most genes that were more highly expressed in chimaera composed partly of Hsp90a and partly of an normal stomach than in gastric carcinoma play a role unknown gene. The tag of interest corresponded only to in normal gastric function (Owen, 1986), and many were the latter, which remains uncharacterized; in this gastric-specific compared with other epithelial libraries. instance, the problem with validation arose from the The SAGE profiles were thus self-validating. Such genes Unigene databases rather than the SAGE profiles. included the gut gastrin, the antibacterial lysozyme, and gastric mucin (MUC5) and trefoil factors Immunohistochemistry pS2 and spasmolytic polypeptide that protect glandular epithelia. Some genes were only recently characterized: Better characterized genes for which antibodies were lipocalin 2, prostate stem cell antigen (PSCA), and a available were validated by IHC, which provides new gene designated CA11. Others had not been localization and identifies the gene’s protein product reported in the stomach, such as aquaporin 5. Many rather than its mRNA transcript. Nevertheless, IHC differentially or specifically expressed tags corresponded corroborated the SAGE profiles. Figure 2 shows the only to uncharacterized cDNAs (expressed sequence three SAGE samples stained for gastrin, lysozyme, pS2, tags or ESTs) or entirely lacked a match in the genetic spasmolytic polypeptide and MUC5. Table 3 lists the databases. IHC results for the full gastrointestinal panel. A few genes, such as the intestinal trefoil factor, were As predicted, lysozyme, pS2, MUC5 and spasmolytic upregulated in gastric carcinoma, but were not widely polypeptide were highly expressed in normal gastric expressed; instead, their normal location was the mucosa. Lysozyme, pS2 and MUC5 were present in all intestine. of the distal gastric carcinomas, of both histological subtypes, although their levels varied and were generally Northern blotting lower than in normal tissue. In the three SAGE samples, the relative IHC staining paralleled the mRNA profiles, To validate and expand the SAGE profiles, we studied with one exception: SAGE had predicted lysozyme to be selected transcripts in a wider panel of 19 gastrointest- absent from tumour 2, but although some areas were inal tumour and normal tissues and cell lines. Where the indeed negative by IHC (Figure 2), focal positive genes had been previously minimally characterized, staining was present elsewhere. Northern blotting for mRNA was used and broadly Spasmolytic polypeptide, which was absent or very corroborated the SAGE profiles (Figure 1). low in the SAGE tumours, was present in only two

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4294 tumours in the wider panel, neither of distal, intestinal Our data are internally consistent: the validation type. As predicted, gastrin was expressed only in normal studies corroborated the gastric SAGE profiles for 13 antrum, in the G-cells (Figure 2). All samples contained out of 15 transcripts. This correlation is at least as good keratin 8, as expected for simple glandular epithelia, as previous SAGE papers in which between 7 and 50% except for normal oesophagus and oesophageal squa- of differentially expressed tags were not corroborated mous carcinoma, which comprise stratified squamous (Zhang et al., 1997; Hibi et al., 1998; de Waard et al., epithelium. Overall, IHC staining in the tumours, but 1999;Waghray et al., 2001). Moreover, although mRNA not in the normal mucosa, was heterogeneous, both abundance can be a poor indicator of protein levels within and between cancers, even those of the same (Pradet-Balade et al., 2001), we found that, for a given subtype. gene, the relative level of mRNA in different samples correlated well with the relative protein level as gauged by immunohistochemical staining (Figure 2). However, Discussion while gene expression in normal tissue from one site was uniform, it varied significantly within and between These are the first global profiles of gene expression in tumours, even of the same type, at both mRNA and gastric carcinoma and normal stomach created using protein levels (Figures 1 and 2 and Tables 2 and 3); such SAGE. We have generated two libraries of gastric tumour heterogeneity is a well-recognized phenomenon adenocarcinoma of distal, intestinal type, and one (Zhang et al., 1997; Hibi et al., 1998; Lal et al., 1999; library of normal gastric antral mucosa. Numerous Perou et al., 2000). transcripts have been identified which are: highly Each of our three SAGE gene expression libraries expressed, differentially expressed between normal yielded around 10 000 tags, or transcripts, derived from and tumour stomach, or gastric-specific by com- an average of 4300 genes. These numbers are slightly parison with normal and tumour breast, colon, ovary, smaller than those in other recent SAGE publications, pancreas, prostate and mesothelium. Selected genes with upwards of 20 000 tags (Zhang et al., 1997; Hibi have been validated in a wider panel of 19 gastro- et al., 1998; Nacht et al., 1999; Velculescu et al., 1999; intestinal tissues by Northern blotting and immuno- Hough et al., 2000; Waghray et al., 2001). However, the histochemistry. global distribution of genes in our gastric libraries is

Figure 1 Northern blotting for selected genes identified by SAGE. RNA was isolated from 19 gastrointestinal tumour and normal tissues and cell lines, as indicated along the top row, of which further details are listed in Table 3

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4295

Figure 2 IHC for selected genes identified by SAGE. A range of 19 gastrointestinal tumour and normal tissues and cell lines was stained, as detailed in Table 3, but these photomicrographs depict representative areas from the two distal, intestinal tumours (T1 and T2) and normal stomach (antrum) (N) which were subjected to SAGE. The magnification is the same throughout. The number in the bottom left-hand corner is the normalized SAGE tag count for comparison. Spasm polypep indicates human spasmolytic polypeptide (TFF2) similar to these previous SAGE profiles, as are the Genes specifically overexpressed in gastric carcinoma underlying library creation statistics, with only around compared with other adenocarcinomas and malignant seven transcripts expressed at levels of 1% or more, and mesothelioma were fewer in number: 20 (0.24%). over 97% present at levels of 0.1% or less. Comparable data for tissue-specific genes in tumours Only 1% of the transcripts were differentially are difficult to find. However, extrapolation from a expressed between the pooled gastric carcinoma libraries study using Digital Differential Display, one of NCBI’s and normal stomach. This figure is similar to previous data-mining tools, comparing adenocarcinomas of SAGE papers comparing other adenocarcinomas with breast, colon, lung, ovary, pancreas and prostate, their normal tissues (Zhang et al., 1997; Hibi et al., 1998; suggests that the number of tissue-specific genes in each Nacht et al., 1999; Velculescu et al., 1999; Hough et al., tumour type is around 16 (Scheurle et al., 2000), 2000; Waghray et al., 2001). When compared individu- again consistent with our figure. That there were ally with normal stomach, the moderately differentiated fewer tissue-specific genes in gastric carcinoma gastric carcinoma yielded more differentially expressed than in normal stomach is in keeping with the tumours tags (T2: 125 tags) than the well-differentiated tumour reverting to a less specialized, that is, less differentiated (T1: 76 tags), so the molecular pathology agrees with the state. morphology. The 99% of transcripts which were there- The most abundant transcripts in all three gastric fore expressed at similar levels in normal and tumour SAGE libraries included ribosomal and mitochondrial tissues would presumably be unlikely to make good genes, and these were generally upregulated in the diagnostic or therapeutic targets, unless of course the tumours. This agrees with the previous global expression resulting protein differed in its abundance, stability or profiles and the comparisons of cancers and the functional state. corresponding normal tissues in general (Zhang et al., In total, 55 gastric-specific genes (0.65%) emerged 1997; Hibi et al., 1998; de Waard et al., 1999; Velculescu from the comparison of normal stomach with other et al., 1999; Perou et al., 2000), and of gastric carcinoma normal glandular epithelia (breast, colon, ovary, pan- and normal stomach in particular (Salesiotis et al., 1995; creas and prostate) and mesothelium. In Velculescu Ebert et al., 2000; Jung et al., 2000; Yoshikawa et al., et al’s (1999) paper on the ‘human transcriptome’, 2000; El-Rifai et al., 2001; Wang et al., 2001; Hippo estimates of the proportion of tissue-specific genes in et al., 2002; Mori et al., 2002). Again, this high normal samples varied from 0.09% (keratinocytes) to requirement for energy generation and protein produc- 1.76% (colon). Our figure is in keeping with this range. tion makes biological sense.

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4296 Most other transcripts upregulated in the tumours nin, profilin, cofilin and gelsolin, which contribute to the were ‘widely expressed’ and many had classical proto- microfilament cytoskeleton (Janke et al., 2000). oncogenic functions. Only five of the 47 tags upregu- This altered expression pattern is shared with other lated in the pooled tumours were gastric-specific, carcinomas, and indeed functional assays suggest that compared with 33 of 59 tags more highly expressed in many cytoskeletal proteins act as tumour suppressor normal stomach, again in keeping with cellular de- genes (Janke et al., 2000). differentiation during malignant transformation. Of the Most genes that were more highly expressed in transcripts further investigated, Id1 was expressed in the normal stomach play a role in gastric function, and gastric carcinomas, and at lower levels in normal many were also gastric-specific. The stomach acts as a stomach. Id1 is a member of the ID (inhibitor of reservoir for food and mechanically churns and mixes it differentiation) family of proteins, which inhibit basic with gastric juice containing hydrochloric acid which is a helix–loop–helix transcription factors; high Id1 levels sterilizing agent, denatures proteins and activates correlate with a more aggressive phenotype in breast digestive enzymes, the protease pepsin, intrinsic factor, carcinoma (Norton, 2000). Id1’s upregulation in and gastric lipase (Owen, 1986). There is early absorp- the SAGE tumour samples is thus predictable, but the tion of water, ions, short-chain fatty acids and alcohol. similar levels in normal stomach, lower levels in the The mucosal lining must be protected from proteolytic other gastric carcinomas and absence from the cell lines and acid attack, and the activity of the stomach must be are surprising. Also upregulated in the tumours were coordinated with the rest of the gut through neurohor- thymosin beta 10 and prothymosin alpha. Thymosins monal mechanisms. One such gut hormone is gastrin, are small proteins that were originally isolated from the which we found to be gastric-specific and present only in (Huff et al., 2001). Prothymosin alpha is now normal antral mucosa (Figures 1 and 2). This is as thought to play a role in cell proliferation, and its expected: gastrin is secreted by antral G cells in response upregulation in gastric cancer has been previously to food entering the stomach and stimulates the gastric reported (Mitani et al., 2000). Beta thymosins bind body mucosa to secrete acid and pepsinogens (Owen, monomeric (globular) actin and thymosin beta 10 is 1986); the latter were also gastric-specific. Gastrin is known to be overexpressed in carcinomas compared only occasionally identifiable in gastric adenocarcino- with normal tissues, although the stomach has not mas although it is a major product of gastric carcinoid previously been studied (Huff et al., 2001). Thymosins (neuroendocrine) tumours (Berner and Nesland, 1991). represent interesting candidate targets, against which Defence against microorganisms is an important role drugs are already in development (Huff et al., 2001). The of the gut (Owen, 1986). Immunoglobulins, especially well-recognized phenomenon of tumour heterogeneity IgA splice variants, and the antibacterial proteins thus leads to far more variable levels of expression lysozyme and lipocalin 2 were abundant in normal between samples for tumour-associated genes than for stomach (the SAGE sample had a chronic gastritis) and those transcripts more abundant in normal tissue. were mostly downregulated in the tumours, and gastric- Other genes upregulated in the tumours also have specific. IgA and lysozyme are well-characterized and well-recognized roles in carcinogenesis, including known to be highly expressed in gastritis and in growth factors (hepatocyte growth factor); their ligands carcinomas of intestinal type, especially when well (fibroblast growth factor receptor); and signal transdu- differentiated as in tumour 1 (Isaacson, 1982). Lipoca- cers including guanylate kinase and putative serine/ lins bind small lipophilic molecules, including bacterial- threonine protein kinases. The last are interesting since derived lipopolysaccharides; lipocalin 2 has been they and the related tyrosine kinases are often over- reported in normal glandular epithelia but not, until expressed in cancers, may be relatively tissue-specific, now, in cancers (Friedl et al., 1999). and have proven to make excellent therapeutic targets, The mucin MUC5 and trefoil factors pS2 and human for drugs such as Glivec (STI571) (Blume-Jensen and spasmolytic polypeptide were similarly highly expressed Hunter, 2001). Genes involved in invasion and metas- in normal stomach, downregulated in the tumours, tasis, such as collagenase I, were similarly overex- especially in the less well-differentiated Tumour 2, and pressed. Our gene expression profiles are consistent gastric-specific. However, although pS2 emerges as with the general cancer literature (Hanahan and specific to stomach compared with other normal tissues, Weinberg, 2000), and with the cDNA array studies of it is also overexpressed in breast and other adenocarci- gastric tumours, in which upregulated transcripts nomas (Wong et al., 1999). Mucins are high molecular include growth factors and genes involved in the cell weight glycoproteins, and trefoil factors (TFFs) are cycle, adhesion and invasion (El-Rifai et al., 2001; small resistant to acid, enzymes and heat Hippo et al., 2002; Mori et al., 2002). This supports the (Wong et al., 1999; Machado et al., 2000). Both are existence of common molecular targets in cancers for normally coexpressed in a site-specific manner in diagnosis and therapy. glandular epithelia (Figure 2), where they act synergis- Unlike other widely expressed genes, most cytoskele- tically to protect and repair the mucosal surface. In a tal proteins were highly expressed in normal stomach previous study, 60% of 96 gastric carcinomas expressed and downregulated in the tumours, including keratin 8, pS2 and MUC5, but only 10% contained spasmolytic an intermediate filament specific to simple glandular polypeptide (Machado et al., 2000), in keeping with our epithelia; desmoplakin, a major component of the findings. Moreover, in a gene knockout model, mice desmosome, the intercellular junction; and alpha acti- lacking pS2 developed hyperplasia, dysplasia and

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4297 Table 3 Immunohistochemistry for selected genes in a range of gastrointestinal tissues Tissue or cell samplea MUC5b pS2Spasm polypep c Lysozyme

Gastric tumour 1 distal, intestinal + +++ À +++ Gastric tumour 2 distal, intestinal ++ +/ÀÀ ++ Gastric tumour 3 distal, intestinal +/À + À + Gastric tumour 4 distal, diffuse ++ + À ++ Gastric tumour 5 distal, diffuse ++ + +++ +++ Gastric tumour 6 proximal, intestinal and solid + + + + Gastric tumour 7 proximal, diffuse and intestinal À + ÀÀ Gastric tumour 8 proximal, mixed ÀÀ À +/À AGS gastric adherent cells + ++ À +++ KATO-III gastric spherical cells À +/ÀÀ ++ MKN-45 gastric spherical cells À + À ++ OE-19 oesophageal adherent cells À + À ++ Stomach normal antrum +++ +++ +++ +++ Stomach normal body +++ +++ +++ ++ Stomach normal cardia +++ ++ +++ +++ Oesophagus normal squamous mucosa ÀÀ À + Oesophagus squamous carcinoma ÀÀ À À Small intestine normal ÀÀ À ++ Colon normal mucosa À +/ÀÀ À Colon Adenocarcinoma + +++ À ++ aThe gastric tumours are described in detail in Table 4. All four cell lines are derived from adenocarcinomas. bThe + and À symbols are used to indicate the intensity of the staining, where À means no staining, +/À means focal weak staining and +++ means the strongest positive staining. Figure 2 shows immunohistochemical staining in the tissues used for SAGE. cSpasm polypep indicates human spasmolytic polypeptide (TFF2) carcinoma of the stomach, which suggests that pS2 may Northern blotting which confirmed the SAGE results. act as a gastric-specific tumour suppressor gene (Le- Simultaneously, the gene was identified by differential febvre et al., 1996). It is possible that some of the less display by a Japanese group who designated it CA11 well-characterized genes identified in this study may play and reported the same expression pattern (Yoshikawa a similar role. et al., 2000). CA11 is one of the most common PSCA was expressed in normal stomach and down- transcripts in normal stomach, which suggests that it is regulated in gastric carcinomas. This agrees with a functionally important, and demonstrates the power of recent paper characterizing PSCA, which was first mRNA profiling in the identification of novel genes. identified through its overexpression in One gene that was expressed in the gastric carcino- (Bahrenberg et al., 2000). (PSCA is different from mas, and absent from normal stomach, was thought prostate-specific antigen (PSA)). Its function is un- until recently to be specific to ovarian and mesothelial known, but a role in cell adhesion has been proposed tissues. Mesothelin is a glycoprotein which may function (Bahrenberg et al., 2000). in cell adhesion (Chang and Pastan, 1996). It is present Another major function of the gut is fluid and ion in normal mesothelium and malignant mesothelioma transport. Aquaporin 5 was more highly expressed in (Chang and Pastan, 1996), and is also highly expressed normal antrum and Tumour 1; it has not previously in ovarian carcinomas compared with nontransformed been identified in the stomach but is found in the ovarian epithelium and with other adenocarcinomas salivary glands, and is one of a family of integral (Hough et al., 2000; Scheurle et al., 2000). Mesothelin membrane proteins which act as water channels (Ma has thus been proposed as an ovarian/mesothelial- and Verkman, 1999). Proteolipid protein 2 (colonic specific marker, but this is contradicted by our study epithelium-enriched A4 protein) had a similar gastric and others: mesothelin has recently been identified in expression pattern; it shows features of an ion channel gastric cancer tissues and cell lines (El-Rifai et al., 2001; (Breitwieser et al., 1997), and although not previously Hippo et al., 2001); in the latter, its expression is described in the stomach or in humans, proteolipid associated with peritoneal metastasis (Hippo et al., protein 2 is abundant in gastric EST libraries in Unigene 2001). This is not surprising given mesothelin’s normal (Wheeler et al., 2001). location, and renders it a candidate therapeutic target. Most transcripts more highly expressed in normal Some genes expressed in the gastric carcinomas and stomach (Table 2) were thus involved in gastric absent from normal stomach reflected the histological function. In some cases, the genes were well character- ‘intestinal’ tumour phenotype (Laure´ n, 1965). This ized, and for others their role could be inferred from the expression pattern was shown by intestinal TFF, the existing data on gastrointestinal physiology. However, third TFF, which is normally found in the small we also identified known genes with unexpected expres- intestine and colon (Wong et al., 1999); sulphotransfer- sion patterns, as well as many new genes. For example, ase 1A1, which detoxifies xenobiotics and endogenous one tag was abundant in normal stomach, absent from compounds (Harris et al., 2000); and butyrate response the tumours, and gastric-specific, yet initially it matched factor 2, which is normally found in the colon where the only ESTs. These were used to create a cDNA probe for short-chain fatty acid butyrate is produced by bacterial

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4298 Table 4 Clinicopathological details of the gastric adenocarcinoma samples Tumour Tumour site Anatomical Histological subtype Histological Tumour stagea number subtype differentiation

1 Body Distal Intestinal Well T1 N0 2 Antrum Distal Intestinal Moderate T3 N0 3 Antrum Distal Intestinal Moderate T3 N1 4 Antrum and body Distal Diffuse Poor T1 N0 5 Antrum Distal Diffuse Poor T3 N2 6 Gastroesophageal junction (GEJ) Proximal Intestinal and solid Poor T3 N2 7 GEJ Proximal Diffuse and intestinal Poor T3 N1 8 GEJ Proximal Intestinal, solid and basaloid Poor T2 N0

aHere T and N refer to classical tumour staging (local tumour spread and presence of lymph node metastases) (Fuchs and Mayer, 1995), but throughout the rest of the paper, T and N indicate the SAGE tumour and normal samples

fermentation and acts as the main mucosal energy ogy. Primary tumours and normal mucosa were collected from source (Gibson et al., 1999). Once again, the molecular oesophagus, stomach and colon: the gastric adenocarcinomas pathology and morphology of the tumours correlate. are detailed in Table 4. Gastric and oesophageal cancer cell Some transcripts that were highly expressed in normal lines (AGS, KATO-III and OE19 from European Collection of gastric antral mucosa, downregulated in gastric carci- Cell Cultures, Wiltshire, UK; MKN-45 from Deutsche Sammlung von Mikroorganismen und Zellkulturen, Braunsch- noma, and tissue-specific by comparison with other weig, Germany) were grown according to the instructions normal glandular epithelia, turned out to be abundant provided. also in pancreatic adenocarcinoma. Spasmolytic poly- , lipocalin 2 and PSCA were highlighted in a recent SAGE study as being upregulated in pancreatic RNA isolation and SAGE adenocarcinoma by comparison with normal pancreatic Total RNA was isolated using TRIzol Reagent (Invitrogen, ductal cells (Argani et al., 2001), an expression pattern Paisley, UK). For SAGE, mRNA was purified using the shared with MUC5 and MUC1 (Scheurle et al., 2000), Poly(A)Pure kit (Ambion Inc, TX, USA). The SAGE samples such that in both papers these genes were proposed as were two gastric adenocarcinomas, both distal and intestinal new markers of pancreatic cancer. In fact, pancreatic but from different patients; and normal gastric antral mucosa carcinomas often exhibit a gastric phenotype (Sessa from a patient without gastric cancer. SAGE was performed as et al., 1990). The mechanisms underlying such metapla- described previously (Velculescu et al., 1995; Zhang et al., sias are unknown but may involve aberrant expression 1997). Tag sequences were analysed, using SAGE software of the developmental homeobox genes (Beck et al., (version 304) and Genbank databases, to produce a catalogue 2000). In any case, these phenomena emphasize the need of expressed genes (the SAGE protocol and software may be obtained via http://www.sagenet.org/sage_protocol.htm). Tags to examine a range of tissues when evaluating potential and matching transcripts were further investigated using the markers, preferably including techniques which enable National Center for Biotechnology Information (NCBI)’s on- target localization. line bioinformatics tools, SAGEmap and Unigene (http:// In conclusion, these are the first global profiles of gene www.ncbi.nlm.nih.gov/SAGE/ and http://www.ncbi.nlm.nih.- expression created, using SAGE, for the stomach. The gov/UniGene/index.html) (Lal et al., 1999; Wheeler et al., molecular pathology correlated with the morphology. 2001). The profiles of gastric carcinoma resembled those of To identify differentially expressed genes, the normal and tumour gastric libraries were compared using SAGE software other tumours, which supports the existence of common t cancer-associated molecular targets. Gene expression in and Microsoft Access and Excel programs. Various defini- normal stomach differed from other normal glandular tions of differential expression have been used in previous SAGE studies, including: threefold, fivefold or 10-fold tissues but agreed with existing literature. Many novel differences in tag ratio; P-values below 0.05, 0.01 or 0.001; transcripts have been identified. These molecular por- and combinations thereof (Velculescu et al., 1995; Zhang et al., traits increase our knowledge about the genes involved 1997; Hibi et al., 1998; Lal et al., 1999; Nacht et al., 1999; in normal gastric function and in malignant change in Velculescu et al., 1999; Hough et al., 2000; Waghray et al., the stomach, and provide a catalogue of candidates 2001). Our relatively stringent definition was a difference of from which to develop markers for better diagnosis and fivefold or more combined with a P-value of 0.01 or less. therapy of gastric carcinoma. Genes expressed specifically in the stomach were identified by digital comparison with other glandular epithelia and mesothelium. One normal and one tumour SAGE library from each of breast, colon, ovary, pancreas, prostate and mesothe- lium were downloaded from the SAGEmap web-site (details Materials and methods available on request) (Lal et al., 1999). Normal stomach was compared pairwise with the normal libraries, and the pooled Primary tissues and cell lines gastric carcinomas were compared with the tumours. Genes Primary tissues were dissected by a histopathologist (KAO) that appeared to be expressed specifically in the stomach on from resection specimens, in accordance with contemporary the basis of these six comparisons were then checked on-line ethical practice, then samples were frozen for later RNA against the many other normal and tumour libraries in extraction or fixed and processed for diagnostic histopathol- SAGEmap (Lal et al., 1999; Wheeler et al., 2001).

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4299 Validation by Northern blotting and immunohistochemistry (IHC) Ltd, Cambridge, UK) and then the VECTASTAIN Universal Elite ABC-Peroxidase Kit (Vector Laboratories Ltd, Peterbor- To validate the SAGE profiles, selected transcripts were ough, UK) according to the manufacturers’ instructions. studied in a wider gastrointestinal panel of 16 normal and tumour tissues and three cell lines (detailed in Figure 2 and Table 4). Where the expression of the gene in the gut was Acknowledgements previously minimally investigated, Northern blotting was used. We thank Mr Robert McFarlane for sequencing services; Dr For better characterized genes, where antibodies were avail- Kenneth W Kinzler for SAGE software and advice; Cancer able, IHC was used. Research UK and the University of Glasgow for research Multiple Northern blots were produced using the North- funding; and the Association of Clinical Pathologists, Key- ernMax kit (Ambion Inc), identically loaded with 30 mg total stone Symposia and Pfizer Limited for travel grants. RNA per lane. cDNA probes were generated from total RNA by reverse transcription–polymerase chain reaction (primer sequences available on request). The probe was labelled with [a-32P]dATP (Amersham Ltd, Buckinghamshire, UK) using Note added in proof the Strip-EZ PCR kit (Ambion Inc) then hybridized and washed according to the manufacturer’s instructions. Since our manuscript was submitted, a further relevant and IHC was performed on tissue processed for diagnostic interesting paper has been published. This used SAGE to study histopathology. Cultured cells were pelleted in soft agar, and one gastric adenocarcinoma, of proximal location (from the then similarly formalin-fixed and paraffin-embedded. Sections gastro-oesophageal junction), plus normal gastric body were stained using primary antibodies (pS2, MUC5AC from mucosa. These are different from our SAGE tissue samples, Lab Vision UK Ltd, Suffolk, UK; human spasmolytic but are also from the stomach. (El-Rifai W, Moskaluk CA, polypeptide from Novocastra Laboratories Ltd, Newcastle Abdrabbo MK, Harper J, Yoshida C, Riggins GJ, Frierson upon Tyne, UK; gastrin, lysozyme and cytokeratin 8 from Dako HF and Powell SM. (2002). Cancer Res., 62, 6823–6826).

References

Argani P, Rosty C, Reiter RE, Wilentz RE, Murugesan SR, Guilford P, Hopkins J, Harraway J, McLeod M, McLeod N, Leach SD, Ryu B, Skinner HG, Goggins M, Jaffee EM, Yeo Harawira P, Taite H, Scoular R, Miller A and Reeve AE. CJ, Cameron JL, Kern SE and Hruban RH. (2001). Cancer (1998). Nature, 392, 402–405. Res., 61, 4320–4324. Hanahan D and Weinberg RA. (2000). Cell, 100, 57–70. Bahrenberg G, Brauers A, Joost HG and Jakse G. (2000). Harris RM, Picton R, Singh S and Waring RH. (2000). Life Biochem. Biophys. Res. Commun., 275, 783–788. Sci, 67, 2051–2057. Beck F, Tata F and Chawengsaksophak K. (2000). BioEssays, Harris V, Sandridge AL, Black RJ, Brewster DH and Gould 22, 431–441. A. (1998). Cancer Registration Statistics Scotland 1986–1995. Berner A and Nesland JM. (1991). Histol. Histopathol, 6, ISD Scotland Publications: Edinburgh. 317–323. Hibi K, Liu Q, Beaudry GA, Madden SL, Westra WH, Blok P, Craanen ME, Offerhaus GJA and Tytgat GNJ. (1997). Wehage SL, Yang SC, Heitmiller RF, Bertelsen AH, Q. J. Med., 90, 735–749. Sidransky D and Jen J. (1998). Cancer Res., 58, 5690–5694. Blume-Jensen P and Hunter T. (2001). Nature, 411, Hippo Y, Taniguchi H, Tsutsumi S, Machida N, Chong JM, 355–365. Fukayama M, Kodama T and Aburatani H. (2002). Cancer Breitwieser GE, McLenithan JC, Cortese JF, Shields JM, Res., 62, 233–240. Oliva MM, Majewski JL, Machamer CE and Yang VW. Hippo Y, Yashiro M, Ishii M, Taniguchi H, Tsutsumi S, (1997). Am. J. Physiol., 272, C957–C965. Hirakawa K, Kodama T and Aburatani H. (2001). Cancer Carulli JP, Artinger M, Swain PM, Root CD, Chee L, Tulig C, Res., 61, 889–895. Guerin J, Osborne M, Stein G, Lian J and Lomedico PT. Hough CD, Sherman-Baust CA, Pizer ES, Montz FJ, Im DD, (1998). J. Cell Biochem., 30 (Suppl), 286–296. Rosenshein NB, Cho KR, Riggins GJ and Morin PJ. (2000). Chan AO, Luk JM, Hui WM and Lam SK. (1999). J. Cancer Res., 60, 6281–6287. Gastroenterol Hepatol., 14, 1150–1160. Huff T, Muller CS, Otto AM, Netzker R and Hannappel E. Chang K and Pastan I. (1996). Proc. Natl. Acad. Sci. USA, 93, (2001). Int. J. Biochem. Cell. Biol., 33, 205–220. 136–140. Isaacson P. (1982). Gut, 23, 578–588. de Waard V, van den Berg BMM, Veken J, Schultz-Heienbrok Janke J, Schluter K, Jandrig B, Theile M, Kolble K, Arnold R, Pannekoek H and van Zonneveld AJ. (1999). Gene, 226, W, Grinstein E, Schwartz A, Estevez-Schwarz L, Schlag PM, 1–8. Jockusch BM and Scherneck S. (2000). J. Exp. Med., 191, Ebert MP, Gunther T, Hoffmann J, Yu J, Miehlke S, Schulz 1675–1686. HU, Roessner A, Korc M and Malfertheiner P. (2000). Jung MH, Kim SC, Jeon GA, Kim SH, Kim Y, Choi KS, Park Cancer Res., 60, 1995–2001. SI, Joe MK and Kimm K. (2000). Genomics, 69, 281–286. El-Rifai W, Frierson Jr HF, Harper JC, Powell SM and Lal A, Lash AE, Altschul SF, Velculescu V, Zhang L, Knuutila S. (2001). Int. J. Cancer, 92, 832–838. McLendon RE, Marra MA, Prange C, Morin PJ, Polyak Friedl A, Stoesz SP, Buckley P and Gould MN. (1999). K, Papadopoulos N, Vogelstein B, Kinzler KW, Strausberg Histochem. J., 31, 433–441. RL and Riggins GJ. (1999). Cancer Res., 59, 5403–5407. Fuchs CS and Mayer RJ. (1995). N. Engl. J. Med., 333, Laure´ n P. (1965). Acta Pathol. Microbiol Scand., 64, 31–49. 32–41. Lefebvre O, Chenard MP, Masson R, Linares J, Dierich A, Gibson PR, Rosella O, Wilson AJ, Mariadason JM, Rickard LeMeur M, Wendling C, Tomasetto C, Chambon P and Rio K, Byron K and Barkla DH. (1999). Carcinogenesis, 20, MC. (1996). Science, 274, 259–262. 539–544. Ma T and Verkman AS. (1999). J. Physiol., 517, 317–326.

Oncogene Gene expression in gastric carcinoma and normal stomach KA Oien et al 4300 Machado JC, Nogueira AM, Carneiro F, Reis CA and Velculescu VE, Madden SL, Zhang L, Lash AE, Yu J, Rago C, Sobrinho-Simoes M. (2000). J. Pathol., 190, 437–443. Lal A, Wang CJ, Beaudry GA, Ciriello KM, Cook BP, Mitani M, Kuwabara Y, Kawamura H, Sato A, Hattori K and Dufault MR, Ferguson AT, Gao YH, He TC, Hermeking H, Fujii Y. (2000). World J. Surg., 24, 455–458. Hiraldo SK, Hwang PM, Lopez MA, Luderer HF, Mathews Mori M, Mimori K, Yoshikawa Y, Shibuta K, Utsunomiya T, B, Petroziello JM, Polyak K, Zawel L, Zhang W, Zhang Sadanaga N, Tanaka F, Matsuyama A, Inoue H and XM, Zhou W, Haluska FG, Jen J, Sukumar S, Landes GM, Sugimachi K. (2002). Surgery, 131, S39–S47. Riggins GJ, Vogelstein B and Kinzler KW. (1999). Nat. Nacht M, Ferguson AT, Zhang W, Petroziello JM, Cook BP, Genet., 23, 387–388. Gao YH, Maguire S, Riley D, Coppola G, Landes GM, Velculescu VE, Vogelstein B and Kinzler KW. (2000). Trends Madden SL and Sukumar S. (1999). Cancer Res., 59, 5464– Genet., 16, 423–425. 5470. Velculescu VE, Zhang L, Vogelstein B and Kinzler KW. Norton JD. (2000). J. Cell Sci., 113, 3897–3905. (1995). Science, 270, 484–487. Owen DA. (1986). Am. J. Surg. Pathol., 10, 48–61. Waghray A, Schober M, Feroze F, Yao F, Virgin J and Chen Parkin DM. (2001). Lancet Oncol., 2, 533–543. YQ. (2001). Cancer Res., 61, 4283–4286. Perou CM, Sorlie T, Eisen MB, van de Rijn M, Jeffrey SS, Wang G, Wang M, You W and Li H. (2001). Zhonghua Yi Xue Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, Yi Chuan Xue Za Zhi, 18, 43–47. Huge O, Pergamenschikov A, Williams C, Zhu SX, Lonning Wheeler DL, Church DM, Lash AE, Leipe DD, Madden TL, PE, Borresen-Dale AL, Brown PO and Botstein D. (2000). Pontius JU, Schuler GD, Schriml LM, Tatusova TA, Wagner Nature, 406, 747–752. L and Rapp BA. (2001). Nucleic Acids Res., 29, 11–16. Pradet-Balade B, Boulme F, Beug H, Mullner EW and Garcia- Wong WM, Poulsom R and Wright NA. (1999). Gut, 44, 890– Sanz JA. (2001). Trends Biochem. Sci., 26, 225–229. 895. Roukos DH. (2000). Cancer Treat. Rev., 26, 243–255. Yoshikawa Y, Mukai H, Hino F, Asada K and Kato I. (2000). Salesiotis AN, Wang CK, Wang CD, Burger A, Li H and Seth Jpn. J. Cancer Res., 91, 459–463. A. (1995). Cancer Lett., 91, 47–54. Zhang L, Zhou W, Velculescu VE, Kern SE, Hruban RH, Scheurle D, De Young MP, Binninger DM, Page H, Jahanzeb Hamilton SR, Vogelstein B and Kinzler KW. (1997). M and Narayanan R. (2000). Cancer Res., 60, 4037–4043. Science, 276, 1268–1272. Sessa F, Bonato M, Frigerio B, Capella C, Solcia E, Prat M, Bara J and Samloff IM. (1990). Gastroenterology, 98, 1655–1665.

Oncogene