<<

View metadata, citation and similar papers at core.ac.uk brought to you by CORE

provided by Elsevier - Publisher Connector

Genomics 88 (2006) 173–184 www.elsevier.com/locate/ygeno

Correlation of expression by comparative analysis of real-time PCR profiling data

Sunita Badola a, Heidi Spurling a, Keith Robison a, Eric R. Fedyk a, Gary A. Silverman b, ⁎ Jochen Strayle c, Rosana Kapeller a,1, Christopher A. Tsu a,

a Millennium Pharmaceuticals, Inc., 40 Landsdowne Street, Cambridge, MA 02139, USA b Department of Pediatrics, University of Pittsburgh School of Medicine, Magee-Women’s Hospital, 300 Halket Street, Pittsburgh, PA 15213, USA c Bayer HealthCare AG, 42096 Wuppertal, Germany Received 2 December 2005; accepted 27 March 2006 Available online 18 May 2006

Abstract

Imbalanced protease activity has long been recognized in the progression of disease states such as and inflammation. , the largest family of endogenous protease inhibitors, target a wide variety of and cysteine and play a role in a number of physiological and pathological states. The expression profiles of 20 serpins and 105 serine and cysteine proteases were determined across a panel of normal and diseased human tissues. In general, expression of serpins was highly restricted in both normal and diseased tissues, suggesting defined physiological roles for these protease inhibitors. A high correlation in expression for a particular serpin–protease pair in healthy tissues was often predictive of a biological interaction. The most striking finding was the dramatic change observed in the regulation of expression between proteases and their cognate inhibitors in diseased tissues. The loss of regulated serpin–protease matched expression may underlie the imbalanced protease activity observed in pathological states. © 2006 Elsevier Inc. All rights reserved.

Keywords: Protease; Serpin; Cancer; Imbalance; Expression profiling; correlation; PCR

Proteases play critical roles in many normal processes such We have chosen to study the largest family of endogenous as development, , and immunity. Therefore, protease inhibitors, the serpins, which target a wide variety of regulatory mechanisms must exist to ensure that homeostasis serine and cysteine proteases [2,3]. Over 1000 serpins have is maintained. In this regard, there are endogenous inhibitors been identified across all species, including 37 human forms for most of the known protease families [1]. A protease and [4]. Their function is mediated by a unique suicide substrate its cognate inhibitor may normally be closely matched in site mechanism, in which cleavage of an exposed substrate-like and amount of expression. An imbalance in expression sequence (reactive site loop, RSL) causes a dramatic confor- patterns may arise as either a causative or responsive element mational change in the serpin, thereby trapping the protease in a in disease. covalently bound, inactive state [5]. The first known serpin, α1- antitrypsin (SERPINA1), is a secreted glycoprotein that is found in high levels in the serum [6]. It is produced primarily in the Abbreviations: serpin, inhibitor; RSL, reactive site loop; UUI, uterine urinary incontinence; BM-MNC, bone marrow mononuclear cells; liver, but also by a few inflammatory and epithelial cell types IBD, inflammatory bowel disease; DRG, dorsal root ganglion; CHF, chronic [7]. A primary site of action is the lung, where SERPINA1 heart failure; COPD, chronic obstructive pulmonary disease; PBMC, peripheral inhibits human , preventing tissue damage. blood mononuclear cells; BPH, benign prostate hyperplasia; SMC, smooth Hereditary loss of SERPINA1 has been implicated in chronic β muscle cell; B2M, 2-microglobulin; siRNA, small interfering RNA. liver disease and emphysema, as well as an increased risk of ⁎ Corresponding author. Fax: +1 617 551 8906. E-mail address: [email protected] (C.A. Tsu). liver and other [6,8]. Conversely, elevated plasma levels 1 Current address: Renegade Therapeutics, One Broadway, 14th floor, of the serpin squamous cell carcinoma antigens SCCA1 and 2 Cambridge, MA 02142, USA. (SERPINB3 and SERPINB4) are an established marker for this

0888-7543/$ - see front matter © 2006 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2006.03.017 174 S. Badola et al. / Genomics 88 (2006) 173–184 disease [9,10]. Although they interact in vitro with S, Particular attention was paid to disease relevance and site L, and K, or G and , respectively, the of expression in the selection of serpins. This resulted in the biological roles of SERPINB3 and SERPINB4 remain un- selection of most of the intracellular ov-serpins (clade B) [2]. known [11,12]. Here we suggest that expression profiling Although an effort was made to avoid secreted serpins, as site correlation may aid in the identification of novel in vivo of action may not correlate well with site of expression, we functions of serpins and potential target proteases. did select a small number of secreted serpins (clades A, E, We utilized real-time PCR (TaqMan PCR) to determine the and I), based primarily on biochemical characterization and expression profiles of 20 serpins and 105 serine and cysteine disease biology. Several novel, previously uncharacterized proteases across a panel of 58 samples from a collection of 13 serpins were also selected. In total, 20 serpins (Table 1) were cell lines and 31 normal and 14 diseased human tissues. In chosen such that the method could be validated with known general, highly restricted expression of serpins was observed interactions and that new serpin–protease pairs might be in normal and diseased tissues, suggesting defined physio- identified de novo. logical roles for these protease inhibitors. Furthermore, in specific cases, the expression correlations observed in lung, Serpin expression profile ovarian, breast, and tissues were altered compared to normal tissues. The expression profile of 20 serpins across 58 samples from a collection of 13 cells and 31 normal and 14 diseased human tissues Results is depicted in Fig. 1A. The normalized relative expression is represented by a light to dark color grid, which shows low to high Protease and serpin selection criteria levels of mRNA transcripts. For example, SERPINB1 and SERPINB8 are ubiquitously expressed, whereas SERPINI2 is The entire human genomic complement of proteases (greater expressed only in pancreas. In general, good agreement was found than 500) and serpins (37) is relatively large [1,13]. Criteria between the serpin mRNA expression patterns in this work and were therefore used to select a subset for transcription profiling. previously reported results (mRNA and/or ). This would Factors included known disease link, known target or substrate, include serpins A1, A3, and A4 (liver, pancreas, and kidney) site of expression (tissue, intracellular or secreted), protein [7,14,15]; A5 (pancreas, kidney, liver, and ovary) [16];B8 structure (fold, essential catalytic or inhibitory residues), and (digestive tract, lung, liver, pituitary, skin, and tonsil) [17];I1 homology to of known activity or disease relevance. (brain cortex, hypothalamus, pituitary, and spinal cord) [18];andI2 Aspartyl and metalloproteases were removed from consider- (pancreas) [19]. The most striking and unexpected finding is the ation, as serine (, ) and cysteine proteases restricted expression of most serpins. The majority of the serpins are account for the majority of proteases known to interact with expressed at “high” levels in four or fewer, often related, tissues in serpins. In this way, a total of 81 trypsin, 10 subtilase, and 14 our panel (Fig. 1A). This supports the hypothesis that serpin cysteine proteases were selected for profiling (Supplemental expression is highly regulated and that imbalanced expression may Table 1). lead to pathological states.

Table 1 Serpin sequences Nomenclature GenBank Common name RSL (P1/P1′) Putative targets SERPINA1 X01683 α1-Antitrypsin GTEAAGAMFLEAIPM/SIPPE Trypsin, SERPINA3 X68733 α1-Antichymotrypsin GTEASAATAVKITLL/SALVE , chymase, SERPINA4 L19684 GTEAAAATSFAIKFF/SAQTN SERPINA5 M68516 inhibitor GTRAAAATGTIFTFR/SARLN , SERPINA9 AX392967 Sequence 9 from Patent WO0214358 GTEATAATTTKFIVR/SKDGP Unknown SERPINA11 BD130048 Human serine protease and serpin polypeptide GTEAGAASGLL/SQPPSLNTM Unknown SERPINA12 AX192197 Sequence 13 from Patent WO0149729 GTEGAAGTGAQTLPM/ETPLV Unknown SERPINA13 AX163789 Sequence 1 from Patent WO0138534 GSEAAAATSIQL/TPGPRPDL Unknown SERPINB1 M93056 Leukocyte elastase inhibitor GTEAAAATAGIATFC/MLMPE Neutrophil elastase, cathepsin G SERPINB2 J03603 inhibitor-2 GTEAAAGTGGVMTGR/TGHGG uPA SERPINB4 U19557 Squamous cell carcinoma antigen 2 GVEAAAATAVVVVEL/SSPST Cathepsin G, chymase SERPINB7 AF027866 Megsin GTEATAATGSNIVEK/QLPQS SERPINB8 L40377 Cytoplasmic antiproteinase 2 GTEAAAATAVVRNSR/CSRME Trypsin, SERPINB10 U35459 Bomapin GTEAAAGSGSEIDIR/IRVPS Trypsin, thrombin SERPINB11 AF419954 Epipin GTEAAAATGDSIAVK/SLPMR Unknown SERPINB12 AF411191 Yukopin GTQAAAATGAVVSER/SLRSW Trypsin SERPINB13 AF169949 Hurpin, headpin GTEAAAATGIGFTVT/SAPGH Cathepsins L, K SERPINE3 AX407123 Sequence 5 from Patent WO0226981 GTKASGATALLLLKRSR/IPI Unknown SERPINI1 Z81326 GSEAAAVSGMIAISR/MAVLY Trypsin, tPA SERPINI2 BC027859 Pancpin GSEAATSTGIHIPVIM/SLAQ Unknown Serpins are listed by standard nomenclature [2], GenBank nucleic acid accession number, and common name. RSL sequences were derived by alignment with SERPINA1. RSL cleavage sites between P1/P1′ in bold are known or proposed based on serpin structure. Putative protease targets were taken from [2,3]. S. Badola et al. / Genomics 88 (2006) 173–184 175

The tight regulation of serpin expression is represented in RSL contains two putative cleavage sites, RNSR↓ and RCSR↓, Figs. 1B and 1C. SERPINA11 (Fig. 1B), a previously unknown which conform to the minimal furin consensus sequence patent sequence, exhibits a restricted expression pattern typical RXXR↓ [24–26]. Collectively, these data suggest that SER- of many serpins: very high expression in normal liver, with >30- PINB8 may play a role in the regulation of furin activity in vivo. fold less expression in normal breast, prostate BPH (benign The expression profile of SERPINE3 (Fig. 2C) is more prostate hyperplasia), and breast tumor. SERPINB4 (SCCA2) restricted than that of SERPINB8, with highest levels found in (Fig. 1C) shows dramatically elevated levels of expression in neural tissues, such as brain, pituitary, and spinal cord (Fig. 1A). lung and prostate tumor tissues versus the corresponding normal PCSK4, a member of the family, tissues. This finding is confirmatory of many studies in which correlates in expression with SERPINE3, with an acceptable SERPINB4 has been identified as a marker for lung and other range of correlation among most of the normal tissues. The types of cancer [20,21]. Low levels of expression were also seen correlation is >0.70 in neural tissues such as brain, spinal cord, in normal tonsil, normal pituitary, hemangioma tumor, colon and pituitary gland (Fig. 2C and data not shown). Although the IBD (inflammatory bowel disease), and breast tumor. primary site of PCSK4 expression has been reported to be in the testes (not included in this study), our results show significant Expression correlation in normal tissues expression in many other tissues such as neural, erythroid, kidney, and pancreas [27]. While the target of SERPINE3 is One of the major challenges is to elucidate the functions of unknown, PCSK4 is a highly specific , preferring a serpins and proteases whose functions are unknown. Mining consensus sequence of KPXR↓XP [28]. A similar sequence is expression profiles is one method to uncover gene function. It found in the RSL of SERPINE3, KRSR↓IP. These data suggest is well reported in the literature that that encode proteins SERPINE3 as the cognate inhibitor of PCSK4 in vivo. that participate in the same pathway or are part of the same protein complex are often coregulated [22]. Clusters of genes Expression correlation in oncology samples with related functions often exhibit similar expression patterns across a diverse collection of samples. Tissue-specific gene Increased protease activity has been implicated in the genesis expression can be used to predict tissue-specific function. of many tumors and metastases [29,30]. We postulate that Therefore, we explored the relationship between serpins and increased protease activity may be due to the loss of matched proteases based on the coregulation of expression in normal expression between a protease and its paired serpin. To test this human tissues. hypothesis we analyzed the expression correlation of serpin– We analyzed the expression correlation of 20 serpins (Table protease pairs across a panel of five different normal and tumor 1) and 105 proteases (Supplemental Table 1) across a panel of tissues (breast, lung, colon, ovary, and prostate) (Fig. 3). We 25 normal human tissues (Supplemental Table 2). A heat map assumed that at least four scenarios are likely to occur during displaying the results of this analysis is shown in Fig. 2A and tumorigenesis and they are described below: the correlation values are listed in Supplemental Table 1. Only serpin–protease pairs with a correlation value >0.7 were (1) Expression of serpin is elevated to compensate increased investigated further (see below). This cutoff value eliminated protease expression in the diseased tissue. most, if not all, spurious/weak correlations. We also observed (2) Protease expression increases, while that of the serpin that several serpins shared distinct banding patterns of high remains unchanged, leading to imbalanced protease correlation (Fig. 2A). This primarily reflects predominant activity. expression in liver of serpins A1, A3, A4, and A11 with pro- (3) Serpin expression drops, while that of the protease teases trypsin, KAL, IF, HTRA4, FA10, FA12, FA7, and Eos, or remains the same or increases, leading to imbalanced in tonsil of serpins B7, B10, B11, and B13 with proteases protease activity. KLK10, KLK12, and KLK13. (4) Serpin expression increases, while that of the protease We used this information to match patterns remains the same or decreases. of proteases with their putative inhibitors as a starting point to understand functional relationships and roles played by serpin– We present examples of the different scenarios as illustrated protease pairs under physiological conditions. To exemplify our by serpin–protease pairs of high expression correlation in findings, the data obtained from two serpin–protease pairs are normal tissues. Expression of SERPINB8 and furin (Fig. 3A) depicted here. SERPINB8 is coexpressed with the subtilase- remains balanced in normal and tumor tissues from breast, type protease furin (Fig. 2B) and SERPINE3 with the subtilase colon, lung, and ovary. Although two- to threefold increases in proprotein convertase, type 4 (PCSK4) (Fig. 2C). furin levels are observed in colon, lung, and prostate tumors, the SERPINB8 is broadly expressed across the panel, most level of expression of SERPINB8 increases proportionally. In notably in liver, skeletal muscle, artery, heart, and smooth this example, serpin expression accompanies the increased muscle cell (SMC) coronary tissues. Furin expression matches levels of protease in tumor samples. The elevated expression of that of SERPINB8 in most cases, with a correlation of 0.8. furin in lung, head and neck, and prostate cancer has been SERPINB8 inhibits furin in vitro with an overall Ki of 54 pM previously reported [29,31,32]. This may allow for the [23]. Although SERPINB8 is known to inhibit several other increased secretion of growth factors and matrix metallopro- trypsin- and subtilase-type proteases in vitro, it is notable that the teases, which in turn leads to increased tumorigenesis and 176 S. Badola et al. / Genomics 88 (2006) 173–184 invasiveness. However, if SERPINB8 expression levels rise, as remains unchanged. Slight increases in TMPRSS3 expression observed here, to compensate for elevated furin expression were also seen in colon and prostate. TMPRSS3 has been levels, this effect would be ameliorated. previously identified as a marker in pancreatic and ovarian Expression of SERPINA5 and the trypsin-type protease cancer [33,34]. It is expressed on the surface of cells and transmembrane protease, serine 3 (TMPRSS3) (Fig. 3B), is appears to possess cleavage specificity after basic amino acids, matched in normal breast, colon, ovary, and prostate with low such as Arg and Lys [34]. SERPINA5 () is a levels of unmatched TMPRSS3 expression in the normal lung. known inhibitor of thrombin and several other coagulation TMPRSS3 is elevated significantly, as much as 10- to 20-fold, proteases, all of which possess a preference for basic amino in breast and ovary tumors, while SERPINA5 expression acids [35,36]. S. Badola et al. / Genomics 88 (2006) 173–184 177

Fig. 1. Serpin expression profile across 58 normal and diseased tissues. (A) Serpin expression profile/heat map. Gene expression profile of 20 serpins across 13 cell types and 31 normal and 14 diseased human tissues. Real-time PCR reagents were designed to interrogate each serpin and experiments were performed on cDNA generated from the cells and different clinical human tissues. Each clinical tissue sample is a pool of three different individuals to account for biological variability. Boxes highlighted in red represent samples expressing serpin gene at high levels, green represents moderate levels, and yellow represents no/low expression. Gray boxes indicate no data available for the serpin in that specific sample. (B) SERPINA11 gene expression profile across 58 human tissues and cells. SERPINA11 shows restricted expression in breast normal, breast tumor, liver normal, prostate BPH, and tonsil normal. Liver normal exhibits the highest level of expression compared to the rest of the tissues. (C) SERPINB4 gene expression profile across 58 human tissues and cells. SERPINB4 shows restricted expression in colon IBD, hemangioma tumor, lung normal, pituitary gland, and prostate tumor. Lung tumor and prostate tumor show a high level of expression compared to the rest of the tissues.

Matched or slight overexpression of SERPINB1 relative observed in normal tissues (Fig. 3D). Expression of HABP2 to the subtilase-type protease 2 (TPP2) in colon tumor is elevated 18-fold versus colon normal, whereas is seen in normal tissues (Fig. 3C). The elevated expression the levels of SERPINA1 remain unchanged. (Small changes in of TPP2 in colon tumor (∼10-fold) is met by increased lung tumor expression versus lung normal are not significant for expression of SERPINB1. The normal role of TTP2 is the either HABP2 or SERPINA1.) HABP2 is a multidomain serine processing of peptides for eventual display in the MHC class protease found primarily in plasma [43–45]. It has been I pathway. TPP2 is found as a multimer of 135-kDa subunits identified in other studies as a significant marker for lung and primarily functions as a relatively nonspecific tripeptidyl adenocarcinoma [46]. SERPINA1 is associated with lung , though endoproteolytic cleavage can occur disease, and the coexpression with HABP2 suggests these two [37–39]. SERPINB1 is among the most broadly expressed proteins interact in vivo. However, the fact that both the serpin serpins, suggesting a general role. It inhibits a wide variety and the protease are also found in plasma may complicate the of elastase- and chymotrypsin-like serine proteases by analysis. utilizing two separate sites on its RSL [40–42].One function of SERPINB1 may be to protect from Discussion elastase and other serine proteases that are highly expressed in these cells. Both SERPINB1 and TPP2 are cytosolic Serpin expression is highly tissue specific in both normal and proteins and may interact. As the activity and specificity of diseased tissues TPP2 are not currently well understood, the potential interaction of SERPINB1 and TPP2 merits further Strikingly, the majority of the serpins examined in this study investigation. exhibited tightly regulated expression across a wide variety of Low, but matched, expression of SERPINA1 and the trypsin- tissues, suggesting physiological roles that may be much more type protease hyaluronan binding protein 2 (HABP2) is precise than simply the general regulation of . This is 178 S. Badola et al. / Genomics 88 (2006) 173–184

Fig. 2. Correlation of serpin and protease expression in normal tissues. (A) Correlation plot of serpins versus proteases in normal tissues. Heat map displaying expression correlation >0.7 between serpins and proteases in normal human tissues. Correlation values were calculated as described under Materials and methods from serpin and protease expression values in 25 normal human tissues (Supplemental Table 2). Each serpin was matched against 105 proteases and correlation values >0.7 were further investigated (Supplemental Table 1). Several distinct banding patterns of high correlation reflect similar tissue-specific serpin expression in liver, pancreas, and kidney (SERPINA1, A3, and A4) or tonsil (SERPINB7, 10, 11, and 13). (B) SERPINB8 with subtilase/furin. Expression profile of SERPINB8 with furin (GenBank X17094) shows correlation value of 0.70 compared across 25 normal tissues (Supplemental Table 2). SERPINB8 is broadly expressed across the tissue panel, most notably in liver, skeletal muscle, artery, heart, and SMC, coronary. Furin expression matches that of SERPINB8 in most cases, with a correlation of 0.8. (C) SERPINE3 with subtilase/PCSK4 (proprotein convertase / type 4). Expression profile of SERPINE3 with PCSK4 (GenBank BC036354) shows correlation value of >0.7 compared across neural tissues only (exception for this pair) and has acceptable correlation compared with the rest of the 25 different normal tissues (Supplemental Table 2). The expression profile of SERPINE3 is more restricted than that of SERPINB8, with highest levels found in neural tissues, such as brain, pituitary, and spinal cord. PCSK4, a member of the proprotein convertase family, correlates in expression with SERPINE3, with an overall value of 0.9. S. Badola et al. / Genomics 88 (2006) 173–184 179

Fig. 2 (continued). amply illustrated by such serpins as I1 (neural tissue), I2 subtilase-type proteases are intracellular, while trypsin types (pancreas), or B4 (lung cancer). This is consistent with the are secreted. This makes precise localization more difficult for numerous tissue-specific functions involving serpins, including the trypsin-type proteases and may complicate expression blood coagulation and , spermatogenesis, neural correlation efforts in some cases. function, and prohormone conversion [2,3]. Also of interest is Of note is the fact that multiple serpins and proteases can that the expression patterns observed in this study often appear be expressed in the same tissue, such as liver, or that one clade-specific. Most clade A serpins are expressed in the major serpin may associate with multiple proteases [2,3], making it secretory organs such as liver, pancreas, and kidney. Other difficult to assign partners on the basis of correlation alone. clades, such as clade B, appear more broadly expressed, but Therefore, we promote the use of correlation data as a first definitive analysis must await further profiling of the remaining step to understanding expression patterns and to narrow down human serpins. That serpin function might follow phylogenetic possible serpin–protease pairings. However, we recognize classification would greatly aid in the study of previously that correlation data alone are insufficient for thorough uncharacterized serpins and the identification of cognate evaluation of serpin–protease pairs and need to be combined proteases. with other biological and biochemical information to understand the full impact of serpin–protease pairings on Correlation analysis of normal tissue expression identified both tissue homeostasis. In addition, as discussed above, when existing and plausible serpin–protease pairs serpin and/or protease is secreted, this approach may not be ideal, as site of expression may not correlate with site of Our correlation method clearly identified an existing action in these cases. interaction in one example (SERPINB8–furin) [23] and suggested a novel interaction (SERPINE3–PCSK4) in Serpin and protease expression varies dramatically in cancer, another. Given the small subset of proteases and inhibitors resulting in an imbalance that may play a role in disease that was examined in this study, this would indicate that progression or response expression correlation is a useful tool in addition to standard biochemical and cell biological methods. The technique does As the proteases examined in this study were often chosen on the appear to be more robust with subtilase- versus trypsin-type basis of disease relevance, expression correlation may identify a serine proteases and cysteine proteases. This may reflect serpin as an important regulator of potentially pathogenic activity in differing biological roles and general location of each normal tissues. Such knowledge may be critical in understanding protease class or the particular mechanism of interaction of the role of relatively obscure proteases such as TMPRSS3 and each serpin with its protease target. As a general rule, HABP2 in cancer. 180 S. Badola et al. / Genomics 88 (2006) 173–184

We were intrigued by the large changes in protease expression was remarkable. These serpins may represent a legitimate attempt that were observed. Ten-fold or greater increase was not by the body to counterbalance overproduction of proteases. Thus uncommon. That the serpin expression sometimes compensated recognized, they may provide new insight into the mechanism and

Fig. 3. Normalized expression profile of serpin and protease pairs across five different normal and tumor tissues (breast, lung, colon, ovary, and prostate). The serpin– protease pairs were selected based on >0.7 correlation values across 25 normal human tissues (Supplemental Tables 1 and 2). (A) SERPINB8 and subtilase/furin. Expression of SERPINB8 and furin exhibited a correlation value of 0.70 in normal tissue comparison. Expression of SERPINB8 and furin remains balanced in normal and tumor tissues from breast, colon, lung, and ovary, despite two- to threefold increases in furin levels in the colon and lung tumors. (B) SERPINA5 and trypsin/ TMPRSS3 (transmembrane protease, serine 3). Expression of SERPINA5 and TMPRSS3 (GenBank AF201380) exhibited a correlation value of 0.73 in normal tissue comparison. Expression of TMPRSS3 was upregulated in breast, colon, and ovary tumors. (C) SERPINB1 and subtilase/TPP2 (tripeptidyl peptidase II). SERPINB1 and TPP2 (GenBank D49742) expression exhibited a correlation value of 0.78 in normal tissue comparison. Expression of SERPINB1 was upregulated in colon and lung tumors, and SERPINB1 levels matched the elevated expression of TPP2 in colon tumor. (D) SERPINA1 and trypsin/HABP2 (hyaluronan binding protein 2). SERPINA1 and HABP2 (GenBank M73047) expression exhibited a correlation value of 0.99 in normal tissue comparison. Expression of HABP2 was upregulated in colon and lung tumor tissue samples. Elevated expression of HABP2 in colon and lung tumor is unmatched by SERPINA1. S. Badola et al. / Genomics 88 (2006) 173–184 181

Fig. 3 (continued). treatment of the disease. Computational tools such as presented here addressed by the use of siRNA to knockdown expression of can provide useful information in this regard. TMPRSS3 and/or SERPINB5 in these tumors to establish For example, our data show that TMPRSS3 expression relevance to disease initiation and/or progression. increases 10- to 20-fold in breast and ovary tumors compared to normal breast and ovary tissues. The expression level of its Relationship between gene function and transcriptional potential inhibitor, SERPINB5, identified in this study (Fig. 3B) coexpression remains unchanged, leading to the hypothesis that dysregulation of TMPRSS3 activity may trigger certain events that are Correlations among the transcript levels in normal tissues necessary for either the initiation or the maintenance of were examined to illuminate the relationship between proteases oncogenesis in these tumor types. This hypothesis can be and serpins. The assumption is that alterations in gene 182 S. Badola et al. / Genomics 88 (2006) 173–184 expression will manifest themselves at the level of biological purity and integrity of the RNA were assessed on the Agilent 2100 bioanalyzer pathways or coregulated gene sets, rather than at the level of with the RNA 6000 Nano Labchip reagent set (Agilent Technologies, CA, USA). The RNA was quantified spectrophotometrically and treated with DNase individual genes [47]. In this approach, genes that have similar (Ambion, TX, USA) for 30 min at 37°C. DNase-treated RNAwas extracted from expression patterns across a set of normal tissue samples are the samples and stored at −80°C. hypothesized to have a functional relationship, such as physical interaction between the encoded proteins, though coexpression cDNA generation of transcripts does not necessarily imply a causal relationship among proteins [48]. Gene-to-gene relationships based on First-strand cDNA synthesis was performed with the ABI reverse transcription transcript levels are purely subjective and need further validation system (Applied Biosystems, CA, USA) according to the manufacturer's protocol. The cDNA quality was assessed by generating expression profiles of 18S ribosomal by functional relationship, e.g., biological and biochemical and β2-microglobulin (B2M) genes with real-time PCR (ABI 7700; Applied assays. In this paper we have made an attempt to bring certain Biosystems). cDNA samples were stored at −20°C. serpins and proteases together based on transcript-level correlation in various normal tissues. Several pairings have Primers and TaqMan probe design been identified, laying the foundation for future experiments. mRNA sequences for all proteases and serpins were derived from GenBank (for accession numbers, see Table 1 and Supplemental Table 1) and primers and Materials and methods TaqMan probes were designed with Primer Express software, version 2.0 (Applied Biosystems). We targeted the design at the 3′end of the open reading Protease and serpin selection frame of the gene. All primers were obtained from MWG Biotech (NC, USA) and 6-carboxyfluorescein-labeled probes were obtained from Applied Biosys- A variety of bioinformatic methods were used over the course of this project, tems. For the normalization of our results, we used VIC-labeled B2M reagents due to the evolving nature of human sequence databases during the execution of from Applied Biosystems. Each of the probes was quenched by 6- this project. Expressed sequence tag (EST) databases were either searched carboxytetramethylrhodamine. directly or clustered with d2 [49]. Genomic sequences were scanned with FGENESH [50] or GenScan [51] to generate potential transcripts. These datasets Primer and probe quality control were then searched using TBLASTN with a defined set of protein probes or translated HMMER searches using a Paracel GeneMatcher and appropriate Each primer and probe pair was tested for PCR efficiency in a synthetic models from PFAM [52]. Protein probes for BLAST [53] were first examined for template system. Primer and probe pairs were thoroughly tested by 7 logs of repetitive regions or domains not of interest to the searches and these regions dilution with synthetic template and also on human control cDNA. Reagents that masked out prior to conducting the search. Phrap [54] and BLASTN were used to passed the criteria were validated for use in expression profiling experiments. identify overlapping ESTs or genomic predictions that overlapped each other or All of the protease and protease inhibitor assays provided the best result with the EST hits. Using the clustered sequence results, GenBank accession numbers, highest primer and probe concentration (900 nM primers and 250 nM probe), gene symbols, and annotations were assigned and catalogued. multiplexed with a housekeeping gene at primer limited concentration (200 nM primers and 200 nM probe). Tissue and cell samples TaqMan real-time PCR Human biological materials were collected from two major sources: fee-for- service tissue providers and academic collaborators. The tissue samples were TaqMan PCR assays were performed on an ABI Prism 7700 sequence collected and frozen by liquid nitrogen within 3 h or less of surgery. In all cases, detection system (Applied Biosystems). Multiplexed assay was performed with informed consent granting use of the tissue for research purposes was obtained protease or serpin reagent sets with the human B2M gene as an endogenous from the patients prior to sample collection. Personal identifiers were stripped control. All quantitative assays designed using ABI guidelines were run using from the samples and the samples were made anonymous prior to shipment to the same universal thermal cycling parameters. Cycling parameters were run at Millennium's central biorepository, the Molecular and Biologic Resource Center. 50°C for 2 min, 95°C for 10 min to activate the DNA polymerase and then 40 All tissue samples were evaluated by a pathologist by hematoxylin and eosin cycles of 95°C for 15 s, 60°C for 1 min. stain and stored at −80°C before being processed into cDNA. We generated a panel of 58 different samples that were collected from 31 normal tissues, 14 Data analysis diseased tissues, and 13 cell types. To account for biological variability, the clinical tissue samples consist of pooled cDNA samples containing representa- Data were analyzed using SDS 1.7 (Applied Biosystems) software. The tive specimens from three different human donors, with both genders first step was to generate an amplification plot for every sample, which represented, whenever possible. Oncology pools often contain representative showed ΔRn on the y axis (where Rn is the fluorescence emission intensity types of different common tumors of that organ, so as to conduct as wide a survey of the reporter dye normalized to a passive reference) against the cycle as possible. Tumor sample pools are identified only by the organ of origin. number on the x axis. From each amplification plot, a threshold cycle (Ct) value was calculated, which was defined as the cycle at which a statistically

Tissue selection criteria for correlation analysis significant increase in ΔRn was first detected, and was displayed in the graph as the intercept point of the amplification plot and threshold. The

Tissues were chosen for transcript analysis to evaluate oncology and obtained Ct values were exported to an Excel spreadsheet for further inflammation paradigms. Normal tissues represent all critical organs in the analysis. human body except some of the neural tissues such as brain due to generally high levels of transcriptional activity. Diseased tissues are more relevant to represent Relative expression calculation oncology and inflammation-related diseases, e.g., tumors, IBD, and COPD. Relative quantitation is used to compare the changes in steady-state mRNA RNA extraction levels of two or more genes to each other, with one of them acting as an endogenous control [55]. Normalization of target gene expression levels was Total RNA extraction was carried out using RNA-STAT (Tel-Test, Inc., TX, performed to compensate intra- and interkinetic RT-PCR variations. Data USA) according to the manufacturer's protocol (RNA Stat 60 User Bulletin). The normalization was carried out against an endogenous unregulated gene S. Badola et al. / Genomics 88 (2006) 173–184 183 transcript. Based on the exponential amplification of the target gene, as well as [7] R.J. Coakley, C. Taggart, S. O'Neill, N.G. McElvaney, Alpha1-antitrypsin the normalizer, the amount of amplified molecules at the threshold cycle was deficiency: biological answers to clinical questions, Am. J. Med. Sci. 321 given by (2001) 33–41. [8] H.A. Chapman Jr., G.P. Shi, Protease injury in the development of COPD: Relative Expression ¼ 2−∂∂CT Thomas A. Neff Lecture, Chest 117 (2000) 305S–309S. AACT ¼ ACT;q AC T;cb [9] H. Kato, T. Torigoe, Radioimmunoassay for tumor antigen of human cervical squamous cell carcinoma, Cancer 40 (1977) 1621–1628. This equation is based on the precondition that the efficiencies of target and [10] G.A. Silverman, et al., SCCA1 and SCCA2 are proteinase inhibitors that ∂ reference amplification are approximately equal and CT,q is the difference map to the serpin cluster at 18q21.3, Tumour Biol. 19 (1998) 480–487. ∂ between target and endogenous reference control and CT,cb is the calibrator [11] C. Schick, et al., Squamous cell carcinoma antigen 2 is a novel serpin that sample (no template control). inhibits the chymotrypsin-like proteinases cathepsin G and chymase, J. Biol. Chem. 272 (1997) 1849–1855. Calculation of correlation values [12] C. Schick, et al., Cross-class inhibition of the cysteine proteinases cathepsins K, L, and S by the serpin squamous cell carcinoma antigen 1: a The Spotfire functional genomics version 7.3 software (Spotfire, Inc., kinetic analysis, Biochemistry 37 (1998) 5258–5266. MA, USA) tool was used to analyze the data generated from proteases and [13] A.J. Barrett, Bioinformatics of proteases in the MEROPS database, Curr. serpins on the same subset of 25 normal tissue samples: adipose, adrenal Opin. Drug Discov. Dev. 7 (2004) 334–341. gland, artery, bladder, breast, colon, dorsal root ganglion, heart, kidney, [14] N. Kalsheker, S. Morely, K. Morgan, Gene regulation of the serine liver, lung, lymph node, ovary, nerve, pancreas, pituitary gland, prostate, proteinase inhibitors alpha1-antitrypsin and alpha1-antichymotrypsin, spinal cord, skeletal muscle, small intestine, spleen, stomach, thymus, tonsil, Biochem. Soc. Trans. 30 (2002) 93–98. and vein (Supplemental Table 2). The Spotfire Profile search tool was [15] G.X. Zhou, L. Chao, J. Chao, Kallistatin: a novel human tissue kallikrein utilized to search for similar expression profiles between two genes, serpin inhibitor. Purification, characterization, and reactive center sequence, J. and protease, based on the Pearson momentum correlation algorithm. Values Biol. Chem. 267 (1992) 25873–25880. of the correlation coefficient r range from +1 to −1, where +1 indicates [16] M. Laurell, A. Christensson, P.-A. Abrahamsson, J. Stenflo, H. Lilja, highest similarity, while a completely opposite profile would give a −1 r Protein C inhibitor in human body fluids: seminal plasma is rich in value. Expression profiles with identical shape had higher correlation. inhibitor antigen deriving from cells throughout the male reproductive The gene expression profiles of the 25 normal tissues listed above were used to system, J. Clin. Invest. 89 (1992) 1094–1101. search for similarity between serpins and proteases. If the correlation value of two [17] M.C. Strik, et al., Distribution of the human intracellular serpin protease genes was greater than 0.7, we arbitrarily defined the expression of those two genes inhibitor 8 in human tissues, J. Histochem. Cytochem. 50 (2002) as closely correlated. For each serpin, a set of proteases was selected according to 1443–1453. their correlation values and the best matched proteases were investigated further by [18] E. Miranda, D.A. Lomas, Neuroserpin: a serpin to think about, Cell. Mol. evaluating the expression profile in normal and diseased tissues. Life Sci. 63 (2006) 709–722. [19] K. Ozaki, et al., Isolation and characterization of a novel human pancreas- specific gene, pancipin, that is down-regulated in pancreatic cancer cells, Acknowledgments Genes Cancer 23 (1998) 179–185. [20] H. Kato, Expression and function of squamous cell carcinoma antigen, We thank Kristine Burke, Kim Tsui, Helen Mclaughlin, and Anticancer Res. 16 (1996) 2149–2153. the members of the Molecular Technology Department at [21] S. Cataltepe, et al., Co-expression of the squamous cell carcinoma antigens 1 and 2 in normal adult human tissues and squamous cell carcinomas, J. Millennium Pharmaceuticals for their excellent technical Histochem. Cytochem. 48 (2000) 113–123. assistance. We give special thanks to Steve Tirrell and Bill [22] T.R. Hughes, et al., Functional discovery via a compendium of expression Campbell for their continuous support of this work. profiles, Cell 102 (2000) 109–126. [23] J.R. Dahlen, F. Jean, G. Thomas, D.C. Foster, W. Kisiel, Inhibition of Appendix A. Supplementary data soluble recombinant furin by human proteinase inhibitor 8, J. Biol. Chem. 273 (1998) 1851–1854. [24] J.R. Dahlen, D.C. Foster, W. Kisiel, Expression, purification, and Supplementary data associated with this article can be found, inhibitory properties of human proteinase inhibitor, Biochemistry 36 in the online version, at doi:10.1016/j.ygeno.2006.03.017. (1997) 14874–14882. [25] S.S. Molloy, P.A. Bresnahan, S.H. Leppla, K.R. Klimpel, G. Thomas, Human furin is a calcium-dependent serine endoprotease that recognizes References the sequence Arg-X-X-Arg and efficiently cleaves anthrax toxin protective antigen, J. Biol. Chem. 267 (1992) 16396–16402. [1] N.D. Rawlings, D.P. Tolle, A.J. Barrett, Evolutionary families of peptidase [26] N.C. Rockwell, J.W. Thorner, The kindest cuts of all: crystal structures of inhibitors, Biochem. J. 378 (2004) 705–716. Kex2 and furin reveal secrets of precursor processing, Trends Biochem. [2] G.A. Silverman, et al., The serpins are an expanding superfamily of Sci. 30 (2004) 80–87. structurally similar but functionally diverse proteins: evolution, mecha- [27] N.G. Seidah, et al., Testicular expression of PC4 in the rat: molecular nism of inhibition, novel functions, and a revised nomenclature, J. Biol. diversity of a novel germ cell-specific Kex2/subtilisin-like proprotein Chem. 276 (2001) 33293–33296. convertase, Mol. Endocrinol. 6 (1992) 1559–1570. [3] D. van Gent, P. Sharp, K. Morgan, N. Kalsheker, Serpins: structure, [28] S. Basak, M. Chretien, M. Mbikay, A. Basak, In vitro elucidation of substrate function and molecular evolution, Int. J. Biochem. Cell Biol. 35 (2003) specificity and bioassay of proprotein convertase 4 using intramolecularly 1536–1547. quenched fluorogenic peptides, Biochem. J. 380 (2004) 505–514. [4] N.D. Rawlings, D.P. Tolle, A.J. Barrett, MEROPS: the peptidase database, [29] D.E. Bassi, H. Mahloogi, A.J. Klein-Szanto, The proprotein convertases Nucleic Acids Res. 32 (2004) D160–D164. furin and PACE4 play a significant role in tumor progression, Mol. [5] J.A. Huntington, R.J. Read, R.W. Carrell, Structure of a serpin–protease Carcinog. 28 (2000) 63–69. complex shows inhibition by deformation, Nature 407 (2000) 923–926. [30] E. Skrzydlewska, M. Sulkowska, M. Koda, S. Sulkowski, Proteolytic– [6] Z. Sun, P. Yang, Role of imbalance between neutrophil elastase and alpha antiproteolytic balance and its regulation in , World J. 1-antitrypsin in cancer development and progression, Lancet Oncol. 5 Gastroenterol. 11 (2005) 1251–1266. (2004) 182–190. [31] K. Uchida, L.R. Chaudhary, Y. Sugimura, H.D. Adkisson, K.A. Hruska, 184 S. Badola et al. / Genomics 88 (2006) 173–184

Proprotein convertases regulate activity of prostate epithelial cell serine proteases through efficient reactions at two active sites, Biochem- differentiation markers and are modulated in human prostate cancer istry 40 (2001) 15762–15770. cells, J. Cell. Biochem. 88 (2003) 394–399. [43] J. Sumiya, et al., Isolation and characterization of the plasma hyaluronan- [32] J.A. Schalken, et al., fur gene expression as a discriminating marker for binding protein (PHBP) gene (HABP2), J. Biochem. (Tokyo) 123 (1997) small cell and nonsmall cell lung carcinomas, J. Clin. Invest. 80 (1987) 983–990. 1545–1549. [44] N.H. Choi-Miura, et al., Purification and characterization of a novel [33] C.A. Iacobuzio-Donahue, et al., Highly expressed genes in pancreatic hyaluronan-binding protein (PHBP) from human plasma: it has three EGF, ductal adenocarcinomas: a comprehensive characterization and compari- a kringle and a serine protease domain, similar to hepatocyte growth factor son of the transcription profiles obtained from three major technologies, activator, J. Biochem. (Tokyo) 119 (1996) 1157–1165. Cancer Res. 63 (2003) 8614–8623. [45] N.H. Choi-Miura, M. Yoda, K. Saito, K. Takahashi, M. Tomita, [34] L.J. Underwood, et al., Ovarian tumor cells express a novel multi-domain Identification of the substrates for plasma hyaluronan binding protein, cell surface serine protease, Biochim. Biophys. Acta 1502 (2000) 337–350. Biol. Pharm. Bull. 24 (2001) 140–143. [35] J.A. Huntington, M. Kjellberg, J. Stenflo, Crystal structure of protein C [46] K.K. Wang, et al., Novel candidate tumor marker genes for lung inhibitor provides insights into hormone binding and heparin activation, adenocarcinoma, Oncogene 21 (2002) 7598–7604. Structure 11 (2003) 205–215. [47] H.K. Lee, A.K. Hsu, J. Sajdak, J. Qin, P. Pavlidis, Coexpression analysis of [36] L. Yang, C. Manithody, T.D. Walston, S.T. Cooper, A.R. Rezaie, human genes across many microarray data sets, Genome Res. 14 (2004) Thrombomodulin enhances the reactivity of thrombin with protein C 1085–1094. inhibitor by providing both a for the serpin and allosterically [48] V.K. Mootha, et al., PGC-1alpha-responsive genes involved in oxidative modulating the activity of thrombin, J. Biol. Chem. 278 (2003) phosphorylation are coordinately downregulated in human diabetes, Nat. 37465–37470. Genet. 34 (2003) 267–273. [37] U. Seifert, et al., An essential role for tripeptidyl peptidase in the [49] J. Burke, D. Davison, W. Hide, d2_cluster: a validated method for generation of an MHC class I epitope, Nat. Immunol. 4 (2003) 375–379. clustering EST and full-length cDNA sequences, Genome Res. 9 (1999) [38] B. Tomkinson, A.K. Jonsson, Characterization of cDNA for human 1135–1142. tripeptidyl peptidase II: the N-terminal part of the enzyme is similar to [50] V.V. Solovyev, A.A. Salamov, C.B. Lawrence, Identification of human subtilisin, Biochemistry 30 (1991) 168–174. gene structure using linear discriminant functions and dynamic program- [39] R.M. Balow, B. Tomkinson, U. Ragnarsson, O. Zetterqvist, Purification, ming, Proc. Int. Conf. Intell. Syst. Mol. Biol. 3 (1995) 367–375. substrate specificity, and classification of tripeptidyl peptidase II, J. Biol. [51] C. Burge, S. Karlin, Prediction of complete gene structures in human Chem. 261 (1986) 2409–2417. genomic DNA, J. Mol. Biol. 268 (1997) 78–94. [40] E. Remold-O'Donnell, J.C. Nixon, R.M. Rose, Elastase inhibitor: [52] S.R. Eddy, G. Mitchison, R. Durbin, Maximum discrimination hidden characterization of the human elastase inhibitor molecule associated with Markov models of sequence consensus, J. Comput. Biol. 2 (1995) 9–23. monocytes, macrophages, and neutrophils, J. Exp. Med. 169 (1989) [53] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, D.J. Lipman, Basic local 1071–1086. alignment search tool, J. Mol. Biol. 215 (1990) 403–410. [41] E. Remold-O'Donnell, J. Chin, M. Alberts, Sequence and molecular [54] B. Ewing, P. Green, Base-calling of automated sequencer traces using characterization of human monocyte/neutrophil elastase inhibitor, Proc. phred. II. Error probabilities, Genome Res. 8 (1998) 186–194. Natl. Acad. Sci. U. S. A. 89 (1992) 5635–5639. [55] K.J. Livak, T.D. Schmittgen, Analysis of relative gene expression data [42] J. Cooley, T.K. Takayama, S.D. Shapiro, N.M. Schechter, E. Remold- using real-time quantitative PCR and the 2(−Delta Delta C(T)) Method, O'Donnell, The serpin MNEI inhibits elastase-like and chymotrypsin-like Methods (Duluth) 25 (2001) 402–408.