Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

27

Early detection of ovarian cancer using group biomarkers

Alain B. Tchagang,1 Ahmed H. Tewfik,2 that single or group biomarkers correlate with the stage Melissa S. DeRycke,3 Keith M. Skubitz,4 of the disease. [Mol Cancer Ther 2008;7(1):27–37] and Amy P.N. Skubitz3

Departments of 1Biomedical Engineering, 2Electrical and Introduction Computer Engineering, 3Laboratory Medicine and Pathology, Epithelial ovarian cancer is the most lethal form of 4 and Medicine, University of Minnesota, Minneapolis, Minnesota gynecologic cancer and the fourth leading cause of cancer death among women in developed countries, claiming Abstract about 15,000 lives in the United States each year (1). One reason it is so deadly is the fact that ovarian cancer is not One reason that ovarian cancer is such a deadly disease is usually diagnosed until it has reached an advanced stage. because it is not usually diagnosed until it has reached an Early detection can help prolong or save lives, but advanced stage. In this study, we developed a novel algo- clinicians currently have no specific and sensitive screening rithm for group biomarkers identification using method and the disease displays very subtle symptoms (2). expression data. Group biomarkers consist of coregulated The well-known CA-125 blood test and other imaging across normal and different stage diseased tissues. techniques, such as ultrasound and computed tomographic Unlike prior sets of biomarkers identified by statistical scan, or the combination of the CA-125 blood test with one methods, genes in group biomarkers are potentially of the above imaging techniques, are useful for tracking involved in pathways related to different types of cancer patients already diagnosed with ovarian cancer but have development. They may serve as an alternative to the tra- not proven sensitive enough to be used as an early ditional single biomarkers or combination of biomarkers diagnostic test (3). used for the diagnosis of early-stage and/or recurrent In recent years, large-scale gene expression analyses have ovarian cancer. We extracted group biomarkers by applying been done to identify differentially expressed genes in biclustering algorithms that we recently developed on the ovarian carcinoma (4–11). A common goal of these studies gene expression data of over 400 normal, cancerous, and was to identify potential tumor markers for the diagnosis of diseased tissues. We identified several groups of coregu- early-stage ovarian cancer as well as to use these markers lated genes that encode for secreted proteins and exhibit as targets for improved therapy and treatment of the expression levels in ovarian cancer that are at least 2-fold disease during all stages. (in log scale) higher than in normal ovary and nonovarian 2 Numerous computational tools have been developed to tissues. In particular, three candidate group biomarkers analyze gene expression data for biomarker discovery exhibited a conserved biological pattern that may be used (12–19). Most focus on differential gene expression, which for early detection or recurrence of ovarian cancer with is tested by a simple calculation of the fold changes by t specificity greater than 99% and sensitivity equal to 100%. test, F test, scoring methods (12), or cluster analysis (13). We validated these group biomarkers using publicly avail- Many other computational techniques based on a super- able gene expression data sets downloaded from a NIH Web vised learning approach have also been developed (e.g. site (http://www.ncbi.nlm.nih.gov/geo). Statistical analy- support vector machine; ref. 15, naive Bayes method, and sis showed that our methodology identified an optimum Fisher discriminant analysis; refs. 16–19). Although most of combination of genes that have the highest effect on the these approaches have been successful in uncovering diagnosis of the disease compared with several computa- interesting patterns that can be used to discriminate tional techniques that we tested. Our study also suggests between healthy and diseased tissues, computational tools for the identification of potential blood biomarkers are still not well developed or do not take into account all of the input variables. Most approaches only do a comparison Received 8/15/07; revised 11/2/07; accepted 12/4/07. between healthy and diseased tissues of the corresponding Grant support: University of Minnesota Graduate ProgramGrant-in-Aid of disease and do not take into consideration other tissues in Research, Artistry, and Scholarship Programand NIH/National Cancer the body that may produce the same protein as the Institute grant NIH R01CA106878 (A.P.N. Skubitz). diseased tissue. Therefore, potential biomarkers identified The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked using these approaches may introduce false positives in a advertisement in accordance with 18 U.S.C. Section 1734 solely to diagnostic blood test. indicate this fact. In this study, we first used a computational tool that we Requests for reprints: Alain B. Tchagang, Department of Biomedical recently developed (20) to identify single biomarkers. Then, Engineering, University of Minnesota, 312 Church Street Southeast, Minneapolis, MN 55455. Phone: 612-624-5285. E-mail: we used a second novel algorithm that we recently [email protected] developed to identify group biomarkers (20–22). Group Copyright C 2008 American Association for Cancer Research. biomarkers correspond to a set of single biomarkers that doi:10.1158/1535-7163.MCT-07-0565 exhibit coherent behavior across an ordered set of ovarian

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

28 Ovarian Cancer Group Biomarkers

cancer tissue samples, representing distinct stages of the Tissues were obtained from the University of Minnesota disease. That is, their expression level increases or Cancer Center Tissue Procurement Facility on approval by decreases coherently during the progression of the disease. the University of Minnesota Institutional Review Board. This unique pattern shows a correlation or coregulation Tissue Procurement Facility employees obtained signed among the set of genes that belong to the same group consent from each patient, allowing procurement of excess biomarkers, suggesting that they respond similarly to the waste tissue and access to medical records. Bulk tumor and same environmental conditions. Prior studies on different normal tissues were identified, dissected, and snap frozen organisms have examined several biclusters of coregulated in liquid nitrogen within 15 to 30 min of resection from the genes and showed that the genes in a given bicluster patient. Tissue sections were made from each sample, typically participate in a single pathway (23). stained with H&E, and examined independently by two Our methodology for identifying single or group bio- pathologists to confirm the pathological state of each markers is based on unifying techniques that are well sample. The integrity of the RNA was verified before use understood and developed in the literature: gene expression in gene array experiments (10, 11). data analysis, biclustering algorithms, and receiver operat- Gene Expression Matrix ing characteristics (ROC) curves. Furthermore, our ap- The gene expression data were determined by Gene proach for identifying blood biomarkers is based on the Logic using the Affymetrix GeneChip HG_U95A, which observation that, if we are looking for biomolecular patterns contains 12,651 known genes and 48,000 expressed se- in the blood that are caused by ovarian cancer, those quence tags. The gene expression data were normalized patterns should only be present in the gene expression data using Affymetrix M.A.S. 4.0.1, and the log-floor data of ovarian cancer tissue samples compared with the gene transform with a floor value of 1 was done (24). After this expression data of normal ovary tissue samples or any process, the data ranged from 0 to 4. The data were then nonovarian healthy or diseased tissue samples. organized into three matrices defined as follows: matrix A We implemented our approach using the computer is a 12,651 62 matrix that represents the gene expression program Matlab and applied it to a comprehensive set of of the 62 normal ovary tissue samples; matrix B = [B1 B2 B3] well-defined gene expression data corresponding to normal is a 12,651 45 matrix that represents the gene expression ovary, ovarian cancer, and nonovarian tissue samples. We of the 45 ovarian cancer tissues samples; submatrix B1 is a identified three candidate group biomarkers that encode 12,651 7 matrix representing the gene expression of the for secreted proteins, membrane proteins, and/or extracel- 7 borderline ovarian cancer tissues; submatrix B2 is a 12,651 lular matrix proteins. These three candidate group bio- 22 matrix, which represents the gene expression of the 22 markers clearly discriminate between the sample sets, and papillary serous adenocarcinoma tumors; submatrix B3 is a they are promising candidates to be used for early 12,651 16 matrix representing the gene expression of the detection or recurrence of ovarian cancer using a blood 16 omentum papillary serous adenocarcinoma; and matrix test. Statistical analysis showed that these group bio- C is a 12,651 319 matrix that represents the gene markers have a much better detection performance than expression of the 319 nonovarian tissues. single biomarkers and combinations of biomarkers identi- Identification of Single Biomarkers in Ovarian Carci- fied using other computational approaches. Our data also noma suggest that single or group biomarkers correlate with the Biomarkers specific for ovarian cancer should be highly stage of the disease. expressed in ovarian cancer samples and low or absent in other samples, including normal ovaries and nonovarian tissues. Mathematically, they should correspond to the set Materials and Methods of genes that are up-regulated in ovarian cancer tissue Tissue Samples samples compared with normal ovary tissue samples and Table 1 lists all of the tissue samples used in this study. each set of nonovarian tissue samples. In this study, we They can be classified into four different sample sets: assume that for a given gene to correspond to a potential normal ovary, ovarian cancer, normal nonovarian, and biomarker it should be at least 2-fold (log2 scale) up- diseased nonovarian. Normal ovaries were obtained from regulated in ovarian cancer tissue samples compared with 62 women. Seven borderline ovarian tumors were obtained; normal ovary tissue samples and each set of nonovarian z z these tumors are considered to be of low malignant tissue samples [that is, log2(y / x) 2 and log2(y / z) 2, potential and were not staged. Next, we obtained tissue where x, y, and z correspond to the expression level of a samples of stage III or IV serous epithelial ovarian cancer gene in the healthy ovary, ovarian cancer, and nonovarian derived from two different sites: 22 from the ovary itself data, respectively]. Also, the corresponding gene must and 16 from the omentum. Tissues were ranked from exhibit a sensitivity of z90% for a specificity of z90%, with normal to low malignancy to highly malignant as follows: accuracy of z90%. Identification of such a pattern is done normal ovaries, borderline ovarian tumors, primary serous in this study using the combination of the Robust epithelial ovarian tumors present in ovarian tissues, and Biclustering Algorithm (20) and the ROC approach we will serous epithelial ovarian tumors present in omental tissues. define here. To develop a diagnostic assay for the detection None of the patients had been treated with chemotherapy of ovarian cancer using a blood test, biomarkers should also before surgical resection of the tissues (10, 11). correspond to genes that encode for predicted secreted

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Molecular Cancer Therapeutics 29

Table 1. Tissue samples used to generate gene expression data tissue samples data set, NB corresponds to the number of B B B biclusters Ml ={Il , Jl } with constant values y (y >> x)in Tissue samples No. Age (y), the ovarian cancer tissue samples data set, and NC samples mean (range) C C C corresponds to the number of biclusters Mm ={Im , Jm } with constant values z (z << y) in the nonovarian tissue Normal ovarian tissues Normal ovary 62 51 (28–79) sample data set. Ovarian cancer tissues Here, we considered each set of ovarian cancer tissues Borderline ovarian cancer 7 51 (25–81) data separately and the expression level of the gene Papillary serous adenocarcinoma 22 58 (29–79) considered in the ovarian cancer tissue samples should be Omentum; papillary 16 57 (29–79) at least 2-fold (log2 scale) greater than the expression level serous adenocarcinoma of the same gene in normal ovary tissue samples and in Normal nonovarian tissues each set of nonovarian tissue samples. Ideally, when Adipose 13 52 (14–86) dealing with blood biomarkers, we would like x = z =0. Cervix 17 50 (34–62) The statistical performance of a given bicluster Colon 16 57 (24–87) M =[M(i,j)] with constant value was then evaluated using Kidney 12 60 (38–89) Liver 14 50 (22–90) the following equation: for all rows, M(i,:) of M =[M(i,j)], Lung 18 55 (32–76) ð ð ; ÞÞ À ð ð ; ÞÞVd ð Þ Myometrium 90 50 (14–84) Max M i : Min M i : 2 Skeletal muscle 10 40 (14–75) Small intestine 10 62 (20–83) with d ! 0 (that is, d is a real positive small number). Uterus 17 46 (30–73) ROC Approach Diseased nonovarian tissues Given the gene expression data as defined above, the Degenerative surface of bone 18 63 (43–85) ROC approach first assumes that all genes correspond to Kidney clear cell adenocarcinoma 3 79 (67–89) potential biomarkers. Then, it uses the following criterion Gallbladder with chronic inflammation 14 35 (12–68) based on the detection performance exhibited by their Liver fibrosis 8 51 (33–67) Myometrium leiomyoma 33 47 (26–87) respective ROC curve to select the ones with high Tonsils with lymphoid hyperplasia 26 21 (10–42) specificity corresponding to high sensitivity and high accuracy. For a given screening cutoff point, let a be the number of healthy ovary and nonovarian tissue samples (healthy and proteins, membrane proteins, and/or extracellular matrix diseased) that screen positive, b the number of ovarian proteins. These types of proteins are more likely to be cancer tissue samples that screen positive, c the number of present in the blood than proteins localized to the cell healthy ovary and nonovarian tissue samples that screen nucleus or cytoplasm. negative, and d the number of ovarian cancer tissue Biclustering Approach samples that screen negative. The sensitivity of a potential We used the Robust Biclustering Algorithm that we blood biomarker (Se) is the number of ovarian cancer recently developed (20) to identify biclusters with constant tissue samples that screen positive divided by the total values. Given the above gene expression matrices: A, B, number of ovarian cancer tissue samples: Se = b / b + d. ... and C, with set of rows or genes G ={g1, g2, , gN} and The specificity of a potential blood biomarker (Sp)isthe A A ... A set of conditions or tissue samples SA ={s1 , s2 , , sM1 }, number of healthy ovary and nonovarian tissue samples B B ... B C C ... C SB ={s1 , s2 , , sM2 }, and SC ={s1 , s2 , , sM3 }, res- that screen negative divided by the total number of pectively. We define a bicluster with constant values, that is, healthy ovary and nonovarian tissue samples: Sp = c / c a subset of genes that the expression level stay constant + a. Using these variables, we compute the ROC function A A across a subset of conditions or tissue samples as Mk ={Ik , of each potential blood biomarker using the following A B B B C C C Jk }, Ml ={Il , Jl }, and Mm ={Im , Jm } or as submatrices equation: Se = f (1 - Sp). Basically, Se = f (1 - Sp) describes A A B B C C Mk =[Mk (i,j)], Ml =[Ml (i,j)], and Mm =[Mm (i,j)] of the relationship between the true-positive rate (sensitivity) A, B, and C, respectively. The Is correspond to the subsets of and the false-positive rate (1 - specificity) for different genes G, the Js correspond to the subsets of tissue samples screening cutoff points. Finally, the ROC methodology SA,SB,orSC, and M(i,j) corresponds to the expression level keeps all genes capable of achieving specificity of gene ith under condition jth, with ieI and jeJ. Identifi- corresponding to sensitivity as well as accuracy greater cation of potential biomarkers can be done using Eq. (1) than the defined specified thresholds. The resultant family V V V V V V below, with 1 k NA,1 l NB, and 1 m NC of genes will correspond to biomarkers that may then be evaluated for their use in the detection of ovarian cancer ¼[ðA \ B \ C Þ: ð Þ I Ik Il Im 1 using a blood test. The P of each identified single biomarker, that is, the probability of observing the given result, or one more In Eq. (1), NA corresponds to the number of biclusters extreme by chance if the null hypothesis is true, was A A A Mk ={Ik , Jk } with constant values x in the healthy ovary estimated using a two-sided test.

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

30 Ovarian Cancer Group Biomarkers

Table 2. Fifty-four genes identified by the single biomarker algorithm to be up-regulated in ovarian cancer tissue samples compared with normal ovary tissue samples and nonovarian tissue samples

Fragment Gene Known Ovarian Ovarian Ovarian name name gene cancer cancer cancer symbol borderline primary omentum

Fold P Fold P Fold P change change change

33454_at Agrin AGRN 2.2 5.2e-07 2.4 5.1e-21 2.3 5.5e-16 757_at Annexin A2, Annexin A2 pseudogene 2 ANXA2 3.0 3.8e-04 3.1 1.6e-07 35099_at Apolipoprotein L1 APOL1 2.1 1.9e-06 2011_s_at BCL2-interacting killer (BIK, KIAA1654) 4.7 8.3e-13 (apoptosis-inducing) 35822_at B-factor properdin BF* 6.0 3.7e-15 41534_at BH-protocadherin (brain-heart) PCDH7 2.0 1.7e-04 1620_at Cadherin 6, type 2, K-cadherin CDH6 2.6 3.1e-03 4.4 6.2e-15 5.2 1.8e-18 (fetal kidney) 41660_at Cadherin, EGF LAG seven-pass CELSR1* 3.5 2.5e-04 3.8 5.4e-10 3.8 9.0e-09 G-type 1 (flamingo homologue, Drosophila) 36499_at Cadherin, EGF LAG seven-pass CELSR2* 2.2 2.0e-10 2.0 4.7e-19 2.0 3.8e-19 G-type receptor 2 (flamingo homologue, Drosophila) 37890_at CD47 antigen (Rh-related antigen, CD47 2.1 9.0e-06 2.4 1.1e-21 2.7 6.5e-22 integrin-associated signal transducer) 39008_at Ceruloplasmin (ferroxidase) CP 4.1 7.4e-12 431_at Chemokine (C-X-C motif) ligand 10 CXCL10 3.1 5.12e-15 36197_at Chitinase 3-like 1 (cartilage CHI3L1 5.2 2.7e-12 glycoprotein-39) 33904_at Claudin 3 CLDN3 6.5 1.7e-15 6.6 1.0e-11 35276_at Claudin 4 CLDN4 5.9 1.6e-16 5.5 1.6e-11 38482_at Claudin 7 CLDN7 5.1 1.5e-10 37534_at Coxsackie virus and CXADR 4.6 5.0e-07 adenovirus receptor 35453_at Dermatan sulfate proteoglycan 3 DSPG3 2.8 3.7e-16 36643_at Discoidin domain receptor DDR1 2.1 7.2e-06 2.4 5.3e-18 2.2 7.6e-12 family, member 1 1007_s_at Discoidin domain receptor DDR1 2.1 2.5e-09 2.1 2.0e-28 2.1 1.6e-18 family, member 1 41586_at Fibroblast growth factor 18 FGF18 2.9 1.7e-05 41587_g_at Fibroblast growth factor 18 FGF18 3.7 1.6e-16 534_s_at Folate receptor 1 (adult) FOLR1 2.5 2.7e-05 2.6 4.9e-09 821_s_at Folate receptor 1 (adult) FOLR1 2.4 8.0e-11 3.1 7.6e-13 38749_at G protein-coupled receptor 39, GPR39, LYPDC1* 6.0 3.1e-27 5.8 7.3e-54 5.6 8.6e-51 LY6/PLAUR domain containing 1 b 406_at Integrin 4 ITGB4, (A) 3.2 1.4e-07 2.9 7.4e-13 2.4 1.5e-06 37554_at Kallikrein 6 (neurosin, zyme) KLK6 5.2 2.9e-19 38143_at Kallikrein 7 (chymotryptic, KLK7, (C) 3.3 2.1e-03 4.4 4.1e-04 4.7 1.4e-04 stratum corneum) 37131_at Kallikrein 8 (neuropsin/ovasin) KLK8, (C) 5.4 5.4e-20 6.1 1.6e-70 6.2 1.5e-71 36838_at Kallikrein 10 KLK10 2.6 3.1e-16 3.0 1.5e-14 40035_at Kallikrein 11 KLK11 4.7 9.9e-15 5.2 1.3e-13 b 36929_at Laminin 3 LAMB3 3.7 1.9e-11 35280_at Laminin c2 LAMC2* 5.3 1.6e-08 5.0 2.6e-28 39583_at Leucine-rich repeat neuronal 5 LRRN5 2.4 1.1e-03 32821_at Lipocalin 2 (oncogene 24p3) LCN2*, (A) 6.3 2.8e-08 5.1 5.7e-13 4.9 6.6e-10 40093_at Lutheran blood group LU 2.6 3.5e-14 2.8 1.1e-14 (Auberger b antigen included)

(Continued on the following page)

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Molecular Cancer Therapeutics 31

Table 2. Fifty-four genes identified by the single biomarker algorithm to be up-regulated in ovarian cancer tissue samples compared with normal ovary tissue samples and nonovarian tissue samples (Cont’d)

Fragment Gene Known Ovarian Ovarian Ovarian name name gene cancer cancer cancer symbol borderline primary omentum

Fold P Fold P Fold P change change change

32072_at Mesothelin MSLN, (B), (C) 3.4 2.5e-05 4.1 2.9e-19 4.5 1.5e-17 38784_g_at Mucin 1, transmembrane MUC1, (B) 4.9 4.7e-13 4.8 2.8e-26 4.6 1.4e-16 927_s_at Mucin 1, transmembrane MUC1 4.5 5.7e-07 4.9 7.5e-18 4.1 1.5e-12 38783_at Mucin 1, transmembrane MUC1 6.2 7.5e-07 6.5 6.6e-16 5.9 5.5e-12 1083_s_at Mucin 1, transmembrane MUC1 3.6 3.6e-06 4.1 2.3e-17 3.9 5.9e-13 35912_at Mucin 4, tracheobronchial MUC4 3.1 7.9e-05 32625_at Natriuretic peptide receptor A/guanylate NPR1 2.4 1.4e-06 A (atrionatriuretic peptide receptor A) 33483_at Neuromedin U NMU 4.3 3.2e-19 35663_at Neuronal pentraxin II NPTX2 2.0 1.9e-07 1985_s_at Nonmetastatic cells 1, (NME1, NME2) 2.1 1.6e-11 protein (NM23A) expressed in nonmetastatic cells 2, protein (NM23B) 33783_at Plexin B1 PLXNB1* 2.3 1.7e-04 2.9 2.2e-12 2.8 6.9e-10 34780_at Plexin B2 PLXNB2 2.1 9.4e-08 41106_at Potassium intermediate/small conductance KCNN4 4.7 9.3e-07 calcium-activated channel, subfamily N, member 4 41470_at Prominin 1 PROM1 3.8 7.2e-05 3.4 3.0e-04 32275_at Secretory leukocyte protease inhibitor SLPI* 4.2 1.3e-04 4.2 7.6e-11 4.1 2.3e-08 (antileukoproteinase) 39075_at Sialidase 1 (lysosomal sialidase) NEU1 2.5 4.5e-06 35207_at Sodium channel, non-voltage-gated 1a SCNN1A* 5.6 3.7e-06 6.0 2.9e-17 6.2 6.7e-14 36609_at Solute carrier family 1 (glial high-affinity (DKFZP547J0410, 2.2 1.6e-08 glutamate transporter), member 3 SLC1A3) 35277_at Spondin 1, extracellular matrix protein SPON1 2.2 5.3e-10 575_s_at Tumor-associated calcium signal TACSTD1* 5.4 4.4e-05 5.6 3.5e-13 5.4 1.1e-09 transducer 1 291_s_at Tumor-associated calcium signal TACSTD2* 4.7 2.4e-06 transducer 2 33218_at V-erb-b2 erythroblastic leukemia viral ERBB2 2.0 2.7e-05 2.1 8.5e-16 oncogene homologue 2, neuro/ glioblastoma-derived oncogene homologue (avian) 33933_at WAP four-disulfide core domain 2 WFDC2, (B) 4.8 7.9e-06 5.3 1.5e-17 5.2 2.1e-13 1887_g_at Wingless-type MMTV integration WNT7A*, (A) 3.0 7.8e-22 2.3 7.1e-22 3.4 2.2e-33 site family, member 7A

NOTE: ‘‘(A),’’ ‘‘(B),’’ and ‘‘(C)’’ are genes that belong to group biomarkers ‘‘A,’’ ‘‘B,’’ and ‘‘C,’’ respectively. Fold change relative to normal ovary tissues; P values relative to normal ovary tissues and nonovarian tissues. Selection criteria: Up-regulated in ovarian cancer tissue samples at least 2-fold (in log2 scale) compared with normal ovary tissue samples and each set of nonovarian tissue samples. Specificity greater or equal to 90% corresponding to sensitivity greater or equal to 90%. Genes code for proteins that are secreted, extracellular, or membranous. *Genes not previously linked in the literature to ovarian cancer.

Identification of Group Biomarkers in Ovarian Carci- papillary serous adenocarcinoma metastases). Briefly, we noma applied Eqs. (1) and (2) and the ROC approach on the Identification of group biomarkers was done using a randomly selected set of data to uncover potential single randomly selected set of 40 of the 62 normal ovary tissues biomarkers. Then, the gene expression data of the single and 30 of the 45 ovarian cancer tissues (5 of the 7 borderline biomarkers identified were sorted according to the progres- ovarian cancer tissues, 15 of the 22 papillary serous sion of the disease. Given that we only had three different adenocarcinoma tumors, and 10 of the 16 omentum stages (normal ovary, borderline ovarian cancer, and

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

32 Ovarian Cancer Group Biomarkers

primary ovarian cancer) and two different sites of ovarian set of size I J will contain an order preserving bicluster cancer (ovary and the omentum), the stages were repeated with G or more genes in it (21). periodically every three samples. Data were organized as XI I 1 i 1 IÀi ... ZðJ; GÞ¼J! 1 À ð3Þ D =[D1 D2 D3 D1 D2 D3 D1 D2 D3], where D1 is a i J! J! column vector representing the expression level of one of i¼G the 40 randomly selected normal ovary tissues, D is a 2 As long as that upper bound probability is smaller than column vector representing the expression level of one of any desired significance level, the identified group bio- the 5 randomly selected borderline ovarian tissues, and D 3 marker will be statistically significant. is one of the 15 randomly selected papillary serous adenocarcinoma or one of the 10 randomly selected omentum papillary serous adenocarcinoma metastases. Results Also, because we only had 5 randomly selected borderline Single Biomarker Algorithm samples, the maximum number of columns or tissue Using Eqs. (1) and (2) with d < 1 and the ROC approach, samples in D that we could have was 15. We therefore we identified 54 genes that are up-regulated in ovarian produced several D matrices, which used the same cancer tissue samples at least 2-fold (log2 scale) compared borderline data. Different matrices had different combina- with normal ovary tissue samples and each set of non- tions of randomly selected normal ovary tissues and ovarian healthy and diseased tissue samples used in this papillary serous adenocarcinoma of ovarian tumors or study (Table 2). The 54 genes achieved a specificity greater omentum papillary serous adenocarcinoma ovarian tumors. or equal to 90% corresponding to a sensitivity greater or In all, we examined 8 such matrices using the order equal to 90% using the ROC approach. The 54 genes encode preserving biclustering algorithm of Tchagang and Tewfik for predicted secreted proteins, membrane proteins, and/ (21, 22) and retained the genes that appeared as many times or extracellular proteins.5 Therefore, they represent pro- as possible in the same bicluster. The order preserving teins that have the potential to be present in the blood of biclustering algorithm has the advantage of being insensi- ovarian cancer patients and may prove useful in an ovarian tive to the relative position of each tissue of a given kind. cancer diagnostic blood test. That is, it produces the same output under any permutation We analyzed each stage of ovarian cancer separately. of the positions of the normal ovary tissues, papillary serous From our analysis and as shown in the following sections, adenocarcinoma of ovarian tumors, or omentum papillary we found that many of the ovarian cancer biomarkers serous adenocarcinoma of ovarian tumors within each correlated with the stage of ovarian cancer. That is, some group of tissues (positions D1, D2,orD3). biomarkers did very well on some stages of ovarian cancer In our problem, the bicluster identification step of refs. 21, but not as well on others. For example, chemokine (C-X-C 22 consists of two substeps. In the first substep, the procedure motif) ligand 10 and chitinase 3-like 1 were found to be up- z enumerates all combinations of K tissues, where K Kmin, regulated in the omental metastases but were not up- the prespecified minimum number of tissues in a valid biclus- regulated in the borderline ovarian cancer or primary ter, from the given MD tissues in matrix D that could poten- ovarian cancer tissue relative to normal ovary (Table 2). tially appear in a valid bicluster. For each subset of K tissues, it The genes listed in Table 2 include most of the potential then uses a row sort procedure that allows us to focus on the biomarkers uncovered by previous studies (4–11) as well as coherent evolutions of gene expression levels rather than the an additional 13 that do not appear to have been mentioned raw or processed expression levels. The output of this step is a in the literature before as potential ovarian cancer bio- matrix that contains the rank of each of the K tissues for each markers (indicated by an asterisk). For example, the G row (gene) when the expression level at each tissue for the protein-coupled receptor 39, LY6/PLAUR domain contain- given gene are ordered in a nondecreasing manner. This ing 1 (GPR39, LYPDC1) corresponds to a gene that has not matrix is referred to as the ‘‘tissue rank matrix’’ and used as been mentioned in the literature before as a potential the input to the main bicluster identification routine (22). In ovarian cancer biomarker. GPR39, LYPDC1 is at least 6-fold the second substep, the main bicluster identification routine (log2 scale) up-regulated in ovarian cancer tissue samples identifies all valid coherent evolution patterns involving all compared with normal ovary tissue samples and each set of genes and a set of K tissues ‘‘simultaneously’’ through a nonovarian tissue samples used in this study (Fig. 1A). By fast row sorting procedure. Note that this allows the algo- ROC analysis, GPR39, LYPDC1 achieved a specificity rithm to identify all the possible valid biclusters ‘‘without’’ an greater than 90% for a sensitivity greater than 90% when exhaustive enumeration of all possible K! permutations of used to detect ovarian cancer at each stage (Fig. 1B). the K tissues. The procedure will also yield biclusters of Group Biomarkers genes where a subset of genes are coherently up-regulated Using the order preserving technique as described above and another subset coherently down-regulated across the K on the gene expression data of the set of single biomarkers tissues. A final pruning step eliminates all biclusters that are listed in Table 2, we identified three potential group completely included in larger ones (22). biomarkers that exhibited unique and conserved biological The statistical significance of each identified group biomarker with G genes is assessed using Eq. (3), that is, the upper bound of the tail probability that a random data 5 http://ca.expasy.org/sprot/

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Molecular Cancer Therapeutics 33

patterns across the ranked data sets that we randomly hemidesmosomes of epithelial cells (25). It has been shown generated. Because we were looking for the group of genes previously to be up-regulated in ovarian cancer (10). Lip- that exhibited coherent behavior across the largest number ocalin 2 (oncogene 24p3; LCN2) encodes for a secreted of ranked tissue samples in each one of the eight matrices protein. It transports small lipophilic substances and forms a that we randomly generated and because each random heterodimer with type V collagenase (MMP-9). Although matrix had 54 rows (genes) and 15 columns (ranked tissue LCN2 has been shown to be up-regulated in patients with samples), we fixed Kmin = 15. renal cell carcinoma (26), little is mentioned in the literature The three genes (ITGB4, LCN2, and WNT7A) identified about LCN2 and its role in ovarian cancer. Wingless-type with ‘‘A’’ in Table 2 represent the set of genes that belong to MMTV integration site family, member 7A (WNT7A) encodes group biomarker ‘‘A,’’ Z = 3.7e-92. They exhibit a coherent for a secreted protein that is present in the extracellular behavior across the following ranked conditions: normal matrix. It is a ligand for members of the frizzled family of ovary, borderline ovarian cancer, and primary ovarian cancer. seven transmembrane receptors (27). It is a developmental h Integrin 4 (ITGB4) encodes for a membrane protein. It is a protein; signaling by WNT7A allows sexual dimorphic receptor for laminin and it plays a critical structural role in the development of the Mullerian ducts (27). WNT7A has been

Figure 1. A, mean expression level of GPR39, LYPDC1 in various tissues. GPR39, LYPDC1 is at least 6-fold (log2 scale) up- regulated in ovarian cancer tissue samples compared with normal ovary tissue samples and each set of nonovarian tissue samples used in this study. B, ROC analysis of GPR39, LYPDC1 on each stage of ovarian cancer [omentum papillary serous adenocarcinomas (x), papillary serous adenocarcinomas of the ovary (n), and borderline ovarian cancer (E)] shows that with a specificity greater than 90%, GPR39, LYPDC1 achieves a sensitivity greater than 90% when used to detect ovarian cancer at each stage.

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

34 Ovarian Cancer Group Biomarkers

Three other genes (WFDC2, MUC1, and MSLN) labeled as ‘‘B’’ in Table 2 belong to group biomarker ‘‘B,’’ Z = 3.7e- 92. They exhibit a coherent behavior across the following ranked conditions: normal ovary, borderline ovarian cancer, and primary ovarian cancer. WAP four-disulfide core domain 2 (WFDC2) encodes for a secreted protein that is expressed in several tumor cells, such as ovarian, colon, breast, lung, and renal (29). WFDC2, also known as HE4, has been shown to be highly up-regulated in ovarian cancer (30, 31). Mucin 1 (MUC1) encodes for a membrane protein that is also secreted. It may play a role in adhesive functions and in cell-cell interactions, metastasis, and signaling (32). MUC1 may provide a protective layer on epithelial surfaces (32). MUC1 has been shown to be highly up-regulated in ovarian cancer (33). Mesothelin (MSLN) encodes for a membrane protein. Its function is unknown, but it may play a role in cell adhesion. It has multiple transcripts due to alternative splicing. MSLN has been shown to be highly up-regulated in ovarian cancer (11, 34–36). Figure 2B shows the expression profile of the three genes that belong to group biomarker ‘‘B’’ across one of the 8 randomly generated matrices. This pattern is conserved across most of the 8 matrices that we randomly generated (data not shown). The genes in this group behave coherently across these ranked conditions. Finally, the three genes (MSLN, KLK8, and KLK7) labeled as ‘‘C’’ in Table 2 belong to group biomarker ‘‘C,’’ Z = 3.7e-92. They exhibit a coherent behavior across the following ranked conditions: normal ovary, borderline ovarian cancer, and secondary ovarian cancer of the omentum. Kallikrein 8 (neuropsin/ovasin; KLK8) encodes for a secreted protein. KLK8 may be involved in epilepto- genesis and hippocampal plasticity. KLK8 has been shown to be highly up-regulated in ovarian cancer (10, 11, 37, 38). Kallikrein 7 (chymotryptic, stratum corneum; KLK7) encodes for a secreted protein. KLK7 is highly up-regulated in ovarian cancer (10, 11) and is present at the apical membrane and in the cytoplasm at the invasive front. Figure 2. Genes belonging to group biomarkers show a coherent pattern Figure 2C shows the expression profile of the three genes of expression. A, group biomarker ‘‘A’’ contains LCN2 (x), WNT7A (n), and ITGB4 (E). B, group biomarker ‘‘B’’ contains MSLN (x), WFDC2 (n), that belong to group biomarker ‘‘C’’ across one of the and MUC1 (E). C, group biomarker ‘‘C’’ contains KLK8 (x), KLK7 (n), and 8 randomly generated matrices. This pattern is conserved MLSN (E). The Y axis corresponds to the expression level and the X axis across the 8 matrices that we randomly generated (data not shows a series of different samples ranked as follows: normal, borderline, and primary or normal, borderline, and omentum. shown). ComparisonofGroupBiomarkerswithOtherSetsof Biomarkers Obtained with Alternative Statistical shown to be up-regulated in lung cancer patients (28), but Approaches no one has shown a role for WNT7A in ovarian cancer. We next did statistical analysis and validation of our Figure 2A shows the expression profile of the three genes three group biomarkers on the entire set of gene expression that belong to group biomarker ‘‘A’’ across one of the data for the ovary tissue samples. Thus, we analyzed data 8 randomly generated matrices. The three genes that belong from the 62 normal ovaries, 7 borderline ovarian cancers, to this group behave coherently across these ranked 22 papillary serous adenocarcinomas, and 16 omentum conditions. This pattern is conserved across most of the papillary serous adenocarcinoma metastases. We also 8 matrices that we randomly generated (Supplementary compared the performance of our three group biomarkers Fig. S1).6 This correlation may mean that they respond with that of the combinations of the best biomarkers similarly to the same environmental conditions. identified using other computational approaches: F test, ROC approach, and clustering. 6 Supplementary material for this article is available at Molecular Cancer ROC plots of group biomarkers ‘‘A’’ (Fig. 3A), ‘‘B’’ Therapeutics Online (http://mct.aacrjournals.org/). (Fig. 3B), and ‘‘C’’ (Fig. 3C) were compared with the

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Molecular Cancer Therapeutics 35

NIH Web site.7 Data sets GSM139377 to GSM139479 for ovarian cancer and normal ovary tissue samples were made available on April 9, 2007. These data sets contain the gene expression of 99 individual ovarian tumors (37 endometrioid, 41 serous, 13 mucinous, and 8 clear cell carcinomas) and 4 individual normal ovary samples. Each tissue was assayed on Affymetrix HG_U133A array, the data were processed using ‘‘Ann Arbor quantile-normal- ized trimmed-mean method’’ and normalized using ‘‘quantile-normalized trimmed-mean, log transformed with log[max(x + 50,0) + 50] using base 10 logarithms.’’ Data sets GSM44671 to GSM44706 for nonovarian tissue samples were made available on April 5, 2005. These data sets contain the expression profiling of 36 types of normal tissue from different organs; RNA samples had been pooled from several donors then assayed on Affymetrix HG_U133A arrays. To compare these data with the above ovarian cancer and normal ovary gene expression data, we normalized this nonovarian data using the ‘‘log transformed with log[max(x + 50,0) + 50] using base 10 logarithms.’’ Table 3 shows the different values of maximum sensitiv- ities for specificity greater than or equal to 99% when our three group biomarkers were used to detect different types and stages of ovarian cancer on the publicly available gene expression data. At least one of our group biomarkers Figure 3. ROC curves comparison of group biomarkers ‘‘A’’ (A), ‘‘B’’ detected each stage and different type of ovarian cancer on (B), and ‘‘C’’ (C), with the best genes uncovered using other computa- the publicly available data set, except for stage II tional techniques: F test (x), ROC approach (.), and Eisen clustering (E). endometrioid and stage III mucinous. Interestingly, with 100% specificity, group biomarker ‘‘A’’ achieved 100% combination of the six best biomarkers identified using the sensitivity on each type of ovarian cancer at stage I of the F test: GPR39, KLK8, LAMC2, LCN2, SCNN1A, and disease, suggesting a potential usefulness in detecting TACSTD1 from the borderline data set; BF, CLDN3, early-stage ovarian cancer compared with group bio- CLDN4, GPR39, KLK8, and SCNN1A from the papillary markers ‘‘B’’ and ‘‘C.’’ serous adenocarcinoma data set; and CLDN3, CLDN4, GPR39, KLK8, SCNN1A, and WFDC2 from the omentum Conclusion papillary serous adenocarcinoma data set. The six best In this study, we applied a novel set of biclustering genes identified using the Eisen clustering approach were algorithms and a ROC approach on well-defined gene CDH6, DDR1, GPR39, KLK8, LAMC2, and LCN2 from the expression data representing ovarian cancer, normal ovary, borderline data set; CDH6, CLDN3, GPR39, KLK8, MUC1, and nonovarian healthy and diseased tissues samples. We and WFDC2 from the papillary serous adenocarcinoma identified many significant patterns that encode for data set; and CLDN3, CLDN4, GPR39, KLK8, SCNN1A, secreted proteins, membrane proteins, and/or extracellular and WFDC2 from the omentum papillary serous adeno- matrix proteins that clearly discriminate between the gene carcinoma data set. The five best genes identified using the expression data of ovarian cancer, normal ovary, and ROC approach were DDR1, ITGB4, KLK8, LCN2, and nonovarian tissues. WNT7A from the borderline data set; DDR1, KLK8, MSLN, The advantage of using a biclustering algorithm is that it MUC1, and WFDC2 from the papillary serous adenocarci- allows grouping together of subsets of genes that exhibit noma data set; and CD47, CLDN3, KLK7, KLK8, and the same behavior across subsets of tissue samples. MSLN from the omentum papillary serous adenocarcino- Therefore, the genes that belong to the same bicluster ma data set. likely have similar responses to the same environmental With specificity greater than 99%, each of our three condition. Thus, a biclustering algorithm approach will group biomarkers achieved 100% sensitivity, with accu- give more clinical and biological insight into the tissue racy greater than 99% (Fig. 3). Thus, our group samples analyzed and potential biomarkers uncovered. biomarkers outperformed the combination of the best A major difference between our ROC approach and other biomarkers identified using the other three computational computational techniques based on ROC curves is that our techniques. Group Biomarkers Validation We validated our three group biomarkers using publicly available sets of gene expression data downloaded from the 7 http://www.ncbi.nlm.nih.gov/geo

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

36 Ovarian Cancer Group Biomarkers

Table 3. Maximum values of sensitivities for specificity greater than or equal to 99%

No. tissue samples Group biomarker ‘‘A’’ (%) Group biomarker ‘‘B’’ (%) Group biomarker ‘‘C’’ (%)

Stage I Clear cell 5 100 99 100 Endometrioid 18 100 78 84 Mucinous 8 100 100 75 Serous 4 100 100 99 Stage II Clear cell 1 100 100 100 Endometrioid 5 80 60 60 Mucinous 2 100 100 100 Serous 3 100 100 100 Stage III Clear cell 1 100 100 100 Endometrioid 11 90 100 90 Mucinous 3 50 50 30 Serous 30 100 99 100 Stage IV Clear cell 1 100 100 100 Endometrioid 3 100 100 100 Serous 4 100 100 100

definition of specificity includes the nonovarian tissues our methodology identifies an optimum combination of whereas others do not (17). Other computational techni- genes that have the highest effect on the diagnosis of a ques only do a classification based on a comparison disease. This suggests that the number of genes in a group between healthy ovary and ovarian cancer tissue samples biomarker is irrelevant, but how they behave together as a and do not account for other tissues in the body that may group is very important. produce the same protein as the ovarian cancer tissue. We statistically validated the group biomarkers identified Therefore, these approaches will result in less specific in this study using publicly available gene expression data biomarkers than ours in a diagnostic blood test. The downloaded from a NIH Web site. Because the genes that advantages of our ROC approach are 2-fold. A given gene we identified in this study encode for secreted proteins, with a high specificity corresponding to a high sensitivity they have the potential to be used as tumor markers for the will not only indicate that it is minimally or not expressed detection of ovarian cancer in a diagnostic blood test. in normal ovary tissues and nonovarian tissues but However, additional clinical studies assessing serum levels also indicate that it is highly expressed in ovarian cancer of the identified putative biomarkers are required to tissues. Therefore, it will represent a highly specific and confirm their usefulness in the diagnosis and/or monitor- sensitive single biomarker for ovarian cancer detection ing of ovarian cancer. using a blood test. This study used the novel approach of group bio- Acknowledgments markers as an alternative to the traditional single We thank Gene Logic for providing the gene expression data and Diane biomarkers or other combinations of biomarkers used to Rauch and Sarah Bowell for procuring the tissue samples (University of date for the detection of ovarian cancer using blood tests. Minnesota Cancer Center Tissue Procurement Facility). Statistical analysis of the potential group biomarkers identified in this study showed that they outperform the References combination of the best biomarkers identified using other 1. American Cancer Society. Cancer facts and figures 2007. Atlanta: computational approaches. We believe that our approach American Cancer Society; 2007. outperforms other computational techniques because 2. Verheijen RHM, Von Mensdorff-Pouilly S, Van Kamp GJ, Kenemans P. there exists a correlation or coregulation among the genes CA 125: fundamental and clinical aspects. Cancer Biol 1999;9:117 – 24. that belong to the group biomarkers that we identi- 3. Bast RC, Jr. Early detection of ovarian cancer: new technologies in pursuit of a disease that is neither common nor rare. Trans Am Clin fied. In contrast, other techniques combined potential bio- Climatol Assoc 2004;115:233 – 48. markers without checking to see whether they are correlated 4. Welsh JB, Zarrinkar PP, Sapinoso LM, et al. Analysis of gene or not. expression profiles in normal and neoplastic ovarian tissue samples Interestingly, our group biomarkers contain fewer genes identifies candidate molecular markers of epithelial ovarian cancer. Proc Natl Acad Sci U S A 2001;98:1176 – 81. (that is, maximum of three genes per group) and they do 5. Hough CD, Cho KR, Zonderman AB, Schwartz DR, Morin PJ. better than the combination of the best biomarkers Coordinately up-regulated genes in ovarian cancer. Cancer Res 2001;61: identified by previous approaches, which contain more 3869 – 76. genes (that is, a minimum of five genes per group). Thus, 6. Schummer M, Bumgarner RE, Nelson PS, et al. Comparative

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Molecular Cancer Therapeutics 37

hybridization of an array of 21,500 ovarian cDNA’s for the discovery of genes biclusters. Proceedings of IEEE International Workshop on Genomic genes overexpressed in ovarian carcinomas. Gene 1999;238:375 – 85. Signal Processing and Statistics; 2006 May 28-30; College Station, TX; 7. Ismail RS, Baldwin RL, Fang J, et al. Differential gene expression IEEE; 2006. between normal and tumor derived ovarian epithelial cells. Cancer Res 23. Madeira SC, Oliveira AL. Biclustering algorithms for biological data 2000;60:6744 – 9. analysis: a survey. IEEE Trans Comp Biol Bioinf 2004;1:24 – 45. 8. Ono K, Tanaka T, Tsunoda T, et al. Identification by cDNA microarray 24. GeneLogic GX2 Explorer 2.0. A component of the Genesis Enterprise of genes involved in ovarian carcinogenesis. Cancer Res 2000;60: System2 user guide. Gene Logic, Inc.; 2003. 5007 – 11. 25. Inoue M, Tamai K, Shimizu H, et al. A homozygous missense mutation 9. Santin AD, Zhan F, Bellone S, et al. Gene expression profiles in primary in the cytoplasmic tail of h4 integrin, G931D, that disrupts hemi- ovarian serous papillary tumours and normal ovarian epithelium: identifi- desmosome assembly and underlies non-Herlitz junctional epidermolysis cation of candidate molecular markers for ovarian cancer diagnosis and bullosa without pyloric atresia. J Invest Dermatol 2000;114:1061 – 4. therapy. Int J Cancer 2004;112:14 – 25. 26. Su¨llentrop F, Moka D, Neubauer S, et al. 31P NMR spectroscopy of 10. Hibbs K, Skubitz KM, Pambuccian SE, et al. Differential gene blood plasma: determination and quantification of phospholipid classes in expression in ovarian carcinoma. Identification of potential biomarkers. patients with renal cell carcinoma. NMR Biomed 2002;15:60 – 8. AmJ Pathol 2004;165:397 – 414. 27. Bui TD, Lako M, Lejeune S, et al. Isolation of a full-length human 11. Skubitz APN, Pambuccian SE, Argenta AP, Skubitz KM. Differential WNT7A gene implicated in limb development and cell transformation, and gene expression identifies subgroups of ovarian carcinoma. Translational mapping to 3p25. Gene 1997;189:25 – 9. Res 2006;148:223 – 48. 28. Calvo R, West J, Franklin W, et al. Altered HOX and WNT7A 12. Hedenfalk I, Duggan D, Chen Y, et al. Gene expression profiles in expression in human lung cancer. Proc Natl Acad Sci U S A 2000;97: hereditary breast cancer. N Engl J Med 2001;344:539 – 48. 12776 – 81. 13. Eisen MB, Spellman PT, Brown PO, Botstein D. Cluster analysis and 29. Bouchard D, Morisset D, Bourbonnais Y, Tremblay GM. Proteins with display of genome-wide expression patterns. Proc Natl Acad Sci 1998;95: whey-acidic-protein motifs and cancer. Lancet Oncol 2006;7:167 – 74. 14863 – 8. 30. HellstromI, Raycraft J, Hayden-Ledbetter M, et al. The HE4 14. Butte AJ, Tamayo P, Slonim D, Golub TR, Kohane IS. Discovering (WFDC2) protein is a biomarker for ovarian carcinoma. Cancer Res functional relationships between RNA expression and chemotherapeutic 2003;63:3695 – 700. susceptibility using relevance networks. Proc Natl Acad Sci U S A 2000; 31. Drapkin R, Von Horsten HH, Lin Y, et al. Human epididymis protein 4 97:12182 – 6. (HE4) is a secreted glycoprotein that is overexpressed by serous and 15. Furey T, Cristianini N, Duffy N, Bednarski D, Schummer M, Haussler endometrioid ovarian carcinomas. Cancer Res 2005;65:2162 – 9. D. Support vector machines classification and validation of cancer tissue 32. Komatsu M, Carraway CAC, Fregien NL, Carraway KL. Reversible samples using microarray expression data. Bioinformatics 2000;16: disruption of cell-matrix and cell-cell interactions by overexpression of 906 – 14. sialomucin complex. J Biol Chem 1997;272:33245 – 54. 16. Moler EJ, Chow ML, Mian IS. Analysis of molecular profile data 33. Feng H, Ghazizadeh M, Konishi H, Araki T. Expression of MUC1 and using generative and discriminative methods. Physiol Genomics 2000;4: MUC2 mucin gene products in human ovarian carcinomas. Jpn J Clin 109 – 26. Oncol 2002;32:525 – 9. 17. Debashis G, Chinnaiyan AM. Classification and selection of bio- 34. Chang K, Pastan I. Molecular cloning of mesothelin, a differentiation markers in genomic data using LASSO. J Biomed Biotechnol 2005;2: antigen present on mesothelium, mesotheliomas, and ovarian cancers. 147 – 54. Proc Natl Acad Sci U S A 1996;93:136 – 40. 18. Xiong M, Xiangzhong F, Jinying Z. Biomarker identification by feature 35. Muminova ZE, Strong TV, Shaw DR. Characterization of human wrappers. Genome Res 2001;11:1878 – 87. mesothelin transcripts in ovarian and pancreatic cancer. BMC Cancer 19. Dudoit S, Fridlyand J, Speed TP. Comparison of discrimination 2004;4:19. methods for the classification of tumors using gene expression data. 36. Scholler N, Fu N, Yang Y, et al. Soluble member(s) of the mesothelin/ J AmStat Assoc 2002;97:77 – 87. megakaryocyte potentiating factor family are detectable in sera from 20. Tchagang AB, Tewfik AH. DNA Microarray data analysis: a novel patients with ovarian carcinoma. Proc Natl Acad Sci 1999;96:11531 – 6. biclustering algorithmapproach. EURASIP J App Sig Proc 2006; article ID 37. Borgono CA, Kishi T, Scorilas A, et al. Human kallikrein 8 protein is a 59809. favorable prognostic marker in ovarian cancer. Clin Cancer Res 2006;12: 21. Tewfik AH, Tchagang AB, Vertatschitsch L. Parallel identification of 1487 – 93. gene biclusters with coherent evolution. IEEE Trans Sig Proc 2006;54: 38. Magklara A, Scorilas A, Katsaros D, et al. The human KLK8 2408 – 17. (neuropsin/ovasin) gene: identification of two novel splice variants and 22. Tchagang AB, Tewfik AH, Skubitz APN. Analysis of order preserving its prognostic value in ovarian cancer. Clin Cancer Res 2001;7:806 – 11.

MolCancerTher2008;7(1).January2008

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research. Published OnlineFirst January 9, 2008; DOI: 10.1158/1535-7163.MCT-07-0565

Early detection of ovarian cancer using group biomarkers

Alain B. Tchagang, Ahmed H. Tewfik, Melissa S. DeRycke, et al.

Mol Cancer Ther 2008;7:27-37. Published OnlineFirst January 9, 2008.

Updated version Access the most recent version of this article at: doi:10.1158/1535-7163.MCT-07-0565

Cited articles This article cites 34 articles, 15 of which you can access for free at: http://mct.aacrjournals.org/content/7/1/27.full#ref-list-1

Citing articles This article has been cited by 1 HighWire-hosted articles. Access the articles at: http://mct.aacrjournals.org/content/7/1/27.full#related-urls

E-mail alerts Sign up to receive free email-alerts related to this article or journal.

Reprints and To order reprints of this article or to subscribe to the journal, contact the AACR Publications Subscriptions Department at [email protected].

Permissions To request permission to re-use all or part of this article, use this link http://mct.aacrjournals.org/content/7/1/27. Click on "Request Permissions" which will take you to the Copyright Clearance Center's (CCC) Rightslink site.

Downloaded from mct.aacrjournals.org on September 29, 2021. © 2008 American Association for Cancer Research.