Clin Transl Oncol (2008) 10:604-617 DOI 10.1007/s12094-008-0261-2

EDUCATIONAL SERIES Blue Series*

In silico analysis of neoplastic biomarkers for cervix and uterine cancer

Mario A. Rodríguez-Pérez · Alberto Medina-Aunon · Sergio M. Encarnación-Guevara · Sofia Bernal-Silvia · Hugo Barrera-Saldaña · Juan Pablo Albar-Ramírez

Received: 16 July 2008 / Accepted: 14 August 2008

Abstract Worldwide, cervical and uterine cancers are the (HPA) portal was explored for expressed in a tis- most deadly cancers in women, with high prevalences, es- sue- or cervix and uterine cancer-specific manner. The pecially in developing countries. The Human Protein Atlas group of proteins differentially expressed and with en- hanced expression in the glandular and surface epithelial (squamous) cells retrieved from HPA were further explored *Supported by an unrestricted educational grant using the Protein Information and Knowledge Extractor from GlaxoSmithKline. (PIKE) portal to compile biological information that is found in different databases, and repositories on the Inter- ౧ M.A. Rodríguez-Pérez ( ) net. Thus, the lists of candidate proteins found in HPA, and Centro de Biotecnología Genómica Instituto Politécnico Nacional PIKE portals may be used as a starting point for the dis- Blvd. del Maestro esq. Elías Piña covery and validation of biomarkers for cervix and uterine Cd. Reynosa, Tamaulipas, México cancer employing proteomics approaches as described in e-mail: [email protected] the present article. ౧ M.A. Rodríguez-Pérez ( ) Keywords Cancer · Differential quantitative proteomics · Proteomics Facility Centro Nacional de Biotecnología Plasma and serum proteome CSIC-CNB, Universidad Autónoma de Madrid Campus Cantoblanco C/ Darwin, 3 ES-28049 Madrid, Spain Introduction e-mail: [email protected]

A. Medina-Aunon · J.P. Albar-Ramírez Cervix and uterine cancer (CUC) is one of the main causes Unidad de Proteómica of death of women worldwide, with a high prevalence, es- Centro Nacional de Biotecnología pecially in developing countries. The implementation of Consejo Superior de Investigaciones Científicas (CSIC-CNB) successful prophylactic programmes of early diagnosis Madrid, Spain and/or vaccination leads to pertinent treatment and preven- S.M. Encarnación-Guevara tion of infection by human papillomavirus (HPV) associat- Centro de Ciencias Genómicas ed with this pathology. The most promising advancement Universidad Nacional Autónoma de México achieved in recent years in the prevention of CUC is the Cuernavaca, Morelos, México development of a vaccine made from virus-like particles, generated by genetic engineering. Meanwhile in the field S. Bernal-Silvia · H. Barrera-Saldaña of diagnosis proteomics based on mass spectrometry is be- Departamento de Bioquímica y Medicina Molecular de la Facultad de Medicina ing applied for identifying new prognostic biomarkers. The Universidad Autónoma de Nuevo León advances in this field come from CUC studies using pro- Monterrey, N. L., México teomic techniques for biopsies of the cervix of patients Clin Transl Oncol (2008) 10:604-617 605 with CUC. Also, an important advancement, from a bio- The most common risk factor for CUC is to be exposed medical point of view, is the identification of protein pro- to certain varieties of HPV. An association between HPV files from plasma or serum from a sample, previously treat- and CUC was suggested by Zur Hausen et al. [7, 8]. Infec- ed to eliminate abundant proteins with little prognostic tion with HPV is a very common sexually transmitted con- value, and investigated using a combination of advanced dition with a prevalence of 10–50% in sexually active differential quantitative proteomic techniques, and multi- women, and it has been found in 99.7% of tumoral tissue variate statistical tests for the discovery of novel potential in cases of CUC [9–12]. There are at least 118 forms of biomarkers which are associated with clinical, molecular HPV, whose structure is described as consisting of a dou- and biochemical data of patients. The objectives of this ar- ble-strand circular molecule of DNA surrounded by a pro- ticle are three-fold: (1) to review the advances for the diag- teic capsid [13, 14]. HPV exclusively infects epithelial nosis of CUC focusing on the application of proteomics in cells of the basement membrane and increases cell prolifer- the search for new diagnostic proteins as biomarkers; (2) ation expressing its early (E5, E6 and E7). As cell perform an in silico analysis using the Human Protein At- proliferation increases, the virus begins to express late las (HPA Version 4.0; available from URL: http://www. genes [3, 4] and begins to replicate and form new viral par- proteinatlas.org [accessed 1 October 2008]) of the Human ticles. As the infection progresses, viral episomal DNA in- Proteome Organization to predict proteins expressed in a tegrates with host DNA and part of the viral genome is ex- differential manner in tissues and in vitro cell lines of pa- cluded. The genes E6 and E7 are implicated mainly in tients with cervical and endometrial cancers; and (3) per- oncogenesis [6, 15–17]. There is a proven role of certain form an in silico analysis using the bioinformatics web tool HPV genotypes in the pathogenesis of epithelial lesions known as the Protein Information and Knowledge Extrac- and they are considered the main risk factor for develop- tor (PIKE; available from URL: http://proteo.cnb.csic. ment of the neoplasia. A study in Mexico has demonstrated es:8080/pike [accessed 1 October 2008]) to associate dif- oncogenic HPV genotypes in 80% of acetowhite colpo- ferential expressed proteins in tissues and cellular lines scopic lesions, linked to VPH16 in half the cancers of from patients with cervical and endometrial cancer, and Mexican patients, and demonstrated that the VPH16 Asian- their biological information which is found in different American (aa-c) variant is characteristic of the Mexican databases and repositories on the internet. mixed-heritage Spanish-American population, and is re- sponsible for neoplasia in younger patients, and with a less favourable prognosis. It has been stated that the immuno- logic response controls HPV infection in most women, The importance of cervix and uterine cancer with these presenting transient infection. However, in a small proportion of women, the infection becomes persist- CUC is one of the main causes of death worldwide, repre- ent, and leads to the development of precancerous and fi- senting 10% of all malignant processes in women [1, 2] nally cancerous lesions. and which, in the year 2000, caused 471,000 new cases, The great challenge in the CUC field of study is to de- according to the International Agency for Research on fine the additional factors involved in the persistence of Cancer (IARC). The incidence of CUC is 15 times greater HPV, as well as the genes, and their respective proteins in- in poor countries compared to industrialised countries [3]. volved in the cancerous process from the initial state of CUC occupies the third place in frequency of tumours disease to its progression to CUC. Since only a small num- worldwide, after breast and colon cancer, and it is a very ber of women infected with the virus develop CUC, other important public health problem in developing countries, possible factors seem to be determinants in the process of representing almost 30% of all neoplasias in Latin Ameri- disease progression and are specific for each population. can women. In the last 30 years, the incidence and mor- Besides environmental factors, genetic factors and their tality rates of CUC have decreased more than 75% in de- products, from the host as well as viral, are linked, and veloped countries because of prevention programmes play a fundamental role in this process. based on cytological and colposcopic screening, together with treatment of precancerous CUC lesions [4]. In con- trast, in a developing country such as Mexico, the 1990s The diagnosis of cervix and uterine cancer began with 4280 deaths associated with CUC and ended with 4620 deaths, which represents a notable increase in In developing countries, the clinical diagnosis of CUC is the registered incidence and mortality, with an average of based mainly on cytological and histological tests. The Pa- one woman dead every two hours because of this neopla- panicolaou test or Pap smear is frequently used for detect- sia [5] with an annual average growth rate of absolute ing CUC, but involves errors of interpretation together with cases of 0.67%. The INEGI (Mexico) [6], based on its the lack of infrastructure to perform it in some countries database of vital statistics in 2002, reported that accord- [18–22]. Colposcopy and biopsy are commonly used pro- ing to the percentage distribution of deaths from malig- cedures for early diagnosis of CUC with the latter being an nant tumours in women, 14.4% corresponded to the uter- invasive procedure that requires highly trained expert per- ine cervix. sonnel [21]. 606 Clin Transl Oncol (2008) 10:604-617

Because of infection, the development of a strict proto- telomerase activators were identified. The proteins report- col in which the most recent knowledge of proteomics (de- ed by Pyo Choi et al. [25] lead to the establishment of the rived from the use of differential quantitative analytic first proteomic database associated to CUC and the di- methods), secondary prevention clinical diagnosis, treat- verse genotypes of HPV. Following a similar strategy, in ment and adequate follow-up are combined can effectively Mexico studies were done to elucidate interaction mecha- help prevent invasive CUC. The direct precursor lesion of nisms between oncoproteins E6 and E7 from HPV 16 or CUC is severe dysplasia or high-grade squamous intraep- 18, and cell proteins p53 and pRb using the HaCaT cell ithelial lesion (SIL, which is diagnosed based on cytologi- line as a model [26], and by proteomic analysis of the se- cal study) or type 3 cervical intraepithelial neoplasia (CIN, cretome of cells infected with HPV16 and HPV18 [27]. In which is generally diagnosed based on histological study). Canada, candidate biomarkers, including chaperonin 10 Most mild to moderate, low-grade SIL or CIN 1–2 dys- and pyruvate kinase, have been identified in serum-free plasias generally involve (i.e., revert to normal without media or cultured endometrial cancer cells (KLE and treatment) and do not progress, with the viral infection be- HEC-1-A) and cervical cancer (HeLa) cells. In the same ing eliminated by the immune response in the 6–12 months study, a total of 203 proteins from the KLE cells, 86 pro- following contact with the virus. One to two thirds of the teins from HEC-1-A cells and 161 proteins from HeLa high-grade squamous intraepithelial lesions conceptualised cells were reported [28]. However, these biomarker studies as pre-invasive lesions, if not treated, will evolve into inva- for CUC should be extended to include a greater number sive lesions (CIN 3=carcinoma in situ) in a period that of tissue and blood samples from patients with CUC with varies from a few months to several years. It is suggested the purpose of incorporating the normalised variation of that the biological and functional differences among viral the different pathogenic processes of the disease. Also, variants could have an impact on the aetiology of the can- these potential biomarkers should surpass the following cer. In the same way, knowledge of the geographic distri- phases of grading and analysis verification, preferentially bution and oncogenic potential of these viral variants in serum and/or plasma samples. would provide data that would assist in designing more ef- ficient vaccines and vaccination protocols [23]. Tumour marker and diagnosis

Proteomics and cervix and uterine cancer Proteomics uses a combination of sophisticated techniques, including two-dimensional gel electrophoresis, image In the last few years, proteomics have had diverse applica- analysis and interpretation, liquid chromatography, mass tions in medicine, drugs, industry and agriculture. In bio- spectrometry, peptide sequencing and the bioinformatics medicine, it has been useful in the field of cardiovascular analysis necessary for adequate analysis of results [29, 30]. and neuromuscular disease research, and in the study of or- The term tumour marker has been used to define any cell gan transplant and infertility. However, one of the most surface antigen or intracellular protein associated to the tu- promising areas of proteomics is the discovery and identifi- mour detected in neoplasic tissue using diverse techniques. cation of protein biomarkers for the diagnosis of different An ideal marker should be easy to determine, inexpensive, types of cancer. 100% sensitive and 100% specific. However, we now know Candidate biomarkers that could be used in the detec- that no tumour marker fulfils all these characteristics, al- tion of different types of cancers have been reported [24]. though many are useful and important in cancer patient as- One of the most recent articles was about proteomic sessment [31]. Also, they are an important tool in the diagno- analysis of cervical cancer tissue samples performed by sis of certain tumours, since in medical oncology practice a Pyo Choi et al. [25] in South Korea. In this article, candi- non-invasive, reliable test is always desirable. Several types date biomarkers were associated with a cancerous tumour of cancer in humans have been analysed using proteomic using the peptide mass fingerprint technique MALDI-ToF platforms to find out diagnostic markers, elucidate disease MS (mass spectrometry based on protein/peptide ionisa- mechanisms and/or determine therapeutic targets [31]. The tion induced by laser excitation and matrix-assisted des- cancers studied have been from bladder, kidney, breast, in- orption coupled with time-of-flight analysers). Also, pro- testine, liver, head and neck, thyroid, ovary and prostate teins that are differentially expressed were identified, [30]. For example, Bae et al. [33] identified 35 proteins comparing a normal immortalised human keratinocyte cell specific for cervical squamous cell carcinoma using differ- line (HaCaT) and tissues of patients with CUC infected ential two-dimensional gels (2-DIGE) and MALDI-ToF. with high-risk HPV: HPV16 or HPV18. This was done by Twelve proteins (pigment epithelium derived actor, anexins protein separation using two-dimensional gel elec- a2 and a5, 19 and 20, heat shock protein 27, mus- trophoresis (with immobilised pH-gradient strips – pre- cle protein 22 alpha, alpha-enolase, squamous cell carcino- made gels on the first dimension and 2D-PAGE for the ma antigens 1 and 2, glutathione S-transferase and second separation step). In this study among other pro- apolipoprotein a1) have been previously reported, but the teins, oncogenes or proto-oncogenes, and proteins associ- 21 remaining proteins were a novel result of this in-depth ated with cell cycle regulation, cell immortalisation and research of the proteome [33]. Clin Transl Oncol (2008) 10:604-617 607

Fig. 1 Flow diagram showing the steps for advanced searching in the HPA. In the example given, we enquired “which proteins are over- expressed in endometrial cancer AND moderately expressed in colorectal and cervical cancers”. The number of individual tumours showing the over-expressed level of proteins in endometrial cancer was set in at least six patients as compared to those occurring in at least eight patients showing a moderate level of proteins in colorectal and cervical cancers. The query generated a list of four hits (genes) matching the criteria for the search string. An antibody ID link button was also provided to view the annotation data, and ex- plore the corresponding expression profiles. A link button was also available to open a new window with information from three data- bases (Ensem/NCBI/Uniprot)

There are databases with clinical, genomic and viral In silico analysis of candidate biomarkers of cervix information from patients with CUC in several institu- and endometrial cancer tions, but in many cases proteomic studies have not been carried out on their clinical samples. Therefore, the Although priority is granted to differential quantitative pro- search for ideal biomarkers that are produced only by teomic studies for the discovery of candidate biomarkers, cancer cells, specific to a type of cancerous tumour (that these can also be selected by in silico analysis starting do not produce false positive results), and that are de- from protein databases. In this way, the search can be es- tectable from the beginning of the disease (that do not tablished for candidate biomarkers that result from experi- produce false negatives), is essential. As stated by Rifai et ments and/or knowledge in the scientific literature and/or al. [34], the search for new biomarkers requires a coher- public domain (i.e. in websites). The Human Protein Atlas ent work methodology with well established techniques (HPA Version 4.0; available from URL: http://www.pro- that permit validation of the biomarkers. The novel bio- teinatlas.org. [accessed 1 October 2008]) of the Human markers should have application and usefulness in daily Proteome Organization (HUPO) is a database that contains clinical studies, which is why we emphasise that protein 6210 antibodies associable with 5,702,812 histological im- detection should be done in blood plasma or serum, or ages (each antibody has been used to mark, by immunohis- from exudates or lavage fluid of the tissue or organ in tochemistry, sections of healthy and neoplastic tissue); the question. latest version was released on 18 August 2008 during the 608 Clin Transl Oncol (2008) 10:604-617

7th HUPO World Congress in Amsterdam, the Nether- must be validated in larger patient cohorts to establish their lands. Also, in the HPA, cell lines used for defining protein potential role as markers of prognosis or prediction. expression patterns are found. HPA is certainly a HUPO From the 151 total proteins found in the HPA associat- repository of paramount importance with as yet unfulfilled ed with CUC, ten proteins have already been reported as potential use for basic cancer researchers and clinicians. markers of cervical and endometrial cancer [30, 36], i.e., From this information a search for differentially expressed cystatin B and C (type A in the HPA), heat-shock proteins proteins in cancer and healthy patients can be done (Fig. or Hsp 90 alpha, Hsp-70, Hsp70/Hsp 90 organising protein 1). These candidate biomarkers include proteins specific to (type beta-1 in the HPA), tyrosine protein kinase HCK, tissue and cells of healthy persons, from in vitro cell lines FGR, FYN and LYN (type SYK in the HPA) and mitogen- or from tumours. In addition, it might be possible to search activated protein kinase kinase kinase 8 (type 1 in the for proteins with differential immunoreactivity within a HPA). Cystatin B and C are natural inhibitors of cathepsin specified type of cancer, i.e., potential prognostic or pre- B. These proteins are also increased in serum samples of dictive markers, and grade of malignancy in a given cell patients with colorectal cancer and melanoma (75 and 76 population, i.e., over-expression or lack of expression of a [30]). Hsp70 and 90, and the proto-oncoprotein tyrosine given protein in a given tumour [35]. On the internet, go to protein kinase (HCK) are increased in tissues of patients the website http://www.proteinatlas.org (Fig. 1A), choose with cervical cancer [36]. The oncoprotein mitogen-acti- both Advanced Search (Fig. 1B) and Add tissue search vated protein kinase kinase kinase 8 is associated to onco- (Fig. 1C) by clicking the link buttons provided. In Ad- proteins E6/E7 from human papilloma virus type 16 and vanced Search (Fig. 1D), set the search criteria by opening 18. The search also resulted in 141 proteins that are not as- the corresponding window. Choose the type of cancer pro- sociated with CUC in the literature. vided in the list and other searching criteria such as tumour or normal cells and the number of patients with strong, moderate or negative staining. Subsequently, click in ‘Add tissue search’, and select the AND function. Choose the Protein Information and Knowledge Extractor (PIKE) type of cancer provided in the list and other searching cri- teria such as tumour or normal cells, and the number of pa- PIKE is a freely accessible bioinformatics tool developed tients with strong, moderate or negative staining. Repeat by the Centro Nacional de Biotecnología (CNB)-Pro- the process for multiple queries by using the AND func- teomics facility (available from URL: http://www.proteo. tion. The Search Result (Fig. 1E) of the structured query cnb.cisc.es/8080/pike [accessed 1 October 2008]) that of- language (hits) is presented as a list of antibodies matching fers an invaluable aid to basic and applied research. In par- the requested pattern along with additional information. ticular, PIKE helps the user to link the data derived from Thus, the advanced search function in the HPA allowed us either a proteomics experiment produced in a laboratory or to enter queries based on multiple criteria in order to find a protein data bank (by means of protein accession codes) proteins with a high expression level (strong immunoreac- with biological and functional information available from a tivity) in one tissue type but low or negative (moderate or number of biomedical databases through the Internet. The negative immunoreactivity) expression level in another tis- biological and functional information is obtained directly sue type. from the main proteomics databases, and it is used for un- Here, we have used the publicly available HPA bioin- derstanding the biological and functional role developed by formatics tool to explore its tremendous potential for giv- a set of proteins within the context of a particular experi- ing insights in the field of CUC biomarkers. Specific pro- ment. These information sources include Expasy SwissProt teins from glandular and surface epithelial (squamous) (SIB), the National Center for Biotechnology Information cells were searched for in the Protein Atlas database. This (NCBI–NIH) –which provide protein features and annota- resulted in nine and 15 proteins, respectively, which are tions, ontology consortium (GO)– functional and summarised in Table 1. In the search for differentially sub-cellular location classification, Kyoto Encyclopedia of over-expressed proteins in cervical and endometrial cancer, Genes (KEGG) –metabolic pathways, proteins interactions this resulted in 13 and four proteins, respectively, which database IntAct (EBI) and, interestingly, Online Mendelian are summarised in Table 1. In the same way, a search func- inherence of human (OMIN)– diseases library. PIKE can tion was used for potential protein biomarkers for progno- collect all the information loaded in all these databases or sis or prediction. In this case, proteins were sought that just one, or a subset of them, depending on the set of pro- were augmented in a subset of tumours within a certain teins used in the query. Furthermore, PIKE could be used type of cancer and absent or with moderate expression in either independently, starting from a set of proteins derived other tumours of patients with the same type of cancer. from a proteomics experiment or as complement to another Since the Protein Atlas has expression data of 12 different tool which allows proteins from a database to be reported, tumours, individually, for each distinct tumour, it is possi- such as HPA. In particular, following the pipeline created ble to search for expressed proteins, differentially, in a de- by the combination of HPA and PIKE, a non-expert re- fined tumour type. However, since the number of tumours searcher in proteomics can use HPA to retrieve a cohort of analysed is relatively small (12), the identified proteins proteins involved in a particular disease clinical stage (i.e., Clin Transl Oncol (2008) 10:604-617 609

Table 1 Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glandular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

Proteins of glandular cells 1. Galactosylgalactosylxylosylprotein 3-beta-glucuronosyltransferase 1 (EC 2.4.1.135) [cervix, uterine, glandular (beta-1,3-glucuronyltransferase 1) (glucuronosyltransferase-P) (GlcAT-P)(UDP-GlcUA: cells = strong staining] glycoprotein beta-1,3-glucuronyltransferase) (GlcUAT-P). and [uterine, cervix, surface 2. C-type lectin domain family 4 member A (C-type lectin superfamily member 6) (dendritic cell epithelial cells (squamous) = negative immunoreceptor) (lectin-like immunoreceptor) (C-type lectin DDB27). staining] and [cervical cancer, 3. Neutrophil gelatinase-associated lipocalin precursor (NGAL) (p25) (25 kDa tumour cells, ≥6 patients = negative alpha-2-microglobulin-related subunit of MMP-9) (lipocalin-2) (oncogene 24p3). staining] and [endometrial cancer, 4. Mucin-5AC precursor (mucin-5 subtype AC, tracheobronchial) (tracheobronchial mucin) (TBM) tumour cells, ≥6 patients = negative (major airway glycoprotein) (gastric mucin) (Lewis B blood group antigen) (LeB) (fragments). staining] 5. Serine/threonine kinase NLK (EC 2.7.11.24) (Nemo-like kinase) (protein LAK1). 6. Progesterone receptor (PR) (nuclear receptor subfamily 3 group C member 3). 7. Protein phosphatase 1 regulatory subunit 1A (protein phosphatase inhibitor 1) (IPP-1) (I-1). 8. Transcription factor Spi-B. 9. Tumour necrosis factor receptor type 1-associated DEATH domain protein (TNFR1-associated DEATH domain protein) (TNFRSF1A-associated via death domain).

Proteins of surface epithelial 1. Placental protein 11 precursor (EC 3.4.21.-) (PP11). cells (squamous) 2. C-C chemokine receptor type 2 (C-C CKR-2) (CC-CKR-2) (CCR-2) (CCR2) (monocyte [cervix, uterine, surface epithelial chemoattractant protein 1 receptor) (MCP-1-R) (CD192 antigen). cells (squamous) = strong staining], 3. Receptor tyrosine-protein kinase erbB-3 precursor (EC 2.7.10.1) (c- erbB3) (tyrosine kinase-type and [cervix, uterine, glandular cells cell surface receptor HER3). = negative staining], and [cervical cancer, 4. G protein-coupled receptor kinase 4 (EC 2.7.11.16) (G protein-coupled receptor kinase GRK4) (ITI1). tumour cells, ≥6 patients = negative 5. , type II cytoskeletal 1 (-1) (CK-1) (keratin-1) (K1) (67 kDa cytokeratin) staining], and [endometrial cancer, (hair alpha protein). tumour cells, ≥6 patients = negative 6. Keratin, type I cytoskeletal 14 (cytokeratin-14) (CK-14) (keratin-14) (K14). staining] 7. LanC-like protein 3. 8. Putative transcription factor Ovo-like 1 (hOvo1). 9. Retinoic acid receptor responder protein 3 (tazarotene-induced gene 3 protein) (RAR-responsive protein TIG3) (retinoid-inducible gene 1 protein). 10. Ryanodine receptor 2 (cardiac muscle-type ryanodine receptor) (RyR2) (RYR-2) (cardiac muscle ryanodine receptor-calcium release channel) (hRYR-2). 11. Structural maintenance of protein 4 (-associated polypeptide C) hCAP-C) (XCAP-C homologue). 12. Serine protease inhibitor Kazal-type 5 precursor (lympho-epithelial Kazal-type-related inhibitor) (LEKTI) [Contains: haemofiltrate peptide HF6478; haemofiltrate peptide HF7665]. 13. Serine/threonine-protein kinase 40 (EC 2.7.11.1) (SINK-homologous serine/threonine-protein kinase) (Sugen kinase 495) (SgK495). 14. Thrombomodulin precursor (TM) (Fetomodulin) (CD141 antigen). 15. Visual system homeobox 2 (Homeobox protein CHX10) (Ceh-10 homeodomain-containing homologue).

Proteins over-expressed in cervical 1. Butyrophilin subfamily 1 member A1 precursor (BT). cancers 2. Cystatin-A (Stefin-A) (Cystatin-AS). [cervical cancer, tumour cells, 3. Involucrin. ≥6 patients = strong staining], 4. Transcription factor jun-B. and [colo-rectal cancer, tumour cells, 5. Kelch-like protein 31 (Kelch repeat and BTB domain-containing protein 1) (Kelch-like protein ≥8 patients < moderate staining], KLHL) (BTB and kelch domain-containing protein 6). and [endometrial cancer, tumour 6. Keratin, type I cytoskeletal 17 (cytokeratin-17) (CK-17) (keratin-17) (K17) (39.1). cells, ≥8 patients < moderate staining] 7. Keratin, type II cytoskeletal 5 (cytokeratin-5) (CK-5) (keratin-5) (K5) (58 kDa cytokeratin). 8. Keratin type II cuticular Hb3 (type II Hb3) (keratin-83) (K83) (K2.10). 9. Leucine-rich repeat-containing protein 37B precursor (C66 SLIT-like testicular protein). 10. Glucocorticoid receptor (GR) (nuclear receptor subfamily 3 group C member 1). 11. Pyridoxal phosphate phosphatase (EC 3.1.3.74) (PLP phosphatase). 12. Serpin B4 (squamous cell carcinoma antigen 2) (SCCA-2) (Leupin). 13. Tumour protein 63 (p63) (transformation-related protein 63) (TP63) (tumour protein p73-like) (p73L) (p51) (p40) (keratinocyte transcription factor KET) (chronic ulcerative stomatitis protein) (CUSP). 610 Clin Transl Oncol (2008) 10:604-617

Table 1 (continuation) Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glan- dular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

Proteins over expressed in endometrial 1. ATP-binding cassette sub-family A member 3 (ATP-binding cassette transporter 3) (ATP-binding cancer cassette 3) (ABC-C transporter). [endometrial cancer, tumor cells, 2. Golgi SNAP receptor complex member 2 (27 kDa Golgi SNARE protein) (Membrin). ≥6 patients = strong staining], 3. Probable palmitoyltransferase ZDHHC6 (EC 2.3.1.-) (Zinc finger DHHC domain-containing and [colo-rectal cancer, tumor protein 6) (DHHC-6) (Zinc finger protein 376) (Transmembrane protein H4). cells,≥8 patients < moderate staining], 4. Antibody CAB000468 (no description). and [cervical cancer, tumor cells, ≥8 patients < moderate staining]

Prognostic proteins in cervical cancer 1. Anterior gradient protein 2 homolog precursor (Secreted cement gland protein XAG-2 homolog) [cervical cancer, tumor cells, ≥4 patients (AG-2) (hAG-2) (HPC8). = strong staining], and [cervical cancer, 2. Arginase-2, mitochondrial precursor (EC 3.5.3.1) (Arginase II) (Non- hepatic arginase) tumor cells, ≥6 patients < moderate (Kidney-type arginase). staining], and [skin cancer, tumor cells, 3. Platelet receptor Gi24 precursor. ≥6 patients < moderate staining] 4. Cystatin-C precursor (Cystatin-3) (Neuroendocrine basic polypeptide) (Gamma-trace) (Post-gamma-globulin). 5. Cystatin-B (Stefin-B) (Liver thiol proteinase inhibitor) (CPI-B). 6. Major histocompatibility complex, class II, DR alpha precursor 7. Interferon-induced transmembrane protein 2 (Interferon-inducible protein 1-8D). 8. Interferon-induced transmembrane protein 3 (Interferon-inducible protein 1-8U). 9. Mitogen-activated protein kinase kinase kinase 4 (EC 2.7.11.25) (MAPK/ERK kinase kinase 4) (MEK kinase 4) (MEKK 4) (MAP three kinase 1). 10. NACHT, LRR and PYD domains-containing protein 3 (Cold autoinflammatory syndrome 1 protein) (Cryopyrin) (PYRIN-containing APAF1-like protein 1) (Angiotensin/vasopressin receptor AII/AVP-like). 11. Proteasome activator complex subunit 1 (Proteasome activator 28 subunit alpha) (PA28alpha) (PA28a) (Activator of multicatalytic protease subunit 1) (11S regulator complex subunit alpha) (REG-alpha) (Interferon gamma up-regulated I-5111 protein) (IGUP I-51). 12. Tyrosine-protein phosphatase non-receptor type 1 (EC 3.1.3.48) (Protein-tyrosine phosphatase 1B) (PTP-1B). 13. RING finger protein 13. 14. Chromaffin granule amine transporter (Vesicular amine transporter 1) (VAT1) (Solute carrier family 18 member 1). 15. Non-receptor tyrosine-protein kinase TYK2 (EC 2.7.10.2). 16. WD repeat-containing protein 89.

Prognostic proteins in cervical cancer 1. Aldo-keto reductase family 1 member C3 (EC 1.-.-.-) (Trans-1,2- dihydrobenzene-1,2-diol [cervical cancer, tumor cells, ≥2 patients dehydrogenase) (EC 1.3.1.20) (3-alpha- hydroxysteroid dehydrogenase type 2) (EC 1.1.1.213) = strong staining], and [cervical cancer, (3-alpha-HSD type 2) (3-alpha-HSD type II, brain) (Prostaglandin F synthase). tumor cells, ≥6 patients = negative 2. Arginase-2, mitochondrial precursor (EC 3.5.3.1) (Arginase II) (Non- hepatic arginase) staining], and [skin cancer, tumor cells, (Kidney-type arginase). ≥8 patients = negative staining] 3. C-type lectin domain family 4 member A (C-type lectin superfamily member 6) (Dendritic cell immunoreceptor) (Lectin-like immunoreceptor) (C-type lectin DDB27). 4. -1 (EC 3.6.5.5). 5. Dual specificity protein phosphatase 12 (EC 3.1.3.48) (EC 3.1.3.16) (Dual specificity tyrosine phosphatase YVH1). 6. Transcription factor ETV6 (ETS-related protein Tel1) (Tel) (ETS translocation variant 6). 7. Fatty acid-binding protein, adipocyte (AFABP) (Fatty acid-binding protein 4) (Adipocyte lipid-binding protein) (ALBP) (A-FABP). 8. Host cell factor 2 (HCF-2) (C2 factor). 9. Heat shock protein beta-1 (HspB1) (Heat shock 27 kDa protein) (HSP 27) (Stress-responsive protein 27) (SRP27) (Estrogen-regulated 24 kDa protein) (28 kDa heat shock protein). 10, Interleukin-10 precursor (IL-10) (Cytokine synthesis inhibitory factor) (CSIF). 11. Keratin, type II cytoskeletal 8 (Cytokeratin-8) (CK-8) (Keratin-8) (K8). 12. Leukemia inhibitory factor precursor (LIF) (Differentiation- stimulating factor) (D factor) (Melanoma-derived LPL inhibitor) (MLPLI) (Emfilermin). 13. Mucosa-associated lymphoid tissue lymphoma translocation protein 1 (EC 3.4.22.-) (MALT lymphoma-associated translocation) (Paracaspase). Clin Transl Oncol (2008) 10:604-617 611

Table 1 (continuation) Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glan- dular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

14. Mucin-5B precursor (MUC-5B) (Mucin-5 subtype B, tracheobronchial) (High molecular weight salivary mucin MG1) (Sublingual gland mucin) (Cervical mucin). 15. Mucin and cadherin-like protein precursor (Mu-protocadherin). 16. Pleckstrin homology-like domain family A member 2 (Imprinted in placenta and liver protein) (Tumor-suppressing subchromosomal transferable fragment candidate gene 3 protein) (Tumor- suppressing STF cDNA 3 protein) (Beckwith-Wiedemann syndrome chromosomal r). 17. Podocalyxin-like protein 1 precursor. 18. Prostaglandin G/H synthase 2 precursor (EC 1.14.99.1) (Cyclooxygenase- 2) (COX-2) (Prostaglandin-endoperoxide synthase 2) (Prostaglandin H2 synthase 2) (PGH synthase 2) (PGHS-2) (PHS II). 19. Retinoic acid receptor responder protein 3 (Tazarotene-induced gene 3 protein) (RAR-responsive protein TIG3) (Retinoid-inducible gene 1 protein). 20. Ryanodine receptor 2 (Cardiac muscle-type ryanodine receptor) (RyR2) (RYR-2) (Cardiac muscle ryanodine receptor-calcium release channel) (hRYR-2). 21. Transcription initiation factor TFIID subunit 12 (Transcription initiation factor TFIID 20/15 kDa subunits) (TAFII-20/TAFII-15) (TAFII20/TAFII15). 22. Trefoil factor 1 precursor (pS2 protein) (HP1.A) (Breast cancer estrogen-inducible protein) (PNR-2). 23. Transmembrane protein 26. 24. Ubiquitin carboxyl-terminal hydrolase isozyme L1 (EC 3.4.19.12) (EC 6.-.-.-) (UCH-L1) (Ubiquitin thioesterase L1) (Neuron cytoplasmic protein 9.5) (PGP 9.5) (PGP9.5). 25. WSC domain-containing protein 1. 26. Zinc finger SWIM domain-containing protein 5. 27. Antibody CAB000358 (no description).

Prognostic proteins in endometrial 1. Uncharacterized protein ENSP00000382483. cancer 2. N-acetyl-beta-glucosaminyl-glycoprotein 4-beta-N- acetylgalactosaminyltransferase 2 (EC 2.4.1.244) [endometrial cancer, tumor cells, (NGalNAc-T2) (Beta- 1,4-N-acetylgalactosaminyltransferase III) (Beta4GalNAc-T3) ≥4 patients = strong staining], (Beta4GalNAcT3). and [endometrial cancer, tumor cells, 3. Bcl-2-binding component 3 (p53 up-regulated modulator of apoptosis) (JFY-1). ≥6 patients < moderate staining], 4. Dachshund homolog 1 (Dach1). and [skin cancer, tumor cells, 5. Dual specificity protein phosphatase 12 (EC 3.1.3.48) (EC 3.1.3.16) (Dual specificity tyrosine ≥6 patients < moderate staining] phosphatase YVH1). 6. Receptor tyrosine-protein kinase erbB-3 precursor (EC 2.7.10.1) (c- erbB3) (Tyrosine kinase-type cell surface receptor HER3). 7. Interleukin-10 precursor (IL-10) (Cytokine synthesis inhibitory factor) (CSIF). 8. CGMP-inhibited 3',5'-cyclic phosphodiesterase A (EC 3.1.4.17) (Cyclic GMP-inhibited phosphodiesterase A) (CGI-PDE A). 9. Progesterone receptor (PR) (Nuclear receptor subfamily 3 group C member 3). 10. Alpha-1-antichymotrypsin precursor (ACT) (Cell growth-inhibiting gene 24/25 protein) [Contains: Alpha-1-antichymotrypsin His-Pro-less]. 11. Tyrosine-protein kinase SYK (EC 2.7.10.2) (Spleen tyrosine kinase). 12. Transmembrane protein 126B. 13. Transmembrane protease, serine 2 precursor (EC 3.4.21.-) (Serine protease 10) [Contains: Transmembrane protease, serine 2 non-catalytic chain; Transmembrane protease, serine 2 catalytic chain].

Prognostic proteins in endometrial 1. Angiotensin-converting enzyme, somatic isoform precursor (EC 3.4.15.1) (Dipeptidyl cancer carboxypeptidase I) (Kininase II) (CD143 antigen) [Contains: Angiotensin-converting enzyme, [endometrial cancer, tumor cells, somatic isoform, soluble form]. ≥2 patients = strong staining], 2. AH receptor-interacting protein (AIP) (Aryl-hydrocarbon receptor- interacting protein) and [endometrial cancer, tumor cells, (Immunophilin homolog ARA9) (HBV X-associated protein 2) (XAP-2). ≥6 patients = negative staining], 3. Protein EAN57. and [skin cancer, tumor cells, 4. Cadherin-2 precursor (Neural cadherin) (N-cadherin) (CD325 antigen) (CDw325). ≥8 patients = negative staining] 5. Dachshund homolog 1 (Dach1). 6. Estrogen receptor (ER) (Estradiol receptor) (ER-alpha) (Nuclear receptor subfamily 3 group A member 1). 7. Gap junction delta-2 protein (Gap junction alpha-9 protein) (Connexin- 36) (Cx36). 612 Clin Transl Oncol (2008) 10:604-617

Table 1 (continuation) Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glan- dular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

8. Hepatocyte nuclear factor 4-gamma (HNF-4-gamma) (Nuclear receptor subfamily 2 group A member 2). 9. Heat shock protein HSP 90-alpha (HSP 86) (Renal carcinoma antigen NY- REN-38). 10. Keratin, type I cytoskeletal 23 (Cytokeratin-23) (CK-23) (Keratin-23) (K23). 11. Laminin subunit gamma-1 precursor (Laminin B2 chain). 12. Mitogen-activated protein kinase kinase kinase kinase 1 (EC 2.7.11.1) (MAPK/ERK kinase kinase kinase 1) (MEK kinase kinase 1) (MEKKK 1) (Hematopoietic progenitor kinase). 13. . 14. Podocalyxin-like protein 1 precursor. 15. Rhomboid, veinlet-like 6 isoform 2. 16. Protein-associating with the carboxyl-terminal domain of ezrin (Ezrin- binding protein PACE-1) (SCY1-like protein 3). 17. Alpha-1-antichymotrypsin precursor (ACT) (Cell growth-inhibiting gene 24/25 protein) [Contains: Alpha-1-antichymotrypsin His-Pro-less]. 18. Monocarboxylate transporter 8 (MCT 8) (MCT 7) (Solute carrier family 16 member 2) (X-linked PEST-containing transporter). 19. Tumor necrosis factor receptor superfamily member 12A precursor (Fibroblast growth factor-inducible immediate-early response protein 14) (FGF-inducible 14) (Tweak-receptor) (TweakR) (CD266 antigen). 20. Villin-1. 21. Antibody HPA013409 (no description).

Proteins in benign vs. malignant 1. Uncharacterized protein ENSP00000382160. cervix and uterine 2. Antibody HPA000452 (no description). [cervical cancer, tumor cells, ≥6 patients 3. Butyrophilin subfamily 3 member A1 precursor (CD277 antigen). = strong staining], and [endometrial 4. Cyclin-dependent kinase inhibitor 2A, isoform 4 (p14ARF) (p19ARF). cancer, tumor cells, ≥6 patients = strong 5. Serine/threonine-protein kinase Chk2 (EC 2.7.11.1) (Cds1). staining], and [cervix, uterine, glandular 6. Claudin-3 (Clostridium perfringens enterotoxin receptor 2) (CPE- receptor 2) (CPE-R 2) cells < moderate staining] (Ventral prostate.1 protein homolog) (HRVP1). 7. Macrophage colony-stimulating factor 1 receptor precursor (EC 2.7.10.1) (CSF-1-R) (Fms proto-oncogene) (c-fms) (CD115 antigen). 8. Erlin-2 (Endoplasmic reticulum lipid raft-associated protein 2) (Stomatin-prohibitin-flotillin- HflC/K domain-containing protein 2) (SPFH domain-containing protein 2). 9. Flap endonuclease 1 (EC 3.1.-.-) (Flap structure-specific endonuclease 1) (FEN-1) (Maturation factor 1) (MF1) (hFEN-1) (DNase IV). 10. FK506-binding protein 3 (EC 5.2.1.8) (Peptidyl-prolyl cis-trans isomerase) (PPIase) (Rotamase) (25 kDa FKBP) (FKBP-25) (Rapamycin- selective 25 kDa immunophilin). 11. Heat shock-related 70 kDa protein 2 (Heat shock 70 kDa protein 2). 12. Heat shock protein beta-1 (HspB1) (Heat shock 27 kDa protein) (HSP 27) (Stress-responsive protein 27) (SRP27) (Estrogen-regulated 24 kDa protein) (28 kDa heat shock protein). 13. Interleukin-1 receptor-like 2 precursor (IL-1Rrp2) (Interleukin-1 receptor-related protein 2) (IL1R-rp2). 14. Interleukin-7 receptor subunit alpha precursor (IL-7R-alpha) (CD127 antigen) (CDw127). 15. Kallikrein-6 precursor (EC 3.4.21.-) (Protease M) (Neurosin) (Zyme) (SP59) (Serine protease 9) (Serine protease 18). 16. Keratin, type I cytoskeletal 14 (Cytokeratin-14) (CK-14) (Keratin-14) (K14). 17. Keratin, type I cytoskeletal 17 (Cytokeratin-17) (CK-17) (Keratin-17) (K17) (39.1). 18. Myomesin-2 (Myomesin family member 2) (M-protein) (165 kDa - associated protein) (165 kDa connectin-associated protein). 19. Protein NDRG1 (N-myc downstream-regulated gene 1 protein) (Differentiation-related gene 1 protein) (DRG-1) (Reducing agents and tunicamycin-responsive protein) (RTP) (Nickel-specific induction protein Cap43) (Rit42). 20. Pre-B-cell leukemia transcription factor-interacting protein 1 (Hematopoietic PBX-interacting protein). 21. Serine/threonine-protein kinase N1 (EC 2.7.11.13) (Protein kinase C- like 1) (Protein-kinase C- related kinase 1) (Protein kinase C-like PKN) (Serine-threonine protein kinase N) (Protein kinase PKN-alpha). 22. Serine/threonine-protein kinase 33 (EC 2.7.11.1). 23. (Phosphoprotein p19) (pp19) (Oncoprotein 18) (Op18) (Leukemia-associated phosphoprotein p18) (pp17) (Prosolin) (Metablastin) (Protein Pr22). Clin Transl Oncol (2008) 10:604-617 613

Table 1 (continuation) Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glan- dular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

Proteins in benign vs. malignant 1. Multidrug resistance-associated protein 4 (ATP-binding cassette sub- family C member 4) cervix and uterine (MRP/cMOAT-related ABC transporter) (Multi-specific organic anion transporter-B) (MOAT-B). [cervical cancer, tumor cells, ≥6 patients 2. Uncharacterized protein ENSP00000382160. = strong staining], and [endometrial 3. Alpha--1 (Alpha-actinin cytoskeletal isoform) (Non-muscle alpha-actinin-1) cancer, tumor cells, ≥6 patients = strong A (F- cross-linking protein). staining], and [cervix-uterine, surface 4. Alpha-actinin-2 (Alpha-actinin skeletal muscle isoform 2) (F-actin cross-linking protein). epithelial cells (squamous) < moderate 5. Uncharacterized protein ACTN3. staining] 6. Alpha-actinin-4 (Non-muscle alpha-actinin 4) (F-actin cross-linking protein). 7. Antibody HPA000452 (no description). 8. HLA class II histocompatibility antigen, DM alpha chain precursor (MHC class II antigen DMA). 9. Beta-2-microglobulin precursor [Contains: Beta-2-microglobulin form pI 5.3]. 10. Beta-1,4-galactosyltransferase 1 (EC 2.4.1.-) (Beta-1,4-GalTase 1) (Beta4Gal-T1) (b4Gal-T1) (UDP-galactose:beta-N-acetylglucosamine beta- 1,4-galactosyltransferase 1) (UDP-Gal:beta-Glc NAc beta-1,4- galactosyltransferase 1) [Includes: Lactose synthase A pr]. 11. B1 bradykinin receptor (BK-1 receptor) (B1R). 12. Butyrophilin subfamily 3 member A1 precursor (CD277 antigen). 13. Claudin-3 (Clostridium perfringens enterotoxin receptor 2) (CPE- receptor 2) (CPE-R 2) (Ventral prostate.1 protein homolog) (HRVP1). 14. Eukaryotic translation initiation factor 4E-binding protein 1 (4E-BP1) (eIF4E-binding protein 1) (Phosphorylated heat- and acid-stable protein regulated by insulin 1) (PHAS-I). 15. Erlin-2 (Endoplasmic reticulum lipid raft-associated protein 2) (Stomatin-prohibitin-flotillin- HflC/K domain-containing protein 2) (SPFH domain-containing protein 2). 16. Constitutive coactivator of PPAR-gamma-like protein 1 (Protein FAM120A). 17. -B (FLN-B) (Beta-filamin) (Actin-binding-like protein) (Thyroid autoantigen) (Truncated actin-binding protein) (Truncated ABP) (ABP- 280 homolog) (ABP-278) (Filamin 3) (Filamin homolog 1) (Fh1). 18. Flotillin-2 (Epidermal surface antigen) (ESA). 19. Golgi integral membrane protein 4 (Golgi phosphoprotein 4) (Golgi integral membrane protein, cis) (GIMPc) (Golgi-localized phosphoprotein of 130 kDa) (Golgi phosphoprotein of 130 kDa). 20. Golgi membrane protein 1 (Golgi phosphoprotein 2) (Golgi membrane protein GP73). 21. Huntingtin-interacting protein 1 (HIP-I). 22. Endoplasmin precursor (Heat shock protein 90 kDa beta member 1) (94 kDa glucose-regulated protein) (GRP94) (gp96 homolog) (Tumor rejection antigen 1). 23. Heat shock protein 105 kDa (Heat shock 110 kDa protein) (Antigen NY- CO-25). 24. 5-hydroxytryptamine receptor 2B (5-HT-2B) (Serotonin receptor 2B). 25. Interleukin-7 receptor subunit alpha precursor (IL-7R-alpha) (CD127 antigen) (CDw127). 26. Keratin, type I cytoskeletal 14 (Cytokeratin-14) (CK-14) (Keratin-14) (K14). 27. Keratin, type I cytoskeletal 17 (Cytokeratin-17) (CK-17) (Keratin-17) (K17) (39.1). 28. Keratin, type I cytoskeletal 18 (Cytokeratin-18) (CK-18) (Keratin-18) (K18) (Cell proliferation- inducing gene 46 protein). 29. Keratin, type I cytoskeletal 19 (Cytokeratin-19) (CK-19) (Keratin-19) (K19). 30. NAD-dependent malic enzyme, mitochondrial precursor (EC 1.1.1.38) (NAD-ME) (Malic enzyme 2). 31. Lactadherin precursor (Milk fat globule-EGF factor 8) (MFG-E8) (HMFG) (Breast epithelial antigen BA46) (MFGM) [Contains: Lactadherin short form; Medin]. 32. Mucin 1 isoform 5 precursor. 33. Mucin-16 (MUC-16) (Ovarian carcinoma antigen CA125) (Ovarian cancer- related tumor marker CA125) (CA-125). 34. Myelin expression factor 2 (MyEF-2) (MST156). 35. Myomesin-2 (Myomesin family member 2) (M-protein) (165 kDa titin- associated protein) (165 kDa connectin-associated protein). 36. Pre-B-cell leukemia transcription factor-interacting protein 1 (Hematopoietic PBX-interacting protein). 37. Glucosidase 2 subunit beta precursor (Glucosidase II subunit beta) (Protein kinase C substrate, 60.1 kDa protein, heavy chain) (PKCSH) (80K-H protein). 38. Lysosome membrane protein 2 (Lysosome membrane protein II) (LIMP II) (Scavenger receptor class B member 2) (85 kDa lysosomal membrane sialoglycoprotein) (LGP85) (CD36 antigen-like 2). 614 Clin Transl Oncol (2008) 10:604-617

Table 1 (continuation) Proteins differentially expressed in endometrial and cervical cancer, and proteins with enhanced expression in the glan- dular and surface epithelial (squamous) cells Advanced search criteria Proteins retrieved in HPA

39. 45 kDa calcium-binding protein precursor (Cab45) (Stromal cell-derived factor 4) (SDF-4). 40. Single-minded homolog 1. 41. Superoxide dismutase [Mn], mitochondrial precursor (EC 1.15.1.1). 42. Ubiquitin carboxyl-terminal hydrolase 10 (EC 3.1.2.15) (Ubiquitin thioesterase 10) (Ubiquitin-specific-processing protease 10) (Deubiquitinating enzyme 10). 43. Antibody CAB000025 (no description).

The table summarises all search criteria used for cervical and endometrial cancer in the HPA. The tested search string is given together with the retrieved hits for respective search.

Table 2 The biological and functional information gathered using PIKE for 10 proteins that have already been reported as markers of cervical and endometrial cancer in the literature and found in the HPA as well Protein name Biological and functional information retrieved when using PIKE

1. Cystatin B Cysteine protease inhibitor activity. 2. Cystatin-C As an inhibitor of cysteine proteinases, this protein is thought to serve an important physiological role as a local regulator of this enzyme activity. Cystatin C is found in various body fluids, such as the cere- brospinal fluid and plasma. It is expressed at the highest levels in the epididymis, vas deferens, brain, thymus and ovary, and the lowest in the submandibular gland. Disease: Genetic variations in CST3 are associated with age-related macular degeneration type 11 (ARMD11). 3. Heat shock protein Molecular chaperone. It has ATPase activity (by similarity). HSP 90-alpha 4. Putative heat shock 70 kDa ATP binding. protein 7 Disease: Stress response. 5. Stress-induced-phosphoprotein 1 It mediates the association of the molecular chaperones HSC70 and HSP90 (HSPCA and HSPCB). 6. Proto-oncogene tyrosine-protein Implicated in the control of cell growth. Plays a role in the regulation of intracellular calcium levels, kinase Fyn with isoform 2 showing a greater ability to mobilise cytoplasmic calcium in comparison to isoform 1. Required in brain development and mature brain function with important roles in the regulation of axon growth, axon guidance and neurite extension. It blocks axon outgrowth and attraction induced by NTN1 by phosphorylating its receptor DDC. Isoform 1 is highly expressed in the brain; isoform 2 is expressed in cells of haemopoietic lineages, especially T lymphocytes. 7. Tyrosine-protein kinase HCK It may serve as part of a signalling pathway coupling the Fc receptor to the activation of the respiratory burst. It may also contribute to neutrophil migration and may regulate the degranulation process of neu- trophils. It is expressed predominantly in cells of the myeloid and B-lymphoid lineages. 8. Tyrosine-protein kinase Lyn It is expressed in primary neuroblastoma tumours. 9. Proto-oncogene tyrosine-protein Protein tyrosine kinase activity. kinase FGR 10. Mitogen-activated protein kinase It is required for TLR4 activation of the MEK/ERK pathway. Able to activate NF-kappa-B 1 by stimulat- kinase kinase 8 ing proteasome-mediated proteolysis of NF-kappa-B 1/p105. It plays a role in the cell cycle. The longer form has some transforming activity, although it is much weaker than the activated cot oncoprotein. It is expressed in several normal tissues and human tumour-derived cell lines.

prognostic proteins in cervical cancer), and in addition, to candidates as biomarkers must follow unbiased validation use PIKE to retrieve their biological and functional infor- studies before having ample application and usefulness in mation. daily clinical studies. As mentioned by Rifai et al. [34], the In this phase of the study, we have used PIKE to re- search for new biomarkers requires a coherent work trieve biological and functional information of those pro- methodology with well established quantitative differential teins differentially expressed in endometrial and cervical proteomics techniques that would permit the validation of cancer, and proteins with enhanced expression in the glan- potential candidates as biomarkers. dular and surface epithelial (squamous) cell. Two kinds of The second pipeline (Table 3) contains six out of 10 submissions (pipelines) were performed. The first of them submissions, one for each subset of proteins according to (Table 2) summarises the functional comments retrieved by several scenarios of the diseases described in Table 1 PIKE about the number of proteins that have already been (prognostic proteins in cervical and endometrial cancers reported both as potential markers of cervical and endome- [strong vs. moderate], proteins over-expressed in cervical trial cancer [30, 36] and by HPA. Certainly, such potential and endometrial cancers, and proteins in benign vs. malig- Clin Transl Oncol (2008) 10:604-617 615

Table 3 Protein counts in relation to their biological information when using PIKE for differentially expressed, prognostic and benign vs. malig- nant proteins in endometrial and cervical cancer found in the HPA Proteins over expressed Prognostic proteinsa Biological and Glandular Surface functional Cervical Endometrial Cervical Endometrial cells epithelial information cancer cancer cancer cancer cells (squamous)

3D-structure 5 8 5 7 4 (6) – (11) Direct protein sequencing 4 – 6 4 4 (–) – (12) Disease mutation 5 – 3 – – (–) 5 (7) Polymorphism 6 10 9 8 5 (12) 10 (21) Phosphoprotein 5 10 4 5 – (10) – (2) Nucleus – – – – 3 (8) – (3) Repeat 4 6 – – – (–) – (13) GO:0005515-protein binding 8 9 5 7 4 (15) 7 (–) aStrong vs. moderate staining The number in bold represents the proteins for the cell type in question and the numbers in normal type represent the proteins in benign vs. ma- lignant cervix and uterine

Fig. 2 Flow diagram showing the steps for searching in the PIKE. The search function in the PIKE allowed us to enter queries based on several subsets of pro- tein accession codes in order to find out bio- logical and functional information for each category of proteins (i.e., proteins over-ex- pressed in cervical can- cers). In the example given, we enquired “protein subset 4” (P12821, O00170 and O43247, up to P36021, Q9NP84 and P09327, which belongs to the group of prognostic proteins in endometrial cancer [strong vs. nega- tive]), selected Swis- sProt/UniProt database and all parameters (i.e., function, subcellular lo- cation, tissue specifici- ty, etc.), including exhaustive search. The information was generated in tables per each submission (protein or subset of proteins) executed. The format of PIKE al- lowed identification of which proteins are linked with a specific disease and access to clinical and biomedical information 616 Clin Transl Oncol (2008) 10:604-617 nant CUC). Given that the input of PIKE was more specif- tion (Fig. 2C) regarding an entry just by clicking on the de- ic, we wanted to focus the search on the information re- sired link. In more detail, each submission drops plenty of garding clinical and biomedical features like OMIM refer- information regarding OMIN (diseases) and KEGG (meta- ences and diseases and tissue-specific comments. bolic pathways), which explains the premise of potential Both approaches were based on the schema described biomarkers for each of them (Fig. 2C). in Fig. 2A. First, the different files of proteins were gener- In summary, the in silico analysis of protein neoplastic ated. The list of pipeline A was filled by hand after extract- biomarkers for CUC using HPA resulted in 10 out of 151 ing the common protein accession codes between biblio- protein candidate biomarkers which were also cited in the graphic review and HPA. Case B created six out of 10 literature. According to PIKE, over-expressed and prognos- subsets of protein accession codes. Each of them was ob- tic proteins in CUC found in HPA showed binding func- tained using HPA according to the criteria described in tion, polymorphism and important PTM such as phospho- Table 1 and extracted directly as a text file from the HPA rylation. The other 141 proteins remain to be unveiled as website. potential candidates in unbiased studies of quantitative dif- Second, the search parameters must be settled using the ferential proteomics before we can conclusively validate PIKE interface. This stage starts with the selection of the the use of these bioinformatics tools specifically for CUC database that contains the selected protein accession codes. biomarker discovery. In both cases the SwissProt/UniProt option was chosen. In addition, the combination of HPA and PIKE offers Next, we had to select those fields we wanted to show. In the user a valuable key in clinical applications. We have this case, depending on the features of each submission, demonstrated how to use both tools to link the information different sets of parameters were selected. In case A, Pro- from a disease (without any knowledge about proteins) to tein Name, Gene Name, UniProt comments regarding the clinical features and related symptoms. function and cellular location together with GO terms (Bi- ological Process, Molecular Function and Subcellular loca- Acknowledgements The present article is an investigation realised in tion) were selected. On the other hand, in case B, Protein a sabbatical financed by the National Council of Science and Tech- Name, Gene Name, tissue specificity, disease information, nology, México (CONACYT-Reference no. 76346). Mario A. Ro- dríguez-Pérez holds a scholarship from Comisión de Operación y Fo- and the references from OMIM, KEGG and Intact were mento de Actividades Académicas/Instituto Politécnico Nacional chosen. (IPN). Mario A. Rodríguez-Pérez is grateful for the support offered Once all the parameters were introduced and after by the Centro Nacional de Biotecnología of Consejo Superior de In- clicking on the start button, the PIKE algorithm started to vestigaciones Científicas (CNB-CSIC) in Madrid, Spain where he ini- search the information regarding each set of proteins. Fig. tiated studies on the search of human biomarkers in bio-fluids asso- 2B shows the results for one of the subsets of proteins de- ciable to different pathological stages by proteomics analysis. Sofia Bernal holds a doctoral scholarship from CONACYT-México and her rived by HPA. Because of the view offered by PIKE, it is thesis on biomarkers for risk of progression of VPH infections to can- easy to identify which proteins are linked with a specific cer is being financed by the PACyT of the Universidad Autónoma de disease the same time as checking all the clinical informa- Nuevo León and CONACyT (México).

References 9. Zur Hausen H (1989) Papilimaviruses in anogenital 18. Laara E, Day EN, Hakama M (1987) Trends in cancer as a model to understand the role of viruses mortality from cervical cancer in the nordic coun- in human cancer. Cancer Res 49:4677–4681 tries: association with organized screening pro- 1. Franco EL, Duarte FE, Ferenczy A (2001) Cervi- 10. Walboomers JM, Jacobs MV, Manos MM et al grams. Lancet 1:1247–1249 cal cancer: epidemiology, prevention and the role (1999) Human papillomavirus is a necessary 19. Lazcano-Ponce E, Rascón-Pacheco R, Lozano R, of human papillomavirus infection. CMAJ 164: cause of invasive cervical cancer worldwide. J Pa- Velasco E (1996) Mortality from carcinoma of the 1017–1025 thol 189:12–19 uterine cervix in México: impact of screening 2. Pillai MR, Halabi S, Mckallip A et al (1996) The 11. Jung WW, Chun T, Sul D et al (2004) Strategies 1980–1990. Acta Cytol 40:506–512 presence of human papilomavirus-16/18 e6/-18 against human papillomavirus infection and cervi- 20. Lazcano-Ponce E, Alonso P, Ruiz-Moreno JÁ, e6, p53, and bc1-2 protein in cervicovaginal cal cancer. J Microbiol 42:255–266 Hernández-Avila M (2003) Recommendations smears from patients with invasive cervical cancer. 12. Steenbergen RDM, Wilde JD, Wilting SM et al for cervical cancer screening programs in devel- Cancer Epidemiol Biomarkers Prev 5:329–335 (2005) HPV-mediated transformation of the oping countries. The need for equity and techno- 3. Parkin M, Pisani P, Ferlay J (1999) Estimates of anogenital tract. J Clin Virol 32s:s25–s33 logical development. Salud Publica Mex 45:449– the worldwide incidence of 25 major cancers in 13. De Villiers EM, Fauquet C, Broker TR et al 462 1990. Int J Cancer 80:827–841 (2004) Classification of papillomaviruses. Virolo- 21. Lin YW, Lai HC, Lin CY et al (2006) Plasma pro- 4. Valdespino-Gomez VM, Valdespino-Castillo VE gy 324:17–27 teomic profiling for detecting and differentiating (2004) Current perspectives in cervical cancer. 14. Choi YP, Kang S, Hong S et al (2005) Proteomic in situ and invasive carcinomas of the uterine Ginecol Obstet Mex 72:29–38.8 analysis of progressive factors in uterine cervical cervix. Int J Gynecol Cancer 16:1216–1224 5. Palacio-Mejia LS, Rangel-Gomez G, Hernandez- cancer. Proteomics 5:1481–1483 22. Koss LG (1993) Cervical (PAP) smear: new direc- Avila M, Lazcano Ponce E (2003) Cervical can- 15. Dillner J, Kallings I, Brihmer C et al (1996) Sero - tions. Cancer 71:1406–1412 cer, a disease of poverty: mortality differences be- positivities to human papillomavirus types 16, 18, 23. De la Cruz-Hernández E, Contreras-Paredes A, tween urban and rural areas in México. Salud Pub or 33 capsids and to chlamydia trachomatis are mar- Lizano-Soberón M (2006) Toward cervical cancer Mex 45:s315–25.9 kers of sexual behaviour. J Infect Dis 173:1394–1398 prevention: strategies employed in the develop- 6. Instituto Nacional de Estadística, Geografía e In- 16. Silins I, Zhaohui W, Avall-Lundvist E et al (1999) ment of HPV vaccines. Rev Invest Clin 58: 586– formación (I.N.E.G.I.) (2005) Información sobre Serological evidence for protection by human pa- 597 tumores malignos. México, D.F., Febrero de 2005 pilomavirus (HPV) type 6 infection against HPV 24. Ciordia S, De los Rios V, Albar JP (2006) Contri- 7. Zur Hausen H (1976) Condylomata acuminata type 16 cervical carcinogenesis. J Gen Vir 80: butions of advanced proteomics technologies to and human genital cancer. Cancer Res 36:794 2931–2936 cancer diagnosis. Clin Transl Oncol 8:566–580 8. Malik AI (2005) The role of Human Papilloma 17. Zur Hausen H (2002) Papillomaviruses and can- 25. Pyo Choi Y, Kang S, Hong S et al (2005) Pro- Virus (HPV) in the etiology of cervical cancer. J cer: from basic studies to clinical application. Na- teomic analysis of progressive factors in uterine Park Med Assoc 55:553–558 ture 2:342–350 cervical cancer. Proteomics 5:1481–1483 Clin Transl Oncol (2008) 10:604-617 617

26. Calderón-González KG et al, 2007. Estan da ri za - proaches. Brief Funct Genomic Proteomic 3: mensional gel analysis of protein expression pro- ción del análisis proteómico de queratinocitos in- 220–239 file in squamous cervical cancer patients. Gynecol mortalizados con los genes E6, E7 y E6-7 del 30. Chen J, Kähne Röcken C, Götze T et al (2004) Oncol 99:26–35 HPV-16. II Simposio Mexicano de Espectrometría Proteome analysis of gastric cancer metastasis by 34. Rifai N, Gillette MA, Carr SA (2006) Protein bio- de Masas. Proteómica Celular y Molecular. The two-dimensional gel electrophoresis and matrix marker discovery and validation: the long and un- city of Guanajuato, México assisted laser desorption/ionization-mass spec- certain path to clinical utility. Nat Biotechnol 27. Checa Rojas A et al 2007. Análisis del secretoma trometry for identification of metastasis-related 24:971–983 de líneas celulares de CaCU. op cit. proteins. J Proteome Res 3:1009–1016 35. Björling E, Lindskog C, Oksvold P et al (2008) A 28. Li H, DeSouza LV, Ghanny S et al (2007) Identifi- 31. Martinez-Cedillo J (2004) Marcadores tumorales web-based tool for in silico biomarker discovery cation of candidate biomarker proteins released séricos: aplicación clínica. Gamo 3:76–81 based on tissue-specific protein profiles in normal by human endometrial and cervical cancer cells 32. Alaiya A, Franzen B, Auer G, Linder S (2000) and cancer tissues. Mol Cell Proteomics 7:825– using two-dimensional liquid chroma to gra - Cancer proteomics: from identification of novel 844 phy/tandem mass spectrometry. J Proteome Res markers to creation of artificial learning models 36. Yoon SH, Cho HI, Kim TG (2005) Activation of 6:2615–2622 for tumor classification. Electrophoresis 21: B cells using Schneider 2 cells expressing CD40 29. Monteoliva L, Albar JP (2004) Differential pro- 1210–1217 ligand for the enhancement of antigen presenta- teomics: an overview of gel and non-gel based ap- 33. Bae SM, Lee CH, Cho YL et al (2005) Two-di- tion in vitro. Exp Mol Med 37:567–574