<<

A NOVEL APPROACH TO IDENTIFICATION OF

DIAGNOSTIC MARKERS IN PROSTATE

by

INNA SHYSHYNOVA

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy

Dissertation Adviser: Professor Andrei Gudkov

Department of

CASE WESTERN RESERVE UNIVERSITY

August, 2006

iii

TABLE OF CONTENTS

CHAPTER ...... PAGE

A NOVEL APPROACH TO IDENTIFICATION OF DIAGNOSTIC MARKERS IN

PROSTATE CANCER ...... 1

TABLE OF CONTENTS...... IV

LIST OF FIGURES ...... IX

LIST OF TABLES...... XI

ACKNOWLEDMENTS ...... XIII

LIST OF ABBREVIATIONS...... XIV

I. INTRODUCTION ...... 1

1. Tumor markers...... 1

2. The prostate...... 5

3. ...... 5

3.1 Diagnosis of prostate cancer ...... 6 3.2 Prostate cancer treatment strategies...... 7 4. Prostatic Acid (PAP)...... 9

5. Prostate-Specific Antigen (PSA) ...... 10

5.1 PSA as prostate cancer marker and its advantages ...... 10 5.2 Drawbacks of PSA-based diagnostics of prostate cancer...... 11 5.3 PSA is transcriptionally repressed by ...... 12 6. Prostate-Specific Membrane Antigen (PSMA) ...... 15

7. The search for novel markers in prostate cancer ...... 15

7.1 The attributes of an ideal ...... 16 7.2 Methods of prostate tumor markers search...... 19 7.3 Potential PCa markers...... 20 II. THE GENERAL STRATEGY OF THE STUDY...... 26

iv

III. IDENTIFICATION OF NOVEL MARKERS IN PROSTATE CANCER BY IN

SILICO EXPRESSION PROFILING...... 29

1. Rationale ...... 29

2. Introduction...... 29

2.1 General strategy of EST data mining...... 29 2.2 EST clustering...... 30 2.3 Commonly used criteria of selection...... 31 2.3.1 Digital Differential Display (DDD)...... 31 2.3.2 Pool specificity...... 32 2.3.3 Guilt By Association (GBA)...... 32 2.4 A novel in silico profiling approach ...... 33 3. Materials and Methods...... 33

3.1 Step 1: Assignment of ESTs to their parent transcripts ...... 33 3.2 Step 2: cDNA library selection and tissue pool assembly ...... 34 3.3 Step 3: Selection of candidate with desired digital expression profile...... 35 3.3.1 Reconstruction of expression profiles...... 35 3.3.2 Selection of potential markers ...... 36 3.4 Software implementation of the algorithms used ...... 41 3.5 Redundancy elimination ...... 41 3.6 The relationship between our approach and the previous work ....42 4. Results...... 43

4.1 The screening summary ...... 43 4.2 Known cancer markers ...... 43 5. Discussion...... 45

6. Conclusions...... 45

IV. IDENTIFICATION OF NOVEL POTENTIAL PROSTATE CANCER

MARKERS BY IN VIVO EXPRESSION PROFILING ...... 47

1. Rationale ...... 47

2. Introduction...... 47

2.1 Tumor-suppressor genes: tumors and mouse models ...... 48 3. Materials and Methods...... 49

3.1 Transgenic ...... 49

v

3.2 RNA isolation ...... 50 3.3 Probes preparation and microarray hybridizations ...... 50 3.4 Microarray data analysis and pre-candidate gene selection...... 50 4. Identification of the human homologs for the selected mouse genes: the

BI-HUMANIZER software ...... 51

4.1 The inner workings of BI-HUMANIZER...... 52 5. Results...... 56

5.1 Genes upregulated in the murine prostate after tumor suppressor inactivation...... 56 5.2 BI-HUMANIZER results...... 58 5.3 Known cancer markers ...... 58 6. Discussion...... 60

7. Conclusions...... 60

V. VERIFICATION OF THE SELECTED PRE-CANDIDATE GENES BY

MICROARRAY HYBRIDIZATION...... 61

1. Rationale ...... 61

2. Introduction...... 61

3. Materials and Methods...... 62

3.1 cDNA clone selection and ordering...... 62 3.2 Microarray printing...... 127 3.3 Cell culture sources for hybridization probe preparation ...... 127 3.4 Probe preparation and microarray hybridization ...... 127 3.5 Microarray data analysis...... 130 3.5.1 Quality control ...... 130 3.5.2 Normalization ...... 130 3.5.3 Cluster analysis ...... 131 3.5.4 The selection of candidate genes for validation...... 131 4. Results...... 131

4.1 Cluster analysis of expression across samples...... 131 4.2 Cluster analysis of expression across genes...... 132 4.3 Selection of candidates for validation of expression profiles ...... 134 5. Discussion...... 134

6. Conclusions...... 136

vi

VI. VALIDATION OF THE CANDIDATES’ EXPRESSION PATTERNS ...... 138

1. Rationale ...... 138

2. Introduction...... 138

3. Semi-quantitative RT-PCR ...... 138

3.1 Materials and Methods...... 138 3.2 Results...... 139 4. Northern blot analysis...... 142

4.1 Materials and Methods...... 142 4.2 Results...... 142 5. Conclusions...... 146

VII. EXPRESSION OF ADVANCED CANDIDATES IN CLINICAL SAMPLES .147

1. Rationale ...... 147

2. Introduction...... 147

3. Materials and Methods...... 147

3.1 Human tissue samples...... 147 3.2 Real-time PCR ...... 148 3.3 Androgen-responsiveness of candidate ...... 149 4. Results...... 151

4.1 Real-time PCR in the cell line collection...... 151 4.2 Real-time PCR in tissue samples ...... 151 4.3 Androgen-dependence of the advanced candidates ...... 157 4.4 Expanded Real-Time PCR study ...... 160 5. Discussion...... 165

5.1 KIAA1181/ERGIC-32/ERGIC-1...... 168 5.2 MAL2...... 168 5.3 MGC13170 ...... 168 5.4 FLJ30428 ...... 169 6. Conclusions...... 171

VIII. DEVELOPMENT OF ANTIBODY-BASED ASSAYS AND FURTHER

VALIDATION OF ADVANCED CANDIDATES...... 172

1. Rationale ...... 172

vii

2. Materials and Methods...... 172

2.1 Generation of custom polyclonal antibodies...... 172 2.2 analysis ...... 172 2.3 Real-time PCR ...... 173 3. Results...... 173

3.1 Analysis of antibodies in the cell line collection ...... 173 3.2 KIAA1181 has two transcripts...... 176 3.3 Comparison of expression of KIAA1181 transcripts (both transcripts vs. 2nd transcript) ...... 179 4. Discussion...... 183

5. Conclusions...... 187

CITED LITERATURE ...... 188

viii

LIST OF FIGURES

FIGURE ...... PAGE

FIGURE1. OPPOSITE REGULATORY EFFECT OF P53 ON PSA AND P21 PROMOTERS...... 14

FIGURE 2. A NOVEL INTEGRATED APPROACH TO IDENTIFICATION OF PROSTATE CANCER

MARKERS. EXPERIMENTAL PLAN...... 28

FIGURE 3. IN SILICO EXPRESSION PROFILES OBTAINED BY OUR SOFTWARE FOR KNOWN

HOUSEKEEPING GENES AND PROSTATE TUMOR MARKERS. X AXIS: TISSUE POOL

NAMES; Y AXIS: RELATIVE EXPRESSION IN AVGE UNITS (SEE TEXT FOR DETAILS). ....38

FIGURE 4. CONVERSION OF THE MOUSE GENES TO THEIR HUMAN HOMOLOGS: BI-

HUMANIZER PROGRAM SCHEMATICS...... 55

FIGURE 5. THE RELATIONSHIP BETWEEN GENE GROUPS UPREGULATED IN P53 KO (RED),

TRAMP (GREEN) AND PTEN+/- (BLUE) MOUSE MODELS...... 57

FIGURE 6. EXPRESSION OF CANDIDATE AND KNOWN PROSTATE MARKERS IN THE PANEL

OF CELL LINES...... 133

FIGURE 7. CANDIDATES’ EXPRESSION PROFILES IN THE CELL LINE COLLECTION BY RT-

PCR. L1R1, L2R2: PRIMER PAIRS FLANKING DIFFERENT OF THE SAME GENE,

SHOWN SEPARATELY WHERE EXPRESSION PATTERNS DIFFER, OTHERWISE TYPICAL

EXPRESSION PATTERN IS SHOWN...... 141

FIGURE 8. CANDIDATES’ EXPRESSION PROFILES IN THE CELL LINE COLLECTION,

DETECTED BY NORTHERN BLOTS...... 144

FIGURE 9. CANDIDATES’ EXPRESSION PROFILES IN HUMAN NORMAL TISSUES, DETECTED

BY NORTHERN BLOTS...... 145

FIGURE 10. RELATIVE EXPRESSION OF CANDIDATE GENES IN CELL LINES...... 153

FIGURE 11A. RELATIVE EXPRESSION OF CANDIDATE GENES IN PROSTATE TISSUE

SAMPLES...... 154

FIGURE 11B. RELATIVE EXPRESSION OF CANDIDATE GENES IN PROSTATE TISSUE

SAMPLES...... 155

ix

FIGURE 11C. RELATIVE EXPRESSION OF CANDIDATE GENES IN PROSTATE TISSUE

SAMPLES...... 156

FIGURE 12. THE CHANGES IN RELATIVE EXPRESSION LEVELS OF CANDIDATES’ GENES

UPON ANDROGEN WITHDRAWAL (CSS) AND RECONSTITUTION (DHT). A. LNCAP

PROSTATE CANCER CELL LINE. B. CWR22R PROSTATE CANCER CELL LINE...... 159

FIGURE 13. EXPRESSION LEVELS OF CANDIDATES, PSA AND PSMA IN HUMAN TISSUE

SAMPLES...... 162

FIGURE 14. EXPRESSION PROFILES OF FOUR CANDIDATE MARKERS, PSA AND PSMA IN

HUMAN PROSTATE SAMPLES. A. UNSORTED. B. AVERAGE LINKAGE HIERARCHICAL

CLUSTERING (UNCENTERED PEARSON CORRELATION)...... 164

FIGURE 15. A NOVEL INTEGRATED APPROACH TO IDENTIFICATION OF PROSTATE CANCER

MARKERS. EXPERIMENTAL PLAN...... 167

FIGURE 16. THE SCHEME OF CUSTOM POLYCLONAL ANTIBODY GENERATION BY

PROTEINTECH GROUP INC...... 174

FIGURE 17: THE WESTERN BLOT WITH CUSTOM POLYCLONAL ANTIBODIES ON THE CELL

LINES COLLECTION (DESCRIBED ABOVE; PROSTATE CELL LINES ARE MARKED BOLD).

A. TWO ANTIBODIES AGAINST MGC13170. B. TWO ANTIBODIES AGAINST KIAA1181. C.

ANTI-KIAA1181B...... 175

FIGURE 18. SEQUENCE-BASED ANALYSIS OF KIAA1181 (SOURCE OF SEQUENCES IS NCBI

DATABASE). A. SEQUENCE ALIGNMENT OF TWO TRANSCRIPTS. B. THE STRUCTURE OF

THE GENE...... 178

FIGURE 19: EXPRESSION OF KIAA1181 TRANSCRIPTS AS BY REAL-TIME PCR. A. PANEL OF

CELL LINES (NORMALIZED TO CWR22R EXPRESSION). B. EXPRESSION PROFILES OF

THE CANDIDATE MARKERS, PSA AND PSMA IN HUMAN PROSTATE SAMPLES...... 180

FIGURE 20: EXPRESSION PROFILES OF THE CANDIDATE MARKERS (WITH THE ADDITION

OF 2ND TRANSCRIPT OF KIAA1181), PSA AND PSMA IN HUMAN PROSTATE SAMPLES...181

x

LIST OF TABLES

TABLE...... PAGE

I. COMMON CLINICAL USES OF SERUM TUMOR MARKERS...... 3

II. CANCER MARKERS COMMONLY USED IN CLINICAL SETTINGS...... 4

III. ATTRIBUTES OF AN IDEAL TUMOR MARKER ...... 18

IV. POTENTIAL PROSTATE CANCER MARKERS ...... 21

V. POTENTIAL TUMOR MARKERS IDENTIFIED BY OUR SOFTWARE: TUMOR-SPECIFIC

AND PROSTATE-SPECIFIC GENES...... 39

VI. POTENTIAL TUMOR MARKERS IDENTIFIED BY OUR SOFTWARE: GENES HIGHLY

SPECIFIC FOR PROSTATE CANCER...... 40

VII. KNOWN AND POTENTIAL TUMOR MARKERS IDENTIFIED BY IN SILICO PROFILING

APPROACH ...... 44

VIII. KNOWN AND POTENTIAL TUMOR MARKERS IDENTIFIED AS GENES POTENTIALLY

REPRESSED BY P53 AND PRB...... 59

IX. CDNA CLONES PRINTED ON PROSTATE MARKER ARRAY ...... 64

X. HUMAN CELL LINES USED IN MICROARRAY AND NORTHERN BLOT

HYBRIDIZATIONS...... 129

XI. THE CANDIDATE GENES...... 137

XII. RT-PCR PRIMERS AND NORTHERN PROBES FOR CANDIDATE GENES...... 140

XIII. TAQMAN GENE EXPRESSION ASSAYS USED IN REAL-TIME PCR STUDY...... 150

XIV...... DIFFERENTIATION AMONG SAMPLE GROUPS BY PSA, PSMA AND FOUR CANDIDATE

MARKERS ...... 163

XV. THE CORRELATION BETWEEN GENE EXPRESSION PROFILES BASED ON REAL-TIME

PCR DATA; PROSTATE TISSUES ...... 164

XVI. THE PROPERTIES OF FOUR ADVANCED CANDIDATE GENES ...... 170

XVII. DIFFERENTIATION AMONG SAMPLE GROUPS BY KIAA1181 TRANSCRIPTS ...... 182

XVIII.....THE CORRELATION BETWEEN GENE EXPRESSION PROFILES BASED ON REAL-TIME

PCR DATA (KIAA1181 2ND TRANSCRIPT DATA ADDED); PROSTATE TISSUES...... 182

xi

XIX. EXAMPLES OF CANCER PROGRESSION-RELATED GENES WITH CANCER-SPECIFIC

ALTERNATIVE SPLICING...... 185

xii

ACKNOWLEDMENTS

I want to express my deepest appreciation and respect to Dr. Andrei Gudkov, my adviser. His continuous support, scientific enthusiasm, magnificent ideas and personal attention were crucial to my work and professional development.

I am also grateful to the members of my dissertation committee: Drs. Edward

Stavnezer, Ganes Sen, David Sedwick, and Hung-Ying Kao for their time and invaluable advice.

I wish to express my sincere gratitude to all my collaborators who made this study possible. First and foremost, I wish to thank Dr. Vadim Krivokrysenko, who was of paramount importance to the work included in this thesis. I want to thank Drs. Katerina

Gurova, Elena Feinstein, Olga Chernova, and Elena Komarova, Ivanda Pavlovska and

Natasha Tararova for their contributions in work and advice.

I would like to thank all members of Dr. Gudkov’s lab, past and present, for their continuing scientific and personal support and just for the sheer joy of their company.

Finally, I would like to extend my thanks to all members of the Department of

Biochemistry at Case Western Reserve University, and all members of the Department of

Molecular Genetics at Lerner Research Institution (Cleveland Clinic) for the assistance, education and friendship they offered.

xiii

LIST OF ABBREVIATIONS

AFP Alpha-fetoprotein

BPH Benign prostate hyperplasia

CA Cancer antigen

CI Confidence interval

CDK -dependent

cDNA Copy deoxyribonucleic acid

CEA

CSS Charcoal-stripped serum

DDD Digital differential display

DHT Dihydrotestosterone

DMEM Dulbecco's modified Eagle's medium

DRE Digital rectal examination

ECM Extracellular matrix

EGF

EP Electronic profile

EST Expressed sequence tag

FBS Fetal bovine serum

FDA Food and drug administration

FOHL Folate

FSH Follicle-stimulating hormone

GBA Guilt by association

GEGF Gene Expression and Genotyping Facility

xiv

hCG Human chorionic gonadotropin

IGF -like growth factor

IGFR Insulin-like growth factor

KO Knock-out

LDL Low-density

MIA inhibitory activity

NAALADase N-Acetylated-alpha-linked-acidic-

NT Nucleotide

PAP Prostatic

PCa Prostate cancer

PSA Prostate-specific antigen

PSMA Prostate-specific membrane antigen

PCR Polymerase chain reaction

RNA Ribonucleic acid

RT-PCR Reverse transcriptase polymerase chain reaction

VEGFR Vascular endothelial growth factor receptor

xv

Novel Approach to Diagnostic Markers in Prostate Cancer

Abstract

by

INNA SHYSHYNOVA

Prostate cancer is the most common and second leading cause of cancer deaths in men. The effectiveness of its cure depends on a timely and accurate diagnosis of the disease as well as application of appropriate therapeutic strategies. We designed an integrated program aimed at identification of prospective marker genes based on their expression properties and applied it to prostate cancer analysis. Our strategy employs in silico expression profiling (an advanced version of human EST database mining) and microarray-based analysis of genes under negative regulation of tumor suppressor genes. A combined list of prospective candidate genes was assembled, which contained a number of known cancer markers, thus demonstrating feasibility of the methodology used. Expression profiles of other candidates were characterized in a large set of tumor cell lines and tumor samples using custom cDNA microarray hybridization,

RT-PCR and Northern blots. KIAA1181, MAL2, MGC13170, and FLJ30428, our most promising candidates, were investigated by real-time PCR in prostate cancer and normal tissue samples. KIAA1181 and MAL2 were capable of distinguishing between cancerous and cancer-free prostate as effectively as PSA or PSMA. This approach, proven feasible by the present study, can now be extended to marker identification of virtually any cancer or disease.

xvi 1

I. INTRODUCTION

1. Tumor markers

What do people die of? According to death statistics, people die mainly from various heart diseases, while cancer is a second cause of death. However, cancer is the major cause of death in Americans younger than 85. [1]. Cancer is a progressive disease, which at advanced stages becomes incurable. The effectiveness of cancer treatment depends on a timely and accurate diagnosis as well as application of appropriate therapeutic strategies.

In past decades tumor markers became recognized as a valuable diagnosis and monitoring tool in cancer treatment with various clinical applications (Table I) [2]. Serum markers (i.e. molecules that can be detected in blood sample by immunochemistry) in solid are mostly used for monitoring recurrence of cancer during or after the treatment; tissue-based markers (such as detected by amplification techniques, e.g., estrogen and progesterone receptors or HER-2) are primarily measured to determine prognosis and predict response to therapy [3]. However, nearly all of the tumor markers can be detected in serum samples of patients [4].

Most current tumor marker tests have very limited sensitivity (i.e. capable of distinguishing cancer vs. healthy tissue) and/or specificity (to pinpoint tissue origin of the tumor). Tumor markers are generally detectable in all healthy individual, so it is not their presence in serum but rather their elevated quantity and its kinetics that are indicative of presence or progression of cancer. Decisions that take into account the concentrations of these cancer-associated molecules should only be made in of the entire clinical picture. 2

The National Academy of Clinical Biochemistry Tumor Marker Practice

Guidelines list approximately 150 analytes and methods that are currently under investigation for their utility as markers for various . However, only a handful of those were approved by the Food and Drug Administration (FDA) to be used as tumor markers with limited usage indications (Table II) [2].With the exception of prostate-specific antigen (PSA), a secreted marker of prostate cancer, tumor markers do not have sufficient sensitivity or specificity for use in screening of general asymptomatic population [4]. One of the major goals in cancer research is to find markers that are significantly more sensitive and specific for early cancer detection, as well as the other uses.

3

TABLE I

COMMON CLINICAL USES OF SERUM TUMOR MARKERS

Utility Example While not used solely for this purpose, markers can aid in making a diagnosis and in locating the Diagnosis source of cancers that have metastasized. After a patient has been successfully treated, some markers are tested at regular intervals to indicate whether there has been a recurrence of Monitoring for recurrence the cancer. To aid in the estimation of tumor volume, as an indicator of disease progression and aggressiveness, or as an indication of metastatic Prognosis and staging involvement. After cancer surgery, testing can be used to indicate whether the entire tumor burden has Detection of residual disease been successfully removed.

Used to test patients without symptoms. With the exception of PSA, the screening population is usually reserved for those individuals at high risk for a given cancer (e.g., genetically-linked Screening cancers).

A means to assess the success of treatment by monitoring a patient's response to various treatment regimens; in general, levels will drop if treatment is beneficial and will remain elevated or Monitoring treatment increase if it is ineffective.

4

TABLE II

CANCER MARKERS COMMONLY USED IN CLINICAL SETTINGS

Cancer Marker Primarily in Description Utility cancers AFP (alpha-fetoprotein) Liver and germ Also elevated during pregnancy Detection cell cancer of ovaries or testes CA 125 (cancer antigen Ovarian and Elevated in -50% of patients with Monitoring 125) endometrial stage I , increasing to therapy and above 90% in advanced stages of recurrence disease CA 15-3 and CA 27.29 Breast Measure different of the Monitoring (cancer antigens 15-3 same antigen (MUC-1 therapy and and 27.29) ) recurrence CA 19-9 (cancer Pancreatic Relevant only in Lewis blood group Staging disease, antigen 19-9) antigen-positive individuals (80-95% monitoring of population) therapy and recurrence CEA Colorectal, Elevated in other conditions such as Monitoring (carcinoembryonic and breast hepatitis, COPD, colitis, , therapy antigen) and in cigarette smokers hCG (human chorionic Testicular and Not approved by FDA as tumor Detection and gonadotropin), beta trophoblastic marker; used in pregnancy tests monitoring therapy PSA (prostate specific Prostate One of few organ-specific markers; Screening, antigen), total and free not cancer-specific; considered monitoring promising and near-ideal prostate therapy and cancer marker recurrence

5

2. The prostate

The prostate gland is a part of the male reproductive system. Its main function is to liquefy the semen, which is necessary for male fertility. A normal prostatic gland is composed of three distinct cell types: secretory cells, basal cells, and neuroendocrine cells. The predominant secretory cells are characterized by expression of (AR), 8 and 18, CD57, and PSA. Basal cells, located on the basement membrane, express cytokeratins 5 and 14, CD44, and low levels of AR [5, 6].

Neuroendocrine cells, in contrast, are androgen-independent and express and a variety of hormones [7].

Benign prostatic hypertrophy (BPH) is a very common prostate disorder in men over 50. It is a progressive disease characterized by increased prostate size, which results in retention of urinary flow and consequent surgery on prostate [8]. Another common disorder of the prostate is prostatitis, an of the prostate gland, which is due to chronic or acute infection.

The most likely precursor of prostate cancer is prostatic intraepithelial neoplasia

(PIN) [7]. Most men diagnosed with PIN develop within 10 years [9, 10].

PIN is characterized by the disruption of basal membrane and progressive accumulation of chromosomal abnormalities [11-13]. The only method of detection is biopsy; PIN does not significantly elevate serum PSA concentration and cannot be detected by current imaging techniques [10, 14].

3. Prostate cancer

Prostate cancer (PCa) is the most common malignancy and second leading cause of cancer-related deaths in men [15]. In 2006, the American Cancer Society estimates

6 over 234,000 new cases of prostate cancer and 27,000 deaths expected in the United

States [16]. According to autopsy studies, approximately 29% of men have microscopic evidence of PCa by age 30-39, which increases to 65% by age 70 [17]; the risk of prostate cancer increases with age and varies with family history and race. But even without initial treatment, only a small proportion of all patients with cancer diagnosed at an early clinical stage die from prostate cancer within 10 to 15 years following diagnosis

[18, 19]. Due to this discrepancy between PCa high prevalence and relatively low mortality, an ongoing debate surrounds the optimal treatment, especially in view of the increased expectancy [20, 21]. Regardless of outcome, early detection of PCa is essential (see below).

The etiology of prostate cancer is not fully understood despite the significant advances made elucidating its molecular pathology. No single specific candidate gene or combination of genes has been definitively assigned as responsible for PCa tumorigenesis

[22]. This could be explained in part by unique heterogeneity of prostate cancer, which is not only histologically heterogeneous and multifocal but also genetically multicentric

[23-25].

Morphologically, the normal prostate is composed of three zones and a fibromuscular stromal component. The majority of PCa foci (about 70%) arise in the peripheral zone, 10–20% in the transition zone, and only 5–10% occur in the central zone

[26].

3.1 Diagnosis of prostate cancer

Early and accurate detection of prostate cancer offers the best hope of cure for the disease [27]. The American Cancer Society and American Urological Association

7 recommend annual examinations for prostate cancer for men at risk (men over 50; or men over 40 with family history of prostate cancer; or men of African-American ancestry).

This is currently done by two procedures. First one is digital rectal examination (DRE), which has little value as a screening test [28, 29]. The main test on which developments in managing prostate cancer depend is the serum concentration of PSA. PSA levels vary with age, prostate size, and the presence of PCa, but can also be raised after ejaculation, prostate biopsy, surgery, or prostatitis. Abnormalities detected by DRE or elevated PSA levels indicate a biopsy [21]. The traditional cut-point for PSA concentration is 4.0 ng/mL [30]. At the present time biopsy is the only method capable of definite diagnosis of prostate cancer. Due to its multifocal , PCa is sometimes difficult to pinpoint; sextant prostate biopsies miss prostate cancer at least 20% of the time and 12-core biopsy only modestly enhances cancer detection rate [25, 31, 32]. However, biopsy in itself brings risk of complications, commonly discomfort and bleeding, and more rarely sepsis

[21].

3.2 Prostate cancer treatment strategies

Prostate cancer may include tumors with moderate or full differentiation that could progress rather slowly, and tumors with poor differentiation that could grow rapidly and spread beyond the confines of the prostate. Several strategies are available to treat prostate cancer: surgery (including radical prostatectomy), radiotherapy, hormonal therapy, or combinations of the above. Another option is watchful waiting, when no treatment is administered but the patient remains under close observation.

The Gleason grading system in biopsy or prostatectomy specimens is a measure of biological aggressiveness and it is currently the best correlate for final pathological

8 stage and subsequent clinical outcome [33, 34]. At early localized stages of PCa either watchful waiting or surgery/radiotherapy are recommended. Hormonal therapy (androgen ablation) remains the treatment of choice for patients with advanced inoperable prostate cancer [35-37].

Androgen ablation or castration result in rapid prostate atrophy within a few days due to massive of prostate epithelial cells [38]. Apoptotic regression of an androgen-dependent tumor can be induced by any procedure that reduces intracellular concentration of dihydrotestosterone by 80% or more [39]. Approximately 70 to 80% of treated patients with metastatic disease will experience symptomatic relief, i.e. reduced bone pain, better performance status, and a general improvement with increased sense of well-being following androgen ablation [36]. Only about 6% of all prostate cancer cases do not respond to androgen deprivation [40, 41].

Hormonal therapy of PCa can be achieved by surgical removal of the testes, by inhibition of pituitary gonadotropin by GnRH-angonsists or antagonists, by oestrogens or by antiandrogens. Two types of antiandrogens exist; steroidal types, such as cyproterone acetate, and non-steroidal types, such as flutamide, bicalutamide and nilutamide [42]. An estimated 20-40% of men experience a biochemical recurrence

(rising PSA levels) within 10 years of definitive prostate cancer treatment [43]. Even though hormonal therapy is not biologically curative, it is capable of delaying death from prostate cancer long enough for a patient to die of unrelated causes [42, 44]. Secondary hormonal therapy strategies are available for androgen-independent prostate cancer that fails to respond to androgen ablation therapy (reviewed in [45]).

9

Early hormonal therapy improves survival when compared to same therapy deferred until clinical progression of PCa (i.e. appearance of metastases) [46]; however, the adverse effects and morbidity of long-term therapy must not be underestimated.

Every type of endocrine treatment carries adverse effects, which influence quality of life in different ways. This group of symptoms (referred to as the castration syndrome) includes loss of libido and erectile function, hot flushes, as well as in the longer term osteoporosis, anemia, , decrease in muscular strength, fatigue, a decline in physical activity and general vitality, mood changes and depression. These effects on quality of life should be concerned when planning to start a castrative treatment in patients who will need it for a long time due to slow progression of the disease [37, 43].

4. Prostatic Acid Phosphatase (PAP)

PAP (prostatic acid phosphatase) was first identified in 1935 [47, 48]. It became

the first candidate marker for the diagnosis of prostate cancer when its high concentrations in human serum and prostatic tissue were linked to primary and metastatic prostate cancer, while reductions in serum PAP levels were found to occur in response to androgen ablation therapy [49, 50]. However, whereas serum PAP levels were elevated in a significant number of men with metastatic disease, fewer than 20% of men with localized prostate cancer exhibited abnormal levels [51].

The necessity of meticulous sample collection and preparation due to contaminating sources of PAP from and leukocytes [52] and the rapid loss of

PAP activity at room temperature hampered the utility of PAP as a prostate cancer marker. Development of a radioimmune assay for PAP in 1975 provided some

10 improvement in test sensitivity [53], but the sensitivity levels were still inadequate for detection of early-stage disease where cure is more likely [48, 54].

5. Prostate-Specific Antigen (PSA)

PSA, or -3 (KLK3), was first discovered in 1971 as an antigen specific to normal and diseased human prostate tissue and practically undetectable in the rest of the organism [55, 56]. PSA belongs to a family of (KLKs), consisting of -like . PSA is a 33-kDa , normally secreted in the seminal fluid by luminal of the prostatic ducts. The physiological function of

PSA is to digest semenogelin I and II, thus liquefying the seminal fluid. Human KLK3 gene is mapped to 19q13 [57] and its expression is tightly dependent on androgen levels. Several kallikrein family-related genes are present in human and mouse genomes; however, the mouse ortholog of KLK3/PSA does not exist [57, 58].

Sensitive and quantitative ELISA-based immunoassay was developed shortly after the discovery of PSA [59]. The pioneering work of Catalona et al. [60] and many subsequent investigations [61, 62] introduced PSA as a valuable diagnostic tool for early detection and monitoring PCa.

5.1 PSA as prostate cancer marker and its advantages

PSA is presently the only prostate cancer marker approved by FDA and frequently described as the most ideal tumor marker available [2, 30]. PSA is the only tumor marker that has sufficient sensitivity and specificity for the screening of general asymptomatic population [4]. Since the introduction of PSA as a detection and monitoring tool, PCa progression can now be diagnosed well before metastases are present, advancing clinical diagnosis by a mean of 10 years [37, 62].

11

Monitoring the level of prostate-specific antigen (PSA) has created a dramatic shift in the population of patients in whom the endocrine treatment is initiated. Patients with recurrent prostate after the failure of local therapy are now diagnosed with recurrence on the basis of a rising PSA level [37].Watchful waiting, as well, has evolved into active surveillance using changes in the PSA levels and its kinetics to determine when a repeat biopsy or definitive therapy is needed instead of just waiting for metastases to appear [63].

5.2 Drawbacks of PSA-based diagnostics of prostate cancer

First, while observed in most prostate cancer carriers, the moderate rise of PSA levels in blood is most likely to be associated with other conditions, such as benign prostate hyperplasia (BPH), prostatitis, or recent ejaculation [21]. PSA is not a PCa- specific marker; but prostate-specific. About 50% of men with BPH have PSA levels above the threshold of 4.0 ng/mL. As a result, elevated PSA levels lead to numerous false positives and unnecessary biopsies. As many as 75% of men who undergo biopsies due to an elevated PSA do not have prostate cancer [32, 64].

Second, utility of PSA as a PCa marker is further complicated by the fact that

PSA levels are significantly affected by such factors as race and age. The reference level for tumor markers is determined by 95th percentile from an apparently healthy patient group [30]. Traditionally the reference range for PSA is 4.0 ng/mL; however, the concentration of PSA at the 95% percentile for white, apparently healthy 40-, 50-, 60-,

70- and 80-year-old men is 2.0, 2.7, 3.8, 5.4, and 7.4 ng/mL, respectively [65]. African-

Americans have significantly higher PSA levels than similarly aged white males [66] and there are differences in Latinos and Asians as well [67].

12

Third, unlike in many other tumor markers, PSA levels do not correlate with the progression of PCa and cannot be used to stage PCa in individual patients [30, 68]. No single PSA value is invariably associated with clinical metastasis or cancer-specific survival [43].

Moreover, the levels of PSA immunoreactivity decrease with tumor grade due to cellular de-differentiation [69]. Androgen ablation therapy or progression of tumor to androgen-independent stage might cause the drop or loss of PSA expression. That makes

PSA unreliable for the post-therapeutic prognosis and the monitoring of advanced and especially metastatic disease.

Finally, as much as two thirds of prostate cancers are missed at PSA levels below

4.0 ng/mL including advanced metastatic cases [20, 29, 70-73]. These data suggest that even such an ideal marker as PSA is not sufficient and additional PCa markers are needed

(see below).

5.3 PSA is transcriptionally repressed by p53

Expression of the PSA gene was demonstrated to be directly regulated by binding

of activated androgen receptor (AR) [74-76] to three androgen responsive elements

(AREs) identified within the PSA , as well as to the enhancer about ~4 kb upstream [77-80]. However, transcriptional regulation of the PSA gene is not limited to androgens [81].

The contribution of our laboratory to this scientific effort and our strategic advantage is the finding that PSA is under the negative transcriptional control of p53 [82].

PSA mRNA levels increased 4-fold in prostatic adenocarcinoma cell line LNCaP upon suppression of p53 by GSE56, a potent dominant-negative p53 mutant [83, 84]. Wild

13 type p53 strongly suppressed, while GSE56 stimulated PSA promoter-driven and secretion of PSA into culture medium. On a transcriptional level the effect was opposite for CDKN1A/p21 gene (Figure 1) [82].

Transcriptional repression by p53 partially explains why PSA gets activated in prostate tumors (since the p53 pathway is inhibited, at least in part, in any malignant tumor). We propose that other tumor markers can be found among genes that are under negative control of p53 and other tumor suppressors that are frequently deregulated in prostate cancer, thus providing a novel rational strategy for a cancer markers search (see below).

14

Figure1. Opposite regulatory effect of p53 on PSA and p21 promoters.

NOTE: Bars reflect relative CAT activity in lysates of LNCaP cells transiently transfected with either PSA- CAT (upper panel) or p21-CAT (lower panel) constructs in combination with the indicated plasmids. Results are normalized according to transfection efficiency and CAT expression in control cells transfected with insert-free vector. wt – plasmid pLp53SN expressing wild type human p53 cDNA; GSE – plasmid pLGSE56SN expressing GSE56. (1) and (2) indicate plasmid concentration in micrograms. The experiment was repeated three times and showed similar results with variations in relative CAT activity values less than 20% [82].

15

6. Prostate-Specific Membrane Antigen (PSMA)

PSMA (FOLH1/PSMA) is a membrane-bound glycoprotein that shares with and the M28 family of cocatalytic and is mapped to the chromosome 11p11.2 [85-87]. It possesses two unique enzymatic functions, folate hydrolase (FOHL) and N-Acetylated-Alpha-Linked-Acidic-Dipeptidase

(NAALADase) [88, 89]. PSMA was originally defined by the monoclonal antibody 7E11 derived from immunization with LNCaP cell line [90]. It is expressed in prostatic secretory epithelium and often overexpressed in prostate cancer [91, 92]. PSMA was also found in the neovasculature of most of the solid tumors, but not in the vasculature of the normal tissues [91, 93].

Unlike PSA, PSMA expression is inversely regulated by androgens [94]. As a result; PSMA is significantly increased in both primary and metastatic tumors [95]; furthermore, expression of PSMA increases precipitously proportional to tumor aggressiveness [91, 96-98]. These unique expression properties of PSMA make it an important marker, as well as an attractive target for therapeutic intervention and imaging techniques [87, 99]. The major weakness of PSMA as a clinical marker for early diagnosis is that elevated serum levels have been observed in healthy males (increasing with age) [100] and females and in the serum of patients [101].

7. The search for novel markers in prostate cancer

Given the drawbacks of PSA described above, there is a strong need to identify additional molecular markers of prostate cancer that would supplement PSA in early detection and be suitable for other diagnostic tasks that PSA cannot fulfill. Such tasks include more accurate and sensitive diagnostics; reliable monitoring of the disease

16

(especially at the late stages); differential diagnostics/prognosis of prostate cancer subclasses, and prediction of tumor responsiveness to the treatment.

7.1 The attributes of an ideal tumor marker

If an ideal tumor marker existed, it would perform all of following functions: aiding cancer diagnosis, prognosis and staging, screening, monitoring treatment and recurrence, and detection of residual disease (Table I). Such tumor marker would be detectable only when malignancy is present; be specific for the type and site of malignancy; correlate with the amount of malignant tissue present, and respond rapidly to a change in tumor size (Table III). While great efforts are being put forth to find such markers, at the present time no analyte comes close to fulfilling these criteria. However, a few of tumor markers perform well at one or more capacities described above. Even though there are serious limitations in their utility (mainly sensitivity and specificity), some of the tumor markers do an adequate job with early detection of recurrence or as a means to monitor the efficacy of treatment [30, 102].

Screening tests require high sensitivity to detect early-stage disease, as well as sufficient specificity to protect patients with false-positive results from unnecessary diagnostic evaluations (such as biopsy) [4]. A good marker would also identify the malignancy that is likely to rapidly progress, thus avoiding intervention for indolent disease. PSA is the only tumor marker approved by the FDA for screening of the general population, but it is still controversial and is responsible for both high false-positive [21,

32, 64] and high false-negative rates [20, 29, 70-73]. At the present time no tumor marker

(including PSA) has demonstrated a survival benefit in randomized controlled trials of screening in the general populations [4, 21].

17

A good tumor marker would detect only one type of cancer, thereby pinpointing the origin of the disease. Most currently-used tumor markers detect cancers of more than one cancer type (Table II) and are too nonspecific to establish the origin of the tumor [4].

Even PSA expression is not limited to the prostate gland as was once thought; it is also expressed in the and salivary glands [103] as well as breast and periurethral glands in women [104].

The reference interval of tumor markers is another source of controversy. It is usually determined by the 95th percentile of an apparently healthy patient cohort [30].

However, many tumor markers (including PSA, which is often described as “the most ideal tumor marker available”) change due to age or other factors not directly related to cancer [105]. In addition, there are small differences between manufacturers of assay kits that may also influence reference interval determinations [30].

Finally, a good tumor marker levels are proportional to the tumor burden, which allows monitoring the effectiveness of treatment. If the treatment is successful, tumor marker levels drop: whereas if the patient is not responding favorably to the therapy, levels may stay flat or actually increase. However, due to the heterogeneity of cancer it is possible for the tumor burden to increase while a particular tumor marker is decreasing. The concentrations might be misleading and low; decreasing levels do not always indicate patient improvement [30].

18

TABLE III

ATTRIBUTES OF AN IDEAL TUMOR MARKER

Characteristic Example Tumor-specific (no overlap between Released exclusively from organ-specific tumor health and disease) tissue, but absent or below a threshold in health Marker is only elevated in response to one cancer Type-specific type

High sensitivity—marker is produced at high levels, Detection of early cancer and early elevated in all patients with cancer, and detectable prediction of recurrence before clinical symptoms become apparent If a tumor grows or gets smaller, this should be Concentration change relative to total reflected by a corresponding increase or decrease tumor burden in tumor marker levels Relatively easy to measure with common instrumentation from minimally invasive sampling Easily measured and cost-effective (i.e., blood or other body fluids)

19

7.2 Methods of prostate tumor markers search

There is a massive ongoing scientific endeavor to identify novel genes that may serve as markers in prostate cancer. First few PCa markers were discovered by laborious immunological and enzymological methods [49, 56, 90]. Today, the list of potential prostate tumor markers is rapidly growing thanks to several high-throughput approaches to molecular analysis that allowed searching for novel markers in a more efficient and comprehensive manner [106]. The most notable techniques include microarray hybridization experiments [27, 107-113], expression database mining [114-121] and serial analysis of gene expression (SAGE) [122-126]. Recently, with the advent of high- throughput microsequencing and , proteomic approaches became more feasible and were applied to the identification of proteins upregulated in prostate cancer [123, 127, 128].

So far, most of high-throughput methods applied to identification of new PCa markers involve a direct comparison of gene expression levels between large sets of prostate tumor samples (or cell lines) and normal prostate tissue. Genes significantly overexpressed in a malignant prostate are viewed as potential prostate tumor markers or, if they are proven to contribute to the transformed , as therapeutic targets.

While seemingly straightforward, this approach is hampered by several problems.

Variability of gene expression between patients and tumors in cases of prostate cancer is further compounded by the high spatial variability of the tumor – a relatively small biopsy core sample almost always contains stroma, normal epithelium, and multiple tumor foci of varying grades. This spatial variation leads to a low concentration of tumor

20 material in the sample, as well as contamination with normal and necrotic tissue. High equipment and labor costs are also obstacles.

7.3 Potential PCa markers

A handful of potential prostate cancer markers were discovered (Table IV) [48]; however, the utility of molecular markers, other than PSA, in prostate cancer diagnostics is limited (mostly due to limited sensitivity and specificity).

TABLE IV POTENTIAL PROSTATE CANCER MARKERS Molecular Weight of a Marker Chromosome Protein (kDa) Subcellular location Biochemical function Biological/cellular function A2M 12p13.3-12.3 163 Secreted Protease inhibitor Protein carrier Akt-1 14q32.32 56 Nucleus/cytoplasm Protein kinase Apoptotic inhibition AMACR 5p13.2-q11 42 Mitochondria/peroxisome Racemase Stereoisomerization 2 1q21 11 Plasma membrane Calcium and lipid binding Membrane trafficking Bax 19q13.3-.4 21 Cytoplasm/membrane Bcl-2 binding Apoptosis Bcl-2 18q21.3 26 Mitochondrial membrane Membrane permeability Apoptosis Cadherin-1 16q22.1 97 Plasma membrane / binding Cell adhesion 8 2q33-34 55 Cytoplasm Protease Apoptosis Catenin 5q31 100 Cadherin binding Cell adhesion Cav-1 7q31.1 20 Plasma membrane Scaffolding Endocytosis/signaling CD34 1q32 41 Plasma membrane Scaffolding Cell adhesion CD44 11p13 82 Plasma membrane Hyaluronate binding Cell adhesion Clar1 19q13.3-.4 34 Nucleus SH3 binding Unknown Cox-2 1q25.2-.3 69 Microsomal membrane synthase Inflammatory response CTSB 8p23.1 38 Protease Protein turnover Cyclin D1 11q13 34 Nucleus CDKb regulation DD3 9q21-22 0 Nucleus/cytoplasm Noncoding Unknown DRG-1 22q12.2 43 Cytoplasm GTP binding /differentiation EGFR 7p12 134 Plasma membrane EGF binding Signaling EphA2 1p36 11 Plasma membrane kinase Signaling ERGL 15q22-23 57 Plasma membrane Lectin/mannose binding Unknown ETK/BMK Xp22.2 78 Cytoplasm Tyrosine kinase Signaling EZH2 7q36.1 85 Nucleus Transcription repressor Homeotic gene regulation Fas 11q13.3 23 Plasma membrane Caspase recruitment Apoptosis

GDEP 4q21.1 4 Unknown Unknown Unknown 2 1

GRN-A 14q32 50 Secretory granules Statin Endocrine function GRP78 9q33.3 72 Multimeric protein assembly Cell stress response GSTP1 11q13 23 Cytoplasm Glutathione reduction DNA protection Hepsin 19q11-13.2 45 Plasma membrane Cell growth/morphology Her-2/Neu 17q21.1 138 Plasma membrane Tyrosine kinase Signaling 7q11.23 23 Cytoplasm Cell stress response 6p21.3 70 Cytoplasm Chaperone Cell stress response 11q13 63 Cytoplasm Chaperone Cell stress response Id-1 20q11.1 16 Nucleus Differentiation regulator IGF-1 12q22-23 17 Secreted IGFR Signaling IGF-2 11p15.5 20 Secreted IGFR ligand Signaling IGFBP-2 2q33-34 35 Secreted IGF binding Signaling IGFBP-3 7p13-12 32 Secreted IGF binding Signaling/apoptosis IL-6 7p15.3 24 Secreted B-cell differentiation IL-8 4q13.3 11 Secreted Cytokine activation KAI1 11p11.2 30 Plasma membrane CD4/CD8 binding Signaling Ki67 10q25-ter 358 Nucleus Nuclear matrix associated Cell proliferation KLF6 10p15 32 Nucleus Transcription factor B-cell development KLK2 19q13.41 29 Secreted Protease Met-Lys/Ser-Arg cleavage Maspin 18q21.3 42 Extracellular Protease inhibitor Cell invasion suppressor MSR1 8p22 50 Plasma membrane LDL receptor Endocytosis MXI1 10q25.2 26 Nucleus Transcription factor suppression MYC 8q24.12-.13 49 Nucleus Transcription factor Cell proliferation NF-kappaB 10q24 97 Nucleus Transcription factor NKX3.1 8p21 26 Nucleus Transcription factor Cell proliferation OPN 4q22.1 35 Secreted Integrin binding Cell-matrix interaction p16 9p21 17 Nucleus CDK inhibitor Cell cycle p21 6p21.2 18 Nucleus CDK inhibitor Cell cycle p27 12p13.1-12 22 Nucleus CDK inhibitor Cell cycle p53 17p13.1 44 Nucleus Transcription factor Growth arrest/apoptosis 2

PAP 3q21-23 45 Secreted Tyrosine phosphatase Signaling 2

PART-1 5q12.1 7 Nucleus/cytoplasm Unknown Unknown PATE 11q24.2 14 Plasma membrane Unknown Unknown PC-1 5q35 32 Nucleus RNA binding Ribosome transport PCGEM1 2q32 0 Nucleus/cytoplasm Noncoding Cell proliferation/survival PCTA-1 1q42-43 36 Cytoplasm Unknown Cell adhesion PDEF 6p21.31 38 Nucleus Transcription factor PSA promoter binding PI3K p85 5q12-13 84 Cytoplasm Lipid kinase Signaling PI3K p110 1p36.2 120 Cytoplasm Lipid kinase Signaling PIM-1 6p21.2 36 Cytoplasm Protein kinase Cell differentiation/survival PMEPA-1 20q13.31-33 32 Plasma membrane NEDD4 binding Growth regulation PRAC 17q21.3 6 Nucleus Choline/ethanolamine kinase Unknown Prostase 19q13.3-.4 27 Secreted Serine protease ECM degradation Prostasin 16p11.2 36 Plasma membrane Serine protease Cell invasion suppressor Prostein 1q32.1 60 Plasma membrane Unknown Unknown PSA 19q13.3-.4 71 Secreted Protease Semen liquification PSCA 8q24.2 13 Plasma membrane Unknown Unknown PSDR1 14q23-24.3 35 Nucleus/cytoplasm Dehydrogenase reductase PSGR 11p15 35 Plasma membrane Odorant receptor Unknown PSMA 11p11.2 84 Plasma membrane Folate hydrolase Cell stress response PSP94 10q11.23 13 Secreted FSH inhibitor Growth inhibition PTEN 10q23.3 47 Cytoplasm Protein/lipid phopatase Signaling RASSF1 3p21.31 33 Cytoplasm Ras binding Signaling RB1 13q14.2 106 Nucleus -1 inactivation Cell cycle RNAseL 1q25.3 84 Cytoplasm/mitochondria RNAse Viral resistance RTVP-1 12q21.1 29 Plasma membrane Unknown Immune response/apoptosis ST7 7q31.2 60/85 Plasma membrane Unknown Cell proliferation STEAP 7q21.23 40 Plasma membrane Unknown Unknown TERT 5p15.33 127 Nucleus Reverse transcriptase Telomere synthesis TIMP 1 Xp11.3-.23 23 Secreted Protease inhibitor Cell adhesion TIMP 2 17q25 24 Secreted Protease inhibitor Cell adhesion 2

TMPRSS2 21q22.3 54 Plasma membrane Serine protease Unknown 3

TRPM2 8p21-12 52 Plasma membrane Calcium channel Ion flux Trp-p8 2q37.1 120 Plasma membrane Calcium channel Ion flux UROC28 6q23.3 17 Nucleus/cytoplasm Choline/ethanolamine kinase Unknown VEGF 6p12 27 Secreted VEGFR binding

2 4

25

In order to pass FDA approval and become clinical prostate cancer markers, the candidates need to pass the clinical evaluation. This is a costly, labor intensive, and time- consuming endeavor, so only a handful of the potential PCa markers from table IV can be reasonably expected to undergo such development. The criteria to choose the best candidates were suggested by Tricoli et al. [48] in the following manner. First, there should be a biological or therapeutic rationale for choosing the marker, or at least a consistent association with disease presence, disease characteristics such as stage, or disease aggressiveness. Second, there should be an assessment of the strength of marker association with disease outcome. Third, the marker should be assessed as an independent predictor in a multivariate analysis. According to these criteria, the most promising PCa marker candidates are GRN-A [129-131], GSTP1 [132-134], PSCA [135, 136], PSMA

(described above), and TERT [137, 138].

Bearing these imperfections in mind, we developed an improved in silico profiling approach in combination with a novel (TSG) inactivation approach to search for new molecular markers of prostate cancer. As a result, we (i) validated the approach by identifying a set of known molecular markers of prostate cancer, and (ii) found new candidate genes with highly specific expression in prostate cancer that we consider to be a potentially important foundation for a new diagnostic tool that may improve methods for detecting prostate cancer.

26

II. THE GENERAL STRATEGY OF THE STUDY

The aim of our integrated approach was to identify new molecular markers for selective detection of prostate cancer cells based on gene expression properties (Figure

2). First, we selected our initial set of pre-candidates using in silico expression profiling

(an advanced version of EST data mining). The distribution of ESTs in the cluster across cDNA libraries roughly follows the expression profile of the transcript over corresponding tissues. Assuming that the number of ESTs sequenced for a particular gene in a given cDNA library reflects its expression in the tissue from which this cDNA library was prepared, we may convert EST abundance data to “relative expression” form

[114, 115, 119-121]. Our most important improvements were extension and enrichment of an EST cluster spectrum used in selection; addition of tissue specificity and specimen representation criteria, and taking advantage of most of available safeguards against data distortion.

Second, our finding that PSA is a cancer marker partly due to the negative regulation by p53 [82] opened a way for rational, systematic identification of similar markers. We hypothesized that the genes normally downregulated by tumor suppressors are a novel source of potential cancer markers. We searched for such genes by analyzing mouse models of prostate cancer deficient for tumor suppressor genes (TSGs) using cDNA microarray hybridization. Human homologs of genes repressed by tumor suppressors in a mouse prostate provided a second source of candidate genes.

Steps one and two resulted in a generation of the “list of suspects” containing

>500 gene candidates that could be prostate cancer markers according to the selection criteria applied. The next step of a prospective marker selection was to verify the

27 expression profile of each candidate using microarray hybridization. To that end, a custom cDNA microarray was printed representing all pre-candidate genes plus a necessary set of controls. Using this “prostate marker array”, the expression levels of all pre-candidates were determined in the set of prostate tumor samples, corresponding cell lines and in normal tissues. Further verification of gene expression patterns and consequently narrowing down the list of candidates was done using such approaches as

RT-PCR (reverse transcription - polymerase chain reaction), Northern blotting analysis and real-time PCR.

The final step of candidate verification employed validation of advanced candidates in clinical tissue samples obtained from bona fide prostate cancer positive and negative organs and an estimation of the percentage of tumors that are identifiable by a given marker in comparison with existing markers, such as PSA and PSMA.

28

p53 Affymetrix Human EST databases n o

i TSG- TRAMP t wt c deficient e

l e

s PTEN microarray

e in silico t

a

d expression i

d

n profiling a

c Genes upregulated in TSG- -

e deficient mouse prostate r P

.

I Predicted diagnostic markers and therapeutic Human homologs targets in prostate cancer

Array printing y a

r r a o r c s

i n m o

i t y PRE-CANDIDATES: a b

z i

n Cell lines d o i cDNA i Preferentially

r (prostate/ t b a non-prostate) expressed in prostate y c i h

f microarray

i cancer cell lines

r e

V

. I

I

l n i a

i r n e o CANDIDATES: Bioinformatics analysis: t i

t ADVANCED a •Any known genes? a

m CANDIDATES: d

i Primary confirmation •Similarity to known? l l a a by RT-PCR and •Predicted to be c

i Real-time PCR V

Northern Bloting secreted/membrane? n .

i I l I c I

Antibody-based assays

Figure 2. A novel integrated approach to identification of prostate cancer markers.

Experimental plan.

29

III. IDENTIFICATION OF NOVEL MARKERS IN PROSTATE CANCER BY

IN SILICO EXPRESSION PROFILING

1. Rationale

We utilized the EST data mining approach that utilizes publicly available expression data, with the addition of several measures that took advantage of most of the available safeguards against data distortion. We added tissue specificity and specimen representation criteria. We also extended and enriched the EST cluster spectrum used in selection.

2. Introduction

2.1 General strategy of EST data mining

ESTs (Expressed Sequence Tags) are the result of random sequencing of cDNA ends with a typical length of ~300-400 bp. The distribution of EST in cDNA library roughly corresponds to expression levels of a given transcript in the same tissue (from which cDNA library was prepared). Such data are convertible to relative expression and is not limited to any subset of genes. These data are also available in sufficiently large amounts (for human, currently >4,500,000 public ESTs from >60 tissues) [122]. ESTs are readily associated with their parent transcripts by homology-based clustering. The overall large number of ESTs allows relatively reliable analysis of expression of the majority of the human genes. The mining procedures take advantage of the fact that ESTs are easily assigned to their source cDNA libraries, and library descriptions are available to identify the tissue, organ or cell type used for preparation of cDNA. This approach allows sorting of most cDNA libraries and ESTs sequenced from them by tissue of origin.

30

To mine the EST database for potentially important genes, several steps need to be followed: a) assemble ESTs into clusters representing the actual transcripts; b) categorize cDNA libraries and assemble them into pools representing the tissues and organs; c) select the EST clusters whose distribution over tissue pools (“digital expression profile”) satisfies the pre-defined criteria. Most published works employing

EST data mining for discovery of differentially expressed genes follow this scenario; however, they proceed differently about how to approach these steps [114-121, 139, 140].

2.2 EST clustering

Many authors of EST data mining works presented their own EST clustering engines [114, 117, 120] or used existing software packages and performed assembly procedure in-house [119, 121]. Recent works use publicly available from UniGene database (http://www.ncbi.nlm.nih.gov/UniGene/) as a source for pre-defined EST clusters [115, 116, 118]. The artifacts of EST clustering that affect subsequent data mining include 1) mixed clusters containing ESTs from unrelated transcripts; 2) spurious removal of ESTs from the cluster by filtering procedures, leading to distorted EST counts; 3) representation of one transcript by multiple small clusters, and 4) the inability to distinguish between different splice isoforms of the same mRNA. The first three problems are with varying success by EST clustering packages, but are far from being solved by any single EST assembly procedure. The fourth problem is usually beyond the scope of EST clustering.

31

2.3 Commonly used criteria of gene selection

There are currently three major criteria for selection of desirable EST distributions (expression profiles). Those are: DDD (Digital Differential Display), various pool specificity measures and GBA (Guilt By Association).

2.3.1 Digital Differential Display (DDD)

DDD currently is the most popular strategy. It performs a selection of biologically

interesting (using expression ratio) and statistically valid (using Fisher’s exact test) differences between two pools in counts of ESTs derived from the same transcript (EST cluster). It is used by several authors in conjunction with custom EST clustering results

[114, 117, 119]. Recently, a public DDD tool became available

(http://www.ncbi.nlm.nih.gov/UniGene/info/ddd.html) that allows users to combine pools from available EST libraries, compare them and select UniGene clusters whose distribution over these pools is significantly different. A number of authors have used this tool with relative success [115, 116, 118]. The drawbacks of DDD are its use of Fisher’s exact test and direct comparison between tissue pools. Fisher’s exact test is unnecessarily restrictive and is not entirely relevant to the task [139]. Less strict criteria proposed by

Audic and Claverie [139] are now implemented by some authors [117]. The direct comparison of two pools highlights all differences, relevant or not. A large number of artifacts described above, produced by EST clustering, distorted libraries, errors in pool assembly and/or random omission/inclusion of ESTs will be picked up by DDD as significant differences. DDD produces the desired result with a reported 10%-30% success rate when selected transcripts are tested by experimental means [114, 115, 118].

32

2.3.2 Pool specificity

Pool specificity criteria simply select the transcripts that are highly specific (80-

100% of ESTs) for the desired tissue pool(s) [115, 120, 121]. While such a procedure does not contain a statistical measure, the selection is usually statistically significant because authors choose to analyze the largest, and therefore, most reliable EST clusters first. Moreover, such selection is relatively insensitive to mentioned artifacts. The percentage of confirmed genes varies between 10% and 40% for different works.

Unfortunately, pool specificity criteria are usually used to select only the most extreme distributions. The imaginably frequent case of a gene that is expressed at low but detectable levels in most normal tissues and highly expressed in a tumor will be discarded by such criteria.

2.3.3 Guilt By Association (GBA)

GBA selects genes whose EST distribution over cDNA libraries closely resembles that of known genes with desirable properties, such as PSA [140]. The co-regulation is estimated using Fisher’s exact test to determine probability of such EST co-distribution being observed by chance. GBA does not take into account expression levels, only presence or absence of the transcript in the library. It leads to an inability to detect co- regulation between minimally-expressed and highly-expressed genes due to the small sequenced size of cDNA libraries. The main advantage of the method, its ability to detect co-regulated genes, is also its main disadvantage: when a marker gene used as “bait” is not expressed, most co-regulated genes will also not be expressed. For this reason, GBA is not a good strategy for identification of novel markers that replace or supplement the existing ones.

33

Notably, some works did introduce additional requirement of high expression only in the tumor, but not in the rest of the analyzed tissue spectrum [114, 116, 120].

Such potential tumor markers are, ideally, detectable in body fluids and as such, are required to be secreted or present in cells of the tumor in question but not in normal human tissues.

2.4 A novel in silico profiling approach

Gene selection procedures mentioned above are either too generic or too narrow to be used for diagnostic marker identification. Less restrictive but more task-oriented selection criteria are clearly needed. This compelled us to design our own approach in order to improve and complement the existing ones. We decided to avoid strict criteria in selection of digital expression profiles. Instead, we used most available safeguards against data distortion and designed a set of task-oriented requirements and scores. This approach can be applied to any tumor type. Here we report the results of applying this strategy to prostate cancer.

3. Materials and Methods

3.1 Step 1: Assignment of ESTs to their parent transcripts

To produce clusters representing their original transcripts, highly similar ESTs are competitively gathered together. Most EST data mining strategies rely on a single EST clustering result [114-121]. Instead, we used three sources of EST clusters. Two independent whole-dbEST clustering results, UniGene

(http://www.ncbi.nlm.nih.gov/UniGene/) and one produced by Paracel Inc.

(http://www.paracel.com) were used. Methods of EST clustering are different for these

34 two assemblies. This approach increases the possibility that the EST cluster representing a particular gene will be assembled correctly and picked up by data mining procedure.

The third source was provided by mRNA-seeded clusters. Approximately 200,000 publicly available mRNA sequences, both evidence-based and GenScan [141] – predicted from RefSeq, HTDB, NCBI and Ensembl Human Genome

(http://www.ensembl.org) were used as “bait” to assemble clusters of highly similar

ESTs. There is no competition for ESTs among clusters in this procedure. Unlike in conventional EST clustering, assembly is driven by mRNA structures and thus sensitive to major splice rearrangements, allowing identifying tissue and tumor-specific splice isoforms.

3.2 Step 2: cDNA library selection and tissue pool assembly

To link ESTs to tissues and organs, filtered cDNA libraries are assigned to tissues and then combined into large “tissue pools” to improve statistical reliability [119]. To improve the quality of data, we discarded from analysis cDNA libraries that were (a) of unknown or mixed tissue origin, or (b) too poorly sequenced (N EST < 1000), or (c) the result of subtraction or pooling of cDNA from unrelated tissues. We did not differentiate between bulk, microdissected, flow-sorted and cell line-derived libraries of the same tissue/organ origin in order to create bigger pools and to cover larger spectra of tumors.

Normal tissues surrounding tumor in bulk samples may also yield tumor markers, such as

TARP (T-cell receptor gamma-chain alternate reading frame protein) [120]. All selected cDNA libraries were assembled into large tissue pools. Each tissue was represented by any combination of Normal, Malignant, and sometimes Disease pools. Fetal and embryo tissues were assembled in one separate pool. A majority of all human ESTs (~65%)

35 belonged to cDNA libraries acceptable for analysis. We assembled 80 pools ranging in size from 1,200 to 170,000 ESTs (32,000 ESTs on average).

3.3 Step 3: Selection of candidate genes with desired digital expression profile

In order to nominate a gene as potential marker, our algorithm (encoded in

PROFILLET software) reconstructed expression profiles across all tissue pools

(“electronic Northern”) and then selected potential markers.

3.3.1 Reconstruction of expression profiles

The proportion of gene-derived ESTs in all tissue pools was calculated and presented in average expression (avgE) units that were calculated as a proportion of gene-derived ESTs in sum of all pools. The result was an approximation of the gene expression profile (Figure 3), which was used for evaluation of a gene as a potential marker. In such profiles, expression was calculated for every tissue separately, relative to

“other normal” average expression (ON-avgE): the proportion of gene-derived ESTs presented in the sum of pools representing normal tissues (except tissue analyzed) as an estimation of the “rest of the organism” to which the expression of a marker gene in its tissue was usually compared. As a result, the ON-avgE value was different for every tissue. For example, if a gene was strictly prostate-specific, its ON-avgE equaled 0.0 in prostate, so a nonzero presence in prostate pools would lead to infinite relative expression values (INF). The same was true for strictly tumor-specific genes – their ON-avgE would be equal to zero from the standpoint of every tissue, because these genes were not found in normal tissues (Table V, Table VI). Genes previously unknown as prostate cancer markers were identified by code names that are composed of an EP (“electronic profile”) abbreviation and a number.

36

3.3.2 Selection of potential markers

The following specificity and reliability criteria customized for diagnostic marker selection were introduced:

(a) Specificity as compared to the average normal level of expression. We aimed to identify genes whose expression in tissue of interest was much higher than “normal average expression” (N-avgE) in the adult organism. N-avgE was calculated in the total pool of normal tissues in order to retain genes that may be elevated in multiple tumor types or were expressed normally only in an embryo. We set the exact cutoff at 2x the level of background data noise. This level was estimated for the maximum expression of housekeeping genes (GAPDH, -) in normal human tissues, and was found to be within 5±1 N-avgE. Therefore, minimum relative expression for a tumor-specific peak was set to 10 N-avgE. If a gene was expressed at a high level in a single normal tissue, but was absent from the rest of the normal tissue pool, such violation of a tumor specificity requirement was diluted and could not be detected by this criterion alone.

(b). Statistical validity of the tissue-specific peak. We used statistical criteria developed by Audic and Claverie for analysis of digital expression profiles [139]. We accepted tumor-specific peaks that differed from summary normal pool at a significance level of P<=0.05.

(c) Minimum sample representation. The EST distribution peak for given transcript in given tissue needed to originate from at least two independent cDNA libraries. This was a novel selection criterion aimed at identifying genes present in a significant fraction of tumors.

37

(d) Nomination of potential marker: specificity across tissue spectrum. If a tumor-

specific peak of expression was selected as specific and valid, it may be nominated as a potential diagnostic marker. This analysis was carried out for all peaks in the profile to identify potential markers for all tumor pools in the analysis at once.

Actin, beta (ACTB) Glyceraldehyde-3-phosphate dehydrogenase (GAPD)

4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 L U L U U R L R P T L P S S L P T U K M M P P P P P T T L N P P S N N O O P P S S M G K P G M O F G G M C H G E H C E E C C C C B B E B B B B B A A 0.00 A u u y i y i h e e h i d d d t t t i a m a a t i a h r l r e a i k e e i r k p i l e n e y y r r m r r o r o e e v e e v u a a u e e u o e o o o o o v v v b t d a t n d a e e e e m m o o o a e e a a n n u y s u i s y i e e r r r e n l r n a e u n u u l a d a n n t t l l a a s s s s a n n s n r n l l l n n n r i n c e e n c r r r r a a o o o e e e i i a a s s s i a b a i i g g v r r m i t i t o _ _ n n r c c r r r c r r r o d n n e n e u u i i r r c t t c r i i c n n n t p p t _ _ o e i e i o a e e e n n n n t t n n n n n n _ l a t y t t _ o o a t r r t s r s _ _ a s a s y y y b i l a a r l r r o o t ______s s h a a a l l _ l e e e h h M N o o r e y _ i y n i l n y n _ e e o o n e e e h _ _ _ a a a M e r e N e e e l _ _ _ t t M N r d _ r _ d _ l i _ _ _ u u n N M m D m _ _ y t t t N c c _ c o _ o y _ o i _ i a s n y c y t _ t c _ _ c c g o _ _ _ e D M N l l l a e a e a p a e p N T n n M _ M N _ g a o a s s s o N M a r p r r e N a M _ t t N t M _ i N i _ M s _ s o e N o e a i c c y y e r s s s l d o s d _ s _ l i s _ i t t i i i i i a a M N r l o s _ r o _ i i a _ _ _ r l o t t e n n a v v c v l g a a e e N o M t N u s _ r o a l t t a r l o l a e e o a h h t m t a _ i _ _ N _ M _ T o a r r l s l r e e N r M e e l e N T M d r o r r a a l r k _ a l l n s s _ r m N r r r l r l r o a r r l e e o l c T c a M i N i r l i r _ o ______o o _ a r p a r r a r v c l v o d t t o m o a a a r n l l r o e e l i i y y a r N r M N e i i o t _ t c t w w i T i a l h t n n t a r _ o r u u l l x a x i i i d e _ _ e r l l s a s s p r N a l l l e a a o r r o N t _ _ a m m _ _ _ _ _ d M N d h c a s l s s i r r l l r r i o N M l t N M N M o _ _ M _ _ g c a _ i h o u u u _ _ a n r r N M o M N n i o o a l M a a e r e e e a M N l n _ o r a r r l o l l l a o _ _ _ l a o N a m i o a u n r l r M D N l m l r l o m c a o a i r y s a l

Kallikrein 3, (prostate specific antigen) (KLK3) Acid phosphatase, prostate (ACPP)

4000.00 3500.00 3000.00 2500.00 2000.00 1500.00 1000.00 500.00 0.00 U U T S S U U T T S S T S R R K K L L P L L M L H M M L G P H G P P G P O M P P O N N G F E P P P M G N E C P O C P E P B C C C C E B B B B B B A A A i u i u y y e h e h i d d d t t p k t t a m k i i i e a e i r a r a e e n r a h e m l l y i r y r r r r o o e e v v u e e e e a u a u e o o o o o o v v b v d d t t a a n e e e e m m o o o e e a a a n n s i y s y i u u e e r r r l l e e a n a n n u u d r a u n n t t s l l n n s a s a s s r n a l n n l l n n r e i n n c c e r r r r a a o o o e e e a i i s s s a b a a i i i r v r g g m t t i i o _ _ n n r r r c d c c r r o r r n n n e e u i u i c i i t r c r t c n r n n _ _ t p t p i e o i o e e e e a n n n n t t n n n n n n a l _ _ t t r t a o r t t y o r s s s s a _ _ a y y y b l i l a a o r _ o r r t _ _ _ _ _ s s _ l a a l a h l e e e N M h h o o y e r n i _ i y y _ n n l e e e o o e h n e _ _ _ a a a r N M e e _ e _ l e e _ t t _ d _ d r M N r _ l i _ _ u n u _ N M D m m _ _ N t t t y o c _ c c _ y _ _ o o i i a n s _ c y y c t t _ o _ _ c c _ e g _ M N D l l l a e a n e a M n e a N p p T M _ N _ g o a a s s s N M a o r e r p r M N _ a t M t t N N M _ i _ i s s _ o N e o e a i c r c s y y s l e s l d d i _ s i t _ s t _ s i o i i i i a a M N r l o s r o i _ a _ i _ _ _ o l r n n c a e t t v v v a g l e e a o N M u t _ s N r o l a r t t l a o l a e e o t h t a h m a o a _ _ T _ N _ M _ i r r l s l r e e r e M l N e e T M N r o r a a k l r r d l a l _ s s _ n r m N r r r l r l r a r r o e l e l o r l i c a c M i N i T r _ _ _ _ o _ _ _ _ o o r a r r a a p _ r v l v c t t m o o d o a a a n r l l o r e l y e y i i a N r N M c r o i _ e i t t t w w i i t n a n t l h T a o r _ r u u l l x x a i i i d e _ _ e r l l a s s s a r N p l l a l a o e r o r t N _ _ a m m _ _ _ _ _ d M N d h c a l s s s r i l l r r r o i N M l t N M M N _ _ o M _ c _ g a _ i o h u u u a _ _ n r M N r o M i N n o a o l M a a r e e e e a l n N M _ o r a r l r o l l a o l _ _ _ l a o N a m o i a n u r l r M N D l m l r o l c m a o a i r a y s l

Figure 3. In silico expression profiles obtained by our software for known housekeeping genes and prostate tumor markers. X 3

axis: tissue pool names; Y axis: relative expression in avgE units (see text for details). 8

39

TABLE V

POTENTIAL TUMOR MARKERS IDENTIFIED BY OUR SOFTWARE: TUMOR-

SPECIFIC AND PROSTATE-SPECIFIC GENES

mRNA name or codea EST number Expression, in ON- Tumor avgE units specificb

Total Norc Malc Norc Malc EP1 65 0 18 0 INF Y EP2 13 0 11 0 INF Y specific protein (28 kDa) 12 0 9 0 INF Y EP4 8 0 8 0 INF Y EP5 7 0 7 0 INF Y EP6 22 0 5 0 INF Y EP7 13 0 4 0 INF Y EP8 11 0 4 0 INF Y small nuclear protein PRAC 7 1 6 INF INF - EP10 4 0 4 0 INF Y EP11 4 0 4 0 INF Y EP12 4 0 4 0 INF Y EP13 4 0 4 0 INF Y EP14 4 0 4 0 INF Y EP15 4 0 4 0 INF Y EP16 38 0 3 0 INF Y EP17 14 1 3 INF INF - EP18 13 0 3 0 INF Y EP19 6 1 3 INF INF - EP20 5 0 3 0 INF Y EP21 3 0 3 0 INF Y EP22 10 0 2 0 INF Y EP23 6 0 2 0 INF Y EP24 4 0 2 0 INF Y EP25 4 0 2 0 INF Y EP26 3 0 2 0 INF Y EP27 3 0 2 0 INF Y EP28 3 0 2 0 INF Y EP29 3 0 2 0 INF Y EP30 2 0 2 0 INF Y a Bold font in gene names indicates known prostate tumor markers and prostate-specific genes. b Y: predicted as strictly tumor-specific, “-“: not predicted to be strictly-tumor-specific. c Nor: Normal prostate pool, Mal: Malignant prostate pool.

40

TABLE VI

POTENTIAL TUMOR MARKERS IDENTIFIED BY OUR SOFTWARE: GENES

HIGHLY SPECIFIC FOR PROSTATE CANCER

mRNA name or codea EST number Expression, in Tumor ON-avgE units specificb

Total Norc Malc Norc Malc kallikrein 3 (PSA) 597 143 228 1005 3956 - kallikrein 2, prostatic 35 19 15 667 1127 - EP33 6 0 5 0 176 - acid phosphatase, prostate 273 83 100 486 1603 - EP35 11 1 3 40 105 - EP36 6 0 5 0 460 - EP37 67 44 20 515 489 - EP38 5 0 4 0 385 - EP39 16 3 3 105 81 - microseminoprotein, beta- 594 375 174 376 461 - EP41 17 0 8 0 72 - EP42 19 0 8 0 70 - EP43 7 0 4 0 70 - EP44 13 0 2 0 70 - EP45 36 0 5 0 67 - EP46 8 0 4 0 54 - EP47 22 0 2 0 54 - FGF-binding protein 1 (HBP17) 20 1 2 35 54 - EP49 18 0 2 0 54 - EP50 12 0 2 0 54 - EP51 12 0 2 0 54 - EP52 11 0 2 0 54 - EP53 4 0 2 0 54 - EP54 16 0 3 0 53 - EP55 5 0 3 0 53 - EP56 122 0 16 0 48 - EP57 12 0 4 0 47 - EP58 41 0 10 0 45 - EP59 43 0 6 0 42 - T cell receptor gamma locus 63 20 20 45 39 - a Bold font in gene names indicates known prostate tumor markers and prostate-specific genes. b Y: predicted as strictly tumor-specific, “-“: not predicted to be strictly-tumor-specific. c Nor: Normal prostate pool, Mal: Malignant prostate pool.

41

For candidate diagnostic markers, the minimal requirement was specificity for a tumor as compared to normal tissue from which a tumor arose. We accepted as markers the genes with at least a 3x higher expression in a tumor than in its normal counterpart.

Highly specific markers were genes that satisfied this requirement for all normal tissues.

3.4 Software implementation of the algorithms used

All algorithms were implemented as PROFILLET software using Python language (http://www.python.org) employing some high-load components rewritten in

C++. Several modules supported automation of BLAST searches, conversion of EST clustering data and BLAST results into transcript-specific EST counts per cDNA library and extraction of library information from dbEST. The expression profiling/candidate selection module used these counts alongside a library classification file to produce counts per tissue, digital expression profiles and to nominate potential diagnostic markers. The redundancy elimination and annotation modules finalized the list of selected genes.

3.5 Redundancy elimination

For every tumor type, a redundant set of potential markers was produced. Most of the redundant entries were removed automatically. Redundancy determination and removal was based on the overlap between tumor-specific EST sets, because those EST sets and not the remainder of the cluster caused the transcript to be selected. The same principle was used to annotate the candidate genes: the sum of UniGene annotations of tumor-specific ESTs was used (it was also used for annotation in our Bi-Humanizer program, see below). This added an extra level of fidelity and protection against mixed

42

EST clusters. Only ESTs that contributed to the selection were used in redundancy elimination, annotation and the experimental validation of candidates.

3.6 The relationship between our approach and the previous work

Our strategy incorporated many features of its predecessors. It was similar to

DDD in two respects. First, we compared two pools against each other; however one of those pools was always the sum of all tissue pools (akin to the DDD2 procedure in

[118]). Second, selection of significant differences employed a statistical criterion. A pool specificity measure was also used to select tumor-specific peaks of expression.

Similarity to GBA may be found in our specificity criteria. GBA used an expression pattern of known good marker as pre-defined profile and we used a more permissive set of constraints instead.

The first steps of the selection (tissue specificity plus statistical validation) identified most of the genes that would be selected by DDD if a tumor EST pool was compared against the sum of the remaining EST libraries. However, our initial gene set was much larger, due to a larger number of EST clusters analyzed, more permissive statistical criteria and tumors being excluded from the reference pool for calculations of the average level of expression.

On the other hand, the minimal representation and specificity across tissue spectrum criteria disqualified a number of genes predicted to be tumor-or tissue-specific by other works. This was expected because few authors selected candidate markers according to the expression in other tissues and representation in the samples/cDNA libraries.

43

4. Results

4.1 The screening summary

We analyzed 419,000 EST-based digital expression profiles encompassing

2,600,000 ESTs that belong to 400 cDNA libraries. The libraries were classified in 80 tissue and tissue state pools. Prostate-specific cDNA libraries were analyzed as three pools (note that all libraries below were derived from tissue samples, not cell lines): a)

“Malignant”: 11 libraries, 51,000 ESTs total, and b) “Normal”: 8 libraries, 32,000 ESTs total. A total of 1005 profiles were selected as potential markers for a malignant prostate.

After automatic and manual elimination of redundancy, 353 unique pre-candidate genes were identified.

Our analysis identifies a large number of strictly tumor-specific and prostate- specific genes, as well as genes highly specific for prostate cancer (Tables V and VI).

4.2 Known cancer markers

The known prostate markers identified by in silico expression profiling represent the proof of principle for our approach. This strategy selected the well-established markers of prostate cancer, such as PSA (kallikrein 3) [56], PSMA [142], kallikrein 2

[143], prostatic acid phosphatase (PAP) [49], and beta-microseminoprotein [144], as highly prostate-specific genes with little or no tumor specificity. We also found in our selection a number of genes known to be highly and specifically expressed in the prostate or prostate cancer, identified using EST data mining, microarray experiments, SAGE analysis and other methods (Table VII).

44

TABLE VII

KNOWN AND POTENTIAL TUMOR MARKERS IDENTIFIED BY IN SILICO

PROFILING APPROACH

Gene symbol Gene name Reference KLK3 Kallikrein 3 (prostate specific antigen) [56] Folate hydrolase (prostate-specific membrane [142] FOHL1 antigen) KLK2 Kallikrein 2 [143] ACPP Prostatic acid phosphatase (PAP) [49] MSMB Beta-microseminoprotein [144] TCR-gamma T cell receptor gamma locus [120] Fatty acid binding protein 5 (- [113, 118] FABP5 associated) CRISP3 -rich secretory protein 3 (SGP28) [114, 145] TSPAN1 Tetraspanin 1 (NET-1) [114, 125, 145] Transmembrane, prostate androgen induced [126] TMEPAI RNA (PMEPA1) PCANAP1 Prostate cancer associated protein 1 [140] ELK4 ETS-domain protein (SRF accessory protein 1) [118] PRAC Small nuclear protein PRAC [146] V-myc myelocytomatosis viral [147] MYC homolog (avian) Nucleophosmin (nucleolar phosphoprotein [148] NPM1 B23, numatrin) NK3 transcription factor related, locus 1 [149] NKX3-1 (NKX3A)

45

5. Discussion

While selecting a number of previously reported prostate cancer markers, our approach did not select certain genes that might have been expected to be in the list of candidates. One of these was androgen receptor (AR), which is known to play a crucial role in prostate cancer. It is normally expressed at relatively low levels and amplified in some androgen-independent prostate tumors [150]. Consistent with this, we found that

AR cDNA-derived ESTs are present only in one non-subtracted cDNA library,

NCI_CGAP_Pr3. A few emerging markers of prostate cancer were discarded by our analysis due to the low tissue/tumor specificity. E.g., -methylacyl-CoA racemase

(AMACR) [125] was rejected due to high predicted expression in normal kidney and low number of ESTs in the prostate pools. Hepsin [109-111, 151] is highly expressed in normal liver and possibly normal kidney. (FASN) [109, 151] is very highly and specifically expressed in normal skin. This is expected because the above three genes are developed as imaging/biopsy diagnostic markers, and our analysis is aimed at identifying very tissue-specific diagnostic markers.

6. Conclusions

We developed an enhanced EST data mining (in silico expression profiling) approach to marker identification. The gene selection procedures reported in the literature are either too generic or too narrow to be used for diagnostic marker identification. Our approach has less restrictive but more task-oriented selection criteria, while using the most available safeguards against data distortion. Another advantage is its sensitivity to major splice rearrangements, allowing identification of tissue and tumor-specific splice isoforms.

46

This approach yielded 353 prostate cancer marker candidate genes that became the object of the following selection/verification steps (Figure 2).

47

IV. IDENTIFICATION OF NOVEL POTENTIAL PROSTATE CANCER

MARKERS BY IN VIVO EXPRESSION PROFILING

1. Rationale

Our finding that PSA expression is negatively regulated by p53 [82] opened an approach to a systematic search for cancer markers belonging to the same regulation category – genes that are under negative control of tumor suppressor genes (TSGs). The inactivation of TSGs is a very basic mechanism of cancer development. Many of TSG pathways are evolutionally conserved; therefore, we presumed it should be possible to identify such genes using mouse models of TSG function loss in the prostate. Mouse models of TSG inactivation provide a reliable and abundant source of experimental material; drastically decrease the inter-organism variability inherent in human tumor samples, and also allows us to analyze gene expression at the early stages of prostate cancer, before the appearance of secondary genetic changes. We performed the microarray-based comparison of the mouse TSG knockout models with their syngeneic host strains. The models we chose to investigate reflect the tumor suppressors that play a major role in cancer, including prostate, namely p53, PTEN and pRB [152-154].

2. Introduction

One of the commonly adopted approaches in studying molecular signatures of cancer, which are likely to be evolutionally conserved, is the use of transgenic mouse models. Several mouse models of TSGs inactivation in prostate (such tumor suppressors as PTEN, RB, and p53) are currently available.

48

2.1 Tumor-suppressor genes: human tumors and mouse models

PTEN is located in chromosomal region 10q, which is lost in 50-80% of prostate carcinomas, and detected by several independent strategies [155-159]. PTEN encodes a phosphatase acting as an antagonist of phosphoinositol-3-kinase, a key player of several signaling, mostly antiapoptotic, pathways. Mice heterozygous for PTEN knockout develop prostate hyperplasia and dysplasia [152, 160]. PTEN is also frequently deleted in human prostate cancer cell lines [161, 162]. The effects of PTEN loss in the prostate are studied using PTEN+/- (heterozygous) mice (homozygous PTEN knockout mice die in utero). PTEN+/- mice display neoplasias in multiple tissues and organs [160]. In particular, early prostate carcinomas (high-grade PIN lesions) are observed in 10% of males at 6 months and in 65% at 12 months of age [163].

Loss of chromosome 13q, including the region of the RB gene, occurs in at least

50% of prostate tumors [164-166]; in RB gene and loss of pRB protein expression have been reported in significant proportion of prostate carcinomas [154].

Retinoblastoma protein (pRB) is an inhibitor of cell-cycle progression, mainly by blocking the transcription of S-phase genes through binding to the E2F family of transcription factors. RB homozygous knockout mice die at the 14-15th day of gestation and thus cannot be used as a model of prostate cancer. RB+/- heterozygous mice do not display extensive tumorigenesis with the exception of pituitary tumors arising from cells with loss of the wild-type RB allele [167]. The TRAMP mouse is a model of prostate cancer with inactivated pRB function, which was designed to develop simultaneous loss of p53 and pRB function in the prostate. Such inactivation is accomplished by expression of SV40 major T antigen, a potent inhibitor of pRB and p53 function [168, 169], under

49 prostate-specific promoter derived from the rat probasin gene [170]. 100% of TRAMP males display at least hyperplasia of the prostate by 10 weeks of age. At 6 months of age,

66% of TRAMP males develop metastatic adenocarcinomas [171].

Mutations in the p53 tumor suppressor gene, a key determinant of cellular response to genotoxic stress, are generally believed to be a late event in the progression of prostate cancer, and are associated with androgen independence, metastasis, and a worse prognosis [159], although this is not a completely clear issue [172]. The nature of the relationship between p53 status and androgen dependence of prostate cancer cells also remains controversial [84, 173]. Loss or of p53 is found by different studies in

10-30% of organ-confined prostate cancer [153, 174] and in 15-50% of advanced metastatic cancer of the prostate [153, 175]. P53 homozygous knockout mice completely lack p53 function and develop tumors of varying origin at 3-6 months of age and die shortly thereafter [176]. Prostate cancer is an infrequent event in such mice, probably due to their early demise from more rapidly developing malignancies.

3. Materials and Methods

3.1 Transgenic animals

P53 knockout (KO) mice (C57BL/6J background) [177], TRAMP mice

(C57BL/6J background) [178], and original C57BL/6J mice were purchased from

Jackson labs (http://jaxmice.jax.org/). Pten+/- mice (FVB background) and the syngeneic

FVB mice were obtained from Dr. Katrina Podsypanina (Departments of Pathology and

Medicine, College of Physicians and Surgeons, Columbia University).

50

3.2 RNA isolation

Male p53 KO, TRAMP and control C57BL/6J mice were sacrificed at the age of

9 weeks, while Pten+/- and control FVB mice at 6 weeks of age, before the pronounced appearance of the tumors and the associated secondary genetic changes. For each model or control strain, prostates were collected from 6 individual mice, assembled into two sets of 3 prostates each (mixing samples within a set serves to reduce biological variability, while sets are used as replicates in completely independent microarray hybridizations to reduce experimental variability). Total RNA was isolated from the prostate sets using

Trizol reagent (Invitrogen) as per manufacturer’s recommendations.

3.3 Probes preparation and microarray hybridizations

Biotin-labeled cRNA probe was synthesized from total RNA using the

Affymetrix™ Enzo® BioArray™ HighYield™ RNA Transcript Labeling Kit according to the manufacturer’s instructions. The probe prepared from each prostate set was used in hybridization with a Affymetrix™ Murine Genome Array U74v2 set (arrays U74Av2,

U74Bv2 and U74Cv2), representing approximately 36,000 full-length mouse genes and

EST clusters from the UniGene database (Build 74). Data acquisition was done using

Affymetrix™ Microarray Suite. Probe preparation, microarray hybridization and data acquisition steps were performed at the CCF Affymetrix Core facility according to the manufacturer’s instructions.

3.4 Microarray data analysis and pre-candidate gene selection

Genes potentially repressed by PTEN, as well as p53 and/or pRB were selected using the GeneSpring™ microarray data analysis package. The criteria for gene selection were set as follows: (1) A reproducibly reliable signal in a TSG inactivation model: P

51

(Present) flag in both replicates; (2) Significant upregulation in the TSG inactivation model compared to normal mice: TSG deficient/Normal expression ratio >= 3; (3) High signal (“expression”) in TSG inactivation model: >= 2x chip-average signal (an arbitrary cutoff, used to select easily detectable genes).

4. Identification of the human homologs for the selected mouse genes: the BI-

HUMANIZER software

For all selected mouse genes, we attempted to find human homologs using NCBI resources together with our custom “humanization” software. NCBI HomoloGene database (http://www.ncbi.nlm.nih.gov/HomoloGene/) contains calculated cross-species homologs (orthologs and paralogs) for mice, and other organisms. HomoloGene relies on NCBI UniGene EST cluster databases (http://www.ncbi.nlm.nih.gov/UniGene/), also available for humans, mice, etc., to establish transcript identities. Homology between transcripts of different organisms is defined as homology between some members of respective UniGene clusters. However, such assignment is frequently promiscuous or spurious because of the frequently weak association of a query sequence with its

UniGene cluster or false links due to contaminant sequences in UniGene clusters. In addition, about 12% of accession numbers of Affymetrix array elements were not found in UniGene and thus were not searchable in HomoloGene; another 15% did not have any homologs listed in HomoloGene (leading to only a 73% success rate for the mouse to human homolog search).

For the above reasons, we designed the BI-HUMANIZER software program that uses NCBI HomoloGene, NCBI UniGene and results of sequential NCBI BLAST [179] searches to select the closest human (or mouse) homolog(s) for an arbitrary set of mouse

52

(human) sequences (given as a list of GenBank accession numbers). It also allows annotation of many non-UniGene sequences using UniGene annotation of highly homologous sequences (Figure 4). In essence, it verifies and complements HomoloGene result using BLAST homology searches. The combined result is more comprehensive and unambiguous, albeit more complex, than HomoloGene results alone.

4.1 The inner workings of BI-HUMANIZER

All algorithms were implemented in the Python language. The BI-HUMANIZER program works by performing the following steps (see Figure 4):

I. BLAST extension and annotation of the query sequences: Every sequence in the query was searched against the same-organism EST database and thesame-organism portion of the nucleotide (nt) database to identify highly homologous sequences (at least

80 bp, 97% for EST and 80 bp, 98% for nt databases). This and all following BLAST searches were done with low complexity and repeat filtering. The result was used for two purposes:

A. Query extension: For every sequence in the query, three cDNA sequences were selected, which provided the longest sequence overall and the longest 5’ and 3’ flank extensions. Duplicates were eliminated.

B. For every sequence in the query, the affinity to the same-organism (in this study, the mouse) the UniGene cluster(s) was determined. This was done by assigning all highly homologous hits to UniGene clusters and counting the hit fraction of “ingredient”

UniGene clusters. Minor (<10% of hits) associations were discarded. The number of associating hits was also counted for every “ingredient”.

53

II. BLAST species crossover: Every query sequence with its extension sequences was BLAST-searched against another organism (in our case, human), the EST database and a portion of the nt database. Hits with a homology of at least 60 bp and 75% were accepted for further analysis.

III. Handling of the BLAST crossover results: For each original query sequence and its extensions, all cross-species hits were combined and annotated using other-organism UniGene database as described for stage I (B). The final result of the

BLAST crossover was a list of “major ingredient” other-organism UniGene clusters, or, in some cases, clusters of non-UniGene sequences. The strength of putative homologies was expressed as a) a fraction of hits for a particular UniGene “ingredient”; b) an average homology percentage and c) the number of crossover hits for this “ingredient”.

IV. HomoloGene species crossover:

A. The UniGene cluster “ingredients” were found for very homologous hits to the query sequence in stage I (B) (see above).

B. Cross-species homologs were obtained using HomoloGene, if available. The strength of the putative homology was represented in HomoloGene by combination of a nucleotide homology percentage and designation that identified the link as curated (c), a reciprocal best hit (b), the ternary reciprocal best hit involving some other UniGene organism (B), the best hit from organism A to organism B only (f), and the best in B to A direction only (s).

V. Comparison of HomoloGene and BLAST crossover results: Both results were combined in one list. Curated homologies from HomoloGene overrode any other possible homologies if association of the query sequence with its assigned UniGene

54 cluster was strong enough (>=2 sequences; >=50% of highly homologous sequences were from this cluster). This was also true for homologies not explicitly marked curated, but with genes having identical or very similar annotations and UniGene symbols.

Homologies that were in agreement between two analyses overrode all weaker homologies. In case of disagreement between analyses, both possibilities were accepted.

Query sequences that did not produce BLAST crossover results were “recycled”: steps I-

III and V are then repeated once more with homology percentage necessary for crossover lowered to 65%.

55

Mouse GenBank ID list

BLAST I. e G n

s BLAST t t e _ _ n m

m retrieval of B o o a u u

n same-gene s s k e e mRNA sequences

I.B I.A Best 3’,5’ Mouse UniGene Query mouse mRNA UniGene annotation extension extensions

BLAST Same-gene

hits e n G s t

t II. _ e I.B (GenBank IDs) _ h n h B u

u BLAST

Determination m m a n a species a k

of UniGene n n affinities crossover

“Recycling” of hitless Best human queries homologs at lower HomoloGene stringency

Human Human UniGene UniGene V. IV. Combined III. HomoloGene results Determination species of UniGene crossover V. affinities Result comparison Final list of homologs

Figure 4. Conversion of the mouse genes to their human homologs: Bi-Humanizer

program schematics.

56

5. Results

5.1 Genes upregulated in the murine prostate after tumor suppressor inactivation

We selected 83 genes that are upregulated and highly expressed in prostates of p53 KO mice (potentially p53-repressed genes), 157 genes in prostates of TRAMP mice

(potentially p53- or pRB-repressed genes), and 46 such genes in PTEN+/- mice. We selected genes with relatively high expression to facilitate easy detection of our pre- candidate markers on the mRNA and/or protein level.

Notably, almost all genes upregulated in p53 KO prostates are also upregulated in

TRAMP prostates, with only 4 genes being p53 KO-specific (Figure 5). This reduced total number of selected genes for these two models to 161. In TRAMP prostates, both pRB and p53 are inactivated, while in p53 KO prostates only p53 is inactive.

Accordingly, it is logical to expect the genes upregulated in p53 KO model to be a subset of the genes upregulated in the TRAMP model. This observation confirms the validity of our experimental system.

For 46 genes identified in PTEN+/- experiments, 36 remained after the removal of duplicates, repeats and Ig sequences. For all 36 sequences, 37 human orthologs and paralogs were found, 32 of those specific for PTEN model. However, PTEN+/- -derived genes were not used in subsequent studies.

57

4 79 78

3 0 6

37

Figure 5. The relationship between gene groups upregulated in p53 KO (red),

TRAMP (green) and PTEN+/- (blue) mouse models.

58

5.2 BI-HUMANIZER results

161 pre-candidates were isolated from p53 KO and TRAMP models. After the removal of entries derived from repeats, Ig and MHC loci, 134 unique genes remained.

For 118 (88%) of those, 131 human orthologs and paralogs were identified. Those pre- candidate marker genes were subjected to the subsequent steps of selection and verification (Figure 2).

5.3 Known cancer markers

Among the final list of pre-candidate genes we found a number of previously reported cancer markers (Table VIII), most of which were not known as markers of prostate cancer.

59

TABLE VIII

KNOWN AND POTENTIAL TUMOR MARKERS IDENTIFIED AS GENES

POTENTIALLY REPRESSED BY p53 AND pRB

Marker properties and Gene Symbol Protein name literature references MIA Melanoma inhibitory activity Melanoma marker [180] APOD Apolipoprotein D Prostate tumor marker [181] Secretory leukocyte protease Ovarian cancer - potential SLPI inhibitor marker [182] Glutathione peroxidase 3 Ovarian cancer - potential GPX3 (plasma) marker [182] Ovarian cancer - potential APOE marker [182] Prostaglandin D2 synthase Meningioma marker [183] PTGDS (21kD, brain) Glutamyl Elevated in cervical tumors ENPEP (aminopeptidase A) [184] Target for anti-cancer treatment CA6 Carbonic anhydrase VI [185]

60

6. Discussion

Our TSG inactivation-based approach to a cancer marker search was based on the observation that PSA is negatively regulated by tumor suppressor p53 [82]. Among the final list of pre-candidate genes, there are a significant number of known and potential tumor markers, most of which are not necessarily known as markers of prostate cancer.

This result demonstrates feasibility of our approach and its applicability to other cancer types (also, see Discussion in the next section).

7. Conclusions

We proposed and tested a novel in vivo approach to a cancer marker search. The resulting pre-candidates will join pre-candidates from the in silico profiling approach for further selection and validation (Figure 2).

61

V. VERIFICATION OF THE SELECTED PRE-CANDIDATE GENES BY

MICROARRAY HYBRIDIZATION

1. Rationale

Once the pre-candidate list was assembled from the combined efforts of EST data mining and TSG inactivation approaches, the next step was to confirm the expression profiles of pre-candidate genes in microarray experiments and make a manageable list of candidates for further studies. We had designed the Prostate Marker Array, and tested the expression profiles of our pre-candidate genes across the spectrum of tumor-derived cell lines representing a large variety of human cancers, including a set of prostate cancer samples. The evaluation of expression profiles across the large and diverse set of cell lines provides a rapid way of marker verification. The primary goal of this experiment was to identify novel candidate genes that are prostate-specific. The minimal condition for validation was set at 1.5x upregulation (as compared to median expression) in at least two human prostate-derived cell lines (at least two replicate experiments in one cell line and at least one replicate experiment in another cell line).

2. Introduction

We selected 131 genes using mouse models of TSG inactivation, and 353 genes using in silico expression profiling, with 5 genes in common. Additional 30 genes were added as controls. For the majority of these genes, cDNA clones were ordered and printed on a custom cDNA microarray.

62

3. Materials and Methods

3.1 cDNA clone selection and ordering

The selection of EST-associated cDNA clones as described below was performed by PickEST [177] software. As a side of our profiling procedures, for each profile we obtained a list of highly homologous ESTs with information about their Clone

ID, sequenced length, and for mRNA BLAST based clusters, the location of ESTs relative to the parent transcript. We excluded from selection: a) clones with >50 bp Alu or repetitive element homology; b) cDNA clones with ESTs that belong to multiple

UniGene clusters; c) cDNA clones with UniGene assignment of ESTs different from majority of ESTs in the same profile; d) cDNA clones present in more than one profile after redundancy elimination.

First preference in selection was given to sequence-verified clones. If available, one clone was ordered per each candidate transcript. If no sequence-verified cDNA clone was available, two best non-verified cDNA clones were selected. Such limited redundancy of non-verified clones was necessary to increase reliability of data and to prevent loss of transcripts from the array due to the wrong association between EST and its cDNA clone. If a profile originated from BLAST results, we selected two clones with the best potential coverage of the mRNA sequence according to positions of sequenced

(EST) portions. The projected modal length of a cDNA clone was assumed to be ~1000 bp. If a profile was derived from an EST clustering result, two cDNA clones with the longest sequenced portion were chosen. Selection algorithm favored prostate-derived cDNA clones, which potentially allowed selection of prostate-specific transcripts of

63 candidate genes if any existed. We also avoided phage-infected ranges of I.M.A.G.E. plates.

For 489 out of 499 genes, EST clones were available and ordered from Research

Genetics (http://www.resgen.com). A total of 734 clones were successfully printed on

Prostate Marker Array (Table IX).

TABLE IX

CDNA CLONES PRINTED ON PROSTATE MARKER ARRAY

Type of UGSymb Primary Primary UGCluster ol cDNA ID ID Original Definition (current) UGName (current) (current) Origin Alpha-methylacyl-CoA 1 1034473 IMAGE alpha-methylacyl-CoA racemase Hs.508343 racemase AMACR CONTROL Homo sapiens arachidonate 12- 2 81434 IMAGE (ALOX12), mRNA Hs.422967 Arachidonate 12-lipoxygenase ALOX12 CONTROL Homo sapiens arachidonate 12- 3 1338603 IMAGE lipoxygenase (ALOX12), mRNA Hs.422967 Arachidonate 12-lipoxygenase ALOX12 CONTROL Homo sapiens ATP-binding cassette, sub-family C ATP-binding cassette, sub- (CFTR/MRP), member 4 (ABCC4), family C (CFTR/MRP), member 4 3278940 IMAGE mRNA Hs.508423 4 ABCC4 CONTROL Homo sapiens ATP-binding cassette, sub-family C ATP-binding cassette, sub- (CFTR/MRP), member 4 (ABCC4), family C (CFTR/MRP), member 5 3887306 IMAGE mRNA Hs.508423 4 ABCC4 CONTROL Homo sapiens Lutheran blood group (Auberger b antigen Basal 6 5186493 IMAGE included) (LU), mRNA Hs.155048 (Lutheran blood group) BCAM CONTROL Homo sapiens Lutheran blood group (Auberger b antigen Basal cell adhesion molecule 7 5207444 IMAGE included) (LU), mRNA Hs.155048 (Lutheran blood group) BCAM CONTROL

calcium/-dependent Calcium/calmodulin-dependent 8 345793 IMAGE protein kinase kinase 2, beta Hs.297343 protein kinase kinase 2, beta CAMKK2 CONTROL

Homo sapiens ERGL protein 6 4

9 2489888 IMAGE (ERGL), mRNA Hs.187694 Complexin 3 CPLX3 CONTROL

HSCOXII Homo sapiens mitochondrial mRNA for 10 3894605 IMAGE subunit II *MT-CO2 cytochrome c oxidase subunit II MT-CO2 CONTROL HSCOXII Homo sapiens mitochondrial mRNA for 11 4154093 IMAGE cytochrome c oxidase subunit II *MT-CO2 cytochrome c oxidase subunit II MT-CO2 CONTROL I27367 Sequence 3 from patent US 12 3891282 IMAGE 5565323 *MT-CO3 cytochrome c oxidase subunit III MT-CO3 CONTROL I27367 Sequence 3 from patent US 13 3891333 IMAGE 5565323 *MT-CO3 cytochrome c oxidase subunit III MT-CO3 CONTROL Homo sapiens folate hydrolase (prostate-specific membrane Folate hydrolase (prostate- 14 284701 IMAGE antigen) 1 (FOLH1), mRNA Hs.380325 specific membrane antigen) 1 FOLH1 CONTROL Homo sapiens folate hydrolase NM_004 GENBA (prostate-specific membrane Folate hydrolase (prostate- 15 476 NK antigen) 1 (FOLH1), mRNA Hs.380325 specific membrane antigen) 1 FOLH1 CONTROL Homo sapiens folate hydrolase NM_004 GENBA (prostate-specific membrane Folate hydrolase (prostate- 16 476 NK antigen) 1 (FOLH1), mRNA Hs.380325 specific membrane antigen) 1 FOLH1 CONTROL

17 2961614 IMAGE prostate differentiation factor Hs.515258 Growth differentiation factor 15 GDF15 CONTROL HIST1H2 18 3463930 IMAGE family member Hs.437275 Histone 1, H2bk BK CONTROL

Homo sapiens myristoylated alanine-rich C kinase MARCKS 19 4646709 IMAGE (MACMARCKS), mRNA Hs.75061 MARCKS-like 1 L1 CONTROL

Homo sapiens macrophage myristoylated alanine-rich C kinase MARCKS 20 5013486 IMAGE substrate (MACMARCKS), mRNA Hs.75061 MARCKS-like 1 L1 CONTROL 6 5

Homo sapiens mRNA; cDNA DKFZp564A072 (from clone 21 784168 IMAGE DKFZp564A072) Hs.480311 PDZ and LIM domain 5 PDLIM5 CONTROL

Homo sapiens procollagen-, 2-oxoglutarate 4- (proline 4-hydroxylase), beta Procollagen-proline, 2- polypeptide (protein oxoglutarate 4-dioxygenase ; hormone binding (proline 4-hydroxylase), beta 22 4760828 IMAGE protein p55) (P4HB), mRNA Hs.464336 polypeptide P4HB CONTROL

Homo sapiens procollagen-proline, 2-oxoglutarate 4-dioxygenase (proline 4-hydroxylase), beta Procollagen-proline, 2- polypeptide (protein disulfide oxoglutarate 4-dioxygenase isomerase; thyroid hormone binding (proline 4-hydroxylase), beta 23 4761374 IMAGE protein p55) (P4HB), mRNA Hs.464336 polypeptide P4HB CONTROL Homo sapiens pyrroline-5- carboxylate reductase 1 (PYCR1), nuclear gene encoding Pyrroline-5-carboxylate 24 2267352 IMAGE mitochondrial protein, mRNA Hs.458332 reductase 1 PYCR1 CONTROL Homo sapiens pyrroline-5- carboxylate reductase 1 (PYCR1), nuclear gene encoding Pyrroline-5-carboxylate 25 5204965 IMAGE mitochondrial protein, mRNA Hs.458332 reductase 1 PYCR1 CONTROL Homo sapiens prostein protein Solute carrier family 45, member 26 3846411 IMAGE (LOC85414), mRNA Hs.278695 3 SLC45A3 CONTROL Homo sapiens prostein protein Solute carrier family 45, member 27 4661500 IMAGE (LOC85414), mRNA Hs.278695 3 SLC45A3 CONTROL

28 3950865 IMAGE 4 (prostate) Hs.438265 Transglutaminase 4 (prostate) TGM4 CONTROL 6 6

Homo sapiens tumor protein D52 29 5260463 IMAGE (TPD52), mRNA Hs.368433 Tumor protein D52 TPD52 CONTROL Homo sapiens tumor protein D52 30 5415187 IMAGE (TPD52), mRNA Hs.368433 Tumor protein D52 TPD52 CONTROL aldehyde dehydrogenase 1 family, Aldehyde dehydrogenase 1 ALDH1A CONTROL, EST 31 814798 IMAGE member A3 Hs.459538 family, member A3 3 mining aldehyde dehydrogenase 6 family, Aldehyde dehydrogenase 6 ALDH6A CONTROL, EST 32 3530715 IMAGE member A1 Hs.293970 family, member A1 1 mining methylmalonate-semialdehyde Aldehyde dehydrogenase 6 ALDH6A CONTROL, EST 33 1627940 IMAGE dehydrogenase Hs.293970 family, member A1 1 mining cyclin-dependent kinase inhibitor Cyclin-dependent kinase 2A (melanoma, p16, inhibits inhibitor 2A (melanoma, p16, CONTROL, EST 34 4761053 IMAGE CDK4) Hs.512599 inhibits CDK4) CDKN2A mining CONTROL, EST 35 3951036 IMAGE kallikrein 2, prostatic Hs.515560 Kallikrein 2, prostatic KLK2 mining kallikrein 3, (prostate specific Kallikrein 3, (prostate specific CONTROL, EST 36 997856 IMAGE antigen) Hs.171995 antigen) KLK3 mining Kallikrein 3, (prostate specific CONTROL, EST 37 953487 IMAGE ESTs Hs.171995 antigen) KLK3 mining UNKNOWN KALLIKREIN: kallikrein 2, prostatic:12 | kallikrein 3, (prostate specific antigen):6 | Kallikrein 3, (prostate specific CONTROL, EST 38 3950475 IMAGE Unknown:2 Hs.171995 antigen) KLK3 mining UNKNOWN KALLIKREIN: kallikrein 2, prostatic:12 | kallikrein NM_001 GENBA 3, (prostate specific antigen):6 | Kallikrein 3, (prostate specific CONTROL, EST 39 648 NK Unknown:2 Hs.171995 antigen) KLK3 mining UNKNOWN KALLIKREIN: kallikrein 2, prostatic:12 | kallikrein NM_001 GENBA 3, (prostate specific antigen):6 | Kallikrein 3, (prostate specific CONTROL, EST 40 648 NK Unknown:2 Hs.171995 antigen) KLK3 mining 6 7

Homo sapiens 18 (KRT18), CONTROL, EST 41 4760751 IMAGE mRNA Hs.406013 KRT18 mining Homo sapiens keratin 18 (KRT18), CONTROL, EST 42 5456792 IMAGE mRNA Hs.406013 Keratin 18 KRT18 mining CONTROL, EST 43 2984862 IMAGE keratin 18 Hs.406013 Keratin 18 KRT18 mining CONTROL, EST 44 3349315 IMAGE keratin 18 Hs.406013 Keratin 18 KRT18 mining CONTROL, EST 45 3950847 IMAGE microseminoprotein, beta- Hs.255462 Microseminoprotein, beta- MSMB mining TCR gamma alternate reading CONTROL, EST 46 2490680 IMAGE T cell receptor gamma locus Hs.534032 frame protein TARP mining TCR gamma alternate reading CONTROL, EST 47 281003 IMAGE T cell receptor gamma locus Hs.534032 frame protein TARP mining Homo sapiens tetraspan 1 (TSPAN- CONTROL, EST 48 3846388 IMAGE 1), mRNA Hs.38972 Tetraspanin 1 TSPAN1 mining Homo sapiens tetraspan 1 (TSPAN- CONTROL, EST 49 5205772 IMAGE 1), mRNA Hs.38972 Tetraspanin 1 TSPAN1 mining CONTROL, EST 50 2320932 IMAGE tetraspan 1 Hs.38972 Tetraspanin 1 TSPAN1 mining AF052107 Homo sapiens clone CONTROL, 51 47588 IMAGE 23620 mRNA sequence Hs.593538 Clone 23620 mRNA sequence MusAffy_TRAMP* AF052107 Homo sapiens clone CONTROL, 52 2341042 IMAGE 23620 mRNA sequence Hs.593538 Clone 23620 mRNA sequence MusAffy_TRAMP

1-acylglycerol-3-phosphate O- 1-acylglycerol-3-phosphate O- 2 acyltransferase 2 (lysophosphatidic (lysophosphatidic acid 53 5445812 IMAGE acid acyltransferase, beta) Hs.320151 acyltransferase, beta) AGPAT2 EST mining 6 8

1-acylglycerol-3-phosphate O- 1-acylglycerol-3-phosphate O- acyltransferase 2 acyltransferase 2 (lysophosphatidic (lysophosphatidic acid 54 5451348 IMAGE acid acyltransferase, beta) Hs.320151 acyltransferase, beta) AGPAT2 EST mining

3'-phosphoadenosine 5'- 3'-phosphoadenosine 5'- 55 594322 IMAGE phosphosulfate synthase 1 Hs.368610 phosphosulfate synthase 1 PAPSS1 EST mining

Homo sapiens, clone 5,10-methylenetetrahydrofolate 56 969666 IMAGE IMAGE:3949285, mRNA Hs.214142 reductase (NADPH) MTHFR EST mining Acid phosphatase 5, tartrate 57 3028469 IMAGE hypothetical protein MGC4549 Hs.1211 resistant ACP5 EST mining

58 2460509 IMAGE acid phosphatase, prostate Hs.433060 Acid phosphatase, prostate ACPP EST mining

59 916498 IMAGE ESTs Hs.433060 Acid phosphatase, prostate ACPP EST mining

60 942436 IMAGE ESTs Hs.433060 Acid phosphatase, prostate ACPP EST mining cytosolic acyl coenzyme A thioester 61 4520219 IMAGE hydrolase Hs.126137 Acyl-CoA 7 ACOT7 EST mining cytosolic acyl coenzyme A thioester 62 5457425 IMAGE hydrolase Hs.126137 Acyl-CoA thioesterase 7 ACOT7 EST mining

Acyl-Coenzyme A binding 63 2805917 IMAGE Golgi phosphoprotein 1 Hs.520207 domain containing 3 ACBD3 EST mining

Acyl-Coenzyme A binding 64 4805900 IMAGE golgi phosphoprotein 1 Hs.520207 domain containing 3 ACBD3 EST mining 65 5446744 IMAGE hypothetical protein FLJ12443 Hs.368853 AYTL2 EST mining 66 5452520 IMAGE hypothetical protein FLJ12443 Hs.368853 Acyltransferase like 2 AYTL2 EST mining 6 9

a disintegrin and domain 10 gene:ADAM10 ADAM metallopeptidase domain 67 4761035 IMAGE MIM:602192 from:NT_010289 Hs.578508 10 ADAM10 EST mining

a disintegrin and metalloproteinase domain 10 gene:ADAM10 ADAM metallopeptidase domain 68 4807547 IMAGE MIM:602192 from:NT_010289 Hs.578508 10 ADAM10 EST mining

adaptor protein containing pH Adaptor protein containing pH domain, PTB domain and leucine domain, PTB domain and 69 754367 IMAGE zipper motif Hs.476415 motif 1 APPL EST mining 70 3544131 IMAGE adenosine kinase Hs.584739 Adenosine kinase ADK EST mining

71 3140402 IMAGE ADP-ribosylation factor-like 2 Hs.502836 ADP-ribosylation factor-like 2 ARL2 EST mining 72 3510557 IMAGE hypothetical protein FLJ10815 Hs.10499 transporter FLJ10815 EST mining

Amyotrophic lateral sclerosis 2 Homo sapiens cDNA: FLJ21354 (juvenile) chromosome region, ALS2CR1 73 613303 IMAGE fis, clone COL02773 Hs.471130 candidate 13 3 EST mining 74 2544446 IMAGE angiopoietin 1 Hs.369675 Angiopoietin 1 ANGPT1 EST mining 75 4805884 IMAGE ribosomal protein S26 Hs.567235 2, neuronal ANK2 EST mining 76 4808209 IMAGE ribosomal protein S26 Hs.567235 Ankyrin 2, neuronal ANK2 EST mining Ankyrin repeat and KH domain 77 941685 IMAGE KIAA1085 protein Hs.591259 containing 1 ANKHD1 EST mining Ankyrin repeat and KH domain 78 4858172 IMAGE KIAA1085 protein Hs.591259 containing 1 ANKHD1 EST mining

hypothetical protein Ankyrin repeat and MYND 79 841229 IMAGE DKFZp564O043 Hs.157378 domain containing 2 ANKMY2 EST mining 80 2988009 IMAGE annexin A10 Hs.188401 Annexin A10 ANXA10 EST mining 7 0

81 4429186 IMAGE annexin A11 Hs.530291 Annexin A11 ANXA11 EST mining

82 4429256 IMAGE annexin A11 Hs.530291 Annexin A11 ANXA11 EST mining

anterior gradient 2 (Xenopus laevis) Anterior gradient 2 homolog 83 510576 IMAGE homolog Hs.530009 (Xenopus laevis) AGR2 EST mining

Apoptosis-inducing factor (AIF)- like -associated 84 3506309 IMAGE hypothetical protein FLJ14497 Hs.533655 inducer of death AMID EST mining

hypothetical protein MGC2840 -linked similar to a putative 8 homolog (yeast, alpha-1,3- 85 2966827 IMAGE glucosyltransferase Hs.503368 glucosyltransferase) ALG8 EST mining

86 3349233 IMAGE Hs.332422 Aspartate beta-hydroxylase ASPH EST mining 87 725117 IMAGE astrotactin 2 Hs.209217 Astrotactin 2 ASTN2 EST mining

ATP-binding cassette, sub-family E ATP-binding cassette, sub- 88 4431205 IMAGE (OABP), member 1 Hs.12013 family E (OABP), member 1 ABCE1 EST mining B-cell receptor-associated 89 4539458 IMAGE accessory proteins BAP31/BAP29 Hs.522817 protein 31 BCAP31 EST mining B-cell receptor-associated 90 4669719 IMAGE accessory proteins BAP31/BAP29 Hs.522817 protein 31 BCAP31 EST mining

ESTs, Highly similar to B Chain B, Crystal Structure Of The Human Cdk2 Kinase Complex With Cell Cycle-Regulatory Protein Ckshs1 91 4738841 IMAGE (Homo sapiens) Hs.525572 receptor B2 BDKRB2 EST mining 7 1

BTAF1 RNA polymerase II, B- ESTs, Weakly similar to B28096 TFIID transcription factor- BG4341 GENBA line-1 protein ORF2 (Homo associated, 170kDa (Mot1 92 11 NK sapiens) Hs.500526 homolog, S. cerevisiae) BTAF1 EST mining

BTAF1 RNA polymerase II, B- ESTs, Weakly similar to B28096 TFIID transcription factor- line-1 protein ORF2 (Homo associated, 170kDa (Mot1 93 998429 IMAGE sapiens) Hs.500526 homolog, S. cerevisiae) BTAF1 EST mining

BTAF1 RNA polymerase II, B- ESTs, Weakly similar to B28096 TFIID transcription factor- line-1 protein ORF2 (Homo associated, 170kDa (Mot1 94 4603470 IMAGE sapiens) Hs.500526 homolog, S. cerevisiae) BTAF1 EST mining ESTs, Weakly similar to TRHY_HUMAN TRICHOHYALI BTB (POZ) domain containing BTBD14 95 957816 IMAGE (Homo sapiens) Hs.531614 14B B EST mining ESTs, Weakly similar to TRHY_HUMAN TRICHOHYALI BTB (POZ) domain containing BTBD14 96 4520562 IMAGE (Homo sapiens) Hs.531614 14B B EST mining

97 2959664 IMAGE BTB (POZ) domain containing 2 Hs.465543 BTB (POZ) domain containing 2 BTBD2 EST mining

98 3162813 IMAGE BTB (POZ) domain containing 2 Hs.465543 BTB (POZ) domain containing 2 BTBD2 EST mining Homo sapiens cDNA FLJ11629 fis, 99 453602 IMAGE clone HEMBA1004241 Hs.490203 Caldesmon 1 CALD1 EST mining

CAMP responsive element 100 814145 IMAGE jumping translocation breakpoint Hs.372924 binding protein 3-like 4 CREB3L4 EST mining 7 2

ESTs, Highly similar to B Chain B, Crystal Structure Of The Human Cdk2 Kinase Complex With Cell Cycle-Regulatory Protein Ckshs1 CDC28 protein kinase regulatory 101 4538252 IMAGE (Homo sapiens) Hs.374378 subunit 1B CKS1B EST mining ESTs, Weakly similar to A37413 102 4738617 IMAGE D, 28K (Homo sapiens) Hs.197042 CDNA clone IMAGE:4667929 EST mining ESTs, Weakly similar to A37413 103 4738621 IMAGE calbindin D, 28K (Homo sapiens) Hs.197042 CDNA clone IMAGE:4667929 EST mining

ESTs, Weakly similar to ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION WARNING 104 2441445 IMAGE ENTRY (Homo sapiens) Hs.592891 CDNA clone IMAGE:4820528 EST mining

ESTs, Weakly similar to ALU7_HUMAN ALU SUBFAMILY SQ SEQUENCE CONTAMINATION WARNING 105 2459422 IMAGE ENTRY (Homo sapiens) Hs.592891 CDNA clone IMAGE:4820528 EST mining Homo sapiens, clone MGC:17225 IMAGE:4151716, mRNA, open reading 106 773446 IMAGE complete cds Hs.497443 frame 37 C1orf37 EST mining open reading C14orf12 107 666254 IMAGE hypothetical protein Hs.592297 frame 129 9 EST mining Chromosome 14 open reading C14orf15 108 2094077 IMAGE hypothetical protein MGC13251 Hs.317821 frame 151 1 EST mining chromosome 14 open reading frame Chromosome 14 open reading 109 767346 IMAGE 4 Hs.179260 frame 4 C14orf4 EST mining 7 3

similar to 40S RIBOSOMAL PROTEIN S24 (S19) gene:LOC94929 InterimID:94929 Chromosome 14 open reading 110 942167 IMAGE from:NT_026437 Hs.509966 frame 58 C14orf58 EST mining Chromosome 14 open reading 111 4429153 IMAGE hypothetical protein MGC15563 Hs.442782 frame 94 C14orf94 EST mining Chromosome 14 open reading 112 5450302 IMAGE hypothetical protein MGC15563 Hs.442782 frame 94 C14orf94 EST mining Chromosome 15 open reading 113 4738025 IMAGE 60S ribosomal protein L30 isolog Hs.274772 frame 15 C15orf15 EST mining Chromosome 15 open reading 114 4806924 IMAGE 60S ribosomal protein L30 isolog Hs.274772 frame 15 C15orf15 EST mining Chromosome 15 open reading 115 4807252 IMAGE 60S ribosomal protein L30 isolog Hs.274772 frame 15 C15orf15 EST mining Chromosome 15 open reading 116 4808079 IMAGE 60S ribosomal protein L30 isolog Hs.274772 frame 15 C15orf15 EST mining hypothetical protein open reading 117 5455969 IMAGE DKFZp761D0211 Hs.513225 frame 9 C16orf9 EST mining hypothetical protein Chromosome 16 open reading 118 5457252 IMAGE DKFZp761D0211 Hs.513225 frame 9 C16orf9 EST mining Chromosome 18 open reading 119 4518955 IMAGE HSPC154 protein Hs.532835 frame 55 C18orf55 EST mining Chromosome 18 open reading 120 4805634 IMAGE HSPC154 protein Hs.532835 frame 55 C18orf55 EST mining open reading 121 4520251 IMAGE hypothetical protein FLJ20550 Hs.274422 frame 27 C20orf27 EST mining Chromosome 20 open reading 122 4804550 IMAGE hypothetical protein FLJ20550 Hs.274422 frame 27 C20orf27 EST mining Chromosome 20 open reading 123 3039219 IMAGE ESTs Hs.472285 frame 74 C20orf74 EST mining 7 4

Chromosome 20 open reading 124 3151868 IMAGE ESTs Hs.472285 frame 74 C20orf74 EST mining hypothetical protein FLJ20627 gene:FLJ20627 LocusID:55005 open reading 125 4429335 IMAGE from:NT_007295 Hs.486835 frame 96 C6orf96 EST mining hypothetical protein FLJ20627 gene:FLJ20627 LocusID:55005 Chromosome 6 open reading 126 4807986 IMAGE from:NT_007295 Hs.486835 frame 96 C6orf96 EST mining ESTs, Moderately similar to S65657 alpha-1C-adrenergic receptor splice form 2 (Homo open reading 127 4538208 IMAGE sapiens) Hs.86970 frame 53 C8orf53 EST mining ESTs, Moderately similar to S65657 alpha-1C-adrenergic receptor splice form 2 (Homo Chromosome 8 open reading 128 4667676 IMAGE sapiens) Hs.86970 frame 53 C8orf53 EST mining

hypothetical gene supported by BC009435 gene:LOC91376 open reading 129 5446972 IMAGE InterimID:91376 from:NT_030037 Hs.19322 frame 140 C9orf140 EST mining

hypothetical gene supported by BC009435 gene:LOC91376 Chromosome 9 open reading 130 5451391 IMAGE InterimID:91376 from:NT_030037 Hs.19322 frame 140 C9orf140 EST mining Homo sapiens cDNA FLJ31795 fis, Coiled-coil domain containing 131 4430397 IMAGE clone NT2RI2008812 Hs.579115 43 CCDC43 EST mining Homo sapiens cDNA FLJ31795 fis, Coiled-coil domain containing 132 5456348 IMAGE clone NT2RI2008812 Hs.579115 43 CCDC43 EST mining

133 1632157 IMAGE KIAA0633 protein Hs.99141 Cordon-bleu homolog (mouse) COBL EST mining C-type lectin domain family 2, 7

134 953690 IMAGE ESTs Hs.268326 member D CLEC2D EST mining 5

CUB domain containing protein 135 2052086 IMAGE hypothetical protein FLJ22969 Hs.476093 1 CDCP1 EST mining 136 562811 IMAGE 5 Hs.440320 Cullin 5 CUL5 EST mining 137 2963100 IMAGE cyclin B1 Hs.23960 Cyclin B1 CCNB1 EST mining cyclin D1 (PRAD1: parathyroid 138 3508088 IMAGE adenomatosis 1) Hs.523852 Cyclin D1 CCND1 EST mining specific granule protein (28 kDa), 139 1253375 IMAGE CRISP-3 Hs.404466 Cysteine-rich secretory protein 3 CRISP3 EST mining specific granule protein (28 kDa), 140 1274584 IMAGE CRISP-3 Hs.404466 Cysteine-rich secretory protein 3 CRISP3 EST mining 141 3535771 IMAGE cytochrome b-561 Hs.355264 Cytochrome b-561 CYB561 EST mining cytoplasmic FMRP interacting Cytoplasmic FMR1 interacting 142 4738995 IMAGE protein 1 Hs.26704 protein 1 CYFIP1 EST mining cytoplasmic FMRP interacting Cytoplasmic FMR1 interacting 143 5451248 IMAGE protein 1 Hs.26704 protein 1 CYFIP1 EST mining

Cytoplasmic 144 4830450 IMAGE ribosomal protein S17 Hs.547988 element binding protein 1 CPEB1 EST mining Defective in sister chromatid cohesion homolog 1 (S. 145 3454505 IMAGE hypothetical protein MGC5528 Hs.315167 cerevisiae) DCC1 EST mining

Homo sapiens 146 1045098 IMAGE II, lysosomal (DNASE2), mRNA Hs.118243 Deoxyribonuclease II, lysosomal DNASE2 EST mining

Homo sapiens deoxyribonuclease 147 1208024 IMAGE II, lysosomal (DNASE2), mRNA Hs.118243 Deoxyribonuclease II, lysosomal DNASE2 EST mining

Homo sapiens hypothetical protein Dephospho-CoA kinase domain 148 4428393 IMAGE FLJ22955 (FLJ22955), mRNA Hs.463148 containing DCAKD EST mining 7 6

Homo sapiens hypothetical protein Dephospho-CoA kinase domain 149 5457470 IMAGE FLJ22955 (FLJ22955), mRNA Hs.463148 containing DCAKD EST mining Dexamethasone-induced 150 2958243 IMAGE MYLE protein Hs.592051 transcript DEXI EST mining Homo sapiens, clone IMAGE:3447394, mRNA, partial 151 429533 IMAGE cds Hs.515081 Dipeptidyl-peptidase 9 DPP9 EST mining

Discs, large homolog 2, 152 79032 IMAGE CGI-82 protein Hs.503453 chapsyn-110 () DLG2 EST mining DnaJ (Hsp40) homolog, 153 4518128 IMAGE KIAA0678 protein Hs.12707 subfamily C, member 13 DNAJC13 EST mining DnaJ (Hsp40) homolog, 154 4737439 IMAGE KIAA0678 protein Hs.12707 subfamily C, member 13 DNAJC13 EST mining Endoplasmic reticulum-golgi intermediate compartment 155 4477079 IMAGE KIAA1181 protein Hs.509163 (ERGIC) 1 ERGIC1 EST mining Endoplasmic reticulum-golgi intermediate compartment 156 4760916 IMAGE KIAA1181 protein Hs.509163 (ERGIC) 1 ERGIC1 EST mining

endothelial differentiation-related Endothelial differentiation- 157 2969739 IMAGE factor 1 Hs.174050 related factor 1 EDF1 EST mining

endothelial differentiation-related Endothelial differentiation- 158 4299357 IMAGE factor 1 Hs.174050 related factor 1 EDF1 EST mining ESTs, Weakly similar to 2004399A chromosomal protein (Homo Enoyl Coenzyme A hydratase 159 1631466 IMAGE sapiens) Hs.476319 domain containing 2 ECHDC2 EST mining 7

160 109863 IMAGE epithelial 2 Hs.531561 Epithelial membrane protein 2 EMP2 EST mining 7

161 4667989 IMAGE secretory protein SEC8 Hs.592280 Exocyst complex component 4 EXOC4 EST mining

162 5296566 IMAGE secretory protein SEC8 Hs.592280 Exocyst complex component 4 EXOC4 EST mining

KIAA1067 protein gene:KIAA1067 163 4476620 IMAGE LocusID:23265 from:NT_010641 Hs.533985 Exocyst complex component 7 EXOC7 EST mining

KIAA1067 protein gene:KIAA1067 164 5445768 IMAGE LocusID:23265 from:NT_010641 Hs.533985 Exocyst complex component 7 EXOC7 EST mining 165 982653 IMAGE ESTs Hs.115792 EXOSC7 EST mining Family with sequence similarity 166 239692 IMAGE KIAA1170 protein Hs.489988 40, member B FAM40B EST mining Family with sequence similarity 167 3501828 IMAGE hypothetical protein Hs.126941 49, member B FAM49B EST mining

hypothetical protein MGC14128:2 | Homo sapiens cDNA: FLJ22033 fis, clone HEP08810, highly similar to HSU43374 Human normal Family with sequence similarity 168 3610547 IMAGE keratinocyte mRNA:1 Hs.379821 83, member A FAM83A EST mining ESTs, Weakly similar to I38022 hypothetical protein (Homo Family with sequence similarity 169 810089 IMAGE sapiens) Hs.67776 83, member H FAM83H EST mining ESTs, Weakly similar to I38022 hypothetical protein (Homo Family with sequence similarity 170 1650748 IMAGE sapiens) Hs.124951 84, member B FAM84B EST mining Fanconi anemia, complementation Fanconi anemia, 171 2541527 IMAGE group E Hs.302003 complementation group E FANCE EST mining Fanconi anemia, complementation Fanconi anemia, 172 5220129 IMAGE group E Hs.302003 complementation group E FANCE EST mining 7 8

Fascin homolog 1, actin- singed (Drosophila)-like (sea urchin bundling protein 173 4429332 IMAGE fascin homolog like) Hs.118400 (Strongylocentrotus purpuratus) FSCN1 EST mining

Fascin homolog 1, actin- singed (Drosophila)-like (sea urchin bundling protein 174 5449146 IMAGE fascin homolog like) Hs.118400 (Strongylocentrotus purpuratus) FSCN1 EST mining

175 4519190 IMAGE ferritin, heavy polypeptide 1 Hs.524910 Ferritin, heavy polypeptide 1 FTH1 EST mining

176 4738132 IMAGE ferritin, heavy polypeptide 1 Hs.524910 Ferritin, heavy polypeptide 1 FTH1 EST mining heparin-binding growth factor Fibroblast growth factor binding 177 3030037 IMAGE binding protein Hs.1690 protein 1 FGFBP1 EST mining heparin-binding growth factor Fibroblast growth factor binding 178 3689309 IMAGE binding protein Hs.1690 protein 1 FGFBP1 EST mining flap structure-specific Flap structure-specific 179 2821792 IMAGE 1 Hs.409065 endonuclease 1 FEN1 EST mining 180 3688745 IMAGE follistatin Hs.9914 Follistatin FST EST mining

181 1610546 IMAGE hepatocyte nuclear factor 3, alpha Hs.163484 Forkhead box A1 FOXA1 EST mining 182 4520368 IMAGE FOS-like antigen-1 Hs.283565 FOS-like antigen 1 FOSL1 EST mining 183 5432425 IMAGE FOS-like antigen-1 Hs.283565 FOS-like antigen 1 FOSL1 EST mining

184 213850 IMAGE four jointed box 1 (Drosophila) Hs.39384 Four jointed box 1 (Drosophila) FJX1 EST mining 185 3445488 IMAGE FtsJ homolog 3 (E. coli) Hs.463785 FtsJ homolog 3 (E. coli) FTSJ3 EST mining 186 810229 IMAGE RNA helicase-related protein Hs.463785 FtsJ homolog 3 (E. coli) FTSJ3 EST mining

Full-length cDNA clone CS0DB009YL20 of Cot 10-

normalized of Homo sapiens 7 9

187 245583 IMAGE hypothetical protein FLJ20651 Hs.130643 (human) EST mining

Full-length cDNA clone CS0DB009YL20 of Neuroblastoma Cot 10- normalized of Homo sapiens 188 296889 IMAGE hypothetical protein FLJ20651 Hs.130643 (human) EST mining

Full-length cDNA clone CS0DB009YL20 of Neuroblastoma Cot 10- normalized of Homo sapiens 189 362875 IMAGE hypothetical protein FLJ20651 Hs.130643 (human) EST mining FXYD domain-containing ion FXYD domain containing ion 190 3997821 IMAGE transport regulator 3 Hs.301350 transport regulator 3 FXYD3 EST mining

191 4669868 IMAGE (DPI, DPII) Hs.193832 G patch domain containing 4 GPATC4 EST mining

192 5451153 IMAGE desmoplakin (DPI, DPII) Hs.193832 G patch domain containing 4 GPATC4 EST mining

hypothetical protein FLJ22684:1 | Homo sapiens cDNA FLJ30646 fis, clone CTONG2004716, weakly similar to Rattus norvegicus mRNA 193 3931152 IMAGE for seven transmembrane receptor:1 Hs.256897 G protein-coupled receptor 110 GPR110 EST mining

hypothetical protein FLJ22684:1 | Homo sapiens cDNA FLJ30646 fis, clone CTONG2004716, weakly similar to Rattus norvegicus mRNA 194 4431307 IMAGE for seven transmembrane receptor:1 Hs.256897 G protein-coupled receptor 110 GPR110 EST mining

GA-binding protein transcription GA binding protein transcription 195 84443 IMAGE factor, beta subunit 2 (47kD) Hs.511316 factor, beta subunit 2 GABPB2 EST mining 8 0

GA-binding protein transcription GA binding protein transcription 196 2160324 IMAGE factor, beta subunit 2 (47kD) Hs.511316 factor, beta subunit 2 GABPB2 EST mining GABA(A) receptor-associated GABA(A) receptor-associated GABARA 197 4429988 IMAGE protein Hs.84359 protein P EST mining GABA(A) receptor-associated GABA(A) receptor-associated GABARA 198 4666622 IMAGE protein Hs.84359 protein P EST mining Homo sapiens, clone MGC:17399 IMAGE:3920640, mRNA, Galactose mutarotase (aldose 1- 199 927237 IMAGE complete cds Hs.435012 epimerase) GALM EST mining Homo sapiens, clone MGC:17399 IMAGE:3920640, mRNA, Galactose mutarotase (aldose 1- 200 2153833 IMAGE complete cds Hs.435012 epimerase) GALM EST mining

GCN1 (general control of amino- GCN1 general control of amino- 201 40567 IMAGE acid synthesis 1, yeast)-like 1 Hs.298716 acid synthesis 1-like 1 (yeast) GCN1L1 EST mining ESTs, Moderately similar to env *GENOMIC *GENOM 202 1612075 IMAGE protein (Homo sapiens) #1 GENOMIC#1_Chr15q21.3 IC#1 EST mining

GENOMIC#2_Chr16p13.3: OA: Homo sapiens cDNA FLJ32252 fis, Homo sapiens cDNA FLJ32252 clone PROST1000167, weakly *GENOMIC fis, clone PROST1000167, *GENOM 203 2531417 IMAGE similar to I #2 weakly similar to IC#2 EST mining *GENOMIC *GENOM 204 1275017 IMAGE ESTs #3 GENOMIC#3_Chr3q21 IC#3 EST mining *GENOMIC *GENOM 205 1274324 IMAGE #4 GENOMIC#4_Chr4 IC#4 EST mining

open reading frame GIPC PDZ domain containing 206 2960992 IMAGE 3 Hs.6454 family, member 1 GIPC1 EST mining 8 1

Homo sapiens cDNA FLJ12534 fis, Glycerol-3-phosphate 207 253241 IMAGE clone NT2RM4000244 Hs.512382 dehydrogenase 2 (mitochondrial) GPD2 EST mining

Glycosylphosphatidylinositol anchor attachment protein 1 208 3613853 IMAGE exosome component Rrp41 Hs.97726 homolog (yeast) GPAA1 EST mining Golgi SNAP receptor complex 209 625863 IMAGE Hs.560517 member 1 GOSR1 EST mining

G-protein signalling modulator 2 210 415264 IMAGE LGN protein Hs.584901 (AGS3-like, C. elegans) GPSM2 EST mining

211 4806780 IMAGE H2A histone family, member Y Hs.586218 H2A histone family, member Y H2AFY EST mining

212 4807024 IMAGE H2A histone family, member Y Hs.586218 H2A histone family, member Y H2AFY EST mining

ESTs, Highly similar to CH60_HUMAN 60 KDA , MITOCHONDRIAL PRECURSOR Heat shock 60kDa protein 1 213 4738295 IMAGE (Homo sapiens) Hs.471014 () HSPD1 EST mining

ESTs, Highly similar to CH60_HUMAN 60 KDA HEAT SHOCK PROTEIN, MITOCHONDRIAL PRECURSOR Heat shock 60kDa protein 1 214 4669901 IMAGE (Homo sapiens) Hs.595053 (chaperonin) HSPD1 EST mining 215 259633 IMAGE DKFZP564G092 protein Hs.51891 Hect domain and RLD 4 HERC4 EST mining Homo sapiens, clone IMAGE:3838859, mRNA, partial Homo sapiens, clone 216 4804539 IMAGE cds Hs.289232 IMAGE:4132557, mRNA EST mining 8 2

Homo sapiens, clone IMAGE:3838859, mRNA, partial Homo sapiens, clone 217 4808067 IMAGE cds Hs.289232 IMAGE:4132557, mRNA EST mining Homo sapiens, Similar to AD038, clone IMAGE:3838464, 218 1902202 IMAGE ESTs Hs.585468 mRNA EST mining

Homo sapiens, Similar to Homo sapiens hypothetical protein hypothetical protein FLJ10233, 219 5452681 IMAGE FLJ10233 (FLJ10233), mRNA Hs.374448 clone IMAGE:5555860, mRNA EST mining Homo sapiens, Similar to LOC157533, clone 220 4520109 IMAGE Hs.529758 IMAGE:4520109, mRNA EST mining Homo sapiens, Similar to LOC157533, clone 221 4665569 IMAGE Hs.529758 IMAGE:4520109, mRNA EST mining 222 840663 IMAGE KIAA1902 protein Hs.372208 HSPC159 protein HSPC159 EST mining

223 2822949 IMAGE hydroxymethylbilane synthase Hs.82609 Hydroxymethylbilane synthase HMBS EST mining

224 4178944 IMAGE hydroxymethylbilane synthase Hs.82609 Hydroxymethylbilane synthase HMBS EST mining Homo sapiens, clone IMAGE:3357862, mRNA, partial LOC1522 225 3689899 IMAGE cds Hs.118820 Hypothetical protein BC007882 17 EST mining Homo sapiens, clone IMAGE:4477067, mRNA, partial LOC1525 226 4537990 IMAGE cds Hs.370904 Hypothetical protein BC012029 73 EST mining Homo sapiens, clone IMAGE:4477067, mRNA, partial LOC1525 227 4807108 IMAGE cds Hs.370904 Hypothetical protein BC012029 73 EST mining 8

228 4669968 IMAGE hypothetical protein FLJ14466 Hs.55148 Hypothetical protein FLJ14466 FLJ14466 EST mining 3

229 4760990 IMAGE hypothetical protein FLJ14466 Hs.55148 Hypothetical protein FLJ14466 FLJ14466 EST mining

230 3941377 IMAGE hypothetical protein FLJ20512 Hs.105606 Hypothetical protein FLJ20512 FLJ20512 EST mining Homo sapiens cDNA FLJ32252 fis, clone PROST1000167, weakly 231 2390011 IMAGE similar to SYNAPSIN I Hs.250557 Hypothetical protein FLJ32252 FLJ32252 EST mining

232 205745 IMAGE hypothetical protein Hs.474643 Hypothetical protein HSPC117 HSPC117 EST mining Homo sapiens, clone IMAGE:4428577, mRNA, partial Hypothetical protein LOC1296 233 4806332 IMAGE cds Hs.7155 LOC129607 07 EST mining Homo sapiens, clone IMAGE:4428577, mRNA, partial Hypothetical protein LOC1296 234 4808048 IMAGE cds Hs.7155 LOC129607 07 EST mining Homo sapiens cDNA FLJ30808 fis, Hypothetical protein LOC2836 235 3931227 IMAGE clone FEBRA2001383 Hs.87194 LOC283658 58 EST mining Homo sapiens cDNA FLJ30808 fis, Hypothetical protein LOC2836 236 4518991 IMAGE clone FEBRA2001383 Hs.87194 LOC283658 58 EST mining Homo sapiens cDNA FLJ30428 fis, Hypothetical protein LOC4409 237 782537 IMAGE clone BRACE2008941 Hs.516398 LOC440910 10 EST mining MGC1033 238 3641657 IMAGE hypothetical protein MGC10334 Hs.546667 Hypothetical protein MGC10334 4 EST mining MGC7207 239 956648 IMAGE ESTs Hs.592178 Hypothetical protein MGC72075 5 EST mining

Homo sapiens hypothetical protein 240 4429075 IMAGE FLJ23338 (FLJ23338), mRNA Hs.411865 Importin 4 IPO4 EST mining

Homo sapiens hypothetical protein 241 5456355 IMAGE FLJ23338 (FLJ23338), mRNA Hs.411865 Importin 4 IPO4 EST mining 8 4

242 1948645 IMAGE Hs.407506 Inhibin, alpha INHA EST mining

inositol polyphosphate-5- Inositol polyphosphate-5- 243 4432225 IMAGE phosphatase, 75kD Hs.449942 phosphatase, 75kDa INPP5B EST mining inositol polyphosphate-5- Inositol polyphosphate-5- 244 5449886 IMAGE phosphatase, 75kD Hs.449942 phosphatase, 75kDa INPP5B EST mining inositol(myo)-1(or 4)- Inositol(myo)-1(or 4)- 245 4108362 IMAGE monophosphatase 1 Hs.555086 monophosphatase 1 IMPA1 EST mining inositol(myo)-1(or 4)- Inositol(myo)-1(or 4)- 246 4806231 IMAGE monophosphatase 1 Hs.555086 monophosphatase 1 IMPA1 EST mining

Homo sapiens integrin, alpha 4 Integrin, alpha 4 (antigen (antigen CD49D, alpha 4 subunit of CD49D, alpha 4 subunit of 247 1011586 IMAGE VLA-4 receptor) (ITGA4), mRNA Hs.584767 VLA-4 receptor) ITGA4 EST mining

Homo sapiens integrin, alpha 4 Integrin, alpha 4 (antigen (antigen CD49D, alpha 4 subunit of CD49D, alpha 4 subunit of 248 4805867 IMAGE VLA-4 receptor) (ITGA4), mRNA Hs.584767 VLA-4 receptor) ITGA4 EST mining

, alpha-inducible protein Interferon, alpha-inducible 249 4805372 IMAGE (clone IFI-6-16) Hs.523847 protein (clone IFI-6-16) G1P3 EST mining

interferon, alpha-inducible protein Interferon, alpha-inducible 250 5451603 IMAGE (clone IFI-6-16) Hs.523847 protein (clone IFI-6-16) G1P3 EST mining 251 4429551 IMAGE hypothetical protein MGC10763 Hs.129959 Interleukin 17 receptor C IL17RC EST mining Homo sapiens interleukin 6 receptor 252 4666443 IMAGE (IL6R), mRNA Hs.591492 Interleukin 6 receptor IL6R EST mining Homo sapiens interleukin 6 receptor 253 4667063 IMAGE (IL6R), mRNA Hs.591492 Interleukin 6 receptor IL6R EST mining ESTs, Highly similar to YZA1_HUMAN HYPOTHETICAL PROTEIN Interleukin 6 signal transducer 254 4429808 IMAGE (Homo sapiens) Hs.532082 (gp130, oncostatin M receptor) IL6ST EST mining 8 5

interleukin-1 receptor-associated kinase 1 gene:IRAK1 MIM:300283 Interleukin-1 receptor-associated 255 4431949 IMAGE from:NT_025965 Hs.522819 kinase 1 IRAK1 EST mining interleukin-1 receptor-associated kinase 1 gene:IRAK1 MIM:300283 Interleukin-1 receptor-associated 256 5456147 IMAGE from:NT_025965 Hs.522819 kinase 1 IRAK1 EST mining

ESTs, Weakly similar to Y063_HUMAN HYPOTHETICAL PROTEIN KIAA0063 (Homo sapiens), similar to RIKEN cDNA 257 878810 IMAGE 1110007C05 gene Hs.467151 Josephin domain containing 2 JOSD2 EST mining

258 4431491 IMAGE KIAA0677 gene product Hs.155983 Jumonji domain containing 2A JMJD2A EST mining

259 5457350 IMAGE KIAA0677 gene product Hs.155983 Jumonji domain containing 2A JMJD2A EST mining karyopherin alpha 6 (importin alpha Karyopherin alpha 6 (importin 260 195313 IMAGE 7) Hs.591500 alpha 7) KPNA6 EST mining

(epidermolysis bullosa 261 3610489 IMAGE simplex, Dowling-Meara, Koebner) Hs.2785 KRT17 EST mining Keratin, hair, basic, 6 262 1509761 IMAGE keratin, hair, basic, 6 (monilethrix) Hs.278658 (monilethrix) KRTHB6 EST mining Homo sapiens cDNA FLJ12038 fis, KIAA025 263 1274711 IMAGE clone HEMBB1001922 Hs.9997 KIAA0256 gene product 6 EST mining Homo sapiens cDNA FLJ12038 fis, KIAA025 264 4538193 IMAGE clone HEMBB1001922 Hs.9997 KIAA0256 gene product 6 EST mining KIAA025 265 3533857 IMAGE KIAA0258 gene product Hs.493804 KIAA0258 8 EST mining KIAA051 266 4429067 IMAGE KIAA0515 protein Hs.495349 KIAA0515 5 EST mining

KIAA051 8 6

267 5447972 IMAGE KIAA0515 protein Hs.495349 KIAA0515 5 EST mining

KIAA067 268 2109030 IMAGE KIAA0676 protein Hs.155829 KIAA0676 protein 6 EST mining KIAA074 269 4428376 IMAGE KIAA0746 protein Hs.479384 KIAA0746 protein 6 EST mining KIAA074 270 5453030 IMAGE KIAA0746 protein Hs.479384 KIAA0746 protein 6 EST mining KIAA132 271 1609538 IMAGE KIAA1324 protein Hs.262811 KIAA1324 4 EST mining KIAA183 272 1206384 IMAGE Hs.369522 KIAA1838 8 EST mining KIAA183 273 953928 IMAGE ESTs Hs.369522 KIAA1838 8 EST mining hypothetical protein FLJ21439:3 | KIAA184 274 428272 IMAGE ESTs:2 Hs.584976 KIAA1840 0 EST mining

killer cell lectin-like receptor Killer cell lectin-like receptor 275 1284336 IMAGE subfamily G, member 1 Hs.558446 subfamily G, member 1 KLRG1 EST mining

276 739983 IMAGE KIAA0042 gene product Hs.3104 family member 14 KIF14 EST mining

277 2149720 IMAGE kinesin family member 1C Hs.435120 Kinesin family member 1C KIF1C EST mining

Leucine rich repeat and coiled- 278 1526050 IMAGE KIAA1764 protein Hs.193115 coil domain containing 1 LRRCC1 EST mining Homo sapiens, clone MGC:20806 IMAGE:4330269, mRNA, Leucine rich repeat containing 279 4430750 IMAGE complete cds Hs.143774 45 LRRC45 EST mining Homo sapiens, clone MGC:20806 IMAGE:4330269, mRNA, Leucine rich repeat containing 280 4477216 IMAGE complete cds Hs.143774 45 LRRC45 EST mining Leucine rich repeat containing 281 4537654 IMAGE KIAA1185 protein Hs.268488 47 LRRC47 EST mining 8 7

Leucine rich repeat containing 282 4669760 IMAGE KIAA1185 protein Hs.268488 47 LRRC47 EST mining Leucine zipper-EF-hand leucine zipper-EF-hand containing containing transmembrane 283 1692164 IMAGE transmembrane protein 1:1 | ESTs:2 Hs.120165 protein 1 LETM1 EST mining Leucine zipper-EF-hand leucine zipper-EF-hand containing containing transmembrane 284 1869095 IMAGE transmembrane protein 1:1 | ESTs:2 Hs.120165 protein 1 LETM1 EST mining ESTs, Weakly similar to CGHU1S collagen alpha 1(I) chain precursor LysM, putative peptidoglycan- 285 243186 IMAGE (Homo sapiens) Hs.513734 binding, domain containing 2 LYSMD2 EST mining Mal, T-cell differentiation 286 565319 IMAGE mal, T-cell differentiation protein 2 Hs.201083 protein 2 MAL2 EST mining Mal, T-cell differentiation 287 813730 IMAGE mal, T-cell differentiation protein 2 Hs.201083 protein 2 MAL2 EST mining

Mal, T-cell differentiation 288 2958058 IMAGE BENE protein Hs.185055 protein-like MALL EST mining Mannosidase, alpha, class 2B, 289 47225 IMAGE KIAA0935 protein Hs.188464 member 2 MAN2B2 EST mining mannosyl (alpha-1,3-)-glycoprotein Mannosyl (alpha-1,3-)- beta-1,4-N- glycoprotein beta-1,4-N- acetylglucosaminyltransferase, acetylglucosaminyltransferase, 290 4760947 IMAGE isoenzyme B Hs.567419 isozyme B MGAT4B EST mining mannosyl (alpha-1,3-)-glycoprotein Mannosyl (alpha-1,3-)- beta-1,4-N- glycoprotein beta-1,4-N- acetylglucosaminyltransferase, acetylglucosaminyltransferase, 291 4761206 IMAGE isoenzyme B Hs.567419 isozyme B MGAT4B EST mining

Homo sapiens hypothetical protein MCF.2 cell line derived 292 4121441 IMAGE FLJ12122 (FLJ12122), mRNA Hs.170422 transforming sequence-like MCF2L EST mining 8 8

Homo sapiens hypothetical protein MCF.2 cell line derived 293 4138009 IMAGE FLJ12122 (FLJ12122), mRNA Hs.170422 transforming sequence-like MCF2L EST mining Homo sapiens mesenchymal stem cell protein DSC96 mRNA, partial Mesenchymal stem cell protein 294 812977 IMAGE cds Hs.596109 DSC96 EST mining

metallothionein 1E (functional):6 | 295 78353 IMAGE metallothionein 1G:5 Hs.513626 Metallothionein 1F (functional) MT1F EST mining 296 4051220 IMAGE metallothionein 2A Hs.534330 Metallothionein 2A MT2A EST mining

methylcrotonoyl-Coenzyme A Methylcrotonoyl-Coenzyme A 297 259374 IMAGE carboxylase 2 (beta) Hs.167531 carboxylase 2 (beta) MCCC2 EST mining Methyltransferase 5 domain METT5D 298 1842559 IMAGE ESTs Hs.243326 containing 1 1 EST mining ESTs, Weakly similar to T08785 hypothetical protein DKFZp586A0522.1 (Homo METTL7 299 4430477 IMAGE sapiens) Hs.51483 Methyltransferase like 7B B EST mining ESTs, Weakly similar to T08785 hypothetical protein DKFZp586A0522.1 (Homo METTL7 300 4666645 IMAGE sapiens) Hs.51483 Methyltransferase like 7B B EST mining mitochondrial ribosomal protein Mitochondrial ribosomal protein 301 417801 IMAGE L27 Hs.7736 L27 MRPL27 EST mining mitochondrially encoded 12S MT- 302 1076948 IMAGE *MT-RNR1 RNA RNR1 EST mining Mitogen-activated protein kinase MAP3K1 303 3282392 IMAGE ESTs Hs.432898 kinase kinase 13 3 EST mining Mitogen-activated protein kinase MAP3K1 304 3702074 IMAGE ESTs Hs.432898 kinase kinase 13 3 EST mining 8 9

mitogen-activated protein kinase- Mitogen-activated protein MAPKAP 305 2988489 IMAGE activated protein kinase 3 Hs.234521 kinase-activated protein kinase 3 K3 EST mining

Homo sapiens B (MAOB), nuclear gene encoding 306 915979 IMAGE mitochondrial protein, mRNA Hs.46732 Monoamine oxidase B MAOB EST mining

Homo sapiens monoamine oxidase B (MAOB), nuclear gene encoding 307 1045515 IMAGE mitochondrial protein, mRNA Hs.46732 Monoamine oxidase B MAOB EST mining Multidrug resistance-related MGC1317 308 3354369 IMAGE hypothetical protein MGC13170 Hs.256301 protein 0 EST mining *MULTIPL *MULTIP 309 5448491 IMAGE ESTs E#1 MULTIPLE#1 LE#1 EST mining *MULTIPL *MULTIP 310 954072 IMAGE E#2 MULTIPLE#2 LE#2 EST mining *MULTIPL *MULTIP 311 941612 IMAGE ESTs E#3 MULTIPLE#3 LE#3 EST mining

*MULTIPL MULTIPLE#4; OA: Cartilage *MULTIP 312 773392 IMAGE cartilage linking protein 1 E#4 linking protein 1 LE#4 EST mining MULTIPLE#6; OA: ESTs, ESTs, Moderately similar to env *MULTIPL Moderately similar to env *MULTIP 313 4805007 IMAGE protein (Homo sapiens) E#6 protein (Homo sapiens) LE#6 EST mining

MULTIPLE#7; OA: hypothetical gene supported by hypothetical gene supported by M77233; BC002866; M77233; BC002866; NM_001011 NM_001011 gene:LOC86389 gene:LOC86389 InterimID:86389 *MULTIPL InterimID:86389 *MULTIP 9

314 1010565 IMAGE from:NT_004636 E#7 from:NT_004636 LE#7 EST mining 0

Myosin head domain containing 315 3629870 IMAGE hypothetical protein FLJ22865 Hs.302051 1 MYOHD1 EST mining NAD(P)H dehydrogenase, quinone NAD(P)H dehydrogenase, 316 2901105 IMAGE 1 Hs.406515 quinone 1 NQO1 EST mining NAD(P)H dehydrogenase, quinone NAD(P)H dehydrogenase, 317 3349257 IMAGE 1 Hs.406515 quinone 1 NQO1 EST mining Neuroblastoma-amplified 318 5446023 IMAGE neuroblastoma-amplified protein Hs.467759 protein NAG EST mining Neuroblastoma-amplified 319 5455434 IMAGE neuroblastoma-amplified protein Hs.467759 protein NAG EST mining

Neuronal guanine nucleotide 320 4430529 IMAGE ESTs Hs.97316 exchange factor NGEF EST mining Neutral sphingomyelinase (N- neutral sphingomyelinase (N- SMase) activation associated 321 289570 IMAGE SMase) activation associated factor Hs.372000 factor NSMAF EST mining Homo sapiens cDNA: FLJ21362 322 327195 IMAGE fis, clone COL02886 Hs.309489 NIPA-like domain containing 2 NPAL2 EST mining

NK (Drosophila), family NK3 transcription factor related, 323 757435 IMAGE 3, A Hs.55999 locus 1 (Drosophila) NKX3-1 EST mining *NOT_FOU *NOT_F 324 998273 IMAGE ND#1 NOT_FOUND#1 OUND#1 EST mining *NOT_FOU *NOT_F 325 2377151 IMAGE ND#2 NOT_FOUND#2 OUND#2 EST mining NOT_FOUND#3; OA: ESTs, ESTs, Moderately similar to Moderately similar to 810024E 810024E cytochrome oxidase III *NOT_FOU cytochrome oxidase III (Homo *NOT_F 326 2132391 IMAGE (Homo sapiens) ND#3 sapiens) OUND#3 EST mining 9 1

NOT_FOUND#3; OA: ESTs, ESTs, Moderately similar to Moderately similar to 810024E 810024E cytochrome oxidase III *NOT_FOU cytochrome oxidase III (Homo *NOT_F 327 2616699 IMAGE (Homo sapiens) ND#3 sapiens) OUND#3 EST mining

*NOT_FOU NOT_FOUND#4; OA: *NOT_F 328 4738589 IMAGE hypothetical protein MGC10763 ND#4 hypothetical protein MGC10763 OUND#4 EST mining

nuclear factor of kappa light Nuclear factor of kappa light polypeptide gene enhancer in B- polypeptide gene enhancer in B- 329 2013642 IMAGE cells inhibitor-like 1 Hs.2764 cells inhibitor-like 1 NFKBIL1 EST mining

nuclear factor of kappa light Nuclear factor of kappa light polypeptide gene enhancer in B- polypeptide gene enhancer in B- 330 3628374 IMAGE cells inhibitor-like 2 Hs.591397 cells inhibitor-like 2 NFKBIL2 EST mining

331 954186 IMAGE ESTs Hs.446678 coactivator 2 NCOA2 EST mining

332 982101 IMAGE ESTs Hs.446678 Nuclear receptor coactivator 2 NCOA2 EST mining Nucleolar and spindle associated 333 3996377 IMAGE calcium binding protein P22 Hs.406234 protein 1 NUSAP1 EST mining

nucleophosmin (nucleolar Nucleophosmin (nucleolar 334 2821577 IMAGE phosphoprotein B23, numatrin) Hs.557550 phosphoprotein B23, numatrin) NPM1 EST mining

nucleophosmin (nucleolar Nucleophosmin (nucleolar 335 3932340 IMAGE phosphoprotein B23, numatrin) Hs.557550 phosphoprotein B23, numatrin) NPM1 EST mining 336 413299 IMAGE PRO2463 protein Hs.507537 Nucleoporin like 1 NUPL1 EST mining 9 2

nucleotide binding protein 2 (E.coli Nucleotide binding protein 2 337 4520584 IMAGE MinD like) Hs.256549 (MinD homolog, E. coli) NUBP2 EST mining

nucleotide binding protein 2 (E.coli Nucleotide binding protein 2 338 5456058 IMAGE MinD like) Hs.256549 (MinD homolog, E. coli) NUBP2 EST mining olfactory receptor, family 7, Olfactory receptor, family 7, subfamily E, member 47 subfamily E, member 47 339 1049291 IMAGE pseudogene Hs.524431 pseudogene OR7E47P EST mining choroideremia-like ( escort Opsin 3 (encephalopsin, 340 265537 IMAGE protein 2) Hs.534399 panopsin) OPN3 EST mining methylmalonate-semialdehyde 341 4509810 IMAGE dehydrogenase Hs.523854 Oral cancer overexpressed 1 ORAOV1 EST mining PC2 (positive 2, PC2 (positive cofactor 2, multiprotein complex) multiprotein complex) /Q- glutamine/Q-rich-associated 342 5445639 IMAGE rich-associated protein Hs.517421 protein PCQAP EST mining PC2 (positive cofactor 2, PC2 (positive cofactor 2, multiprotein complex) multiprotein complex) glutamine/Q- glutamine/Q-rich-associated 343 5451877 IMAGE rich-associated protein Hs.517421 protein PCQAP EST mining 344 4666679 IMAGE DKFZP727G051 protein Hs.460124 PHD finger protein 19 PHF19 EST mining 345 5455601 IMAGE DKFZP727G051 protein Hs.460124 PHD finger protein 19 PHF19 EST mining

Phenylalanine-tRNA synthetase- 346 2823551 IMAGE -tRNA synthetase-like Hs.23111 like, alpha subunit FARSLA EST mining 347 812217 IMAGE 8B Hs.584830 Phosphodiesterase 8B PDE8B EST mining ESTs, Moderately similar to env 348 4805440 IMAGE protein (Homo sapiens) Hs.459855 Phosphomannomutase 2 PMM2 EST mining 9 3

phosphoprotein associated with Phosphoprotein associated with -enriched glycosphingolipid microdomains 349 4428936 IMAGE microdomains Hs.266175 1 PAG1 EST mining

phosphoprotein associated with Phosphoprotein associated with glycosphingolipid-enriched glycosphingolipid microdomains 350 4738238 IMAGE microdomains Hs.266175 1 PAG1 EST mining HUMUPAB Human RNA for -type plasminogen 351 4477825 IMAGE activator, partial cds Hs.77274 Plasminogen activator, urokinase PLAU EST mining HUMUPAB Human RNA for urokinase-type plasminogen 352 5452039 IMAGE activator, partial cds Hs.77274 Plasminogen activator, urokinase PLAU EST mining -promoting complex 1; 353 4839962 IMAGE meiotic checkpoint regulator Hs.528525 Plasminogen-like B2 PLGLB2 EST mining Pleckstrin homology domain containing, family B (evectins) PLEKHB 354 5448466 IMAGE hypothetical protein FLJ20783 Hs.469944 member 2 2 EST mining Pleckstrin homology domain containing, family B (evectins) PLEKHB 355 5448519 IMAGE hypothetical protein FLJ20783 Hs.469944 member 2 2 EST mining

protective protein for beta- Poly (ADP-ribose) polymerase 356 769933 IMAGE galactosidase (galactosialidosis) Hs.348609 family, member 10 PARP10 EST mining

protective protein for beta- Poly (ADP-ribose) polymerase 357 810038 IMAGE galactosidase (galactosialidosis) Hs.348609 family, member 10 PARP10 EST mining poly(A)-binding protein, Poly(A) binding protein, 358 4805617 IMAGE cytoplasmic 1 Hs.387804 cytoplasmic 1 PABPC1 EST mining HSU63542 Human putative FAP Poly(A) binding protein, 9

359 4806620 IMAGE protein mRNA, partial cds Hs.387804 cytoplasmic 1 PABPC1 EST mining 4

HSU63542 Human putative FAP Poly(A) binding protein, 360 4808097 IMAGE protein mRNA, partial cds Hs.387804 cytoplasmic 1 PABPC1 EST mining

polymerase (RNA) II (DNA Polymerase (RNA) II (DNA 361 3139275 IMAGE directed) polypeptide I (14.5kD) Hs.47062 directed) polypeptide I, 14.5kDa POLR2I EST mining Proliferation-associated 2G4, 362 4670043 IMAGE proliferation-associated 2G4, 38kD Hs.524498 38kDa PA2G4 EST mining Proliferation-associated 2G4, 363 4737809 IMAGE proliferation-associated 2G4, 38kD Hs.524498 38kDa PA2G4 EST mining Proliferation-associated 2G4, 364 1115874 IMAGE ESTs Hs.524498 38kDa PA2G4 EST mining Proliferation-associated 2G4, 365 1115898 IMAGE ESTs Hs.524498 38kDa PA2G4 EST mining 366 321271 IMAGE Rho GTPase activating protein 8 Hs.102336 Proline rich 5 (renal) PRR5 EST mining 367 856115 IMAGE prostate stem cell antigen Hs.379010 Prostate stem cell antigen PSCA EST mining

368 1913366 IMAGE protease, serine, 4 ( 4, brain) Hs.128013 Protease, serine, 3 (mesotrypsin) PRSS3 EST mining (prosome, proteasome (prosome, macropain) macropain) 26S subunit, 369 5432424 IMAGE 26S subunit, ATPase, 3 Hs.250758 ATPase, 3 PSMC3 EST mining Proteasome (prosome, proteasome (prosome, macropain) macropain) 26S subunit, 370 5447774 IMAGE 26S subunit, ATPase, 3 Hs.250758 ATPase, 3 PSMC3 EST mining

protein (peptidyl-prolyl cis/trans Protein (peptidylprolyl cis/trans isomerase) NIMA-interacting, 4 isomerase) NIMA-interacting, 4 371 3683177 IMAGE (parvulin) Hs.118076 (parvulin) PIN4 EST mining

protein kinase, cAMP-dependent, Protein kinase, cAMP- 372 5447915 IMAGE catalytic, alpha Hs.194350 dependent, catalytic, alpha PRKACA EST mining 9 5

protein kinase, cAMP-dependent, Protein kinase, cAMP- 373 5449798 IMAGE catalytic, alpha Hs.194350 dependent, catalytic, alpha PRKACA EST mining

Protein O-linked mannose O-linked mannose beta1,2-N- beta1,2-N- POMGNT 374 3138971 IMAGE acetylglucosaminyltransferase Hs.525134 acetylglucosaminyltransferase 1 EST mining

375 4538189 IMAGE protein regulator of cytokinesis 1 Hs.567385 Protein regulator of cytokinesis 1 PRC1 EST mining

376 5447830 IMAGE protein regulator of cytokinesis 1 Hs.567385 Protein regulator of cytokinesis 1 PRC1 EST mining

protein tyrosine phosphatase type Protein tyrosine phosphatase 377 4428986 IMAGE IVA, member 1 Hs.227777 type IVA, member 1 PTP4A1 EST mining

protein tyrosine phosphatase type Protein tyrosine phosphatase 378 4518098 IMAGE IVA, member 1 Hs.227777 type IVA, member 1 PTP4A1 EST mining

pre-mRNA processing factor 31 PRP31 pre-mRNA processing 379 182818 IMAGE homolog (yeast) Hs.515598 factor 31 homolog (yeast) PRPF31 EST mining Pyridoxal (pyridoxine, vitamin 380 25360 IMAGE hypothetical protein FLJ21324 Hs.284491 B6) kinase PDXK EST mining Pyridoxal (pyridoxine, vitamin 381 4518764 IMAGE hypothetical protein dJ37E16.5 Hs.445351 B6) phosphatase PDXP EST mining Pyridoxal (pyridoxine, vitamin 382 4806233 IMAGE hypothetical protein dJ37E16.5 Hs.445351 B6) phosphatase PDXP EST mining

383 3450321 IMAGE RAD1 (S. pombe) homolog Hs.531879 RAD1 homolog (S. pombe) RAD1 EST mining 384 954154 IMAGE Hs.558371 Reelin RELN EST mining 385 4737873 IMAGE reelin Hs.558371 Reelin RELN EST mining Regenerating islet-derived 9 6

386 3850358 IMAGE regenerating gene type IV Hs.171480 family, member 4 REG4 EST mining

Regenerating islet-derived 387 4839673 IMAGE regenerating gene type IV Hs.171480 family, member 4 REG4 EST mining binding protein 388 3356558 IMAGE retinoblastoma-binding protein 4 Hs.16003 4 RBBP4 EST mining Retinoblastoma binding protein 389 1601546 IMAGE retinoblastoma-binding protein 7 Hs.495755 7 RBBP7 EST mining uncharacterized bone marrow Rho GTPase activating protein ARHGAP 390 4073585 IMAGE protein BM046 Hs.171011 15 15 EST mining uncharacterized bone marrow Rho GTPase activating protein ARHGAP 391 4519239 IMAGE protein BM046 Hs.171011 15 15 EST mining GTPase regulator associated with Rho GTPase activating protein ARHGAP 392 1642382 IMAGE the focal adhesion kinase pp125 Hs.293593 26 26 EST mining

Rho guanine nucleotide ARHGEF 393 3010245 IMAGE hypothetical protein FLJ10521 Hs.443460 exchange factor (GEF) 10-like 10L EST mining 394 4669368 IMAGE ribosomal protein L30 Hs.400295 Ribosomal protein L30 RPL30 EST mining 395 4804494 IMAGE ribosomal protein L30 Hs.400295 Ribosomal protein L30 RPL30 EST mining ESTs, Highly similar to YZA1_HUMAN HYPOTHETICAL PROTEIN 396 4430590 IMAGE (Homo sapiens) Hs.282998 Ribosomal protein L41 RPL41 EST mining 397 3930891 IMAGE ribosomal protein L6 Hs.528668 Ribosomal protein L6 RPL6 EST mining 398 2821975 IMAGE ribosomal protein L6 Hs.546283 Ribosomal protein L6 RPL6 EST mining 399 5456795 IMAGE ribosomal protein S17 Hs.433427 Ribosomal protein S17 RPS17 EST mining 400 1045363 IMAGE ribosomal protein S2 Hs.498569 Ribosomal protein S2 RPS2 EST mining

similar to 40S RIBOSOMAL PROTEIN S24 (S19) gene:LOC94929 InterimID:94929 401 783478 IMAGE from:NT_026437 Hs.356794 Ribosomal protein S24 RPS24 EST mining 402 981916 IMAGE ribosomal protein S5 Hs.378103 Ribosomal protein S5 RPS5 EST mining 9 7

403 997443 IMAGE ribosomal protein S5 Hs.378103 Ribosomal protein S5 RPS5 EST mining

hypothetical gene supported by M77233; BC002866; NM_001011 gene:LOC86389 InterimID:86389 404 1062361 IMAGE from:NT_004636 Hs.534346 Ribosomal protein S7 RPS7 EST mining 405 4431467 IMAGE hypothetical protein FLJ12565 Hs.553723 Ring finger protein 123 RNF123 EST mining 406 5449759 IMAGE hypothetical protein FLJ12565 Hs.553723 Ring finger protein 123 RNF123 EST mining 407 1045519 IMAGE ESTs Hs.292882 Ring finger protein 19 RNF19 EST mining RNA binding motif (RNP1, 408 4806273 IMAGE RNA binding motif protein 3 Hs.301404 RRM) protein 3 RBM3 EST mining RNA binding motif (RNP1, 409 4808235 IMAGE RNA binding motif protein 3 Hs.301404 RRM) protein 3 RBM3 EST mining Homo sapiens mRNA; cDNA DKFZp566A1124 (from clone RNA binding protein S1, serine- 410 4668290 IMAGE DKFZp566A1124) Hs.355643 rich domain RNPS1 EST mining Homo sapiens mRNA; cDNA DKFZp566A1124 (from clone RNA binding protein S1, serine- 411 4431983 IMAGE DKFZp566A1124) Hs.507343 rich domain RNPS1 EST mining

runt-related transcription factor 1 Runt-related transcription factor (acute myeloid 1; aml1 1 (acute myeloid leukemia 1; 412 589484 IMAGE oncogene) Hs.149261 aml1 oncogene) RUNX1 EST mining 413 1843642 IMAGE scinderin Hs.326941 Scinderin SCIN EST mining

414 897609 IMAGE hypothetical protein FLJ10074 Hs.506481 SCY1-like 2 (S. cerevisiae) SCYL2 EST mining

SEC24 (S. cerevisiae) related gene SEC24 related gene family, 415 4538281 IMAGE family, member C Hs.81964 member C (S. cerevisiae) SEC24C EST mining

SEC24 (S. cerevisiae) related gene SEC24 related gene family, 416 5452676 IMAGE family, member C Hs.81964 member C (S. cerevisiae) SEC24C EST mining 9 8

Sema domain, (Ig), short basic domain, 417 3410667 IMAGE ESTs Hs.269109 secreted, (semaphorin) 3C SEMA3C EST mining Homo sapiens kraken-like 418 998749 IMAGE (BK126B4.1), mRNA Hs.553531 Serine hydrolase-like SERHL EST mining Homo sapiens kraken-like 419 1204758 IMAGE (BK126B4.1), mRNA Hs.553531 Serine hydrolase-like SERHL EST mining Homo sapiens cDNA: FLJ22642 420 942200 IMAGE fis, clone HSI06970 Hs.288232 Serine incorporator 5 SERINC5 EST mining Homo sapiens cDNA: FLJ22642 421 1010557 IMAGE fis, clone HSI06970 Hs.288232 Serine incorporator 5 SERINC5 EST mining serine (or cysteine) proteinase inhibitor, clade A (alpha-1 peptidase inhibitor, clade antiproteinase, antitrypsin), member A (alpha-1 antiproteinase, SERPINA 422 450533 IMAGE 3 Hs.534293 antitrypsin), member 3 3 EST mining serine (or cysteine) proteinase inhibitor, clade B (ovalbumin), Serpin peptidase inhibitor, clade SERPINB 423 3051381 IMAGE member 6 Hs.519523 B (ovalbumin), member 6 6 EST mining

SET translocation (myeloid SET translocation (myeloid 424 4429931 IMAGE leukemia-associated) Hs.436687 leukemia-associated) SET EST mining

SET translocation (myeloid SET translocation (myeloid 425 4761026 IMAGE leukemia-associated) Hs.436687 leukemia-associated) SET EST mining

426 359641 IMAGE hypothetical protein FLJ10539 Hs.528650 SHQ1 homolog (S. cerevisiae) SHQ1 EST mining

427 953691 IMAGE ESTs Hs.528650 SHQ1 homolog (S. cerevisiae) SHQ1 EST mining similar to rat tricarboxylate carrier- 9

428 3352015 IMAGE like protein Hs.283844 Sideroflexin 3 SFXN3 EST mining 9

Homo sapiens, Similar to RIKEN cDNA 2310038H17 gene, clone Similar to hepatocellular MGC:15974 IMAGE:3542748, carcinoma-associated antigen LOC1511 429 4667826 IMAGE mRNA, complete cds Hs.516646 HCA557b 94 EST mining Homo sapiens, Similar to RIKEN cDNA 2310038H17 gene, clone Similar to hepatocellular MGC:15974 IMAGE:3542748, carcinoma-associated antigen LOC1511 430 5452901 IMAGE mRNA, complete cds Hs.516646 HCA557b 94 EST mining

STEAP 1, prostate cancer Six transmembrane epithelial 431 244796 IMAGE associated protein 1 Hs.489051 antigen of the prostate 2 STEAP2 EST mining

HUMZE01A04 Homo sapiens full 432 4667864 IMAGE length insert cDNA clone ZE01A04 Hs.445045 SLAM family member 9 SLAMF9 EST mining SLIT and NTRK-like family, 433 343569 IMAGE KIAA0918 protein Hs.591208 member 5 SLITRK5 EST mining Homo sapiens small nuclear protein 434 4737322 IMAGE PRAC (PRAC), mRNA Hs.116467 Small nuclear protein PRAC PRAC EST mining Homo sapiens small nuclear protein 435 4739311 IMAGE PRAC (PRAC), mRNA Hs.116467 Small nuclear protein PRAC PRAC EST mining

SMC4 (structural maintenance of SMC4 structural maintenance of 436 713127 IMAGE 4, yeast)-like 1 Hs.58992 chromosomes 4-like 1 (yeast) SMC4L1 EST mining

Solute carrier family 35 (UDP- solute carrier family 35 (UDP-N- N-acetylglucosamine (UDP- acetylglucosamine (UDP-GlcNAc) GlcNAc) transporter), member 437 3505772 IMAGE transporter), member 3 Hs.448979 A3 SLC35A3 EST mining Solute carrier family 35, member 438 3827889 IMAGE CGI-19 protein Hs.285847 B3 SLC35B3 EST mining 1 0 0

Solute carrier family 35, member 439 1169375 IMAGE hypothetical protein Hs.356467 E1 SLC35E1 EST mining Solute carrier family 35, member 440 1203483 IMAGE hypothetical protein Hs.356467 E1 SLC35E1 EST mining Spermidine/spermine N1-acetyl 441 4476608 IMAGE proliferation-associated 2G4, 38kD Hs.512181 -like 1 SATL1 EST mining Spermidine/spermine N1-acetyl 442 4807845 IMAGE proliferation-associated 2G4, 38kD Hs.512181 transferase-like 1 SATL1 EST mining ESTs, Weakly similar to I38022 hypothetical protein (Homo 443 809829 IMAGE sapiens) Hs.128676 SPRY domain containing 4 SPRYD4 EST mining hypothetical protein FLJ13195 444 957884 IMAGE similar to stromal antigen 3 Hs.213392 Stromal antigen 3-like FLJ13195 EST mining hypothetical protein FLJ13195 445 982865 IMAGE similar to stromal antigen 3 Hs.213392 Stromal antigen 3-like FLJ13195 EST mining calcium binding protein Cab45 446 957130 IMAGE precursor Hs.42806 Stromal cell derived factor 4 SDF4 EST mining calcium binding protein Cab45 447 1291649 IMAGE precursor Hs.42806 Stromal cell derived factor 4 SDF4 EST mining Stromal cell-derived factor 2- 448 3946057 IMAGE stromal cell-derived factor 2-like 1 Hs.303116 like 1 SDF2L1 EST mining 449 3357276 IMAGE hypothetical protein MGC3195 Hs.287412 SVH protein SVH EST mining 450 3347793 IMAGE syndecan 1 Hs.224607 Syndecan 1 SDC1 EST mining ESTs, Moderately similar to I78885 serine/-specific protein TatD DNase domain containing 451 1568056 IMAGE kinase (Homo sapiens) Hs.530538 3 TATDN3 EST mining gene near HD on 4p16.3 with homology to hypothetical S. pombe Tetracycline transporter-like 452 4429317 IMAGE gene Hs.398178 protein TETRAN EST mining gene near HD on 4p16.3 with homology to hypothetical S. pombe Tetracycline transporter-like 1

453 5456495 IMAGE gene Hs.398178 protein TETRAN EST mining 0 1

transmembrane 4 superfamily 454 2958221 IMAGE member (tetraspan NET-7) Hs.499941 Tetraspanin 15 TSPAN15 EST mining ribosomal protein S27 455 5451829 IMAGE (metallopanstimulin 1) Hs.504517 Tetraspanin 9 TSPAN9 EST mining ATP binding protein associated domain containing 456 300590 IMAGE with cell differentiation Hs.536122 9 TXNDC9 EST mining Homo sapiens cDNA: FLJ20993 Three prime histone mRNA 457 431284 IMAGE fis, clone CAE02403 Hs.20000 1 THEX1 EST mining

similar to tight junction protein ZO- 2 isoform A gene:LOC92592 Tight junction protein 2 (zona 458 953836 IMAGE InterimID:92592 from:NT_023967 Hs.50382 occludens 2) TJP2 EST mining

similar to tight junction protein ZO- 2 isoform A gene:LOC92592 Tight junction protein 2 (zona 459 1190765 IMAGE InterimID:92592 from:NT_023967 Hs.50382 occludens 2) TJP2 EST mining 460 4428491 IMAGE tolloid-like 1 Hs.106513 Tolloid-like 1 TLL1 EST mining 461 4665863 IMAGE tolloid-like 1 Hs.106513 Tolloid-like 1 TLL1 EST mining Topoisomerase (DNA) I, 462 4518560 IMAGE mitochondrial topoisomerase I Hs.528574 mitochondrial TOP1MT EST mining Topoisomerase (DNA) I, 463 4806121 IMAGE mitochondrial topoisomerase I Hs.528574 mitochondrial TOP1MT EST mining ATP-dependant interferon 464 2959819 IMAGE responsive Hs.584957 Torsin family 3, member A TOR3A EST mining 465 1677877 IMAGE ESTs Hs.112582 Transcribed locus EST mining 466 982696 IMAGE ESTs Hs.326315 Transcribed locus EST mining ESTs, Weakly similar to SFR4_HUMAN SPLICING FACTOR, /SERINE- 1 0

467 2118422 IMAGE RICH 4 (Homo sapiens) Hs.369728 Transcribed locus EST mining 2

ESTs, Weakly similar to SFR4_HUMAN SPLICING FACTOR, ARGININE/SERINE- 468 4054473 IMAGE RICH 4 (Homo sapiens) Hs.369728 Transcribed locus EST mining 469 953848 IMAGE ESTs Hs.430095 Transcribed locus EST mining Homo sapiens cDNA FLJ13581 fis, 470 995742 IMAGE clone PLACE1009039 Hs.470038 Transcribed locus EST mining Homo sapiens cDNA FLJ13581 fis, 471 2574944 IMAGE clone PLACE1009039 Hs.470038 Transcribed locus EST mining 472 953643 IMAGE ESTs Hs.552800 Transcribed locus EST mining 473 953979 IMAGE ESTs Hs.561110 Transcribed locus EST mining 474 1252698 IMAGE ESTs Hs.561110 Transcribed locus EST mining 475 381166 IMAGE histone Hs.595500 Transcribed locus EST mining 476 1861762 IMAGE ESTs Hs.598706 Transcribed locus EST mining 477 4474942 IMAGE ESTs Hs.598870 Transcribed locus EST mining 478 955287 IMAGE ESTs Hs.600025 Transcribed locus EST mining 479 3579336 IMAGE ESTs Hs.600025 Transcribed locus EST mining 480 2253060 IMAGE ESTs Hs.606212 Transcribed locus EST mining

HUMZE01A04 Homo sapiens full 481 3931008 IMAGE length insert cDNA clone ZE01A04 Hs.612587 Transcribed locus EST mining

ESTs, Weakly similar to ALU1_HUMAN ALU SUBFAMILY J SEQUENCE Transcribed locus, strongly CONTAMINATION WARNING similar to NP_835503.1 histone 482 1604793 IMAGE ENTRY (Homo sapiens) Hs.130853 1, H2bg (Mus musculus) EST mining 1 0 3

Transcribed locus, strongly similar to XP_517465.1 PREDICTED: similar to ATP- binding cassette sub-family E member 1 (RNase L inhibitor) ATP-binding cassette, sub-family E ( 4 inhibitor) 483 4738162 IMAGE (OABP), member 1 Hs.571791 (RNS4I) [Pan troglodytes] EST mining

Transcribed locus, strongly similar to XP_519204.1 PREDICTED: similar to ribosomal protein S27 ribosomal protein S27 [Pan 484 4073435 IMAGE (metallopanstimulin 1) Hs.552313 troglodytes] EST mining Transcribed locus, strongly similar to XP_519866.1 PREDICTED: similar to similar to ubiquinol-cytochrome c Ubiquinol-cytochrome C reductase binding protein reductase complex 14 kDa gene:LOC86970 InterimID:86970 protein (Complex III subunit VI) 485 981785 IMAGE from:NT_011630 Hs.449487 (QP-C) [Pan troglodytes] EST mining Transcribed locus, strongly similar to XP_519866.1 PREDICTED: similar to similar to ubiquinol-cytochrome c Ubiquinol-cytochrome C reductase binding protein reductase complex 14 kDa gene:LOC86970 InterimID:86970 protein (Complex III subunit VI) 486 1252258 IMAGE from:NT_011630 Hs.449487 (QP-C) [Pan troglodytes] EST mining Transcribed locus, strongly similar to XP_519889.1 PREDICTED: poly(A) binding poly(A)-binding protein, protein, cytoplasmic 1 [Pan 487 997701 IMAGE cytoplasmic 1 Hs.573588 troglodytes] EST mining 1 0 4

Transcribed locus, strongly similar to XP_532958.2 PREDICTED: similar to anaphase-promoting complex 1; anaphase promoting complex 488 4429069 IMAGE meiotic checkpoint regulator Hs.567955 subunit 1 (Canis familiaris) EST mining Transcribed locus, weakly similar to NP_055301.1 neuronal thread protein AD7c-NTP 489 4518815 IMAGE Hs.86045 (Homo sapiens) EST mining Transcribed locus, weakly similar to NP_055301.1 neuronal thread protein AD7c-NTP 490 4839457 IMAGE Hs.86045 (Homo sapiens) EST mining Transforming growth factor beta 491 4737317 IMAGE cell cycle progression 2 protein Hs.231411 regulator 4 TBRG4 EST mining Transforming growth factor beta 492 5456746 IMAGE cell cycle progression 2 protein Hs.231411 regulator 4 TBRG4 EST mining

Transient receptor potential cation channel, subfamily C, 493 898208 IMAGE DKFZP727M231 protein Hs.168073 member 4 associated protein TRPC4AP EST mining of inner translocase of inner mitochondrial mitochondrial membrane 23 494 915936 IMAGE membrane 23 homolog (yeast) Hs.524308 homolog (yeast) TIMM23 EST mining Translocase of inner translocase of inner mitochondrial mitochondrial membrane 23 495 940901 IMAGE membrane 23 homolog (yeast) Hs.524308 homolog (yeast) TIMM23 EST mining transmembrane, prostate androgen Transmembrane, prostate 496 511233 IMAGE induced RNA Hs.517155 androgen induced RNA TMEPAI EST mining Transporter 2, ATP-binding transporter 2, ATP-binding cassette, cassette, sub-family B 497 2405842 IMAGE sub-family B (MDR/TAP) Hs.502 (MDR/TAP) TAP2 EST mining 1 0 5

trichorhinophalangeal syndrome I gene:TRPS1 MIM:604386 Trichorhinophalangeal syndrome 498 4429125 IMAGE from:NT_008110 Hs.253594 I TRPS1 EST mining trichorhinophalangeal syndrome I gene:TRPS1 MIM:604386 Trichorhinophalangeal syndrome 499 4475131 IMAGE from:NT_008110 Hs.253594 I TRPS1 EST mining

tumor factor receptor receptor TNFRSF1 500 3458466 IMAGE superfamily, member 10b Hs.521456 superfamily, member 10b 0B EST mining

Tumor protein, translationally- 501 953501 IMAGE hypothetical protein FLJ20030 Hs.374596 controlled 1 TPT1 EST mining

Tumor protein, translationally- 502 953668 IMAGE hypothetical protein FLJ20030 Hs.374596 controlled 1 TPT1 EST mining ubiquinol-cytochrome c reductase Ubiquinol-cytochrome c 503 2823237 IMAGE hinge protein Hs.481571 reductase hinge protein UQCRH EST mining ubiquinol-cytochrome c reductase Ubiquinol-cytochrome c 504 3140323 IMAGE hinge protein Hs.481571 reductase hinge protein UQCRH EST mining UDP-glucose UDP-glucose ceramide 505 83463 IMAGE glucosyltransferase Hs.304249 glucosyltransferase UGCG EST mining

UDP-N-acetyl-alpha-D- UDP-N-acetyl-alpha-D- galactosamine:polypeptide N- galactosamine:polypeptide N- acetylgalactosaminyltransferase 3 acetylgalactosaminyltransferase 506 328542 IMAGE (GalNAc-T3) Hs.170986 3 (GalNAc-T3) GALNT3 EST mining

UDP-N-acteylglucosamine UDP-N-acteylglucosamine 507 292515 IMAGE pyrophosphorylase 1 Hs.492859 pyrophosphorylase 1 UAP1 EST mining Upregulated during skeletal KIAA0185 protein:1 | hypothetical muscle growth 5 homolog 1

508 4245636 IMAGE protein MGC14697:1 Hs.500921 (mouse) USMG5 EST mining 0 6

509 2966930 IMAGE uridine phosphorylase Hs.488240 Uridine phosphorylase 1 UPP1 EST mining 510 3140458 IMAGE uridine phosphorylase Hs.488240 Uridine phosphorylase 1 UPP1 EST mining 511 4520026 IMAGE vaccinia related kinase 1 Hs.422662 Vaccinia related kinase 1 VRK1 EST mining 512 4665818 IMAGE vaccinia related kinase 1 Hs.422662 Vaccinia related kinase 1 VRK1 EST mining

v- avian myeloblastosis viral V-myb myeloblastosis viral 513 243549 IMAGE oncogene homolog Hs.591337 oncogene homolog (avian) MYB EST mining

v-myc avian myelocytomatosis viral V-myc myelocytomatosis viral 514 2985844 IMAGE oncogene homolog Hs.202453 oncogene homolog (avian) MYC EST mining

v-myc avian myelocytomatosis viral V-myc myelocytomatosis viral 515 3048750 IMAGE oncogene homolog Hs.202453 oncogene homolog (avian) MYC EST mining

516 4429626 IMAGE KIAA0800 gene product Hs.118738 Vpr (HIV-1) binding protein VPRBP EST mining

517 4670225 IMAGE KIAA0800 gene product Hs.118738 Vpr (HIV-1) binding protein VPRBP EST mining BG9283 GENBA WD repeat and FYVE domain 518 54 NK ESTs Hs.480116 containing 3 WDFY3 EST mining WD repeat and FYVE domain 519 954064 IMAGE ESTs Hs.480116 containing 3 WDFY3 EST mining WD repeat and FYVE domain 520 983039 IMAGE ESTs Hs.480116 containing 3 WDFY3 EST mining

Homo sapiens hypothetical protein 521 4430630 IMAGE FLJ10233 (FLJ10233), mRNA Hs.213690 WD repeat domain 70 WDR70 EST mining 522 3449394 IMAGE X-box binding protein 1 Hs.437638 X-box binding protein 1 XBP1 EST mining protein 145 (Kruppel- like, expressed in promyelocytic Zinc finger and BTB domain 523 2499829 IMAGE leukemia) Hs.591945 containing 16 ZBTB16 EST mining

Zinc finger and BTB domain 1 0

524 1631238 IMAGE KIAA1483 protein Hs.520073 containing 2 ZBTB2 EST mining 7

KIAA1454 protein gene:KIAA1454 525 3931261 IMAGE LocusID:57603 from:NT_024654 Hs.458986 Zinc finger protein 291 ZNF291 EST mining

KIAA1454 protein gene:KIAA1454 526 4666115 IMAGE LocusID:57603 from:NT_024654 Hs.458986 Zinc finger protein 291 ZNF291 EST mining

similar to zinc finger protein 113 gene:LOC115528 527 4539223 IMAGE InterimID:115528 from:NT_007751 Hs.435302 Zinc finger protein 3 (A8-51) ZNF3 EST mining

similar to zinc finger protein 113 gene:LOC115528 528 4806072 IMAGE InterimID:115528 from:NT_007751 Hs.435302 Zinc finger protein 3 (A8-51) ZNF3 EST mining ESTs, Weakly similar to I38022 hypothetical protein (Homo 529 773535 IMAGE sapiens) Hs.531262 Zinc finger protein 573 ZNF573 EST mining

zona pellucida glycoprotein 3A Zona pellucida glycoprotein 3 530 4539403 IMAGE (sperm receptor) Hs.130553 (sperm receptor) ZP3 EST mining

zona pellucida glycoprotein 3A Zona pellucida glycoprotein 3 531 5449519 IMAGE (sperm receptor) Hs.130553 (sperm receptor) ZP3 EST mining 532 451907 IMAGE ZW10 interactor Hs.591363 ZW10 interactor ZWINT EST mining

ZW10 (Drosophila) homolog, ZW10, kinetochore associated, 533 4430579 IMAGE centromere/kinetochore protein Hs.503886 homolog (Drosophila) ZW10 EST mining

ZW10 (Drosophila) homolog, ZW10, kinetochore associated, 534 4670073 IMAGE centromere/kinetochore protein Hs.503886 homolog (Drosophila) ZW10 EST mining 1 0 8

EST mining, TCF3 (E2A) fusion partner (in TCF3 (E2A) fusion partner (in MusAffy_p53KO**, 535 3354161 IMAGE childhood Leukemia) Hs.590939 childhood Leukemia) TFPT MusAffy_TRAMP EST mining, TCF3 (E2A) fusion partner (in TCF3 (E2A) fusion partner (in MusAffy_p53KO, 536 4301790 IMAGE childhood Leukemia) Hs.590939 childhood Leukemia) TFPT MusAffy_TRAMP EST mining, v-yes-1 Yamaguchi sarcoma viral V-yes-1 Yamaguchi sarcoma MusAffy_p53KO, 537 1662432 IMAGE related oncogene homolog Hs.491767 viral related oncogene homolog LYN MusAffy_TRAMP EST mining, v-yes-1 Yamaguchi sarcoma viral V-yes-1 Yamaguchi sarcoma MusAffy_p53KO, 538 3571985 IMAGE related oncogene homolog Hs.491767 viral related oncogene homolog LYN MusAffy_TRAMP EST mining, v-yes-1 Yamaguchi sarcoma viral V-yes-1 Yamaguchi sarcoma MusAffy_p53KO, 539 4428738 IMAGE related oncogene homolog Hs.491767 viral related oncogene homolog LYN MusAffy_TRAMP EST mining, v-yes-1 Yamaguchi sarcoma viral V-yes-1 Yamaguchi sarcoma MusAffy_p53KO, 540 5446265 IMAGE related oncogene homolog Hs.491767 viral related oncogene homolog LYN MusAffy_TRAMP EST mining, MusAffy_p53KO, 541 773564 IMAGE nerve injury gene 283 Hs.427284 Zinc and ring finger 1 ZNRF1 MusAffy_TRAMP EST mining, MusAffy_p53KO, 542 2823677 IMAGE nerve injury gene 283 Hs.427284 Zinc and ring finger 1 ZNRF1 MusAffy_TRAMP fatty acid binding protein 5 EST mining, 543 4693761 IMAGE (psoriasis-associated) Hs.180711 3A STX3A MusAffy_TRAMP ESTs, Weakly similar to 544 703820 IMAGE Glucosidase II (Homo sapiens) Hs.143261 3, (p94) CAPN3 MusAffy_p53KO

AB058705 Homo sapiens mRNA open reading 545 564389 IMAGE for KIAA1802 protein, partial cds. Hs.7542 frame 8 C13orf8 MusAffy_p53KO 1 0 9

AB058705 Homo sapiens mRNA Chromosome 13 open reading 546 4828544 IMAGE for KIAA1802 protein, partial cds. Hs.7542 frame 8 C13orf8 MusAffy_p53KO Homo sapiens cytochrome P450, subfamily IIE (ethanol-inducible) Cytochrome P450, family 2, 547 4716501 IMAGE (CYP2E), mRNA Hs.12907 subfamily E, polypeptide 1 CYP2E1 MusAffy_p53KO Homo sapiens cytochrome P450, subfamily IIE (ethanol-inducible) Cytochrome P450, family 2, 548 4723454 IMAGE (CYP2E), mRNA Hs.12907 subfamily E, polypeptide 1 CYP2E1 MusAffy_p53KO extracellular glycoprotein EMILIN- 549 2307518 IMAGE 2 precursor Hs.532815 microfibril interfacer 2 EMILIN2 MusAffy_p53KO Family with sequence similarity 550 3504430 IMAGE TERA protein Hs.505154 60, member A FAM60A MusAffy_p53KO

Homo sapiens non-canonical MULTIPLE#8; OA: Non- ubquitin conjugating enzyme 1 *MULTIPL canonical ubquitin conjugating *MULTIP 551 267824 IMAGE (NCUBE1), mRNA E#8 enzyme 1 (NCUBE1) LE#8 MusAffy_p53KO Homo sapiens proteasome (prosome, macropain) subunit, beta Proteasome (prosome, 552 1742235 IMAGE type, 7 (PSMB7), mRNA Hs.213470 macropain) subunit, beta type, 7 PSMB7 MusAffy_p53KO Homo sapiens proteasome (prosome, macropain) subunit, beta Proteasome (prosome, 553 4761620 IMAGE type, 7 (PSMB7), mRNA Hs.213470 macropain) subunit, beta type, 7 PSMB7 MusAffy_p53KO

Homo sapiens KIAA0923 protein Synovial sarcoma, X breakpoint 554 4697991 IMAGE (KIAA0923), mRNA Hs.22587 2 interacting protein SSX2IP MusAffy_p53KO

Homo sapiens KIAA0923 protein Synovial sarcoma, X breakpoint 555 5163892 IMAGE (KIAA0923), mRNA Hs.22587 2 interacting protein SSX2IP MusAffy_p53KO Homo sapiens non-canonical ubquitin conjugating enzyme 1 -conjugating enzyme 1

556 4869036 IMAGE (NCUBE1), mRNA Hs.163776 E2, J1 (UBC6 homolog, yeast) UBE2J1 MusAffy_p53KO 1 0

Homo sapiens apolipoprotein D MusAffy_p53KO, 557 5178675 IMAGE (APOD), mRNA Hs.522555 Apolipoprotein D APOD MusAffy_TRAMP Homo sapiens apolipoprotein D MusAffy_p53KO, 558 5199481 IMAGE (APOD), mRNA Hs.522555 Apolipoprotein D APOD MusAffy_TRAMP Homo sapiens apolipoprotein E MusAffy_p53KO, 559 5302334 IMAGE (APOE), mRNA Hs.515465 Apolipoprotein E APOE MusAffy_TRAMP Homo sapiens apolipoprotein E MusAffy_p53KO, 560 5420598 IMAGE (APOE), mRNA Hs.515465 Apolipoprotein E APOE MusAffy_TRAMP

Apoptosis-inducing factor (AIF)- like mitochondrion-associated MusAffy_p53KO, 561 4637056 IMAGE BC016840 Hs.533655 inducer of death AMID MusAffy_TRAMP

Apoptosis-inducing factor (AIF)- like mitochondrion-associated MusAffy_p53KO, 562 5294773 IMAGE BC016840 Hs.533655 inducer of death AMID MusAffy_TRAMP Homo sapiens dentatorubral- pallidoluysian atrophy (atrophin-1) MusAffy_p53KO, 563 4518807 IMAGE (DRPLA), mRNA Hs.143766 Atrophin 1 ATN1 MusAffy_TRAMP Homo sapiens dentatorubral- pallidoluysian atrophy (atrophin-1) MusAffy_p53KO, 564 4644115 IMAGE (DRPLA), mRNA Hs.143766 Atrophin 1 ATN1 MusAffy_TRAMP MusAffy_p53KO, 565 823718 IMAGE carbonic anhydrase VI Hs.100322 Carbonic anhydrase VI CA6 MusAffy_TRAMP Homo sapiens carnitine palmitoyltransferase I, muscle Carnitine palmitoyltransferase MusAffy_p53KO, 566 2953576 IMAGE (CPT1B), mRNA Hs.439777 1B (muscle) CPT1B MusAffy_TRAMP Homo sapiens carnitine palmitoyltransferase I, muscle Carnitine palmitoyltransferase MusAffy_p53KO, 567 5169217 IMAGE (CPT1B), mRNA Hs.439777 1B (muscle) CPT1B MusAffy_TRAMP 1 1 1

CDC42 effector protein (Rho CDC42EP MusAffy_p53KO, 568 29964 IMAGE Cdc42 effector protein 2 Hs.343380 GTPase binding) 2 2 MusAffy_TRAMP MusAffy_p53KO, 569 824755 IMAGE chloride channel 6 Hs.193043 Chloride channel 6 CLCN6 MusAffy_TRAMP

AK024009 Homo sapiens cDNA FLJ13947 fis, clone Y79AA1000985, highly similar to Human centrosomal protein kendrin open reading MusAffy_p53KO, 570 5171392 IMAGE mRNA. Hs.236572 frame 58 C21orf58 MusAffy_TRAMP chromosome 9 open reading frame Chromosome 9 open reading MusAffy_p53KO, 571 811149 IMAGE 3 Hs.434253 frame 3 MusAffy_TRAMP

Homo sapiens hypothetical protein Chromosome 9 open reading MusAffy_p53KO, 572 4111294 IMAGE FLJ14675 (FLJ14675), mRNA Hs.434253 frame 3 C9orf3 MusAffy_TRAMP

Homo sapiens hypothetical protein Chromosome 9 open reading MusAffy_p53KO, 573 5226263 IMAGE FLJ14675 (FLJ14675), mRNA Hs.434253 frame 3 C9orf3 MusAffy_TRAMP

cleavage stimulation factor, 3' pre- Cleavage stimulation factor, 3' MusAffy_p53KO, 574 2964685 IMAGE RNA, subunit 1, 50kD Hs.172865 pre-RNA, subunit 1, 50kDa CSTF1 MusAffy_TRAMP

Homo sapiens D component of MusAffy_p53KO, 575 2490551 IMAGE complement (adipsin) (DF), mRNA Hs.155597 Complement (adipsin) CFD MusAffy_TRAMP

Homo sapiens D component of MusAffy_p53KO, 576 5176463 IMAGE complement (adipsin) (DF), mRNA Hs.155597 Complement factor D (adipsin) CFD MusAffy_TRAMP Homo sapiens COP9 constitutive photomorphogenic homolog subunit COP9 constitutive 7A (Arabidopsis) (COPS7A), photomorphogenic homolog MusAffy_p53KO, 577 4865769 IMAGE mRNA Hs.530823 subunit 7A (Arabidopsis) COPS7A MusAffy_TRAMP 1 1 2

Homo sapiens COP9 constitutive photomorphogenic homolog subunit COP9 constitutive 7A (Arabidopsis) (COPS7A), photomorphogenic homolog MusAffy_p53KO, 578 5286316 IMAGE mRNA Hs.530823 subunit 7A (Arabidopsis) COPS7A MusAffy_TRAMP COX17 homolog, cytochrome c MusAffy_p53KO, 579 487912 IMAGE popeye protein 2 Hs.534383 oxidase assembly protein (yeast) COX17 MusAffy_TRAMP Homo sapiens cysteine and - rich protein 3 (cardiac LIM protein) Cysteine and glycine-rich MusAffy_p53KO, 580 1266745 IMAGE (CSRP3), mRNA Hs.83577 protein 3 (cardiac LIM protein) CSRP3 MusAffy_TRAMP Homo sapiens cysteine and glycine- rich protein 3 (cardiac LIM protein) Cysteine and glycine-rich MusAffy_p53KO, 581 4292038 IMAGE (CSRP3), mRNA Hs.83577 protein 3 (cardiac LIM protein) CSRP3 MusAffy_TRAMP

Cytidine and dCMP deaminase MusAffy_p53KO, 582 2114122 IMAGE protein kinase NYD-SP15 Hs.388220 domain containing 1 CDADC1 MusAffy_TRAMP MusAffy_p53KO, 583 2307780 IMAGE BC008335 Hs.591002 Dedicator of cytokinesis 6 DOCK6 MusAffy_TRAMP MusAffy_p53KO, 584 5203063 IMAGE BC008335 Hs.591002 Dedicator of cytokinesis 6 DOCK6 MusAffy_TRAMP

DNA segment, Chr 17, ERATO D17Ertd4 MusAffy_p53KO, 585 525398 IMAGE ESTs Mm.371601 Doi 441, expressed 41e MusAffy_TRAMP MusAffy_p53KO, 586 4333499 IMAGE melanoma inhibitory activity Hs.515417 Egl nine homolog 2 (C. elegans) EGLN2 MusAffy_TRAMP NM_006 GENBA MusAffy_p53KO, 587 533 NK melanoma inhibitory activity Hs.515417 Egl nine homolog 2 (C. elegans) EGLN2 MusAffy_TRAMP NM_006 GENBA MusAffy_p53KO, 588 533 NK melanoma inhibitory activity Hs.515417 Egl nine homolog 2 (C. elegans) EGLN2 MusAffy_TRAMP 1 1 3

GENOMIC#6_Multiple: OA: ESTs, Moderately similar to ESTs, Moderately similar to envelope polyprotein precursor *GENOMIC envelope polyprotein precursor *GENOM MusAffy_p53KO, 589 1903036 IMAGE (Mus musculus) #6 (Mus musculus) IC#6 MusAffy_TRAMP

GENOMIC#6_Multiple: OA: ESTs, Moderately similar to ESTs, Moderately similar to envelope polyprotein precursor *GENOMIC envelope polyprotein precursor *GENOM MusAffy_p53KO, 590 1903045 IMAGE (Mus musculus) #6 (Mus musculus) IC#6 MusAffy_TRAMP Hairy and enhancer of split 5 MusAffy_p53KO, 591 2432792 IMAGE ESTs Hs.57971 (Drosophila) HES5 MusAffy_TRAMP Hairy and enhancer of split 5 MusAffy_p53KO, 592 2574667 IMAGE ESTs Hs.57971 (Drosophila) HES5 MusAffy_TRAMP Homo sapiens hypothetical protein MusAffy_p53KO, 593 4556509 IMAGE FLJ10847 (FLJ10847), mRNA Hs.232054 Hypothetical protein FLJ10847 FLJ10847 MusAffy_TRAMP Homo sapiens hypothetical protein MusAffy_p53KO, 594 5296948 IMAGE FLJ10847 (FLJ10847), mRNA Hs.232054 Hypothetical protein FLJ10847 FLJ10847 MusAffy_TRAMP MGC1296 MusAffy_p53KO, 595 785365 IMAGE hypothetical protein MGC12966 Hs.7980 Hypothetical protein LOC84792 6 MusAffy_TRAMP KIAA170 MusAffy_p53KO, 596 159127 IMAGE AF161370 Hs.487994 KIAA1706 protein 6 MusAffy_TRAMP AK027386 Homo sapiens cDNA FLJ14480 fis, clone KIAA170 MusAffy_p53KO, 597 3343882 IMAGE MAMMA1002215. Hs.487994 KIAA1706 protein 6 MusAffy_TRAMP KIAA170 MusAffy_p53KO, 598 3918552 IMAGE AF161370 Hs.487994 KIAA1706 protein 6 MusAffy_TRAMP AK027386 Homo sapiens cDNA FLJ14480 fis, clone KIAA170 MusAffy_p53KO, 599 4180691 IMAGE MAMMA1002215. Hs.487994 KIAA1706 protein 6 MusAffy_TRAMP 1 1 4

MusAffy_p53KO, 600 302127 IMAGE lipocalin 2 (oncogene 24p3) Hs.204238 Lipocalin 2 (oncogene 24p3) LCN2 MusAffy_TRAMP Homo sapiens hypothetical protein MID1 interacting protein 1 STRAIT11499 (STRAIT11499), (gastrulation specific G12-like MusAffy_p53KO, 601 2989926 IMAGE mRNA Hs.522605 ()) MID1IP1 MusAffy_TRAMP Homo sapiens hypothetical protein MID1 interacting protein 1 STRAIT11499 (STRAIT11499), (gastrulation specific G12-like MusAffy_p53KO, 602 5164760 IMAGE mRNA Hs.522605 (zebrafish)) MID1IP1 MusAffy_TRAMP Homo sapiens *MULTIPL MULTIPLE#10; OA: Protein *MULTIP MusAffy_p53KO, 603 796828 IMAGE (LOC51207), mRNA E#10 phosphatase (LOC51207) LE#10 MusAffy_TRAMP Homo sapiens protein phosphatase *MULTIPL MULTIPLE#9; OA: Protein *MULTIP MusAffy_p53KO, 604 796826 IMAGE (LOC51207), mRNA E#9 phosphatase (LOC51207) LE#9 MusAffy_TRAMP molecule possessing ankyrin repeats Nuclear factor of kappa light induced by lipopolysaccharide polypeptide gene enhancer in B- MusAffy_p53KO, 605 841666 IMAGE (MAIL), homolog of mouse Hs.319171 cells inhibitor, zeta NFKBIZ MusAffy_TRAMP Homo sapiens occludin (OCLN), MusAffy_p53KO, 606 1074577 IMAGE mRNA Hs.422901 Occludin OCLN MusAffy_TRAMP Homo sapiens occludin (OCLN), MusAffy_p53KO, 607 5210344 IMAGE mRNA Hs.422901 Occludin OCLN MusAffy_TRAMP Homo sapiens pericentrin 2 MusAffy_p53KO, 608 3868481 IMAGE (kendrin) (PCNT2), mRNA Hs.474069 Pericentrin (kendrin) PCNT MusAffy_TRAMP AK024009 Homo sapiens cDNA FLJ13947 fis, clone Y79AA1000985, highly similar to Human centrosomal protein kendrin MusAffy_p53KO, 609 4291235 IMAGE mRNA. Hs.474069 Pericentrin (kendrin) PCNT MusAffy_TRAMP Homo sapiens pericentrin 2 MusAffy_p53KO, 610 4931241 IMAGE (kendrin) (PCNT2), mRNA Hs.474069 Pericentrin (kendrin) PCNT MusAffy_TRAMP Prostaglandin D2 synthase MusAffy_p53KO, 611 4294999 IMAGE Prostaglandin D2 synthase Hs.446429 21kDa (brain) PTGDS MusAffy_TRAMP 1 1 5

Homo sapiens prostaglandin D2 synthase (21kD, brain) (PTGDS), Prostaglandin D2 synthase MusAffy_p53KO, 612 5266736 IMAGE mRNA Hs.446429 21kDa (brain) PTGDS MusAffy_TRAMP Homo sapiens prostaglandin D2 synthase (21kD, brain) (PTGDS), Prostaglandin D2 synthase MusAffy_p53KO, 613 5286169 IMAGE mRNA Hs.446429 21kDa (brain) PTGDS MusAffy_TRAMP Homo sapiens protein disulfide isomerase related protein (calcium- binding protein, intestinal-related) Protein disulfide isomerase MusAffy_p53KO, 614 4914731 IMAGE (ERP70), mRNA Hs.93659 family A, member 4 PDIA4 MusAffy_TRAMP Homo sapiens protein disulfide isomerase related protein (calcium- binding protein, intestinal-related) Protein disulfide isomerase MusAffy_p53KO, 615 5185316 IMAGE (ERP70), mRNA Hs.93659 family A, member 4 PDIA4 MusAffy_TRAMP Homo sapiens cDNA FLJ30824 fis, MusAffy_p53KO, 616 1669685 IMAGE clone FEBRA2001698 Hs.356766 Similar to RPE-spondin HSUP1 MusAffy_TRAMP MusAffy_p53KO, 617 2491337 IMAGE smoothelin Hs.149098 Smoothelin SMTN MusAffy_TRAMP MusAffy_p53KO, 618 2755769 IMAGE expressed sequence AW550622 Mm.208723 Spire homolog 1 (Drosophila) Spire1 MusAffy_TRAMP sulfortranferase family 4A, member Sulfotransferase family 4A, MusAffy_p53KO, 619 1630022 IMAGE 1 Hs.189810 member 1 SULT4A1 MusAffy_TRAMP Tetratricopeptide repeat domain MusAffy_p53KO, 620 815750 IMAGE hypothetical protein FLJ22584 Hs.424788 13 TTC13 MusAffy_TRAMP

transcriptional adaptor 2 (ADA2, Transcriptional adaptor 2 MusAffy_p53KO, 621 1637728 IMAGE yeast, homolog)-like Hs.500066 (ADA2 homolog, yeast)-like TADA2L MusAffy_TRAMP Homo sapiens active BCR-related gene (ABR), transcript variant 1, 622 3544160 IMAGE mRNA Hs.159306 Active BCR-related gene ABR MusAffy_TRAMP Homo sapiens active BCR-related gene (ABR), transcript variant 1, 1

623 4840373 IMAGE mRNA Hs.159306 Active BCR-related gene ABR MusAffy_TRAMP 1 6

AlkB, alkylation repair homolog 624 1031747 IMAGE alkylation repair; alkB homolog Hs.94542 1 (E. coli) ALKBH1 MusAffy_TRAMP Homo sapiens apolipoprotein C-I 625 2019284 IMAGE (APOC1), mRNA Hs.110675 Apolipoprotein C-I APOC1 MusAffy_TRAMP Homo sapiens apolipoprotein C-I 626 4716093 IMAGE (APOC1), mRNA Hs.110675 Apolipoprotein C-I APOC1 MusAffy_TRAMP Homo sapiens ceroid-lipofuscinosis, neuronal 8 (epilepsy, progressive Ceroid-lipofuscinosis, neuronal with mental retardation) (CLN8), 8 (epilepsy, progressive with 627 4134937 IMAGE mRNA Hs.127675 mental retardation) CLN8 MusAffy_TRAMP Homo sapiens ceroid-lipofuscinosis, neuronal 8 (epilepsy, progressive Ceroid-lipofuscinosis, neuronal with mental retardation) (CLN8), 8 (epilepsy, progressive with 628 5210946 IMAGE mRNA Hs.127675 mental retardation) CLN8 MusAffy_TRAMP Homo sapiens NICE-1 protein Chromosome 1 open reading 629 1707496 IMAGE (NICE-1), mRNA Hs.110196 frame 42 C1orf42 MusAffy_TRAMP Homo sapiens NICE-1 protein Chromosome 1 open reading 630 2735978 IMAGE (NICE-1), mRNA Hs.110196 frame 42 C1orf42 MusAffy_TRAMP Homo sapiens hypothetical protein Chromosome 19 open reading 631 3285542 IMAGE MGC4022 (R32184_3), mRNA Hs.515003 frame 6 C19orf6 MusAffy_TRAMP

Homo sapiens hypothetical protein Chromosome 19 open reading 632 4375473 IMAGE MGC4022 (R32184_3), mRNA Hs.515003 frame 6 C19orf6 MusAffy_TRAMP CKLF-like MARVEL transmembrane domain 633 812244 IMAGE chemokine-like factor 1 Hs.15159 containing 1 CMTM1 MusAffy_TRAMP factor VIII, Homo sapiens cDNA FLJ31647 fis, procoagulant component 634 1883851 IMAGE clone NT2RI2003973 Hs.6917 (hemophilia A) F8 MusAffy_TRAMP 635 344997 IMAGE Cystatin E (cystatin M) Hs.139389 Cystatin E/M CST6 MusAffy_TRAMP NM_001 GENBA 636 323 NK Cystatin E (cystatin M) Hs.139389 Cystatin E/M CST6 MusAffy_TRAMP 1 1 7

NM_001 GENBA 637 323 NK Cystatin E (cystatin M) Hs.139389 Cystatin E/M CST6 MusAffy_TRAMP Homo sapiens cDNA FLJ12749 fis, Cytochrome b5 domain 638 340826 IMAGE clone NT2RP2001149 Hs.27475 containing 1 CYB5D1 MusAffy_TRAMP Discs, large homolog 2, 639 544574 IMAGE ESTs Hs.503453 chapsyn-110 (Drosophila) DLG2 MusAffy_TRAMP Homo sapiens fatty acid desaturase 640 4745507 IMAGE 2 (FADS2), mRNA Hs.502745 Fatty acid desaturase 2 FADS2 MusAffy_TRAMP Homo sapiens fatty acid desaturase 641 4745962 IMAGE 2 (FADS2), mRNA Hs.502745 Fatty acid desaturase 2 FADS2 MusAffy_TRAMP

ficolin (collagen/fibrinogen Ficolin (collagen/fibrinogen 642 2049835 IMAGE domain-containing) 1 Hs.440898 domain containing) 1 FCN1 MusAffy_TRAMP HUMPROFILE Human 643 3613648 IMAGE profilaggrin mRNA, 3' end. Hs.23783 FLG MusAffy_TRAMP GENOMIC#5_ChrXp11.23: ESTs, Weakly similar to I61746 OA: ESTs, Weakly similar to pheromone receptor VN4 - rat *GENOMIC I61746 pheromone receptor VN4 *GENOM 644 1019292 IMAGE (R.norvegicus) #5 – rat (R.norvegicus) IC#5 MusAffy_TRAMP GENOMIC#5_ChrXp11.23: ESTs, Weakly similar to I61746 OA: ESTs, Weakly similar to pheromone receptor VN4 - rat *GENOMIC I61746 pheromone receptor VN4 *GENOM 645 1019901 IMAGE (R.norvegicus) #5 - rat (R.norvegicus) IC#5 MusAffy_TRAMP Homo sapiens EH-domain Glioma tumor suppressor GLTSCR 646 3162946 IMAGE containing 2 (EHD2), mRNA Hs.421907 candidate region gene 2 2 MusAffy_TRAMP Homo sapiens EH-domain Glioma tumor suppressor GLTSCR 647 3895191 IMAGE containing 2 (EHD2), mRNA Hs.421907 candidate region gene 2 2 MusAffy_TRAMP Homo sapiens (aminopeptidase A) Glutamyl aminopeptidase 648 4050191 IMAGE (ENPEP), mRNA Hs.435765 (aminopeptidase A) ENPEP MusAffy_TRAMP 1 1 8

Homo sapiens glutamyl aminopeptidase (aminopeptidase A) Glutamyl aminopeptidase 649 4608546 IMAGE (ENPEP), mRNA Hs.435765 (aminopeptidase A) ENPEP MusAffy_TRAMP Homo sapiens glutathione peroxidase 3 (plasma) (GPX3), Glutathione peroxidase 3 650 4824756 IMAGE mRNA Hs.386793 (plasma) GPX3 MusAffy_TRAMP Harakiri, BCL2 interacting protein (contains only BH3 651 2388311 IMAGE RIKEN cDNA 2300002A13 gene Hs.87247 domain) HRK MusAffy_TRAMP Hexamethylene bis-acetamide 652 3535529 IMAGE HMBA-inducible Hs.15299 inducible 1 HEXIM1 MusAffy_TRAMP Homo sapiens hypothetical protein MGC1733 653 3509257 IMAGE MGC17330 (MGC17330), mRNA Hs.26670 HGFL gene 0 MusAffy_TRAMP Homo sapiens hypothetical protein MGC1733 654 4852055 IMAGE MGC17330 (MGC17330), mRNA Hs.26670 HGFL gene 0 MusAffy_TRAMP hydroxysteroid (11-beta) Hydroxysteroid (11-beta) 655 505059 IMAGE dehydrogenase 1 Hs.195040 dehydrogenase 1 HSD11B1 MusAffy_TRAMP LOC3886 656 417983 IMAGE ESTs Hs.355747 Hypothetical LOC388610 10 MusAffy_TRAMP LOC9037 657 3832598 IMAGE BC002926 Hs.443636 Hypothetical protein BC002926 9 MusAffy_TRAMP LOC9037 658 3941154 IMAGE BC002926 Hs.443636 Hypothetical protein BC002926 9 MusAffy_TRAMP LOC1244 659 5184351 IMAGE BC017488 Hs.460574 Hypothetical protein BC017488 46 MusAffy_TRAMP LOC1244 660 5208741 IMAGE BC017488 Hs.460574 Hypothetical protein BC017488 46 MusAffy_TRAMP Homo sapiens hypothetical protein DKFZp762E1312 Hypothetical protein DKFZp76 661 3354430 IMAGE (DKFZp762E1312), mRNA Hs.532968 DKFZp762E1312 2E1312 MusAffy_TRAMP 1 1 9

Homo sapiens hypothetical protein DKFZp762E1312 Hypothetical protein DKFZp76 662 5202244 IMAGE (DKFZp762E1312), mRNA Hs.532968 DKFZp762E1312 2E1312 MusAffy_TRAMP ESTs, Highly similar to T47163 hypothetical protein Hypothetical protein DKFZp76 663 1540236 IMAGE DKFZp762E1312.1 (Homo sapiens) Hs.532968 DKFZp762E1312 2E1312 MusAffy_TRAMP

Homo sapiens inhibitor of DNA Inhibitor of DNA binding 1, binding 1, dominant negative helix- dominant negative helix-loop- 664 4300632 IMAGE loop-helix protein (ID1), mRNA Hs.504609 helix protein ID1 MusAffy_TRAMP Homo sapiens inhibitor of DNA Inhibitor of DNA binding 1, binding 1, dominant negative helix- dominant negative helix-loop- 665 5175558 IMAGE loop-helix protein (ID1), mRNA Hs.504609 helix protein ID1 MusAffy_TRAMP Homo sapiens inositol 1,4,5- triphosphate receptor, type 1 Inositol 1,4,5-triphosphate 666 1019153 IMAGE (ITPR1), mRNA Hs.567295 receptor, type 1 ITPR1 MusAffy_TRAMP Homo sapiens inositol 1,4,5- triphosphate receptor, type 1 Inositol 1,4,5-triphosphate 667 5192827 IMAGE (ITPR1), mRNA Hs.567295 receptor, type 1 ITPR1 MusAffy_TRAMP Homo sapiens hypothetical gene MGC16733 similar to CG12113 668 3905672 IMAGE (MGC16733), mRNA Hs.533723 Integrator complex subunit 4 INTS4 MusAffy_TRAMP Homo sapiens hypothetical gene MGC16733 similar to CG12113 669 4129693 IMAGE (MGC16733), mRNA Hs.533723 Integrator complex subunit 4 INTS4 MusAffy_TRAMP 670 810960 IMAGE kallikrein 10 Hs.275464 Kallikrein 10 KLK10 MusAffy_TRAMP 671 3632557 IMAGE kallikrein 10 Hs.275464 Kallikrein 10 KLK10 MusAffy_TRAMP NM_002 GENBA 672 776 NK kallikrein 10 Hs.275464 Kallikrein 10 KLK10 MusAffy_TRAMP Homo sapiens keratin 17 (KRT17), 673 4752518 IMAGE mRNA Hs.2785 Keratin 17 KRT17 MusAffy_TRAMP 1 2 0

Homo sapiens keratin 17 (KRT17), 674 5182302 IMAGE mRNA Hs.2785 Keratin 17 KRT17 MusAffy_TRAMP Homo sapiens similar to KERATIN, TYPE II CYTOSKELETAL 6F ( 6F) (CK 6F) (K6F KERATIN) (LOC121036), 675 4745765 IMAGE mRNA Hs.505608 Keratin 6L KRT6L MusAffy_TRAMP Homo sapiens similar to KERATIN, TYPE II CYTOSKELETAL 6F (CYTOKERATIN 6F) (CK 6F) (K6F KERATIN) (LOC121036), 676 4746566 IMAGE mRNA Hs.505608 Keratin 6L KRT6L MusAffy_TRAMP Homo sapiens p30 DBC protein KIAA196 677 3055500 IMAGE (LOC57805), mRNA Hs.433722 KIAA1967 7 MusAffy_TRAMP Homo sapiens p30 DBC protein KIAA196 678 5206899 IMAGE (LOC57805), mRNA Hs.433722 KIAA1967 7 MusAffy_TRAMP

latent transforming growth factor Latent transforming growth 679 767202 IMAGE beta binding protein 2 Hs.512776 factor beta binding protein 2 LTBP2 MusAffy_TRAMP Homo sapiens hypothetical protein Leucine rich repeat containing 680 927046 IMAGE FLJ10751 (FLJ10751), mRNA Hs.7778 20 LRRC20 MusAffy_TRAMP Homo sapiens hypothetical protein Leucine rich repeat containing 681 2989869 IMAGE FLJ10751 (FLJ10751), mRNA Hs.7778 20 LRRC20 MusAffy_TRAMP Homo sapiens KIAA0430 gene 682 2722531 IMAGE product (KIAA0430), mRNA Hs.173524 Limkain b1 LKAP MusAffy_TRAMP Homo sapiens KIAA0430 gene 683 5206837 IMAGE product (KIAA0430), mRNA Hs.173524 Limkain b1 LKAP MusAffy_TRAMP 684 3690202 IMAGE hypothetical protein MGC14151 Hs.565094 LSM domain containing 1 LSMD1 MusAffy_TRAMP MADS box transcription enhancer MADS box transcription factor 2, polypeptide C (myocyte enhancer factor 2, polypeptide C 1

685 122288 IMAGE enhancer factor 2C) Hs.444409 (myocyte enhancer factor 2C) MEF2C MusAffy_TRAMP 2 1

Homo sapiens mannosidase, alpha, class 2B, member 1 (MAN2B1), Mannosidase, alpha, class 2B, 686 4750858 IMAGE mRNA Hs.356769 member 1 MAN2B1 MusAffy_TRAMP Homo sapiens mannosidase, alpha, class 2B, member 1 (MAN2B1), Mannosidase, alpha, class 2B, 687 5203054 IMAGE mRNA Hs.356769 member 1 MAN2B1 MusAffy_TRAMP Homo sapiens MCM5 minichromosome maintenance MCM5 minichromosome deficient 5, cell division cycle 46 maintenance deficient 5, cell 688 4298673 IMAGE (S. cerevisiae) (MCM5), mRNA Hs.517582 division cycle 46 (S. cerevisiae) MCM5 MusAffy_TRAMP Homo sapiens MCM5 minichromosome maintenance MCM5 minichromosome deficient 5, cell division cycle 46 maintenance deficient 5, cell 689 5419899 IMAGE (S. cerevisiae) (MCM5), mRNA Hs.517582 division cycle 46 (S. cerevisiae) MCM5 MusAffy_TRAMP MULTIPLE#5; OA: ESTs, Highly similar to T47163 ESTs, Highly similar to T47163 hypothetical protein hypothetical protein *MULTIPL DKFZp762E1312.1 (Homo *MULTIP 690 3062487 IMAGE DKFZp762E1312.1 (Homo sapiens) E#5 sapiens) LE#5 MusAffy_TRAMP Homo sapiens mRNA; cDNA DKFZp762P1915 (from clone Myeloid/lymphoid or mixed- 691 949947 IMAGE DKFZp762P1915) Hs.120228 lineage leukemia 2 MLL2 MusAffy_TRAMP Homo sapiens , light polypeptide 2, regulatory, cardiac, Myosin, light polypeptide 2, 692 758311 IMAGE slow (MYL2), mRNA Hs.75535 regulatory, cardiac, slow MYL2 MusAffy_TRAMP Homo sapiens myosin, light polypeptide 2, regulatory, cardiac, Myosin, light polypeptide 2, 693 3304193 IMAGE slow (MYL2), mRNA Hs.75535 regulatory, cardiac, slow MYL2 MusAffy_TRAMP Homo sapiens KIAA0618 gene NOL1/NOP2/Sun domain 694 4749609 IMAGE product (KIAA0618), mRNA Hs.549079 family, member 5C NSUN5C MusAffy_TRAMP plasmalemma vesicle associated Plasmalemma vesicle associated 695 666292 IMAGE protein Hs.107125 protein PLVAP MusAffy_TRAMP 1 2 2

Homo sapiens KIAA0618 gene POM121 membrane 696 4906164 IMAGE product (KIAA0618), mRNA Hs.488624 glycoprotein (rat) POM121 MusAffy_TRAMP pre-B-cell leukemia transcription Pre-B-cell leukemia transcription 697 448386 IMAGE factor 3 Hs.428027 factor 3 PBX3 MusAffy_TRAMP Homo sapiens pregnancy specific beta-1-glycoprotein 1 (PSG1), Pregnancy specific beta-1- 698 4731651 IMAGE mRNA Hs.466843 glycoprotein 1 PSG1 MusAffy_TRAMP Homo sapiens pregnancy specific beta-1-glycoprotein 1 (PSG1), Pregnancy specific beta-1- 699 4733098 IMAGE mRNA Hs.466843 glycoprotein 1 PSG1 MusAffy_TRAMP pregnancy specific beta-1- Pregnancy specific beta-1- 700 2028866 IMAGE glycoprotein 4 Hs.558372 glycoprotein 4 PSG4 MusAffy_TRAMP Homo sapiens proliferating cell 701 4517709 IMAGE nuclear antigen (PCNA), mRNA Hs.147433 Proliferating cell nuclear antigen PCNA MusAffy_TRAMP Homo sapiens proliferating cell 702 5190671 IMAGE nuclear antigen (PCNA), mRNA Hs.147433 Proliferating cell nuclear antigen PCNA MusAffy_TRAMP

Homo sapiens prostaglandin- endoperoxide synthase 1 (prostaglandin G/H synthase and Prostaglandin-endoperoxide ) (PTGS1), synthase 1 (prostaglandin G/H 703 1712870 IMAGE transcript variant 1, mRNA Hs.201978 synthase and cyclooxygenase) PTGS1 MusAffy_TRAMP

Homo sapiens prostaglandin- endoperoxide synthase 1 (prostaglandin G/H synthase and Prostaglandin-endoperoxide cyclooxygenase) (PTGS1), synthase 1 (prostaglandin G/H 704 5174946 IMAGE transcript variant 1, mRNA Hs.201978 synthase and cyclooxygenase) PTGS1 MusAffy_TRAMP protein phosphatase 1, regulatory Protein phosphatase 1, 705 3346573 IMAGE (inhibitor) subunit 2 Hs.535731 regulatory (inhibitor) subunit 2 PPP1R2 MusAffy_TRAMP , protein phosphatase 2, regulatory regulatory subunit B (B56), beta 706 586725 IMAGE subunit B (B56), beta isoform Hs.75199 isoform PPP2R5B MusAffy_TRAMP 1 2 3

Homo sapiens PTEN induced 707 3622656 IMAGE putative kinase 1 (PINK1), mRNA Hs.389171 PTEN induced putative kinase 1 PINK1 MusAffy_TRAMP

Homo sapiens PTEN induced 708 5214483 IMAGE putative kinase 1 (PINK1), mRNA Hs.389171 PTEN induced putative kinase 1 PINK1 MusAffy_TRAMP Homo sapiens Rec8p, a meiotic recombination and sister chromatid cohesion phosphoprotein of the 709 2623653 IMAGE rad21p family (REC8), mRNA Hs.419259 REC8-like 1 (yeast) REC8L1 MusAffy_TRAMP Homo sapiens Rec8p, a meiotic recombination and sister chromatid cohesion phosphoprotein of the 710 5192338 IMAGE rad21p family (REC8), mRNA Hs.419259 REC8-like 1 (yeast) REC8L1 MusAffy_TRAMP 711 3355603 IMAGE ribosomal protein L34 Hs.438227 Ribosomal protein L34 RPL34 MusAffy_TRAMP 712 1926364 IMAGE RNA binding motif protein 9 Hs.282998 Ribosomal protein L41 RPL41 MusAffy_TRAMP Homo sapiens ring finger protein 40 713 4810404 IMAGE (RNF40), mRNA Hs.65238 Ring finger protein 40 RNF40 MusAffy_TRAMP Homo sapiens ring finger protein 40 714 5212755 IMAGE (RNF40), mRNA Hs.65238 Ring finger protein 40 RNF40 MusAffy_TRAMP secretory leukocyte protease Secretory leukocyte peptidase 715 378813 IMAGE inhibitor (antileukoproteinase) Hs.517070 inhibitor SLPI MusAffy_TRAMP Homo sapiens serine protease inhibitor, Kazal type, 5 (SPINK5), Serine peptidase inhibitor, Kazal 716 2407562 IMAGE mRNA Hs.331555 type 5 SPINK5 MusAffy_TRAMP Homo sapiens serine protease inhibitor, Kazal type, 5 (SPINK5), Serine peptidase inhibitor, Kazal 717 2740938 IMAGE mRNA Hs.331555 type 5 SPINK5 MusAffy_TRAMP Homo sapiens, Similar to RIKEN cDNA 5031425D22 gene, clone MGC:21579 IMAGE:4473003, 718 645662 IMAGE mRNA, complete cds Hs.591998 -like 1 SAAL1 MusAffy_TRAMP 1 2 4

Homo sapiens small proline-rich 719 2448791 IMAGE protein 1A (SPRR1A), mRNA Hs.46320 Small proline-rich protein 1A SPRR1A MusAffy_TRAMP Homo sapiens SKI-interacting 720 2662745 IMAGE protein (SNW1), mRNA Hs.546550 SNW domain containing 1 SNW1 MusAffy_TRAMP Homo sapiens SKI-interacting 721 4856640 IMAGE protein (SNW1), mRNA Hs.546550 SNW domain containing 1 SNW1 MusAffy_TRAMP ESTs, Weakly similar to I38022 hypothetical protein (Homo Solute carrier family 35, member 722 489109 IMAGE sapiens) Hs.356467 E1 SLC35E1 MusAffy_TRAMP Solute carrier family 7 (cationic amino acid transporter, y+ 723 1571882 IMAGE KIAA1443 protein Hs.318431 system), member 8 SLC7A8 MusAffy_TRAMP Homo sapiens stearoyl-CoA desaturase (delta-9-desaturase) Stearoyl-CoA desaturase (delta- 724 434891 IMAGE (SCD), mRNA Hs.558396 9-desaturase) SCD MusAffy_TRAMP Homo sapiens stearoyl-CoA desaturase (delta-9-desaturase) Stearoyl-CoA desaturase (delta- 725 3071027 IMAGE (SCD), mRNA Hs.558396 9-desaturase) SCD MusAffy_TRAMP SWI/SNF related, matrix SWI/SNF related, matrix associated, actin dependent associated, actin dependent regulator of , subfamily c, regulator of chromatin, SMARCC 726 2565521 IMAGE member 1 Hs.476179 subfamily c, member 1 1 MusAffy_TRAMP 727 780937 IMAGE KIAA0080 protein Hs.32984 XI SYT11 MusAffy_TRAMP 728 2823228 IMAGE hypothetical protein MGC12943 Hs.488683 Syntaxin 1A (brain) STX1A MusAffy_TRAMP 729 2586821 IMAGE AF047002 Hs.534385 THO complex 4 THOC4 MusAffy_TRAMP 730 3033493 IMAGE AF047002 Hs.534385 THO complex 4 THOC4 MusAffy_TRAMP Homo sapiens glutathione peroxidase 3 (plasma) (GPX3), 731 5179324 IMAGE mRNA Hs.570891 Transcribed locus MusAffy_TRAMP Homo sapiens transcription factor- 72 1

732 4398713 IMAGE like 1 (TCFL1), mRNA Hs.2430 (yeast) VPS72 MusAffy_TRAMP 2 5

Homo sapiens transcription factor- Vacuolar protein sorting 72 733 5263551 IMAGE like 1 (TCFL1), mRNA Hs.2430 (yeast) VPS72 MusAffy_TRAMP zinc finger protein 135 (clone pHZ- Zinc finger protein 135 (clone 734 286378 IMAGE 17) Hs.85863 pHZ-17) ZNF135 MusAffy_TRAMP

Note: *MusAffy_TRAMP stands for TRAMP mouse model (Affymetrix microarray-based analysis) **MusAffy_p53KO stands for p53 knockout mouse model (Affymetrix microarray-based analysis)

1 2 6

127

3.2 Microarray printing

The custom Prostate Marker Array was created using QBI/Incyte microarray technology and facilities. For each clone, an insert was PCR-amplified and printed on the glass slide as two nonadjacent spots. Additional technological controls (yeast, rat DNA,

Cy3 and Cy5 concentration gradient spots) were added to the microarray layout.

3.3 Cell culture sources for hybridization probe preparation

We used a large set of human cell lines derived from prostate and non-prostate tumors (Table X). Several cell lines derived from gliomas, glioblastomas and were included to take advantage of the fact that prostate tumors often undergo neuroendocrine differentiation [186]. Accordingly, predicted markers of prostate cancer may also be associated with neurological and melanocyte malignancies. Overall, a diverse set of cell lines allows a better assessment of expression profiles and provides more reliable clusterization of data.

The majority of cell lines were obtained from American Type Culture Collection

(Manassas, VA), with the exception of melanoma and nevus-derived cell lines, which were obtained from the Department of Surgical Oncology, University of Illinois at

Chicago.

The non-prostate cell lines were cultured in Dulbecco's Modified Eagle's Medium

(DMEM) supplemented with 10% fetal bovine serum (FBS); the prostate cancer and renal carcinoma cell lines were cultured in RPMI-1640 medium with 10% FBS.

3.4 Probe preparation and microarray hybridization

For the preparation of the control hybridization sample in custom microarray hybridization, morphologically normal or benign prostate hyperplasia (BPH) samples

128 were obtained from two prostatectomies. We isolated total RNA from cell lines as tissue samples using Trizol reagent (Invitrogen) according to manufacturer’s instructions. The

Poly(A) mRNA fraction was prepared from all RNA samples using the Ambion

MicroPoly(A)Purist™ . A common control sample was composed of equimolar amounts of BPH poly(A) mRNA, Human Brain Poly A+ RNA and Human Placenta Poly

A+ RNA (Clontech). Poly(A) samples were labeled with Cy3 or Cy5 (common control sample) using Incyte Genomics Life Array Probe Labeling Kit.

Hybridization was performed in the QBI/Incyte facility according to the Incyte protocols. Every cell line RNA sample was used in two independent hybridization experiments versus the common control sample. Microsomal versus cytoplasmic RNA hybridizations were done once per cell line.

129

TABLE X

HUMAN CELL LINES USED IN MICROARRAY AND NORTHERN BLOT

HYBRIDIZATIONS

Cell lines Origin Reference LNCaP Prostate cancer, AR-positive, [187, 188] androgen-dependent CWR22R Prostate cancer, AR-positive, [189, 190] androgen-independent C4-2, LN-56* Prostate cancer, LNCaP-derived [84, 191] PC3, DU145 Prostate cancer, AR-negative, [192, 193] androgen-independent CMN, CMNgse56* Nevus [194, 195] UISO-MEL-1, MEL-2, MEL-6, Melanoma** [196-198] MEL-7, MEL-8/pLPCX, MEL- 11, MEL-16/pLNCX, MEL- 23/pPS, MEL-29, 435 neo A172, D54, T98G, U138, U251, Glioma [199] U87 MCF7, T47D, AU565, MDA- Breast cancer [200, 201] MB-231, MDA-MB-361, MDA-MB-468 HCT116, RKO, LIM1215 Colon cancer [202] ACHN, RCC26, RCC28 Renal cell carcinoma [203] WI-38, WI-38gse56*, Fibroblasts [204] DMNgse56* A431 Squamous cell carcinoma [205] HeLa [206] NCI-H1299 [207] HT1080 Fibrosarcoma [208] SAOS2 Osteosarcoma [209] HEK293 Embryonic kidney [210] NCI-H295R Adrenal gland carcinoma [211]

* p53 is inactivated by GSE56 **All UISO-MEL-… lines are abbreviated as MEL-…here and in the subsequent text.

130

3.5 Microarray data analysis

Image analysis, per-spot fluorescence intensity calculations and spot geometry

QC were performed using Incyte GemTools 2.5 software.

3.5.1 Quality control

Per-chip quality control was performed at the QBI/Incyte branch in Pleasanton,

CA. All hybridizations were judged high-quality (Pearson correlation between signal values over replica experiment pairs >0.95). Subsequent quality control and normalization operations were done by our GINSTREAM software. Per-spot quality control was performed according to standard Incyte criteria: all spots that had an area of less than 40% and signal-to-background ratio in both channels less than 2.5 were considered invalid. Per-gene quality control was done by removing from analysis all genes with more than 1/3 of invalid spots. After the removal, 648 clones corresponding to about 440 individual genes were present in the dataset. 108 clones/90 genes were removed, mostly due to failed amplification (all spots were invalid).

3.5.2 Normalization

Per chip normalization was accomplished through robust nonlinear fitting of log expression ratios against average log spot intensities. The fitting was done using lowess() statistical function, separately for each of the four quadrants (pins). Such a procedure successfully removed signal-dependent non-linear bias [212].

Per-gene normalization was performed by centering the median log ratio of gene expression to 0.0. The median rather than the mean was chosen as more natural and robust metrics considering sufficient numbers of unrelated samples present in the experiment. Such normalization was necessary in order to remove distortions introduced

131 by a common control, since comparison to the common control by itself was not informative in our case. The expression ratios obtained after per-chip normalizations may be thought of as gene expression levels shown in common control expression units

(CCU) rather then in concentration units. Adjustment of median log ratio to zero (ratio to

1.0) eliminated CCU and used the “usual” median expression of the gene as the base unit.

3.5.3 Cluster analysis

No gene filtering before clustering was performed. Genes and experiments were clustered and resulting trees drawn using the de facto “industry standard” programs

(Cluster and TreeView) by Dr. Michael Eisen (Stanford University, http://rana.lbl.gov/EisenSoftware.htm) [213]. In each case, hierarchical, average linkage clustering was done using centered (Pearson) correlation as the distance metrics.

3.5.4 The selection of candidate genes for validation

The 1.5x upregulation in at least two human prostate-derived cell lines (at least two replicate experiments in one cell line and at least one replicate experiment in another cell line) was set as a minimum condition for validation. To be considered as coding for putative secreted/membrane protein, a pre-candidate gene had to be at least 1.5x upregulated in microsomal fraction (as compared to cytoplasmic) in at least two prostate- derived cell lines.

4. Results

4.1 Cluster analysis of expression across samples

In Figure 6, we present a cluster tree of the selected candidate genes as well as

some of the known cancer markers that passed our selection criteria. The arrangement of

26 human cell lines in the tree is largely consistent with their biological origins (despite a

132 small number and non-randomness of genes on the array), and all replicate experiments of identical origin are joined together in the lowest level clusters (Figure 6). Four major experiment groups are observed: Melanoma cell lines (all except MEL 6);

Glioma/Glioblastoma cell lines (with addition of MEL-6 line); “AR-positive Prostate

Cancer” (LNCaP, C4-2 and CWR22R with addition of Lim1251 and Common Control (a mixture of BPH-derived, brain and placenta mRNA)), and “Other Tumors” (all other cell lines, except SAOS2 and H1299 which are joined with the Melanoma cluster in a higher- level cluster, with addition of DU145 and PC3 AR-negative prostate cancer cell lines).

The probable reasons for PC3 and DU145 being removed from other prostate-derived cell lines are full or partial absence of androgen dependence and androgen response, as well as an overall higher degree of tumor progression for their respective malignancies of origin. Meanwhile, LNCaP, C4-2, and CWR22R possess at least some degree of androgen sensitivity and ability to express PSA [190, 214-216].

4.2 Cluster analysis of expression across genes

There are several major gene clusters of interest that contain genes with mostly prostate-specific expression (Figure 6, only genes mentioned in the text are shown). The largest prostate-specific cluster contains known prostate- and prostate cancer-specific genes: PSA, PSMA, kallikrein 2, AMACR and NKX3-1 [149]. More diverged members of this cluster, namely KIAA1181, MGC13170, MAL2, FADS2, and DSC96 possess a similar prostate-specific pattern of expression yet were not studied in detail as prostate- or cancer-specific genes or markers.

133

Figure 6. Expression of candidate and known prostate markers in the panel of cell lines.

NOTE: UniGene symbols are shown; “*”denotes unofficial symbols; “+” designates chosen candidate ESTs. Log2 relative expression is shown; average linkage hierarchical clustering was performed using uncentered Pearson correlation.

134

4.3 Selection of candidates for validation of expression profiles

The 1.5x upregulation (as compared to median expression) in at least two human prostate-derived cell lines (at least two replicate experiments in one cell line and at least one replicate experiment in another cell line) was set as a minimal condition for validation. Overall, 90 (20%) potential candidate genes exhibited expression patterns that complied with our criteria of selection. Out of those, we selected the 14 genes (Table XI) with strong relative expression in 2 or more prostate cancer cell lines and no or little expression in other cell lines. Some candidates are co-regulated with known markers such as PSA and PSMA, while others display novel expression patterns. One of the selected genes, MIA (melanoma inhibitory activity) forms a primarily melanoma-specific cluster of its own. MIA is an established tumor marker with commercial assay kits available, but was not proposed for use in prostate cancer diagnostics.

5. Discussion

We applied the microarray hybridization approach to 579 PCa marker pre- candidate genes. We analyzed their expression pattern across a wide spectrum of cell lines and thus were able to significantly narrow down our list, while still being able to retrieve the rest of pre-candidates for the future work. One of the major advantages of cell lines over tissue specimens is their ability to expand almost infinitely, as well as their widespread availability. Microarray hybridization with RNA probes from the cell line collection yielded reliable and reproducible gene expression patterns, which, due to the large number of cell lines involved, resulted in high quality clusterizing data. The genes that largely, but not exactly, co-expressed with known markers such as PSA and PSMA were especially valuable because they automatically possessed many of their benefits

135

(such as prostate specificity), while not necessarily sharing their drawbacks (such as tight dependence of expression on androgens).

On the other hand, cells grown in culture are dramatically perturbed compared to their tissues or organs of origin; expression of many genes and whole pathways become affected [217]. In vitro studies cannot distinguish interactions between different cell types in the organ, e.g. prostate epithelium and prostate stroma. In studies involving PCa cell lines the whole stromal and smooth muscle components are missing altogether. By using cell lines RNA probes we probably lost (temporarily) a number of potential PCa markers.

However, this approach was designed to be imperfect. The Prostate Marker Array selection allowed us to prioritize and narrow the candidate list, and the champions that passed this “unfair” selection are very interesting to us. Also, none of our top 14 candidates were previously proposed as a potential prostate cancer marker.

All final candidates are a result of in silico profiling. While it is discouraging for the TSG inactivation approach, this approach should not be disregarded as a whole. First of all, the Prostate Marker Array contained a much smaller proportion of genes from this branch of our project (only 22.6%). Second, since the study involved knockouts, we couldn’t use probes with human RNA. Mouse cancer models were our best alternative, and we did extensive work (a small study in itself) finding the closest human paralogs and orthologs for the mouse pre-candidates, which culminated in the designing of the BI-

HUMINIZER software program (Figure 4). However, this effort was hindered by the fact that mouse prostate is distinct from human prostate in several ways. The prostate is evolutionarily a young organ and not vital for organism survival; it is present in all

136 mammals. Tissue-specific gene regulation might not be as strictly evolutionally conserved as in other organs (e.g., mice lack PSA/KLK# [58]).

6. Conclusions

The results of Prostate Marker Array hybridizations with human cell line probes allowed us to validate a large number of our pre-candidate genes as prostate-specific, as well as to narrow down the number of candidates to a manageable number (14; see Table

XI). The genes that are largely, but not exactly co-expressed with known markers (such as PSA) are especially valuable, because they automatically possess many of their benefits (such as prostate specificity) and may not share their drawbacks.

137

TABLE XI

THE CANDIDATE GENES

Gene Symbol Gene Description (Homo sapiens) Accession Origin

ACBD3/ GOCAP1 acyl-Coenzyme A binding domain NM_022735 EST mining containing 3 ALDH6A1 aldehyde dehydrogenase 6 family, NM_005589 EST mining member A1 ARHGAP15 Rho GTPase activating protein 15 NM_018460 EST mining

CFD/ DF D component of complement (adipsin) NM_001928 MusAffy_p53KO, MusAffy_TRAMP DSC96 mesenchymal stem cell protein DSC96 AF242771 EST mining

KIAA1181/ ERGIC32/ endoplasmic reticulum-golgi NM_020462 EST mining ERGIC1 intermediate compartment (ERGIC) 1, transcript variant 2

FADS2 fatty acid desaturase 2 NM_004265 MusAffy_TRAMP FLJ30428 similar to murine Mlstd1 (male sterility AK054990 EST mining domain containing 1)

MAL2 mal, T-cell differentiation protein 2 NM_052886 EST mining

MGC13170 multidrug resistance-related protein NM_199249 EST mining NM_199250 MIA melanoma inhibitory activity NM_006533 MusAffy_p53KO, MusAffy_TRAMP MYST4/ MORF MYST histone acetyltransferase NM_012330 EST mining (monocytic leukemia) 4

TPT1 tumor protein, translationally-controlled NM_003295 EST mining 1, fortilin UAP1 UDP-N-acteylglucosamine NM_003115 EST mining pyrophosphorylase 1

138

VI. VALIDATION OF THE CANDIDATES’ EXPRESSION PATTERNS

1. Rationale

As a result of custom cDNA microarray analysis we generated a much smaller list of candidate marker genes. In order to evaluate their value as potential prostate cancer markers, we investigated their expression patterns.

2. Introduction

Currently, the most common techniques for quantifying mRNA levels are RT-

PCR, Northern blotting, and RNase protection assay. Real-time PCR is a recently developed advancement of RT-PCR, which utilizes PCR to yield more quantitative and precise data. At this step of our investigation we used RT-PCR and Northern blot analysis in order to validate gene expression patterns of our candidates in cell lines. To investigate their expression in human tissues, we applied Northern blotting and real-time PCR.

3. Semi-quantitative RT-PCR

3.1 Materials and Methods

Representative mRNA (or longest-sequenced EST sequence) was found for every

transcript using the UniGene database. For primer design we developed primer

IO/selection software, which adapts the Primer3 program [218] to select two best-scoring non-overlapping RT-PCR primer pairs giving the product of desired length (150-300 bp), free of Alu and L1 repeats and with reduced dependence on a single priming site, so that primers may be re-combined if needed (Table XII). The primers were specifically designed to span -exon junctions.

The RT-PCR reactions were performed using Taq polymerase (Invitrogen) according to the manufacturer’s protocol. Template cDNA was synthesized with

139

Superscript II reverse polymerase (Invitrogen) using total RNA of the cell line collection

(described above). The human G3PDH Control Amplimer Set of primers (BD

Biosciences Clontech) was used as a control of cDNA quality. The number of cycles was determined empirically to find the linear range of amplification. Every RT-PCR reaction was repeated at least twice.

3.2 Results

The RT-PCR results allowed us to investigate the expression patterns of candidate genes. The primers, which were specifically designed to span different exon-exon junctions, highlighted potential tissue-specific slicing events, as can be seen in 4 genes out of 13, namely KIAA1181, MIA, GOCAP1, and ARHGAP15 (Figure 7). Other genes might undergo tissue-specific splicing as well, even if not seen here, because only two pairs of primers spanning arbitrarily chosen exon-exon junctions were tested.

TABLE XII

RT-PCR PRIMERS AND NORTHERN PROBES FOR CANDIDATE GENES

Northern Gene Symbol Accession Left primer 1 (L1) Right primer 1 (R1) Left primer 2 (L2) Right primer 2 (R2) Probe CAAATTCTCATCCgC gCAgCTTCTggTTCCA gTCAgTgAgTCCAgC ATgTCATCTTCTgCCC ACBD3/ GOCAP1 NM_022735 CAgTT gTTC gATgA AACC L2R2 ATggAACTgCCATCTT TTTCTAACggCCCATg TCCAAggTgAAATC TTgATAgCggAgCAAg ALDH6A1 NM_005589 CACC gTAg CAgTCC ACCT L1R1 ATgACAggCTCAgCCA ACAgAgCCTgTTgCTT TggCTCTCATCTgC ACTggCTgTCgTCCAA ARHGAP15 NM_018460 AAgT ggAT ACAAAg ATTC L2R2 CTTgATgTgCgCggAgA gACCAACCAgATgCA gCACgTgCTCTTgCC CTTgggTgACCCTgAC CFD/ DF NM_001928 g ggAgT AgT CTT L1R1 gCACACCTTTCCCTC ggAAAgTCCACACCC DSC96 AF242771 TgTTC AgCTA - - L1R1 ACCTgCCCTACAATC AggTgATgAAgAACCg ggAgCAgTCCTTCTT CCATgCTTggCACATA FADS2 NM_004265 ACCAg gATg CAACg gAgA L1R1 TAACCTCAATCCCTg CCTTCCAATgAgCCA ATCTgTggCTCATTg TgCgTTCCCTCTgATT FLJ30428 AK054990 CAACC CAgAT gAAgg TCTT L2R2 KIAA1181/ gggTAgAAATTgggTgC CTgCCCATCTCATCC ATgACCCAgACAAg CgggATCTTCATggAg ERGIC32/ ERGIC1 NM_020462 TgA TgAAT gACAgC TTgT L1R1 TTTCTgggCATgTTCC gCCCggTTATggTTgTA ATCCCTgCATgATT TCgTAAAgCCAgACC MAL2 NM_052886 TCTC TTg TgCATT CAAAC L1R1 CCTggCTCTCAAggAC AgCTCCACAgggAggg ACATgCCTgTTACC TAggCCTggTCACAAT MGC13170 NM_032712 AgAC TAgT CAggTC Aggg L2R2 CTCTTgCTCACAgTCC AggTTTCAgggTCTggT ACCCTATCTCCATg TTTgCACTgggCTgAT MIA NM_006533 ACgA CCT gCTgTg TgTA L1R1 gACCgCAgTACAgggT gCCACAATCTgCACA AAAgggCTgTCATC TgCTAATTgCCTTgAT MYST4/ MORF NM_012330 CAAT AgAgA TggTTg gCTg L1R1 gAggCATTTggAgCAT gAAATggTTggCAATg AACCAgTTggAgTgg ATTCCATTgggTTTgT 1

UAP1 NM_003115 TCAT TTCC TTTgC CTgg L1R1 4 0

X X C C C P N P 6 6 N L L p 5 5 / / S 5 R e p e / 6 9 1 / 6 3 1 0 s s 2 o 2 1 P 5 2 2 1 6 7 8 1 2 2 8 g g e 2 9

7 8 ------S 1 a 4 1 0 a n 9 1 1 8 2 G N N D 3 N R N F

2 L L L L L L L L 1 O T 1 C 3 O - 2 5 7 3 3 4 7 - L 8 7 3 5 M I C E E E E E E E E U e 1 C 2 8 M 4 1 T M 5 1 M W K M 4 C I 9 N 4 A 9 3 D P W C 2 C S H R C H H L M U U T C D A U H M D M M M M D M 4 L T M M A ALD H6A1 L1R1 D SC96 L1R1 F ADS2 L1R1 GO CAP1 L1R1 GO CAP1 L2R2 KIA A1181 L1R1 KIA A1181 L2R2 MIA L1R1 MIA L2R2

MYST4 L1R1

ARHG AP15 L1R1 ARHG AP15 L2R2 CFD L1R1 FLJ 30428 L2R2 MAL2 L1R1 MGC 13170 L2R2 UAP1 L1R1 PSA hGAPDH prostate

Figure 7. Candidates’ expression profiles in the cell line collection by RT-PCR. L1R1, L2R2: primer pairs flanking different introns of 1

the same gene, shown separately where expression patterns differ, otherwise typical expression pattern is shown. 4 1

142

4. Northern blot analysis

In order to investigate our candidates’ tumor-specificity and tissue-specificity we performed the Northern blot analysis for 11 candidates (except CDF, UAP1, and TPT1) in the cell line collection and in a set of normal tissues.

4.1 Materials and Methods

Northern blot analysis was performed by resolving RNA samples [either total

RNA of the cell line collection (described above) or a set of poly-A mRNAs from human normal tissues (obtained from Cardiology Research Center at Russian Academy of

Medical Sciences)] on a 1% agarose formaldehyde gel and blotting onto Hybond-N nylon membrane as recommended by the manufacturer (Amersham). All hybridizations were repeated at least twice on independently prepared membranes.

For probes we used cDNA from prostate tumors as a template in PCR reactions with selected primer pairs (Table XII), cloned the products into pCRII or pCR4Blunt-

TOPO vectors (Invitrogen) and verified them by sequencing. Probes were synthesized in a PCR reaction on sequence-verified clones of candidates and labeled with -32P-dCTP

(3000 Ci/mmol, Perkin Elmer) using the Megaprime DNA labeling system (Amersham

Biosciences Corp., Piscataway, NJ) and random priming protocol.

4.2 Results

The Northern blotting results in cell lines were consistent with RT-PCR, though

Northerns appeared to be not as sensitive (Figure 8). Among normal tissues the expression of our candidates was not as strictly prostate tissue-specific as PSA (Figure 9).

However, at least several displayed good tissue and/or tumor specificity. At this time, we were interested primarily in the candidates that would be expressed primarily in prostate

143 tumor-derived cell lines and would either not be expressed in normal nonprostatic tissues, or would be expressed at the low level. Also, secreted or membrane bound proteins were preferred. According to these criteria, the four most promising candidates were selected for advanced validation: MAL2 and MGC13170 (as top candidates), KIAA1181 (for extremely prostate-specific splice isoform found by RT-PCR), and FLJ30428 (for its high prostate specificity shown by Northern blotting in normal tissues). Other remaining candidates (MIA, FADS2, GOCAP1, MYST4, ARHGAP15, TPT1, UAP1 and CFD) possess comparably attractive expression features and will be revisited in the future.

X X 1 1 8 2 3 6 6 C C - 2 3 4 P N - - - 6 L 6 N L L 5 5 E B B B / S e p

e / 6 / 1 6 9 s 0 s 2 o M M M M 6 P 1 5 7 5 6 7 8 1 1 2 - - g - 8 - g e 2 9

------N 8 2 a 1 4 6 0 a n 9 1 1 8 2 O A N A A G D 3 N R F

L L L 2 L L L 1 5 O T H C C 1 3 - 2 5 3 3 4 7 - L 8 7 S 5 3 I E E E E D C E E D D I U U 1 T C e C M 2 4 1 5 1 M 4 W C K C N 9 4 3 9 C U M P M M D M L C C M M M M R M A H 4 H R H H A D M T U W A U 2 T D A ALDH6A1 DSC96 FADS2 GOCAP1 KIAA1181 MIA MYST4

ARHGAP15

FLJ30428

MAL2 MGC13170 PSA PSMA hGAPDH prostate

Figure 8. Candidates’ expression profiles in the cell line collection, detected by Northern blots.

1 4 4

145

d e

n l e d d d a c l n n n n w s

i a t a a G o u l s l l

s r s

e e r y G t G M G u t

r a

s h y g e l l a n s a y a d t t I c c a a a r i u M

e t n n s a

a m o a l h n o r n t t r e e l e n e e r g r m k v e c p s l o i t e m e m a i c r l a n n y y e u l s n o o l a v a o a e d o o r i s u h r e h e a m t k p a l P L E S C H L S T P B S L S A S T P M B T ALDH6A1 DSC96

FADS2 GOCAP1 KIAA1181

MIA MYST4 ARHGAP15 FLJ30428

MAL2 MGC13170 PSA PSMA

hGAPDH

Figure 9. Candidates’ expression profiles in human normal tissues, detected by Northern

blots.

146

5. Conclusions

We have investigated the expression properties of our candidate genes by RT-

PCR and Northern blotting in a diverse panel of cell lines and a set of normal tissues. As a result, we narrowed our search to the four most prostate cancer specific genes to select our advanced candidates. These genes are MAL2, MGC13170, KIAA1181, and

FLJ30428. As positive controls we chose established prostate cancer markers PSA and

PSMA, which would be investigated in parallel with our 4 advanced candidates. 147

VII. EXPRESSION OF ADVANCED CANDIDATES IN CLINICAL SAMPLES

1. Rationale

In order to evaluate the performance of the prostate cancer marker candidates in clinical samples, we investigated their expression properties in the human prostate tissue samples using real-time PCR for quantitative as well as qualitative results.

Androgen-dependence is one of the qualities that underlies both positive (tissue specificity) and negative (disappearance in advanced cancers and after hormonal therapy) features of PSA. Accordingly, we investigated androgen-dependence of advanced candidate gene expression in prostate cancer cell lines.

2. Introduction

Real time PCR is a recently developed laboratory technique that allows performing relatively accurate, reliable and reproducible measurements to quantitatively determine the presence of specific gene sequences [219]. Its advantage over a regular

PCR reaction is that the quantitative measurements of a PCR product are taken every cycle, which eliminates the need for additional processing steps required for visualization of the PCR product (such as running agarose gels) and yields a higher dynamic range of measurements.

3. Materials and Methods

3.1 Human tissue samples

We obtained the frozen prostate tissue samples from the Pathology Department at

Cleveland Clinic Foundation. They included 30 pairs of tumor/normal morphology biopsy cores from prostate cancer patients and 12 benign prostate samples from patients with negative prostate cancer diagnosis. An additional set of 10 normal nonprostatic

148 tissue samples (liver, thyroid, 2 samples of heart, lymph node, lung, esophagus, 2 samples of colon, and stomach) was obtained from the Cardiology Research Center at the

Russian Academy of Medical Sciences.

3.2 Real-time PCR

We performed the reactions in 96-well format on ABI PRISM® 7700 Sequence

Detection System (Applied Biosystems) and the TaqMan EZ RT-PCR core reagents kit

(Applied Biosystems) according to the manufacturer’s protocol. We purchased TaqMan

Gene Expression Assays (Applied Biosystems) and used them as probes (Table XIII).

The probe for FLJ30428 was custom-made (reporter CCTGGAATCACATTTTGC, forward primer CACCGACCTAAGTCAATGTTAGTCT, and reverse primer

CACCGACCTAAGTCAATGTTAGTCT). We used beta-2-microglobulin (B2M) as the endogenous control, which after comparison to GAPDH and 18S proved to be the most suitable [220]. Also, the GAPDH expression increase was reported to correlate with the prostate cancer pathological stage, which makes it unsuitable as loading control in gene expression analysis [221]. Relative mRNA abundance was calculated using the comparative ∆Ct method. Total RNA isolated from the CWR22R prostate cancer cell line was chosen as a calibrator, because as a cell line it provides a relatively unlimited source of standard template, and it is the only cell line that shows a detectable expression of all genes tested. The androgen-response study was performed likewise.

The study with a large set of normal and malignant tissue samples was performed at the Gene Expression and Genotyping Facility (GEGF) of Case Western Reserve

University on ABI PRISM® 7900HT Sequence Detection System in 386-well format

149 using the same TaqMan Gene Expression Assays. 18S rRNA was used as endogenous control by GEGF recommendation.

3.3 Androgen-responsiveness of candidate gene expression

As a control, cells were grown in 10% FBS. Androgen withdrawal was simulated by culturing cells in 10% charcoal-stripped serum (CSS) for 7 days, after which we added

0.1 nM dihydrotestosterone (DHT) to the CSS-supplemented medium for 24 hours. Total

RNA from treated cells was then used in real-time PCR experiments as described above.

150

TABLE XIII

TAQMAN GENE EXPRESSION ASSAYS USED IN REAL-TIME PCR STUDY

Gene Symbol Assay ID Designation Description B2M Hs99999907_m1 endogenous beta-2-microglobulin control 18S Hs99999901_s1 endogenous eukaryotic 18S rRNA control KLK3/ PSA Hs00426859_g1 PCa marker kallikrein 3, (prostate specific antigen) FOLH1/ Hs00379515_m1 PCa marker folate hydrolase (prostate-specific PSMA membrane antigen) 1 KIAA1181/ Hs00393279_m1 candidate marker endoplasmic reticulum-golgi intermediate ERGIC-1 compartment 32 kDa protein MAL2 Hs00294541_m1 candidate marker mal, T-cell differentiation protein 2 MGC13170 Hs00364147_m1 candidate marker multidrug resistance-related protein FLJ30428 AK054990 (custom) candidate marker hypothetical protein

151

4. Results

4.1 Real-time PCR in the cell line collection

When the real-time PCR technique was first introduced in our lab, we used total

RNA isolated from the cell line collection (Table X) as a template, which allowed us to refine the technique and at the same time to confirm the gene expression levels of the candidates in cell lines (Figure 10).

One of the challenges was to choose a standard template, which would have relative expression level “1-fold” (“0” on log2 scale). Most of the traditional sources were not suitable, because PSA and other prostate-specific genes would not be expressed.

We decided to choose a prostate cancer cell line because it would provide a relatively unlimited source of template RNA; CWR22R was the only cell line that showed detectable expression of all genes tested in this study.

4.2 Real-time PCR in tissue samples

We have investigated the expression of our candidates as well as PSA and PSMA in a set of 10 normal tissues as well as prostate tissue samples obtained from CCF

Pathology Department, which contained 8 pairs of prostate cancer (HPCa) and matched normal (HPNm) tissue samples (each pair of HPCa# and HPNm# with the same number came from the same patient), as well as 6 frozen tissue samples from cancer-free prostates (designated “benign prostates” – BP01-06 – to distinguish from “normal tissue” in cancerous prostates.).

Even on such a relatively small set of samples the results were very encouraging

(Figure 11). When compared to PSA and PSMA, our candidates distinguish prostate cancer from benign prostate very well. Contrary to our initial expectations, none of the

152 prostate cancer markers (prospective or established) was capable of distinguishing cancerous (HPCa) from “normal” (HPNm) tissue within the same prostate organ. This could be explained by the fact that prostate cancer is multifocal in nature, so there is a high chance of a “normal” (HPNm) sample coming from close vicinity to a cancer lesion.

153

Candidate genes expression in cell lines: Log2 relative expression

5.0 4.0 3.0 2.0 1.0 0.0 -1.0 PSA -2.0 PSMA -3.0 MAL2 -4.0 MGC13170 -5.0 -6.0 KIAA1181 -7.0 FLJ30428 -8.0 -9.0 -10.0 FLJ30428 R P 6

2 KIAA1181 5 2 3 a 5 - 5 R MGC13170 7 4 8 - 6 2 4 - C MAL2 C 6 5 X 1 6 5 9 S

N PSMA a R 0 F 5 9 C P N 4 e 3 C 9 U 1 P L PSA L 8 2 - N 6 C L U s 9 2 2 W 3 P p e D 0 G 8 2 6 B / H H g 2 1 A 7 4 L M 1 C O 8 3 H 1 8 / 3 C 1 C M N H A 9 T 1 3 1 K 8 2 - A C - - A T T H U M I R A M L R C D D E W H M M

Figure 10. Relative expression of candidate genes in cell lines.

NOTE: Log2 relative expression is shown; normalized to the gene expression in CWR22R.

154

PSA

8 7 6 5 4 3 2 R 1 h g 8 7 6 5 4 3 2 1 0 0 0 0 0 0 0 2 0 c a 0 0 0 0 0 0 0 0 d 2 i a h h m a m a m a m a m a m a m a m a t t 1 2 3 4 5 6 n n o r r R p p r N C N C N C N C N C N C r m N C N C 0 0 0 g 0 0 0 o o a a l l o e y o n P P P P P P P P P P P m P P W P P P P P P P P P e e s o o t v h y u i H H H H H H H H H H B h H l B H c B c l s B H C h t e H B B H l H 1 5

1 0 n o

i s s

e 5 r

p x e

e 0 v i

t a l

e

r -5

2 g o L -10

-15

PSMA

8 7 6 R 5 4 3 2 1 g h 8 7 6 5 4 3 2 1 0 0 0 2 0 0 0 0 0 c a 0 0 0 0 0 0 0 0 d 2 i a h h m a m a m a m a m a m a m a m a t t 1 2 3 4 5 6 n n o r R r p p r N C N C N C N r C m N C N C N C N C g 0 0 0 0 0 0 o o a a o l l e y o n P P P P P P m P P W P P P P P P P P P P P P P P e s e o t o

v h u y i H H H H h e H H l c h l H l t H C s H c B H B H B H B H B H B H H 5

n o

i 0 s s e r p x

e

e -5

v i t a

l e r

2

g -10

o L

-1 5

Figure 11a. Relative expression of candidate genes in prostate tissue samples

NOTE: Log2 relative expression is shown; normalized to gene expression in CWR22R prostate cancer cell line; the error bars represent 95% CI (1.96*standard error).

155

MAL2

R 1 2 3 4 5 6 7 8 h g 1 2 3 4 5 6 7 8 2 0 0 0 0 0 0 0 0

c a 0 0 0 0 0 0 0 0 d 2 i a h h a m a m a m a m a m a m a m a m t t 1 2 3 4 5 6 n n o r R r p p r m r C N C N C N C N C N C N C N C N 0 0 g 0 0 0 0 o o a a l l o e y o n W m P P P P P P P P P P P P P P P P P P P P P P e s e o t o v h y u i c B C l s c h B l e t B B l B h B H H H H H H H H H H H H H H H H 5

n o

i 0 s s

e r p

x e

e -5 v i t a l e

r

2

g - 10 o

L

- 15

MGC13170

8 7 6 5 4 R 3 2 1 g h 8 7 6 5 4 3 2 1 0 0 0 0 0 2 0 0 0

c a 0 0 0 0 0 0 0 0 d 2 i a h h m a m a m a m a m a m a m a m a t t 1 2 3 4 5 6 n n o r R r p p r N C N C N C N C N r C m N C N C N C g 0 0 0 0 0 0 o o a a l o l e y o n P P P P P P P P m P W P P P P P P P P P P P P P e s e o t o v h y i u H H H H H H H H l t H h C l l H c e h s c B H B H B H B H B H B H 5

n o i

s 0 s

e r p

x e -5 e v

i t a l e r

2 -10

g o

L

-15

Figure 11b. Relative expression of candidate genes in prostate tissue samples

NOTE: Log2 relative expression is shown; normalized to gene expression in CWR22R prostate cancer cell line; the error bars represent 95% CI (1.96*standard error).

156

KIAA1181

8 7 6 R 5 4 3 2 1 h g 8 7 6 5 4 3 2 1 0 0 0 2 0 0 0 0 0 c a 0 0 0 0 0 0 0 0 d 2 i a h h a m a m a m a m a m a m a m a m t t 1 2 3 4 5 6 n n o r r R p p

r C r m N C N C N C N C N C N C N C N 0 0 0 g 0 0 0 o o a a l l o e y o n P P W P P P m P P P P P P P P P P P P P P P P P e s e o o t v h i y u c l H t c s H C B H H h B H l e H l h B H H B H H B H H B H H H H 5

n

o i 0 s

s e r

p x e

e -5 v

i t a l e r

2

g -10 o L

-1 5

FLJ30428

1 2 3 4 5 6 R 7 8 h g 1 2 3 4 5 6 7 8 0 0 0 0 0 0 2 0 0 c a 0 0 0 0 0 0 0 0 d 2 i a h

h a m a m a m a m a m a m a m a m t t 6 5 4 3 2 1 n n o r r R p p r C N C N C N C m N C N C N C N r C N 0 0 0 0 0 0 g o o a a l l o e y o n P P P P P P P P P P P P P m P W P P P P P P P P e s e o t o v h u y i H B B H B B H B B H H H c H s H h H H H c l H e H l l H h C H t H 5

n o

i 0

s s e

r p

x e

e -5 v i t

a l e r

2

g -1 0 o L

-1 5

Figure 11c. Relative expression of candidate genes in prostate tissue samples

NOTE: Log2 relative expression is shown; normalized to gene expression in CWR22R prostate cancer cell line; the error bars represent 95% CI (1.96*standard error).

157

4.3 Androgen-dependence of the advanced candidates

To evaluate the possible androgen regulation of our candidate gene expression we exposed androgen-dependent prostate cancer cell line LNCaP to androgen withdrawal and investigated candidates’ expression levels by real-time PCR. We conducted a parallel study in the CWR22R cell line, which grows in an androgen-independent manner due to mutation in the ligand binding domain of its androgen receptor. This was done in view of reported separability of androgen-dependence of growth and androgen-dependence of prostate markers expression [222].

The standard medium is supplemented with 10% fetal bovine serum (FBS), which contains physiological levels of including androgen. Charcoal-stripped serum

(CCS) was treated with charcoal to remove steroids. We applied CCS-supplemented medium to cells to cause androgen withdrawal; after 7 days of treatment we added dihydrotestosterone (DHT) to CCS-supplemented medium for 24 hours as a positive control.

The data was normalized to gene expression levels in FBS-supplemented (control) medium, so on the graph we presented only changes in expression as a result of treatment

(Figure 12). In the LNCaP cell line MGC13170 had a PSA-like, androgen-dependent expression pattern. MAL2 and FLJ30428 showed PSMA-like, inversely androgen- dependent expression, while KIAA1181 did not seem to respond to androgens at all. The androgen-independent cell line CWR22R, which still retains its androgen receptor

(though mutated), showed more complicated results. Interestingly, all genes changed their expression in response to androgens withdrawal and following reconstitution, even

KIAA1181, which didn’t seem to be inclined to do so in LNCaP. MGC13170 had the

158 opposite profiles of androgen dependency in LNCaP and CWR22R cell lines. These findings could be explained by indirect androgen actions, i.e. not through androgen receptor [223]. The regulation of these genes is clearly very complicated and warrants further investigation.

159

A

Androgen-responsiveness in LNCaP

3.0

n 2.0 o i s s

e 1.0 r p x

e 0.0

CSS e v

i DHT

t -1.0 a l e r

-2.0 2 g

o -3.0 L -4.0 PSA PSMA KIAA1181 FLJ30428 MAL2 MGC13170

B

Androgen-responsiveness in CWR22R

2.0 n o i s

s 1.0 e r p x e

CSS

e 0.0 v

i DHT t a l e r -1.0 2 g o L -2.0 PSA PSMA KIAA1181 FLJ30428 MAL2 MGC13170

Figure 12. The changes in relative expression levels of candidates’ genes upon androgen

withdrawal (CSS) and reconstitution (DHT). A. LNCaP prostate cancer

cell line. B. CWR22R prostate cancer cell line.

NOTE:Log2 relative expression is shown; the error bars represent 95% CI (1.96*standard error).

160

4.4 Expanded Real-Time PCR study

Once we obtained additional prostate samples from the CCF Pathology

Department that supplemented the primary set of 8 matched pairs of cancerous prostate samples and 6 benign (cancer-free) prostates, our study required expansion. The CCF

Gene Expression Core Facility lacked the appropriate high-throughput instrument (their maximum capacity was 96-well format); therefore, we transferred our study to the Gene

Expression and Genotyping Facility (GEGF) at the Cancer Center of Case Western

Reserve University. Our purpose was to standardize this study as much as possible in order to ease the transition to clinical trials.

The GEGF performed a standard archival cDNA synthesis reaction across the samples. In our CCF-based real-time PCR studies we used beta-3-microblobulin (B2M), because it outperformed GAPDH and 18S when tested in a number of cell lines of various origins. However, before conducting a bigger study, we performed a comprehensive evaluation of B2M, 18S, GUSB, and HPRT1 as best housekeeping gene candidates [220]. The top two candidates were B2M and 18S; with the latter performing slightly better. The GEGF recommended using 18S as an endogenous control in this particular study using a greater dilution of a template archival cDNA (a factor of 1000, whereas dilution factor of 10 is routinely used in reactions with target genes).

The results are shown in Figure 13 as a vertical scatter plot; the statistical analysis of these data is summarized in Table XIV. The gene expression values were normalized to the average expression of the gene in benign (cancer-free) prostates. Just as in the preliminary real-time PCR-based study, none of the marker genes could reliably distinguish between cancerous (HPCa) and “normal” (HPNm) prostate samples.

161

However, the novel candidate markers could distinguish among benign (cancer-free) prostates and cancer-containing prostates.

The cluster analysis based on expression correlations (Figure 14) showed that most of benign prostates clustered together. Only some HPCa/HPNm pairs actually formed clustered pairs. The correlations of expression among genes are summarized in

Table XV. It is encouraging that none of the candidate genes correlated higher than 85% with established markers PSA or PSMA, which gives hope that the candidates will be successful in supplementing PSA and PSMA to improve diagnostics of prostate cancer.

162

PSSAA PPSMA

10 8

6 5 4 0 n n o 0 1 2 3 4 5 o i i 2 s s s s e e r r p - 5 p 0 x x e e

s 0 1 2 3 4 5 s e e n v v n i i a t t

a

s - 2 a a g l l - 10 s g r e e e r t e r r o t

o a 2 2

t a

e t g g t e s - 4 t s o o a o L t L a o - 15 r t r s p s

p o

o r - 6 n r n m a p g m a p i - g N C i - n n N - 20 C n n P P e o P P e - 8 o H H b n H H b n - 25 - 10

KPIARAM101581 FLPJR3M014028

8 8

6 6

4 4 n n o o i i 2 s s s s 2 e e r r p p x x 0 e e

0 1 2 3 4 5 s s e e 0 v v n n i i t t 0 1 2 3 4 5 a a - 2 a a l l s s g g e e r r e e r r t

t

o - 2 o 2 2

a a t t g g e e - 4 t t o o s s L L a a o o t t r r s - 4 s p p

o o - 6 r r n n a m a m p p g g - i - i N C N C n n n - 6 n P P P P e o o e - 8 H H n b H n H b - 8 - 10

PRM11 PRM12 MAL2 MGC13170 10 8 8 6

6 4 n n o o i i 4 2 s s s s e e r r p p 2 x 0 x e e

0 1 2 3 4 5 s s e e

v v n n i i t t a 0 a - 2 a a l s l s g g e e r

0 r 1 2 3 4 5 e e r r

t

t

o o

2 2

a a t t g g e - 2 e - 4 t s t o o s L a L o a o t t r r s s p p

o - 4 o - 6 r r n n m a a m p p g g i - - i N C N C n n n n P P P P e o o - 6 e - 8 H b n H n H H b

- 8 - 10

Figure 13. Expression levels of candidates, PSA and PSMA in human tissue samples.

NOTE: Vertical scatter plots; log2 relative expression is shown; normalized to average expression in benign prostate; the red marks are median expression values within each set of samples.

163

TABLE XIV

DIFFERENTIATION AMONG SAMPLE GROUPS BY PSA, PSMA AND FOUR

CANDIDATE MARKERS

A. T-test P-values: <=0.05 good, 0.050.1 non-significant Comparison: PSA PSMA MAL2 KIAA1181 MGC13170 FLJ30428 HPCa/Benign Prostate 0.091 0.001 0.003 0.042 0.071 0.384 HPNm/Benign Prostate 0.071 0.003 0.003 0.032 0.157 0.177 HPAll/Benign Prostate 0.077 0.002 0.003 0.035 0.103 0.253 HPCa/HPNm 0.615 0.252 0.921 0.672 0.278 0.222 HPCa/Organs 0 0 0.098 0.01 0.002 0

B. Folds of ratio of relative expression (geometric mean of relative expression values per group is used) Comparison: PSA PSMA MAL2 KIAA1181 MGC13170 FLJ30428 HPCa/Benign Prostate 4.3 9.2 7.1 4.1 3.7 1.8 HPNm/Benign Prostate 4.9 6.9 7.2 4.5 2.8 2.6 HPAll/Benign Prostate 4.6 8 7.2 4.3 3.2 2.1 HPCa/HPNm 0.9 1.3 1 0.9 1.3 0.7 HPCa/Organs 33782.5 195.4 4.4 3.8 6 32.5

Cancerous morphology sample from prostate cancer-positive HPCa: N=30 patient (paired normal-malignant core set) Normal morphology sample from prostate cancer-positive patient HPNm: N=30 (paired normal-malignant core set) Samples from prostate cancer-positive patients (both normal and HPAll N=60 malignant cores) Benign Prostate: N=12 Normal morphology sample from prostate cancer-free patient Organs: N=10 Normal non-prostate human organs

164

A

B

Figure 14. Expression profiles of four candidate markers, PSA and PSMA in human

prostate samples. A. Unsorted. B. Average linkage hierarchical clustering

(uncentered Pearson correlation).

NOTE: Log2 relative expression is shown; BP##: normal morphology sample from prostate cancer-free patient; HPCa and HPNm are Cancerous and Normal morphology paired samples from prostate cancer- positive patients.

TABLE XV

THE CORRELATION BETWEEN GENE EXPRESSION PROFILES BASED ON

REAL-TIME PCR DATA; PROSTATE TISSUES

PSA PSMA KIAA1181 MAL2 MGC13170 FLJ30428 PSA - PSMA 0.73 - KIAA1181 0.85 0.74 - MAL2 0.7 0.78 0.9 - MGC13170 0.8 0.7 0.95 0.83 - FLJ30428 0.73 0.61 0.77 0.7 0.76 -

165

5. Discussion

Currently, PSA is the only prostate cancer marker approved by FDA for clinical use to aid in distinguishing prostate cancer from benign prostate conditions. The major drawback of PSA testing is that it is associated with a false-positive rate of up to 75%

[64]. It is not clear if the benefits of PSA screening outweigh the risks of follow-up diagnostic tests and cancer treatments. The PSA test sensitivity includes small cancers that would never become life threatening (as many as 90% of all prostate cancer cases).

The multiple-needle prostate biopsy (procedure used to diagnose prostate cancer) is painful for the patient and also costly; it may cause side effects, including bleeding and infection. Overdiagnosis puts men at risk for complications from unnecessary treatment including surgery or radiation and creates a lot of anxiety for the patient and his family.

Prostate cancer treatment may cause incontinence and erectile dysfunction.

Another drawback of the PSA test is that prostate cancer is not rare among men with PSA levels generally thought to be in the normal range. In a study of men with PSA levels of 4.0 ng/mL or less (age range 62 to 91 years) prostate cancer was biopsy- diagnosed in 15.2%, including very advanced, metastatic cases with a Gleason score of 7 or higher (14.9% of cancer cases) [73]. PSA expression tightly depends on androgen receptor activity [224], which is often lost in advanced prostate cancers. Drop or loss of

PSA expression could also be caused by an androgen ablation therapy, which makes PSA unreliable in the post-therapeutic prognosis.

The goal of our project was a search for prostate cancer markers that could supplement PSA and be used for development of novel diagnostic assays and therapies in prostate cancer, which would allow more specific diagnosis and more effective prostate

166 cancer treatment. We developed a novel, rational approach to cancer marker search. After passing through several selection steps (Figure 15) we narrowed our search to the four advanced candidates, whose expression patterns were investigated in human tissue samples. The ultimate value of a potential diagnostic marker is its ability to distinguish tumor from normal tissues. All four candidates were able to distinguish between a cancerous and benign (cancer-free) prostate, while two of them performed just as well as established prostate cancer markers PSA and PSMA (Figures 13 and 14; Tables XIII and

XIV).

None of the four advanced candidates was previously reported to be a prostate cancer marker or established as a marker for any type of cancer (Table XVI). These genes were mentioned in a few publications related to cancer markers search, but in every case our candidate was mentioned as a part of a much longer list of potential cancer markers based on their expression profiles. Nevertheless, these earlier observations support our findings.

The NCBI GEO microarray experiment repository lists all our four candidates as present on popular microarray platforms, in particular Affymetrix HG-U95 series.

Multiple experiments that involved prostate-derived samples were performed with these arrays. However, our candidate genes appear prominently prostate-specific [225] only in

GeneNote experimental series, possibly due to the fact that tissues from multiple (10-25) individuals were used and men with precancerous or cancerous prostates were among them.

167

p53 Affymetrix In silico expression microarray profiling n

o i TSG- TRAMP 4,000,000 ESTs t wt c deficient e l e

s PTEN

36,000 genes e t

a 216,000 EST cluster profiles d i 203,000 mRNA-seeded profiles d n a

c 134 mouse genes - e r P

. I 1005 profiles 118 human homologs and orthologs 352 genes

Array printing y a r r

a o r c s i n m

o

i t y PRE-CANDIDATES: a b

z i

n Cell lines

d o i

i cDNA (prostate/ Preferentially r t b a non-prostate) expressed in prostate y c i h f microarray i cancer cell lines r e

V 530 genes

.

I I

l n i

a

i r n e o 14 CANDIDATES: Bioinformatics analysis: t i

t 4 ADVANCED a •Any known genes? a

m CANDIDATES: d

i Primary confirmation •Similarity to known? l l a a by RT-PCR and •Predicted to be c

i Real-time PCR V

Northern Bloting secreted/membrane? n . i I l I c I

Antibody-based assays

Figure 15. A novel integrated approach to identification of prostate cancer markers.

Experimental plan.

168

5.1 KIAA1181/ERGIC-32/ERGIC-1

The protein encoded by the gene KIAA1181 was recently identified through a approach in HepG2 human cell line [226] and described under the name of

ERGIC-32, which stands for ER/Golgi Intermediate Compartment protein of 32kDa, and later renamed as ERGIC-1. ERGIC-1 cycles between ER, intermediate compartment

(ERGIC), and Golgi. It is presumed to play a role in early secretory pathway by participating in membrane traffic and selective transport of vesicular cargo. Its exact physiological function is not known.

5.2 MAL2

MAL2 is an integral membrane protein of the MAL family. It selectively resides in lipid rafts and is predominantly localized to a subapical compartment. MAL2 was recently shown to be an essential component of the indirect transcytotic route in human hepatoma HepG2 [227]. It is indispensable for lipid raft-mediated transport from a basolateral to an apical membrane in polarized epithelial cells [228, 229]. Interestingly,

MAL2 protein was first identified through its interaction with the androgen-responsive tumor protein D52 (TPD52) [230], which is often amplified and overexpressed in prostate and other cancers and acts as a regulator of vesicle trafficking and exocytotic secretion through being phosphorylated in response to secretory stimuli [231].

5.3 MGC13170

MGC13170 is called a multidrug resistance-related protein by association with a phenomenon rather than homology or function. This putative protein was mentioned in the literature twice by independent groups in their studies of gene expression in prostate cancer [232] and androgen-responsive genes in prostate cancer [233].

169

5.4 FLJ30428

The biological role of FLJ30428 has not been established yet. FLJ30428 protein contains male sterility domain, which is conserved in all (possibly related to fatty acids metabolism) and is preferentially expressed in invasive breast carcinoma [234].

170

TABLE XVI

THE PROPERTIES OF FOUR ADVANCED CANDIDATE GENES

Name Size Localization Function Domains and Known connections to AA conserved cancer and prostate motifs (PFAM, SMART) KIAA1181/ 290 ER/Golgi COPII two Predictor of bad prognosis in ERGIC-32/ vesicle transmembrane breast cancer [235] ERGIC-1 protein domains (predicted) FLJ30428 102 Nucleus/ unknown pfam03015: Potential molecular marker in Cytoplasm* Sterile, Male mammary ductal carcinoma sterility protein [234] MAL2 176 Golgi/ Basolateral- PFAM_MARVE Increased activity in breast Membrane to-apical L: conserved and prostate cancers [230]; transcytosis domain involved candidate marker for renal in membrane cell carcinoma subtypes apposition [229]; Involved in events; tetraspan prostasomal secretion in PC-3 prostate cancer cells [236]

MGC13170 117 Cytoplasm/ unknown none Androgen-responsive gene Nucleus* [233]; Upregulated in prostate cancer [232]; Located next to KLK2 gene; CDDB-induced drug resistance mRNA expressed in lung adenocarcinoma [127]

171

6. Conclusions

We have developed a novel, rational approach to cancer marker search and applied it to prostate cancer. As a result we selected four advanced candidates that are capable of distinguishing between cancerous and benign (cancer-free) prostates as well as prostate cancer markers PSA and PSMA. These candidates will be used for development of novel diagnostic assays in prostate cancer. The candidate genes that didn’t pass the final selection step will be revisited in the future.

172

VIII. DEVELOPMENT OF ANTIBODY-BASED ASSAYS AND FURTHER

VALIDATION OF ADVANCED CANDIDATES

1. Rationale

Most of the traditional diagnostic assays are antibody-based, e.g. ELISA or . Since no antibodies against any of the advanced candidates were commercially available, we requested custom polyclonal antibodies to be generated at

ProteinTech Group, Inc (Chicago, IL).

2. Materials and Methods

2.1 Generation of custom polyclonal antibodies

We had requested custom polyclonal antibodies against 3 of our candidate marker genes. For KIAA1181 and MGC13170 we used a “peptide to antibody” service. Our modification to the standard protocol was using a mixture of two for immunization to create a wider range of polyclonal antibodies and improve the success rate as opposed to the commonly used single peptide per injection. The antibodies were affinity-purified using each peptide in the matrix of a separate affinity column resulting in two antibody fractions designated as “A” and “B” (Figure 16).

In order to obtain polyclonal antibodies against MAL2, which is a highly hydrophobic protein, we had to inject rabbits with recombinant protein using “cDNA expression to antibody” option. The antibody generation is in progress.

2.2 Western blot analysis

Samples of lysed cells containing 40 g of total protein were separated in 4-20% gradient SDS-PAGE and blotted to a PVDF membrane. Membranes were blocked with

PBS containing 5% nonfat dry milk and 0.1 % Tween 20 and incubated with the primary

173 rabbit antibodies. Then, the blots were counterstained with goat anti-rabbit IgG conjugated with HRP and immunoreactive bands were visualized by incubation of the membrane with enhanced chemiluminescence reagent (Pierce).

2.3 Real-time PCR

Real-time PCR analysis was performed at CCF gene expression core facility as described above. We added KIAA1181-specific TaqMan Gene Expression Assay

(Hs01063255_m1).

3. Results

3.1 Analysis of antibodies in the cell line collection

Once we obtained the requested antibodies, we had performed the Western blotting on total lysates from human cancer cell lines. Unfortunately, only one out of four custom polyclonal antibodies, namely anti-KIAA1181b, seemed to have acceptable specificity (Figure 17). MGC13170 protein had the expected size of 13 kDa, which was predicted based on its open reading frame. Neither antibody against MGC13170 showed bands of any size that would be consistent in both “A” and “B” fractions of antibodies.

The size of KIAA1181/ERGIC-1 protein is 32 kDa [226] and the band of corresponding size was found to be ubiquitously expressed in all cell lines tested. An additional 17 kDa band was noted to be specific to prostate cell lines (Figure 17).

174

Peptide A Peptide B (KLH-conjugated) NZW NZW Rabbi t 1 Rabbit 2

Immune Serum

Column A Column B Affinity Purification

Antibody A Antibody B

Figure 16. The scheme of custom polyclonal antibody generation by ProteinTech Group

Inc.

175 3

A 3 5 5 p p i

i s R R 5 5 s - - 2 2 1 1 P P P 2 5 P 5 2 2 2 6 R 1 1 R a a a a 2 4 1 1 a a 1 5 5 v R 1 v 5 R 2 2 1 1 - C C C 3 3 C 3 9 - 3 9 - L L R R M M 4 2 4 U W e 4 e 2 4 U W N N N N C I C I N 2 2 A P L H L C 2 L L D C H A 2 H P H L L C D L C

62 kDa

-actin Anti-MGC13170a Anti-MGC13170b B 3

5 p i s R R 3 5 5

- 2 5 2 1 1 P P P 5 2 2 5 2 p 2 1 R 6 R 1 a a i a 2

4 1 a 1 v 1 5 R 5 a s v 5 R 1 2 2 1 C 1 C - 3 - C 3 3 9 - L - 9 3 L R M R M U 4 2 W 4 e 4 N e 2 W 4 U N N I C N N I C 2 2 D L P 2 A H C C L H L L L L C 2 H H C A D P L

32 kDa

17 kDa

-actin Anti-KIAA1181a Anti-KIAA1181b C

X 8 1 6 3 C X

S 4 2 C N - - P 6 P p L 5 B B / /

5 e L 6 3 6 / 0 1 s M M 9 1 6 8 5 5 o 2 1 7 8 8 2 - - g 9 1 6 7 N - 2 - - - e 1 2 2 4 4 2 0 1 a 9 v 1 5 D A A N F 2 n s L L L L L 1 O T H - C C C 1 3 7 2 3 - L 7 3 R o 5 M C D D E E E E E T 8 C C U 1 4 e M 4 K C C C N 4 I C a 9 2 3 H M U T R H L A M M D H R A 2 H R 2 P R S M D C 4 M M L M M

32 kDa Anti- KIAA1181b 17 kDa 100 kDa PSMA 47 kDa -actin

Figure 17: The Western Blot with custom polyclonal antibodies on the cell lines

collection (described above; prostate cell lines are marked bold). A. Two

antibodies against MGC13170. B. Two antibodies against KIAA1181. C.

Anti-KIAA1181b.

176

3.2 KIAA1181 has two transcripts

The NCBI database is being constantly updated as more information for new genes as wells as existing ones becomes available. During the selection steps, including selection of advanced candidates, NCBI entry for KIAA1181 had only one sequence

(accession NM_020462). At the moment, NM_020462 is described as “Homo sapiens endoplasmic reticulum-golgi intermediate compartment (ERGIC) 1 (ERGIC1), transcript variant 2, mRNA” (http://www.ncbi.nlm.nih.gov). The transcript variant 1 has accession

NM_001031711.

The sequences alignment showed one segment in common (Figure 18a). From the bioinformatics analysis of the gene locus we found that these two transcript variants shared two , but had separate (alternative) promoters (Figure 18b). Our current working hypothesis is that the smaller band on the Western blot is indeed the 2nd isoform of KIAA1181 gene, but this hypothesis requires further validation. Due to low levels of expression, isolation of required quantity of purified protein is technically challenging, especially since we don’t yet have antibody specific to the 2nd isoform. The data that confirms this hypothesis include the coinciding pattern of expression and correspondence of the size of our protein to predicted size of the 2nd isoform (17.8 kDa). Successful knockdown of the 2nd isoform by shRNA would also confirm the partial sequence of the protein in question (work in progress).

In order to avoid confusion, here and in subsequent text, the variants of

KIAA1181 gene on mRNA level will referred to as transcripts 1 and 2 (the corresponding sequences were reported at NM_001031711 and NM_020462), and

177 protein bands on Western Blot, which are hypothesized to be KIAA1181-encoded proteins, will be referred to as isoforms 1 and 2.

178

A

mRN A NM_001031711 (transcript variant 1)

NM_020462 (transcript variant 2)

B F2-1 F2-2 F2-3 AAAAA AAAAAAA

F1-1 F 1-2 F1-3 F1-4 F1-5 F1-6 F1-7 F1-8 F1-9 F1-10

Figure 18. Sequence-based analysis of KIAA1181 (source of sequences is NCBI

database). A. Sequence alignment of two transcripts. B. The structure of

the gene.

179

3.3 Comparison of expression of KIAA1181 transcripts (both transcripts vs. 2nd

transcript)

After sequence analysis the TaqMan Gene Expression Assay for KIAA1181, which was used in the real-time PCR studies described above, turned out to be specific to both KIAA1181 transcripts. We selected an additional TaqMan Gene Expression Assay that is specific to the 2nd transcript sequence (Hs01063255_m1).

Among cancer cell lines, KIAA1181 2nd transcript was found to be expressed much higher in prostate cancer cell lines than in the cell lines of other origins (Figure

19a). Among human tissue samples, KIAA1181 2nd transcript had expression profiles different from that which we got for both transcripts of KIAA1181 (Figures 19b and 20).

Of note, upon clustering, KIAA1181 2nd transcript is closest to MAL2, whereas

KIAA1181 (both transcripts) is closest to MGC13170. The statistical analysis of

KIAA1181 is summarized in Table XVII. Even though KIAA1181 2nd transcript was found to be not as prostate-specific as KIAA1181, the former distinguished between cancerous and benign (cancer-free) prostate slightly better than the latter. It is intriguing that their patterns of expression had only 86% correlation, which was less than the correlation of KIAA1181 with MAL2 or MGC13170 (Table XVIII).

180

A

KIAA1181 expression in cell lines

KIAA1181 (both transcripts) KIAA1181-2nd transcript

6.0

n 4.0 o i s

s 2.0 e r p

x 0.0 e

e v i

t -2.0 a l e r

-4.0 2 g o

L -6.0

-8.0 6 5 7 9 2 3 1 P S R N R O - 5 4 9 F 3 a 5 2 C P K H - 4 1 2 4 9 2 p C C P R C N / 1 C 2 U A R N L 3 M A H H D L 2 W L C E M

B KIPARAM101581 KIPARAM10158-12 n2dn dtrtarnacnrcipritpt

8 12 10 6 8 4 6 n n o o i i s s 4 s s 2 e e r r p p x x 2 e e

s s e e 0 v v n n i i t t 0 1 2 3 4 5 0 a a a a l l s s g g 0 1 2 3 4 5 e e r r e e r r t

t o - 2 o 2 2 - 2

a a t g t g e e t t o o s s L L a a o o t t r

r - 4 s - 4 s p p

o o r r n n m a a m p p - 6 g g i - - i N C N C n n n - 6 n P P P P e o o e - 8 H b n n H H H b - 8 - 10

Figure 19: Expression of KIAA1181 transcripts as by real-time PCR. A. Panel of cell lines

(normalized to CWR22R expression). B. Expression profiles of the

candidate markers, PSA and PSMA in human prostate samples.

NOTE: log2 relative expression is shown; BP##: normal morphology sample from prostate cancer-free patient; HPCa and HPNm are Cancerous and Normal morphology paired samples from prostate cancer- positive patients; average linkage hierarchical clustering (uncentered Pearson correlation).

181

PSA PSMA KIAA1181 MGC13170 MAL2 KIAA1181-2nd FLJ30428

Figure 20: Expression profiles of the candidate markers (with the addition of 2nd

transcript of KIAA1181), PSA and PSMA in human prostate samples.

NOTE: log2 relative expression is shown; BP##: normal morphology sample from prostate cancer-free patient; HPCa and HPNm are Cancerous and Normal morphology paired samples from prostate cancer- positive patients; average linkage hierarchical clustering (uncentered Pearson correlation).

182

TABLE XVII

DIFFERENTIATION AMONG SAMPLE GROUPS BY KIAA1181 TRANSCRIPTS

A. T-test P-values: <=0.05 good, 0.050.1 non-significant Comparison: KIAA1181 (both transcripts) KIAA1181-2nd transcript HPCa/Benign Prostate 0.042 0.029 HPNm/Benign Prostate 0.032 0.054 HPAll/Benign Prostate 0.035 0.037 HPCa/HPNm 0.672 0.473 HPCa/Organs 0.01 0.59

B. Folds of ratio of relative expression (geometric mean of relative expression values per group is used) Comparison: KIAA1181 (both transcripts) KIAA1181-2nd transcript HPCa/Benign Prostate 4.1 6.3 HPNm/Benign Prostate 4.5 4.9 HPAll/Benign Prostate 4.3 5.5 HPCa/HPNm 0.9 1.3

Cancerous morphology sample from prostate cancer- HPCa: N=30 positive patient (paired normal-malignant core set) Normal morphology sample from prostate cancer-positive HPNm: N=30 patient (paired normal-malignant core set) Samples from prostate cancer-positive patients (both normal HPAll N=60 and malignant cores) Normal morphology sample from prostate cancer-free Benign Prostate: N=12 patient Organs: N=10 Normal non-prostate human organs

TABLE XVIII

THE CORRELATION BETWEEN GENE EXPRESSION PROFILES BASED ON

REAL-TIME PCR DATA (KIAA1181 2ND TRANSCRIPT DATA ADDED);

PROSTATE TISSUES

PSA PSMA KIAA1181 MAL2 MGC13170 KIAA1181-2 FLJ30428 PSA - PSMA 0.73 - KIAA1181 0.85 0.74 - MAL2 0.70 0.78 0.90 - MGC13170 0.80 0.70 0.95 0.83 - KIAA1181-2 0.62 0.72 0.86 0.91 0.83 - FLJ30428 0.73 0.61 0.77 0.70 0.76 0.68 -

183

4. Discussion

KIAA1181 has at least two alternatively spliced transcripts, one of which seems to be specifically expressed in prostate cancer. KIAA1181 was identified as a potential PCa marker by the in silico expression profiling approach, which took into account major splice variants by applying a mRNA-seeded clustering protocol.

Alternative splicing is defined as a process, by which identical pre-mRNA molecules are spliced in different ways [237]. Upon differentially spliced mRNAs yield no viable protein product or proteins that could differ in structure, enzymatic function, affinity, subcellular localization, sites, susceptibility to proteosomal degradation, etc. Alternative splicing is very important in normal embryonic development, as well as such diseases as growth hormone deficiency, Frasier syndrome, Parkinson’s disease, cystic fibrosis, , spinal muscular atrophy, myotonic dystrophy and cancer [reviewed in (237-239)]. Studies have shown that tumor suppressors are often inactivated by splicing in cancer, and oncogenes are inactivated by alternative splicing in normal differentiation [240].

The idea of an alternatively spliced tumor marker is not novel. The NRSF transcription-silencing factor has an alternative 50-base exon inserted to produce a truncated protein in small cell lung cancer, but not normal tissue. This extra exon is a potential clinical marker for small cell lung cancer [241]. Other examples of cancer- specific alternative splicing are compiled in Table XIX [237, 240]. Several of our PCa marker candidates displayed differential RT-PCR patterns in the cell line collection implying potential tissue-specific alternative splicing (see Figure 7). Rapidly accumulating data on alternative splicing in cancer allows us to presume that an

184 alternatively spliced product of a gene is more likely to be a suitable marker of a disease or a process (such as tumorigenesis) than a whole gene.

185

TABLE XIX

EXAMPLES OF CANCER PROGRESSION-RELATED GENES WITH CANCER-

SPECIFIC ALTERNATIVE SPLICING

Gene Cancer tissue Function -4 Lung Adhesion, metastasis AIB1 Breast Hormone signalling Androgen receptor Breast, prostate Transcription Factor ASIP Breast, liver Adhesion, migration ATM Many Tumor suppressor Bcl-x Many Apoptosis BRCA1 Breast Tumor suppressor Caspase 3, 8 Many Apoptosis C-CAM Lung Adhesion CD44 Many Proliferation, angiogenesis Crk Brain Migration, invasion DNMT3b4 Liver Chromatin modelling Breast Transcription Factor Fas Apoptosis FGFR1 Breast, brain Growth signalling FGFR2 Prostate Growth signaling FGFR3 Many Tumor suppressor Many Angiogenesis Fyn Leukemia Tyrosine kinase Gastrin receptor Pancreas Proliferation hSNF5 suppressor IIP45 Brain Invasion Thyroid, colon Tyrosine kinase Integrin beta1C Endometrium Adhesion Interleukin 10 Many Apoptosis Interleukin 18 Ovary Caspase target KAI1/CD82 Gastric Metastasis Kinectin Liver Caspase target KLF6 Prostate Tumor suppressor , MDM4 Many MLH1 Colon DNA mismatch repair MUC1 Thyroid Adhesion, metastasis NF1 Brain Signalling GTPase NF2 Brain Tumor suppressor NRSF Lung Transcription factor NTRK1 Many Tumor suppressor PASG Leukemia Chromatin modelling Rac1 Colorectal Signalling GTPase RB1 Many Tumor suppressor Secretin receptor Pancreas Growth inhibitor

186

SHBG Endometrium Hormone signalling SLP65 Lymphoma B-cell differentiation Many Apoptosis SVH Liver Unknown Syk Breast, lymphoma Metastasis Tenascin-C Many Adhesion inhibitor TERT Many Telomerase TP53 Many Tumor suppressor TP73 Many Tumor suppressor TSG101 Many Proteolysis uPAR Breast Adhesion, proteolysis VEGF Many Angiogenesis WISP1 Gastric Invasion

187

5. Conclusions

We generated custom polyclonal antibodies against KIAA1181 and MGC13170

(MAL2 antibodies are in progress). Both antibodies against KIAA1181 detected a prostate-specific 17 kDa band on a Western blot, which we hypothesized to be encoded by the KIAA1181 2nd transcript. The expression profile of the latter was investigated using real-time PCR. When compared to expression profiles of KIAA1181 (both transcripts), the KIAA1181 2nd transcript was found to be more prostate-specific among cell lines, less prostate specific among normal tissues, but more prostate cancer-specific among prostate tissue samples. These data suggest that the KIAA1181 2nd transcript might be considered as a fifth advanced candidate of prostate cancer marker.

188

CITED LITERATURE

1. Twombly, R., Cancer surpasses heart disease as leading cause of death for all but the very elderly. J Natl Cancer Inst, 2005. 97(5): p. 330-1. 2. Hoefner, D.M., Serum tumor markers. Part I: Clinical utility. MLO Med Lab Obs, 2005. 37(12): p. 20, 22-4. 3. Molina, R., et al., Tumor markers in breast cancer- European Group on Tumor Markers recommendations. Tumour Biol, 2005. 26(6): p. 281-93. 4. Perkins, G.L., et al., Serum tumor markers. Am Fam Physician, 2003. 68(6): p. 1075-82. 5. Sherwood, E.R., et al., Differential cytokeratin expression in normal, hyperplastic and malignant epithelial cells from human prostate. J Urol, 1990. 143(1): p. 167- 71. 6. Liu, A.Y., et al., Cell-cell interaction in prostate gene regulation and cytodifferentiation. Proc Natl Acad Sci U S A, 1997. 94(20): p. 10705-10. 7. Bostwick, D.G., Prospective origins of prostate carcinoma. Prostatic intraepithelial neoplasia and atypical adenomatous hyperplasia. Cancer, 1996. 78(2): p. 330-6. 8. Fitzpatrick, J.M., The natural history of benign prostatic hyperplasia. BJU Int, 2006. 97 Suppl 2: p. 3-6; discussion 21-2. 9. Sakr, W.A., et al., The frequency of carcinoma and intraepithelial neoplasia of the prostate in young male patients. J.Urol., 1993. 150(2 Pt 1): p. 379-385. 10. Bostwick, D.G. and J. Qian, High-grade prostatic intraepithelial neoplasia. Mod Pathol, 2004. 17(3): p. 360-79. 11. Sakr, W.A., et al., Allelic loss in locally metastatic, multisampled prostate cancer. Cancer Res., 1994. 54(12): p. 3273-3277. 12. Qian, J., et al., Chromosomal anomalies in prostatic intraepithelial neoplasia and carcinoma detected by fluorescence in situ hybridization. Cancer Res., 1995. 55(22): p. 5408-5414. 13. Vocke, C.D., et al., Analysis of 99 microdissected prostate carcinomas reveals a high frequency of allelic loss on chromosome 8p12-21. Cancer Res., 1996. 56(10): p. 2411-2416. 14. Joniau, S., et al., Prostatic intraepithelial neoplasia (PIN): importance and clinical management. Eur Urol, 2005. 48(3): p. 379-85. 15. , D.M., F.I. Bray, and S.S. Devesa, Cancer burden in the year 2000. The global picture. Eur J Cancer, 2001. 37 Suppl 8: p. S4-66. 16. Jemal, A., et al., Cancer statistics, 2006. CA Cancer J Clin, 2006. 56(2): p. 106- 30. 17. Sakr, W.A., et al., High grade prostatic intraepithelial neoplasia (HGPIN) and prostatic adenocarcinoma between the ages of 20-69: an autopsy study of 249 cases. In Vivo, 1994. 8(3): p. 439-43. 18. Chodak, G.W., et al., Results of conservative management of clinically localized prostate cancer. N Engl J Med, 1994. 330(4): p. 242-8. 19. Johansson, J.E., et al., Fifteen-year survival in prostate cancer. A prospective, population-based study in Sweden. Jama, 1997. 277(6): p. 467-71.

189

20. Schroder, F.H. and M.F. Wildhagen, Screening for prostate cancer: evidence and perspectives. BJU Int, 2001. 88(8): p. 811-7. 21. Frankel, S., et al., Screening for prostate cancer. Lancet, 2003. 361(9363): p. 1122-8. 22. Konishi, N., et al., Molecular pathology of prostate cancer. Pathol Int, 2005. 55(9): p. 531-9. 23. Konishi, N., et al., Different patterns of DNA alterations detected by restriction landmark genomic scanning in heterogeneous prostate carcinomas. Am J Pathol, 1997. 150(1): p. 305-14. 24. Konishi, N., et al., Intratumor cellular heterogeneity and alterations in ras oncogene and p53 tumor suppressor gene in human prostate carcinoma. Am J Pathol, 1995. 147(4): p. 1112-22. 25. Konishi, N., et al., Heterogeneous and patterns of the INK4a/ARF locus within prostate carcinomas. Am J Pathol, 2002. 160(4): p. 1207-14. 26. McNeal, J.E., The zonal anatomy of the prostate. Prostate, 1981. 2(1): p. 35-49. 27. Walsh, P.C., Prostate cancer kills: strategy to reduce deaths. Urology, 1994. 44(4): p. 463-466. 28. Reissigl, A., et al., PSA-based screening for prostate cancer in asymptomatic younger males: pilot study in blood donors. Prostate, 1997. 30(1): p. 20-5. 29. Vis, A.N., et al., Serendipity in detecting disease in low prostate-specific antigen ranges. BJU Int, 2002. 89(4): p. 384-9. 30. Hoefner, D.M., Serum tumor markers: part II--practical considerations and limitations of testing. MLO Med Lab Obs, 2006. 38(2): p. 10-1, 14-6; quiz 18-9. 31. Luciani, L.G., et al., Role of transperineal six-core prostate biopsy in patients with prostate-specific antigen level greater than 10 ng/mL and abnormal digital rectal examination findings. Urology, 2006. 67(3): p. 555-8. 32. Ornstein, D.K. and J. Kang, How to improve prostate biopsy detection of prostate cancer. Curr Urol Rep, 2001. 2(3): p. 218-23. 33. Stamey, T.A., et al., Biological determinants of cancer progression in men with prostate cancer. Jama, 1999. 281(15): p. 1395-400. 34. Nelson, P.S., Predicting prostate cancer behavior using transcript profiles. J Urol, 2004. 172(5 Pt 2): p. S28-32; discussion S33. 35. Grayhack, J.T., T.C. Keeler, and J.M. Kozlowski, Carcinoma of the prostate. Hormonal therapy. Cancer, 1987. 60(3 Suppl): p. 589-601. 36. Huggins, C. and C.V. Hodges, Studies on prostatic cancer: I. The effect of castration, of estrogen and of androgen injection on serum in metastatic carcinoma of the prostate. 1941. J Urol, 2002. 168(1): p. 9-12. 37. Tammela, T., Endocrine treatment of prostate cancer. J Steroid Biochem Mol Biol, 2004. 92(4): p. 287-95. 38. English, H.F., N. Kyprianou, and J.T. Isaacs, Relationship between DNA fragmentation and apoptosis in the in the rat prostate following castration. Prostate, 1989. 15(3): p. 233-250. 39. Kyprianou, N. and J.T. Isaacs, Quantal relationship between prostatic dihydrotestosterone and prostatic cell content: critical threshold concept. Prostate, 1987. 11(1): p. 41-50.

190

40. Haggins, C. and C.V. Hodges, The effect of castration of estrogen and of androgen injection on serum phosphatases in metastases carcinoma of the prostate. Cancer Res., 1941. 1: p. 293-297. 41. Palmberg, C., et al., PSA decline is an independent prognostic marker in hormonally treated prostate cancer. Eur Urol, 1999. 36(3): p. 191-6. 42. Damber, J.E., Endocrine therapy for prostate cancer. Acta Oncol, 2005. 44(6): p. 605-9. 43. Ward, J.F. and J.W. Moul, Rising prostate-specific antigen after primary prostate cancer therapy. Nat Clin Pract Urol, 2005. 2(4): p. 174-82. 44. Talback, M., et al., Cancer survival in Sweden 1960-1998--developments across four decades. Acta Oncol, 2003. 42(7): p. 637-59. 45. Lam, J.S., et al., Secondary hormonal therapy for advanced prostate cancer. J Urol, 2006. 175(1): p. 27-34. 46. Messing, E.M., et al., Immediate hormonal therapy compared with observation after radical prostatectomy and pelvic lymphadenectomy in men with node- positive prostate cancer. N Engl J Med, 1999. 341(24): p. 1781-8. 47. Kutscher, W. and H. Wolbergs, Prostata phosphatase. Hoppe-Seyler’s Z Physiol Chem, 1935. 236: p. 237. 48. Tricoli, J.V., M. Schoenfeldt, and B.A. Conley, Detection of prostate cancer and predicting progression: current and future diagnostic markers. Clin Cancer Res, 2004. 10(12 Pt 1): p. 3943-53. 49. Vihko, P., et al., Serum prostate-specific acid phosphatase: development and validation of a specific radioimmunoassay. Clin Chem, 1978. 24(11): p. 1915-9. 50. Schacht, M.J., J.E. Garnett, and J.T. Grayhack, Biochemical markers in prostatic cancer. Urol Clin North Am, 1984. 11(2): p. 253-67. 51. Nesbit, R.M. and W.C. Baum, Serum phosphatase determinations in diagnosis of prostatic cancer; a review of 1,150 cases. J Am Med Assoc, 1951. 145(17): p. 1321-4. 52. King, E.J. and K.A. Jegatheesan, A method for the determination of tartratelabile, prostatic acid phosphatase in serum. J Clin Pathol, 1959. 12(1): p. 85-9. 53. Foti, A.G., et al., Detection of prostatic cancer by solid-phase radioimmunoassay of serum prostatic acid phosphatase. N Engl J Med, 1977. 297(25): p. 1357-61. 54. Bruce, A.W., et al., The significance of prostatic acid phosphatase in adenocarcinoma of the prostate. J Urol, 1981. 125(3): p. 357-60. 55. Hara, M., et al., Physico-chemical characteristics of "y-seminoprotein", an antigenic component specific for human seminal plasma. Nippon Hoigaku Zasshi, 1971. 25(4): p. 322-4. 56. Wang, M.C., et al., Purification of a human prostate specific antigen. Invest Urol, 1979. 17(2): p. 159-63. 57. Evans, B.A., C.C. Drinkwater, and R.I. Richards, Mouse glandular kallikrein genes. Structure and partial sequence analysis of the kallikrein gene locus. J Biol Chem, 1987. 262(17): p. 8027-34. 58. Olsson, A.Y., et al., The evolution of the glandular kallikrein locus: identification of orthologs and pseudogenes in the cotton-top tamarin. Gene, 2004. 343(2): p. 347-55.

191

59. Kuriyama, M., et al., Use of human prostate-specific antigen in monitoring prostate cancer. Cancer Res, 1981. 41(10): p. 3874-6. 60. Catalona, W.J., et al., Measurement of prostate-specific antigen in serum as a screening test for prostate cancer. N.Engl.J.Med., 1991. 324(17): p. 1156-1161. 61. Gann, P.H., C.H. Hennekens, and M.J. Stampfer, A prospective evaluation of plasma prostate-specific antigen for detection of prostatic cancer. JAMA, 1995. 273(4): p. 289-294. 62. Stenmann, U.H., et al., Serum concentrations of prostate-specific antigen as screening test for prostate cancer. Br.Med.J., 1995. 311: p. 1340-1343. 63. Chodak, G.W. and K.S. Warren, Watchful waiting for prostate cancer: a review article. Prostate Cancer Prostatic Dis, 2006. 9(1): p. 25-9. 64. Keetch, D.W., W.J. Catalona, and D.S. Smith, Serial prostatic biopsies in men with persistently elevated serum prostate specific antigen values. J Urol, 1994. 151(6): p. 1571-4. 65. Oesterling, J.E., et al., Serum prostate-specific antigen in a community-based population of healthy men. Establishment of age-specific reference ranges. Jama, 1993. 270(7): p. 860-4. 66. Henderson, R.J., et al., Prostate-specific antigen (PSA) and PSA density: racial differences in men without prostate cancer. J Natl Cancer Inst, 1997. 89(2): p. 134-8. 67. Saw, S. and T.C. Aw, Age-related reference intervals for free and total prostate- specific antigen in a Singaporean population. Pathology, 2000. 32(4): p. 245-9. 68. Partin, A.W., et al., Prostate specific antigen in the staging of localized prostate cancer: influence of tumor differentiation, tumor volume and benign hyperplasia. J Urol, 1990. 143(4): p. 747-52. 69. McNeal, J.E., et al., Immunohistochemical evidence for impaired cell differentiation in the premalignant phase of prostate . Am J Clin Pathol, 1988. 90(1): p. 23-32. 70. Kranse, R., et al., Predictors for biopsy outcome in the European Randomized Study of Screening for Prostate Cancer (Rotterdam region). Prostate, 1999. 39(4): p. 316-22. 71. Schroder, F.H., et al., Prostate cancer detection at low prostate specific antigen. J Urol, 2000. 163(3): p. 806-12. 72. Schroder, F.H. and M.F. Wildhagen, Low levels of PSA predict long-term risk of prostate cancer: results from the Baltimore longitudinal study of aging. Urology, 2002. 59(3): p. 462. 73. Thompson, I.M., et al., Prevalence of prostate cancer among men with a prostate- specific antigen level < or =4.0 ng per milliliter. N Engl J Med, 2004. 350(22): p. 2239-46. 74. Young, C.Y., et al., Hormonal regulation of prostate-specific antigen messenger RNA in human prostatic adenocarcinoma cell line LNCaP. Cancer Res, 1991. 51(14): p. 3748-52. 75. Montgomery, B.T., et al., Hormonal regulation of prostate-specific antigen (PSA) glycoprotein in the human prostatic adenocarcinoma cell line, LNCaP. Prostate, 1992. 21(1): p. 63-73.

192

76. Trapman, J. and K.B. Cleutjens, Androgen-regulated gene expression in prostate cancer. Semin Cancer Biol, 1997. 8(1): p. 29-36. 77. Cleutjens, K.B., et al., An androgen response element in a far upstream enhancer region is essential for high, androgen-regulated activity of the prostate- specific antigen promoter. Mol Endocrinol, 1997. 11(2): p. 148-61. 78. Schuur, E.R., et al., Prostate-specific antigen expression is regulated by an upstream enhancer. J Biol Chem, 1996. 271(12): p. 7043-51. 79. Zhang, S., P.E. Murtha, and C.Y. Young, Defining a functional androgen responsive element in the 5' far upstream flanking region of the prostate-specific antigen gene. Biochem Biophys Res Commun, 1997. 231(3): p. 784-8. 80. Wang, Q., J.S. Carroll, and M. Brown, Spatial and temporal recruitment of androgen receptor and its coactivators involves chromosomal looping and polymerase tracking. Mol Cell, 2005. 19(5): p. 631-42. 81. Yeung, F., et al., Regions of prostate-specific antigen (PSA) promoter confer androgen- independent expression of PSA in prostate cancer cells. J Biol Chem, 2000. 275(52): p. 40846-55. 82. Gurova, K.V., et al., Expression of prostate specific antigen (PSA) is negatively regulated by p53. Oncogene, 2002. 21(1): p. 153-7. 83. Ossovskaya, V.S., et al., Use of genetic suppressor elements to dissect distinct biological effects of separate p53 domains. Proc Natl Acad Sci U S A, 1996. 93(19): p. 10309-14. 84. Rokhlin, O.W., et al., p53 is involved in tumor necrosis factor-alpha-induced apoptosis in the human prostatic carcinoma cell line LNCaP. Oncogene, 2000. 19(15): p. 1959-68. 85. Israeli, R.S., et al., Molecular of a complementary DNA encoding a prostate-specific membrane antigen. Cancer Res, 1993. 53(2): p. 227-30. 86. O'Keefe, D.S., et al., Mapping, genomic organization and promoter analysis of the human prostate-specific membrane antigen gene. Biochim Biophys Acta, 1998. 1443(1-2): p. 113-27. 87. Rajasekaran, A.K., G. Anilkumar, and J.J. Christiansen, Is prostate-specific membrane antigen a multifunctional protein? Am J Physiol Cell Physiol, 2005. 288(5): p. C975-81. 88. Pinto, J.T., et al., Prostate-specific membrane antigen: a novel folate hydrolase in human prostatic carcinoma cells. Clin Cancer Res, 1996. 2(9): p. 1445-51. 89. Carter, R.E., A.R. Feldman, and J.T. Coyle, Prostate-specific membrane antigen is a hydrolase with substrate and pharmacologic characteristics of a neuropeptidase. Proc Natl Acad Sci U S A, 1996. 93(2): p. 749-53. 90. Horoszewicz, J.S., E. Kawinski, and G.P. Murphy, Monoclonal antibodies to a new antigenic marker in epithelial prostatic cells and serum of prostatic cancer patients. Anticancer Res, 1987. 7(5B): p. 927-35. 91. Silver, D.A., et al., Prostate-specific membrane antigen expression in normal and malignant human tissues. Clin Cancer Res, 1997. 3(1): p. 81-5. 92. Sokoloff, R.L., et al., A dual-monoclonal sandwich assay for prostate-specific membrane antigen: levels in tissues, seminal fluid and urine. Prostate, 2000. 43(2): p. 150-7.

193

93. Liu, H., et al., Monoclonal antibodies to the extracellular domain of prostate- specific membrane antigen also react with tumor vascular . Cancer Res, 1997. 57(17): p. 3629-34. 94. Israeli, R.S., et al., Expression of the prostate-specific membrane antigen. Cancer Res, 1994. 54(7): p. 1807-11. 95. Wright, G.L., Jr., et al., Upregulation of prostate-specific membrane antigen after androgen-deprivation therapy. Urology, 1996. 48(2): p. 326-34. 96. Chang, S.S., et al., Comparison of anti-prostate-specific membrane antigen antibodies and other immunomarkers in metastatic prostate carcinoma. Urology, 2001. 57(6): p. 1179-83. 97. Kawakami, M. and J. Nakayama, Enhanced expression of prostate-specific membrane antigen gene in prostate cancer as revealed by in situ hybridization. Cancer Res, 1997. 57(12): p. 2321-4. 98. Ross, J.S., et al., Correlation of primary tumor prostate-specific membrane antigen expression with disease recurrence in prostate cancer. Clin Cancer Res, 2003. 9(17): p. 6357-62. 99. Ghosh, A. and W.D. Heston, Tumor target prostate specific membrane antigen (PSMA) and its regulation in prostate cancer. J Cell Biochem, 2004. 91(3): p. 528-39. 100. Beckett, M.L., et al., Prostate-specific membrane antigen levels in sera from healthy men and patients with benign prostate hyperplasia or prostate cancer. Clin Cancer Res, 1999. 5(12): p. 4034-40. 101. Uria, J.A., et al., Prostate-specific membrane antigen in breast carcinoma. Lancet, 1997. 349(9065): p. 1601. 102. Lai, L.C., et al., Clinical usefulness of tumour markers. Malays J Pathol, 2003. 25(2): p. 83-105. 103. Elgamal, A.A., et al., Detection of prostate specific antigen in pancreas and salivary glands: a potential impact on prostate cancer overestimation. J Urol, 1996. 156(2 Pt 1): p. 464-8. 104. Black, M.H., et al., Serum total and free prostate-specific antigen for breast cancer diagnosis in women. Clin Cancer Res, 2000. 6(2): p. 467-73. 105. Lopez, L.A., et al., Prevalence of abnormal levels of serum tumour markers in elderly people. Age Ageing, 1996. 25(1): p. 45-50. 106. Gelmann, E.P. and O.J. Semmes, Expression of genes and proteins specific for prostate cancer. J Urol, 2004. 172(5 Pt 2): p. S23-6; discussion S26-7. 107. Bull, J.H., et al., Identification of potential diagnostic markers of prostate cancer and prostatic intraepithelial neoplasia using cDNA microarray. Br J Cancer, 2001. 84(11): p. 1512-9. 108. Chaib, H., et al., Profiling and verification of gene expression patterns in normal and malignant human prostate tissues by cDNA microarray analysis. Neoplasia, 2001. 3(1): p. 43-52. 109. Dhanasekaran, S.M., et al., Delineation of prognostic biomarkers in prostate cancer. Nature, 2001. 412(6849): p. 822-6. 110. Luo, J., et al., Human prostate cancer and benign prostatic hyperplasia: molecular dissection by gene expression profiling. Cancer Res, 2001. 61(12): p. 4683-8.

194

111. Magee, J.A., et al., Expression profiling reveals hepsin overexpression in prostate cancer. Cancer Res, 2001. 61(15): p. 5692-6. 112. Schena, M., et al., Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995. 270(5235): p. 467-70. 113. Vaarala, M.H., et al., Differentially expressed genes in two LNCaP prostate cancer cell lines reflecting changes during prostate cancer progression. Lab Invest, 2000. 80(8): p. 1259-68. 114. Asmann, Y.W., et al., Identification of differentially expressed genes in normal and malignant prostate by electronic profiling of expressed sequence tags. Cancer Res, 2002. 62(11): p. 3308-14. 115. Baranova, A.V., et al., In silico screening for tumour-specific expressed sequences in human genome. FEBS Lett, 2001. 508(1): p. 143-8. 116. Ghadersohi, A. and A.K. Sood, Prostate epithelium-derived Ets transcription factor mRNA is overexpressed in human breast tumors and is a candidate breast tumor marker and a breast tumor antigen. Clin Cancer Res, 2001. 7(9): p. 2731-8. 117. Nelson, P.S., et al., The human (PEDB) and mouse (mPEDB) Prostate Expression Databases. Nucleic Acids Res, 2002. 30(1): p. 218-20. 118. Scheurle, D., et al., Cancer gene discovery using digital differential display. Cancer Res, 2000. 60(15): p. 4037-43. 119. Schmitt, A.O., et al., Exhaustive mining of EST libraries for genes differentially expressed in normal and tumour tissues. Nucleic Acids Res, 1999. 27(21): p. 4251-60. 120. Vasmatzis, G., et al., Discovery of three genes specifically expressed in human prostate by expressed sequence tag database analysis. Proc Natl Acad Sci U S A, 1998. 95(1): p. 300-4. 121. Vinals, C., S. Gaulis, and T. Coche, Using in silico transcriptomics to search for tumor-associated antigens for . Vaccine, 2001. 19(17-19): p. 2607-14. 122. Velculescu, V.E., et al., Serial analysis of gene expression. Science, 1995. 270(5235): p. 484-7. 123. Waghray, A., et al., Identification of androgen-regulated genes in the prostate cancer cell line LNCaP by serial analysis of gene expression and proteomic analysis. Proteomics, 2001. 1(10): p. 1327-38. 124. Waghray, A., et al., Identification of differentially expressed genes by serial analysis of gene expression in human prostate cancer. Cancer Res, 2001. 61(10): p. 4283-6. 125. Xu, J., et al., Identification of differentially expressed genes in human prostate cancer using subtraction and microarray. Cancer Res, 2000. 60(6): p. 1677-82. 126. Xu, L.L., et al., A novel androgen-regulated gene, PMEPA1, located on chromosome 20q13 exhibits high level expression in prostate. Genomics, 2000. 66(3): p. 257-63. 127. Ahram, M., et al., Proteomic analysis of human prostate cancer. Mol Carcinog, 2002. 33(1): p. 9-15. 128. Ornstein, D.K., et al., Proteomic analysis of laser capture microdissected human prostate cancer and in vitro prostate cell lines. Electrophoresis, 2000. 21(11): p. 2235-42.

195

129. Deftos, L.J., et al., Immunoassay and immunohistology studies of chromogranin A as a neuroendocrine marker in patients with carcinoma of the prostate. Urology, 1996. 48(1): p. 58-62. 130. Berruti, A., et al., Potential clinical value of circulating chromogranin A in patients with prostate carcinoma. Ann Oncol, 2001. 12 Suppl 2: p. S153-7. 131. Berruti, A., et al., Independent prognostic role of circulating chromogranin A in prostate cancer patients with hormone-refractory disease. Endocr Relat Cancer, 2005. 12(1): p. 109-17. 132. Lee, W.H., et al., Cytidine methylation of regulatory sequences near the pi-class glutathione S-transferase gene accompanies human prostatic carcinogenesis. Proc Natl Acad Sci U S A, 1994. 91(24): p. 11733-7. 133. Jeronimo, C., et al., Quantitation of GSTP1 methylation in non-neoplastic prostatic tissue and organ-confined prostate adenocarcinoma. J Natl Cancer Inst, 2001. 93(22): p. 1747-52. 134. Jeronimo, C., et al., Quantitative GSTP1 hypermethylation in bodily fluids of patients with prostate cancer. Urology, 2002. 60(6): p. 1131-5. 135. Reiter, R.E., et al., Prostate stem cell antigen: a cell surface marker overexpressed in prostate cancer. Proc Natl Acad Sci U S A, 1998. 95(4): p. 1735-40. 136. Gu, Z., et al., Prostate stem cell antigen (PSCA) expression increases with high gleason score, advanced stage and bone metastasis in prostate cancer. Oncogene, 2000. 19(10): p. 1288-96. 137. Sommerfeld, H.J., et al., Telomerase activity: a prevalent marker of malignant human prostate tissue. Cancer Res, 1996. 56(1): p. 218-22. 138. Iczkowski, K.A., et al., Telomerase reverse transcriptase subunit immunoreactivity: a marker for high-grade prostate carcinoma. Cancer, 2002. 95(12): p. 2487-93. 139. Audic, S. and J.M. Claverie, The significance of digital gene expression profiles. Genome Res, 1997. 7(10): p. 986-95. 140. Walker, M.G., et al., Prediction of gene function by genome-scale expression analysis: prostate cancer-associated genes. Genome Res, 1999. 9(12): p. 1198- 203. 141. Burge, C. and S. Karlin, Prediction of complete gene structures in human genomic DNA. J Mol Biol, 1997. 268(1): p. 78-94. 142. Murphy, G.P., et al., Evaluation and comparison of two new prostate carcinoma markers. Free-prostate specific antigen and prostate specific membrane antigen. Cancer, 1996. 78(4): p. 809-18. 143. Corey, E., et al., Detection of circulating prostate cells by reverse transcriptase- polymerase chain reaction of human glandular kallikrein (hK2) and prostate- specific antigen (PSA) messages. Urology, 1997. 50(2): p. 184-8. 144. Teni, T.R., et al., Serum and urinary prostatic inhibin-like peptide in benign prostatic hyperplasia and carcinoma of prostate. Cancer Lett, 1988. 43(1-2): p. 9- 14. 145. Ernst, T., et al., Decrease and gain of gene expression are equally discriminatory markers for prostate carcinoma: a gene expression analysis on total and microdissected prostate tissue. Am J Pathol, 2002. 160(6): p. 2169-80.

196

146. Liu, X.F., et al., PRAC: A novel small nuclear protein that is specifically expressed in human prostate and colon. Prostate, 2001. 47(2): p. 125-31. 147. Fleming, W.H., et al., Expression of the c-myc protooncogene in human prostatic carcinoma and benign prostatic hyperplasia. Cancer Res, 1986. 46(3): p. 1535-8. 148. Subong, E.N., et al., Monoclonal antibody to prostate cancer nuclear matrix protein (PRO:4- 216) recognizes nucleophosmin/B23. Prostate, 1999. 39(4): p. 298-304. 149. He, W.W., et al., A novel human prostate-specific, androgen-regulated homeobox gene (NKX3.1) that maps to 8p21, a region frequently deleted in prostate cancer. Genomics, 1997. 43(1): p. 69-77. 150. Visakorpi, T., et al., In vivo amplification of the androgen receptor gene and progression of human prostate cancer. Nat Genet, 1995. 9(4): p. 401-6. 151. Welsh, J.B., et al., Analysis of gene expression identifies candidate markers and pharmacological targets in prostate cancer. Cancer Res, 2001. 61(16): p. 5974-8. 152. Di Cristofano, A. and P.P. Pandolfi, The multiple roles of PTEN in tumor suppression. Cell, 2000. 100(4): p. 387-90. 153. Dinjens, W.N., et al., Frequency and characterization of p53 mutations in primary and metastatic human prostate cancer. Int J Cancer, 1994. 56(5): p. 630- 3. 154. Ittmann, M., Allelic loss on in prostate adenocarcinoma. Cancer Res, 1996. 56(9): p. 2143-7. 155. Bergerheim, U.S., et al., Deletion mapping of chromosomes 8, 10, and 16 in human prostatic carcinoma. Genes Chromosomes Cancer, 1991. 3(3): p. 215-20. 156. Carter, B.S., et al., Allelic loss of chromosomes 16q and 10q in human prostate cancer. Proc Natl Acad Sci U S A, 1990. 87(22): p. 8751-5. 157. Macoska, J.A., et al., Extensive genetic alterations in prostate cancer revealed by dual PCR and FISH analysis. Genes Chromosomes Cancer, 1993. 8(2): p. 88-97. 158. Sakr, W.A., et al., Allelic loss in locally metastatic, multisampled prostate cancer. Cancer Res, 1994. 54(12): p. 3273-7. 159. Saric, T., et al., Genetic pattern of prostate cancer progression. Int J Cancer, 1999. 81(2): p. 219-24. 160. Podsypanina, K., et al., Mutation of Pten/Mmac1 in mice causes neoplasia in multiple organ systems. Proc Natl Acad Sci U S A, 1999. 96(4): p. 1563-8. 161. Li, J., et al., PTEN, a putative protein tyrosine phosphatase gene mutated in human brain, breast, and prostate cancer. Science, 1997. 275(5308): p. 1943-7. 162. Steck, P.A., et al., Identification of a candidate tumour suppressor gene, MMAC1, at chromosome 10q23.3 that is mutated in multiple advanced cancers. Nat Genet, 1997. 15(4): p. 356-62. 163. Kim, M.J., et al., of Nkx3.1 and Pten loss of function in a mouse model of prostate carcinogenesis. Proc Natl Acad Sci U S A, 2002. 99(5): p. 2884-9. 164. Cooney, K.A., et al., Distinct regions of allelic loss on 13q in prostate cancer. Cancer Res, 1996. 56(5): p. 1142-5. 165. Li, C., et al., Identification of two distinct deleted regions on chromosome 13 in prostate cancer. Oncogene, 1998. 16(4): p. 481-7.

197

166. Melamed, J., J.M. Einhorn, and M.M. Ittmann, Allelic loss on chromosome 13q in human prostate carcinoma. Clin Cancer Res, 1997. 3(10): p. 1867-72. 167. Jacks, T., et al., Effects of an Rb mutation in the mouse. Nature, 1992. 359(6393): p. 295-300. 168. DeCaprio, J.A., et al., SV40 large tumor antigen forms a specific complex with the product of the retinoblastoma susceptibility gene. Cell, 1988. 54(2): p. 275-83. 169. Reich, N.C. and A.J. Levine, Specific interaction of the SV40 T antigen-cellular p53 protein complex with SV40 DNA. Virology, 1982. 117(1): p. 286-90. 170. Greenberg, N.M., et al., Prostate cancer in a transgenic mouse. Proc Natl Acad Sci U S A, 1995. 92(8): p. 3439-43. 171. Kasper, S., et al., Development, progression, and androgen-dependence of prostate tumors in probasin-large T antigen transgenic mice: a model for prostate cancer. Lab Invest, 1998. 78(3): p. 319-33. 172. MacGrogan, D. and R. Bookstein, Tumour suppressor genes in prostate cancer. Semin Cancer Biol, 1997. 8(1): p. 11-9. 173. Burchardt, M., et al., Reduction of wild type p53 function confers a hormone resistant phenotype on LNCaP prostate cancer cells. Prostate, 2001. 48(4): p. 225-30. 174. Downing, S.R., P. Jackson, and P.J. Russell, Mutations within the tumour suppressor gene p53 are not confined to a late event in prostate cancer progression. a review of the evidence. Urol Oncol, 2001. 6(3): p. 103-110. 175. Schlechte, H., et al., p53 tumour suppressor gene mutations in benign prostatic hyperplasia and prostate cancer. Eur Urol, 1998. 34(5): p. 433-40. 176. Donehower, L.A., et al., Mice deficient for p53 are developmentally normal but susceptible to spontaneous tumours. Nature, 1992. 356(6366): p. 215-21. 177. Jacks, T., et al., Tumor spectrum analysis in p53-mutant mice. Curr Biol, 1994. 4(1): p. 1-7. 178. Greenberg, N.M., et al., Prostate cancer in a transgenic mouse. Proc Natl Acad Sci U S A, 1995. 92(8): p. 3439-43. 179. Altschul, S.F., et al., Basic local alignment search tool. J Mol Biol, 1990. 215(3): p. 403-10. 180. Schmitz, C., et al., Comparative study on the clinical use of protein S-100B and MIA (melanoma inhibitory activity) in melanoma patients. Anticancer Res, 2000. 20(6D): p. 5059-63. 181. Zhang, S.X., et al., Immunolocalization of apolipoprotein D, androgen receptor and prostate specific antigen in early stage prostate cancers. J Urol, 1998. 159(2): p. 548-54. 182. Hough, C.D., et al., Coordinately up-regulated genes in ovarian cancer. Cancer Res, 2001. 61(10): p. 3869-76. 183. Kawashima, M., et al., Prostaglandin D synthase (beta-trace) in meningeal hemangiopericytoma. Mod Pathol, 2001. 14(3): p. 197-201. 184. Fujimura, H., et al., Aminopeptidase A expression in cervical neoplasia and its relationship to neoplastic transformation and progression. Oncology, 2000. 58(4): p. 342-52. 185. Supuran, C.T., et al., Carbonic anhydrase inhibitors: sulfonamides as antitumor agents? Bioorg Med Chem, 2001. 9(3): p. 703-14.

198

186. Helpap, B., J. Kollermann, and U. Oehler, Neuroendocrine differentiation in prostatic carcinomas: histogenesis, biology, clinical relevance, and future therapeutical perspectives. Urol.Int., 1999. 62(3): p. 133-138. 187. van Steenbrugge, G.J., et al., The human prostatic cancer cell line LNCaP and its derived sublines: an in vitro model for the study of androgen sensitivity. J Steroid Biochem Mol Biol, 1991. 40(1-3): p. 207-14. 188. Horoszewicz, J.S., et al., The LNCaP cell line--a new model for studies on human prostatic carcinoma. Prog Clin Biol Res, 1980. 37: p. 115-32. 189. Chen, C.T., et al., Androgen-dependent and -independent human prostate xenograft tumors as models for drug activity evaluation. Cancer Res, 1998. 58(13): p. 2777-83. 190. Wainstein, M.A., et al., CWR22: androgen-dependent xenograft model derived from a primary human prostatic carcinoma. Cancer Res, 1994. 54(23): p. 6049- 52. 191. Wu, H.C., et al., Derivation of androgen-independent human LNCaP prostatic cancer cell sublines: role of bone stromal cells. Int J Cancer, 1994. 57(3): p. 406- 12. 192. Kaighn, M.E., et al., Establishment and characterization of a human prostatic carcinoma cell line (PC-3). Invest Urol, 1979. 17(1): p. 16-23. 193. Stone, K.R., et al., Isolation of a human prostate carcinoma cell line (DU 145). Int J Cancer, 1978. 21(3): p. 274-81. 194. Salti, G.I., et al., Betulinic acid reduces ultraviolet-C-induced DNA breakage in congenital melanocytic naeval cells: evidence for a potential role as a chemopreventive agent. Melanoma Res, 2001. 11(2): p. 99-104. 195. Mehta, R.R., et al., In vitro transformation of human congenital naevus to malignant melanoma. Melanoma Res, 2002. 12(1): p. 27-33. 196. Feucht, K.A., et al., Effect of 17 beta-estradiol on the growth of estrogen receptor-positive human melanoma in vitro and in athymic mice. Cancer Res, 1988. 48(24 Pt 1): p. 7093-101. 197. Rauth, S., et al., Chromosome abnormalities in metastatic melanoma. In Vitro Cell Dev Biol Anim, 1994. 30A(2): p. 79-84. 198. Rauth, S., et al., Establishment of a human melanoma cell line lacking p53 expression and spontaneously metastasizing in nude mice. Anticancer Res, 1994. 14(6B): p. 2457-63. 199. Bigner, D.D., et al., Heterogeneity of Genotypic and phenotypic characteristics of fifteen permanent cell lines derived from human gliomas. J Neuropathol Exp Neurol, 1981. 40(3): p. 201-29. 200. Brooks, S.C., E.R. Locke, and H.D. Soule, Estrogen receptor in a human cell line (MCF-7) from breast carcinoma. J Biol Chem, 1973. 248(17): p. 6251-3. 201. Keydar, I., et al., Establishment and characterization of a cell line of human breast carcinoma origin. Eur J Cancer, 1979. 15(5): p. 659-70. 202. Whitehead, R.H., et al., A colon cancer cell line (LIM1215) derived from a patient with inherited nonpolyposis . J Natl Cancer Inst, 1985. 74(4): p. 759-65.

199

203. Kochevar, J., A renal cell carcinoma neoplastic antigen detectable by immunohistochemistry is defined by a murine monoclonal antibody. Cancer, 1987. 59(12): p. 2031-6. 204. Liebhaber, H., et al., Recovery of Cytopathic Agents from Patients with Infectious Hepatitis: Isolation and Propagation in Cultures of Human Diploid Lung Fibroblasts (Wi-38). Virology, 1964. 24: p. 109-13. 205. Linsley, P.S. and C.F. Fox, Direct linkage of EGF to its receptor: characterization and biological relevance. J Supramol Struct, 1980. 14(4): p. 441- 59. 206. Hsu, T.C., Cytological studies on HeLa, a strain of human cervical carcinoma, I. Observations on and chromosomes. Tex Rep Biol Med, 1954. 12(4): p. 833-46. 207. Lee, M., et al., Epidermal growth factor receptor monoclonal antibodies inhibit the growth of lung cancer cell lines. J Natl Cancer Inst Monogr, 1992(13): p. 117- 23. 208. Jones, P.A., W.E. Laug, and W.F. Benedict, Fibrinolytic activity in a human fibrosarcoma cell line and evidence for the induction of plasminogen activator secretion during tumor formation. Cell, 1975. 6(2): p. 245-52. 209. Zhan, X., et al., Human oncogenes detected by a defined medium culture assay. Oncogene, 1987. 1(4): p. 369-76. 210. Aiello, L., et al., Adenovirus 5 DNA sequences present and RNA sequences transcribed in transformed human embryo kidney cells (HEK-Ad-5 or 293). Virology, 1979. 94(2): p. 460-9. 211. Cobb, V.J., et al., Forskolin treatment directs steroid production towards the androgen pathway in the NCI-H295R adrenocortical tumour cell line. Endocr Res, 1996. 22(4): p. 545-50. 212. Yang, Y.H., et al., Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res, 2002. 30(4): p. e15. 213. Eisen, M.B., et al., Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci U S A, 1998. 95(25): p. 14863-8. 214. Horoszewicz, J.S., et al., LNCaP model of human prostatic carcinoma. Cancer Res, 1983. 43(4): p. 1809-18. 215. Hasenson, M., et al., PAP and PSA in prostatic carcinoma cell lines and aspiration biopsies: relation to hormone sensitivity and to cytological grading. Prostate, 1989. 14(2): p. 83-90. 216. Wu, H.C., et al., Derivation of androgen-independent human LNCaP prostatic cancer cell sublines: role of bone stromal cells. Int J Cancer, 1994. 57(3): p. 406- 12. 217. Birgersdotter, A., R. Sandberg, and I. Ernberg, Gene expression perturbation in vitro--a growing case for three-dimensional (3D) culture systems. Semin Cancer Biol, 2005. 15(5): p. 405-12. 218. Rozen, S. and H. Skaletsky, Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol, 2000. 132: p. 365-86. 219. Gibson, U.E., C.A. Heid, and P.M. Williams, A novel method for real time quantitative RT-PCR. Genome Res, 1996. 6(10): p. 995-1001.

200

220. de Kok, J.B., et al., Normalization of gene expression measurements in tumor tissues: comparison of 13 endogenous control genes. Lab Invest, 2005. 85(1): p. 154-9. 221. Rondinelli, R.H., D.E. Epner, and J.V. Tricoli, Increased glyceraldehyde-3- phosphate dehydrogenase gene expression in late pathological stage human prostate cancer. Prostate Cancer Prostatic Dis, 1997. 1(2): p. 66-72. 222. Denmeade, S.R., et al., Dissociation between androgen responsiveness for malignant growth vs. expression of prostate specific differentiation markers PSA, hK2, and PSMA in human prostate cancer models. Prostate, 2003. 54(4): p. 249- 57. 223. Heinlein, C.A. and C. Chang, The roles of androgen receptors and androgen- binding proteins in nongenomic androgen actions. Mol Endocrinol, 2002. 16(10): p. 2181-7. 224. Riegman, P.H., et al., The promoter of the prostate-specific antigen gene contains a functional androgen responsive element. Mol Endocrinol, 1991. 5(12): p. 1921- 30. 225. Yanai, I., et al., Genome-wide midrange transcription profiles reveal expression level relationships in human tissue specification. Bioinformatics, 2005. 21(5): p. 650-9. 226. Breuza, L., et al., Proteomics of endoplasmic reticulum-Golgi intermediate compartment (ERGIC) membranes from brefeldin A-treated HepG2 cells identifies ERGIC-32, a new cycling protein that interacts with human Erv46. J Biol Chem, 2004. 279(45): p. 47242-53. 227. de Marco, M.C., et al., MAL2, a novel raft protein of the MAL family, is an essential component of the machinery for transcytosis in hepatoma HepG2 cells. J Cell Biol, 2002. 159(1): p. 37-44. 228. Marazuela, M., et al., Expression and distribution of MAL2, an essential element of the machinery for basolateral-to-apical transcytosis, in human thyroid epithelial cells. Endocrinology, 2004. 145(2): p. 1011-6. 229. Marazuela, M., et al., Expression of MAL2, an integral protein component of the machinery for basolateral-to-apical transcytosis, in human epithelia. J Histochem Cytochem, 2004. 52(2): p. 243-52. 230. Wilson, S.H., et al., Identification of MAL2, a novel member of the mal proteolipid family, though interactions with TPD52-like proteins in the yeast two- hybrid system. Genomics, 2001. 76(1-3): p. 81-8. 231. Boutros, R., et al., The tumor protein D52 family: many pieces, many puzzles. Biochem Biophys Res Commun, 2004. 325(4): p. 1115-21. 232. Kristiansen, G., et al., Expression profiling of microdissected matched prostate cancer samples reveals CD166/MEMD and CD24 as new prognostic markers for patient survival. J Pathol, 2005. 205(3): p. 359-76. 233. Nelson, P.S., et al., The program of androgen-responsive genes in neoplastic prostate epithelium. Proc Natl Acad Sci U S A, 2002. 99(18): p. 11890-5. 234. Porter, D., et al., Molecular markers in ductal carcinoma in situ of the breast. Mol Cancer Res, 2003. 1(5): p. 362-75. 235. van 't Veer, L.J., et al., Gene expression profiling predicts clinical outcome of breast cancer. Nature, 2002. 415(6871): p. 530-6.

201

236. Llorente, A., M.C. de Marco, and M.A. Alonso, -1 and MAL are located on prostasomes secreted by the prostate cancer PC-3 cell line. J Cell Sci, 2004. 117(Pt 22): p. 5343-51. 237. Venables, J.P., Aberrant and alternative splicing in cancer. Cancer Res, 2004. 64(21): p. 7647-54. 238. Faustino, N.A. and T.A. Cooper, Pre-mRNA splicing and human disease. Genes Dev, 2003. 17(4): p. 419-37. 239. Garcia-Blanco, M.A., A.P. Baraniak, and E.L. Lasda, Alternative splicing in disease and therapy. Nat Biotechnol, 2004. 22(5): p. 535-46. 240. Venables, J.P., Unbalanced alternative splicing and its significance in cancer. Bioessays, 2006. 28(4): p. 378-86. 241. Coulson, J.M., et al., A splice variant of the neuron-restrictive silencer factor repressor is expressed in small cell lung cancer: a potential role in derepression of neuroendocrine genes and a useful clinical marker. Cancer Res, 2000. 60(7): p. 1840-4.