US 2017.0191133A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2017/0191133 A1 Davicioni (43) Pub. Date: Jul. 6, 2017

(54) SYSTEMIS AND METHODS FOR Publication Classification EXPRESSION-BASED DISCRIMINATION OF (51) Int. C. DISTINCT CLINICAL DISEASE STATES IN CI2O I/68 (2006.01) PROSTATE CANCER (52) U.S. C. CPC ...... CI2O 1/6886 (2013.01); C12O 2600/112 (71) Applicant: GENOMEDX BIOSCIENCES, INC., (2013.01); C12O 2600/118 (2013.01); C12O Vancouver (CA) 2600/158 (2013.01) (72) Inventor: Elai Davicioni, La Jolla, CA (US) (57) ABSTRACT A system for expression-based discrimination of distinct clinical disease states in prostate cancer is provided that is (21) Appl. No.: 15/092,468 based on the identification of sets of transcripts, which are characterized in that changes in expression of each gene (22) Filed: Apr. 6, 2016 transcript within a set of gene transcripts can be correlated with recurrent or non-recurrent prostate cancer. The Prostate Cancer Prognostic system provides for sets of “prostate Related U.S. Application Data cancer prognostic' target sequences and further provides for combinations of polynucleotide probes and primers derived (63) Continuation of application No. 12/994.408, filed on there from. These combinations of polynucleotide probes Feb. 17, 2011, now abandoned, filed as application can be provided in solution or as an array. The combination No. PCT/CA2009/000694 on May 28, 2009. of probes and the arrays can be used for diagnosis. The (60) Provisional application No. 61/056,827, filed on May invention further provides further methods of classifying 28, 2008. prostate cancer tissue. Patent Application Publication Jul. 6, 2017. Sheet 1 of 9 US 2017/O191133 A1

FG. A Patent Application Publication Jul. 6, 2017. Sheet 2 of 9 US 2017/O191133 A1

& Patent Application Publication Jul. 6, 2017. Sheet 3 of 9 US 2017/O191133 A1

Patent Application Publication Jul. 6, 2017. Sheet 4 of 9 US 2017/O191133 A1

¿?*8). ?0000'0×d

8% ågå

Patent Application Publication Jul. 6, 2017. Sheet 5 of 9 US 2017/O191133 A1

: i. 3. SYS

F.G. 3 Patent Application Publication Jul. 6, 2017. Sheet 6 of 9 US 2017/O191133 A1

&

& wer x : s . 8 SE x 8.

Yes

83888 x 8.3: Patent Application Publication Jul. 6, 2017. Sheet 7 of 9 US 2017/O191133 A1

------:

i s

x & x x : x : 838 kix: Patent Application Publication Jul. 6, 2017. Sheet 8 of 9 US 2017/O191133 A1

:

i

s Patent Application Publication Jul. 6, 2017. Sheet 9 of 9 US 2017/O191133 A1

:

: s

''''''''''''''''''''''''''''''''''''''''''''''''''''1''''''''''''''''''''''''''''''''''''''''''''''''''''''''

838 :x: US 2017/O 191133 A1 Jul. 6, 2017

SYSTEMIS AND METHODS FOR cancer and this has resulted in an increased number of EXPRESSION-BASED DISCRIMINATION OF patients being treated but not significantly decreased the DISTINCT CLINICAL DISEASE STATES IN mortality rate. PROSTATE CANCER 0007. Despite the controversies surrounding PSA testing as a screening tool, most physicians confidently rely on PSA CROSS-REFERENCE TO RELATED testing to assess pre-treatment prognosis and to monitor APPLICATIONS disease progression after initial therapy. Successive increases in PSA levels above a defined threshold value or 0001. This application is a continuation of U.S. applica variations thereof (i.e. Rising-PSA), also known as bio tion Ser. No. 12/994,408 filed Feb. 17, 2011, now aban chemical recurrence has been shown to be correlated to doned, which claimed the benefit of 5 USC S371 National disease progression after first-line therapy (e.g., prostatec Stage application of International Application No. PCT/ tomy, radiation and brachytherapy). However, less than a /3 CA2009/000694 filed May 28, 2009, which claims priority of patients with rising-PSA will eventually be diagnosed benefit of U.S. 61/056,827 filed May 28, 2008, now expired. with systemic or metastatic disease and several studies have The disclosure of each of the prior applications is considered shown that after long-term follow up, the majority will never part of and is incorporated by reference in the disclosure of show any symptoms of disease progression aside from this application. increases in PSA measurement. The limitations of using the PSA biomarker and the absence of additional biomarkers for INCORPORATION OF SEQUENCE LISTING predicting disease recurrence have led to the development of 0002 The material in the accompanying sequence listing statistical models combining several clinical and pathologi is hereby incorporated by reference into this application. The cal features including PSA results. Several of these nomo accompanying sequence listing text file, name GBX1140 2 grams have been shown to improve the predictive power for Sequence Listing..txt, was created on Jun. 22, 2016, and is disease recurrence in individual patients over any single independent variable. These models (see Citation #14) are 744 kb. The file can be assessed using Microsoft Word on a used routinely in the clinic and are currently the best computer that uses Windows OS. available tools for prediction of outcomes, although they do not provide high levels of accuracy for groups of patients FIELD OF THE INVENTION with highly similar histological/pathological features or 0003. This invention relates to the field of diagnostics and those at intermediate risk of disease recurrence after pros in particular to systems and methods for classifying prostate tatectomy. cancer into distinct clinical disease states. 0008. The use of quantitative molecular analyses has the potential to increase the sensitivity, specificity and/or overall BACKGROUND accuracy of disease prognosis and provide a more objective basis for determination of risk stratification as compared to 0004 Prostate cancer is the most common malignancy conventional clinical-pathological risk models (see Citation affecting U.S. men, with approximately 240,000 new cases #13). The PSA test demonstrates the deficiencies of relying diagnosed each year. The incidence of prostate cancer is on the measurement of any single biomarker in clinically increasing, in part due to increased Surveillance efforts from heterogeneous and complex prostate cancer genomes. the application of routine molecular testing Such as prostate Therefore, genomic-based approaches measuring combina specific antigen (PSA). For most men, prostate cancer is a tions of biomarkers or a signature of disease recurrence are slow-growing, organ-confined or localized malignancy that currently being investigated as better Surrogates for predict poses little risk of death. The most common treatments for ing disease outcome (see Citations # 1-13). For prostate prostate cancer in the U.S. are Surgical procedures such as cancer patients these efforts are aimed at reducing the radical prostatectomy, where the entire prostate is removed number of unnecessary Surgeries for patients without pro from the patient. This procedure on its own is highly curative gressive disease and avoid inadvertent under-treatment for for most but not all men. higher risk patients. To date, genomic profiling efforts to 0005. The vast majority of deaths from prostate cancer identify DNA-based (e.g., copy-number alterations, meth occur in patients with metastasis, believed to be present ylation changes), RNA-based (e.g., gene or non-coding already at the time of diagnosis in the form of clinically RNA expression) or -based (e.g., protein expression undetectable micro-metastases. In these patients, it is clear or modification) signatures, useful for disease prognosis that prostatectomy alone is not curative and additional have not however resulted in widespread clinical use. therapies such as anti-androgen or radiation therapy are 0009. There are several key reasons explaining why prior required to control the spread of disease and extend the life genomic profiling methods for prostate cancer have not yet of the patient. been incorporated in the clinic. These include the small 0006 Most prostatectomy patients however face uncer sample sizes typical of individual studies, coupled with tainty with respect to their prognosis after Surgery: whether variations due to differences in study protocols, clinical or not the initial surgery will be curative several years from heterogeneity of patients and lack of external validation the initial treatment because the current methods for assess data, which combined have made identifying a robust and ment of the clinical risk Such as the various pathological reproducible disease signature elusive. Specifically for gene (e.g., tumor stage), histological (e.g., Gleasons), clinical or RNA expression based prognostic models; the mitigating (e.g., Kattan nomogram) and molecular biomarkers (e.g., technological limitations include the quality and quantity of PSA) are not reliable predictors of prognosis, specifically RNA that can be isolated from routine clinical samples. disease progression. Routine PSA testing has certainly Routine clinical samples of prostate cancer include needle increased Surveillance and early-detection rates of prostate biopsies and Surgical resections that have been fixed in US 2017/O 191133 A1 Jul. 6, 2017

formalin and embedded in paraffin wax (FFPE). FFPE expression level of said one or more nucleic acid sequences derived RNA is typically degraded and fragmented to with the disease state of prostate cancer tissue. between 100-300 by in size and without poly-A tails making 0016. In accordance with another aspect of the present it of little use for traditional 3'-biased invention, there is provided an array of probe nucleic acids profiling, which requires larger microgram quantities of certified for use in expression-based assessment of prostate RNA with intact poly-A tails to prime cDNA synthesis. cancer recurrence risk, wherein said array comprises at least 0010 Furthermore, as <2% of the genome encodes for two different probe nucleic acids that specifically hybridize protein, traditional gene expression profiling in fact captures to corresponding different target nucleic acids depicted in only a small fraction of the transcriptome and variation in one of SEQ ID NOs: 1-2114, an RNA form thereof, or a expression as most RNA molecules that are transcribed are complement to either thereof. not translated into protein but serve other functional roles 0017. In accordance with another aspect of the present and non-coding RNAS are the most abundant transcript invention, there is provided a device for classifying a species in the genome. biological sample from a prostate cancer as recurrent or 0011. This background information is provided for the non-recurrent, the device comprising means for measuring purpose of making known information believed by the the expression level of one or more transcripts of one or applicant to be of possible relevance to the present inven more selected from the group of genes set forth in tion. No admission is necessarily intended, nor should be Table 3 and/or 6; means for correlating the expression level construed, that any of the preceding information constitutes with a classification of prostate cancer status; and means for prior art against the present invention. outputting the prostate cancer status. 0018. In accordance with another aspect of the present SUMMARY OF THE INVENTION invention, there is provided a computer-readable medium comprising one or more digitally-encoded expression pat 0012. An object of the present invention is to provide systems and methods for expression-based discrimination of tern profiles representative of the level of expression of one distinct clinical disease states in prostate cancer. In accor or more transcripts of one or more genes selected from the dance with one aspect of the present invention, there is group of genes set forth in Table 3 and/or 6, each of said one provided a system for expression-based assessment of risk or more expression pattern profiles being associated with a of prostate cancer recurrence after prostatectomy, said sys value wherein each of said values is correlated with the tem comprising one or more polynucleotides, each of said presence of recurrent or non-recurrent prostate cancer. polynucleotides capable of specifically hybridizing to a RNA transcript of a gene selected from the group of genes BRIEF DESCRIPTION OF THE DRAWINGS set forth in Table 3 and/or 6. 0019. These and other features of the invention will 0013. In accordance with another aspect of the present become more apparent in the following detailed description invention, there is provided a nucleic acid array for expres in which reference is made to the appended drawings. Sion-based assessment of prostate cancer recurrence risk, 0020 FIGS. 1A-C. A.) Principle components analysis said array comprising at least ten probes immobilized on a (PCA) of 2,114 RNAs identified to be differentially solid support, each of said probes being between about 15 expressed between tumors from patients with differing clini and about 500 nucleotides in length, each of said probes cal outcome (see Table 2 for 20 comparisons evaluated), being derived from a sequence corresponding to, or comple PCA plot of 22 prostate cancer tumors shows tight clustering mentary to, a transcript of a gene selected from the group of of samples by clinical outcome of patients (circles, NED: genes set forth in Table 3 and/or 6, or a portion of said diamonds, PSA; squares, SYS). B) Two-way hierarchical transcript. clustering dendrogram and expression matrix of 526 target 0014. In accordance with another aspect of the present sequences (Table 4) RNAS filtered using linear regression invention, there is provided a method for expression-based (p<0.01) to identify RNAs that followed either assessment of prostate cancer recurrence, said method com SYS)PSA>NED or NEDDPSA>SYS trend in differential prising: (a) determining the expression level of one or more expression. C) Two-way hierarchical clustering dendrogram transcripts of one or more genes in a test sample obtained and expression matrix of 148 target sequences (Table 5), a from said subject to provide an expression pattern profile, subset of the most differentially expressed transcripts said one or more genes selected from the group of genes set between patients with clinically significant recurrent (i.e., forth in Table 3 and/or 6, and (c) comparing said expression SYS) and non-recurrent (i.e., PSA and NED) disease pattern profile with a reference expression pattern profile. as filtered using a t-test (p<0.001). For B) and C), sample 0015. In accordance with another aspect of the present and RNAs were optimally ordered using Pearson’s correla invention, there is provided a kit for characterizing the tion distance metric with complete-linkage cluster distances expression of one or more nucleic acid sequences depicted and the expression of each RNA in each sample was in SEQ ID NOs: 1-2114 comprising one or more nucleic normalized in the heatmap by the number of standard acids selected from (a) a nucleic acid depicted in any of SEQ deviations above (blacker) and below (whiter) the median ID NOs: 1-2114; (b) an RNA form of any of the nucleic expression value (grey) across all samples. acids depicted in SEQID NOs: 1-2114; (c) a peptide nucleic 0021 FIG. 2. Histograms showing distribution patients acid form of any of the nucleic acids depicted in SEQ ID tumor expression levels of a metagene generated from a NOs: 1-2114; (d) a nucleic acid comprising at least 20 linear combination of the 526 RNAs for each clinical group. consecutive bases of any of (a-c); (e) a nucleic acid com The histograms bin Samples with similar metagene expres prising at least 25 consecutive bases having at least 90% sion values and significantly separate three modes of patient sequence identity to any of (a-c); or (f) a complement to any metagene scores (ANOVA, p<0.000001) corresponding to of (a-e); and optionally instructions for correlating the the three clinical status groups evaluated. US 2017/O 191133 A1 Jul. 6, 2017

0022 FIG. 3. Scatter plots summarizing the mean (tstan biochemical relapse (two successive increases in prostate dard deviation) of metagene expression values for tumor specific antigen levels; (PSA) and systemic prostate cancer samples from patients in the three clinical status groups systemic metastases (SYS). (NED; PSA; SYS). Metagenes were generated from a linear 0029. In accordance with one embodiment of the inven combinations of 6 (A), 18 (0) or 20 () RNAs and tion, the target sequences comprised by the prostate cancer demonstrate highly significant differential expression prognostic set are sequences based on or derived from the between clinical groups (ANOVA, p<0.000001). gene transcripts from the library, or a subset thereof. Such 0023 FIG. 4. Box plots showing interquartile range and sequences are occasionally referred to herein as “probe distribution of POP scores for each clinical group using an selection regions' or “PSRs.” In another embodiment of the 18-target sequence metagene (Table 7) to derive patient invention, the target sequences comprised by the prostate outcome predictor scores scaled and normalized on a data classification set are sequences based on the gene transcripts range of 0-100 points. T-tests were used to evaluate the from the library, or a subset thereof, and include both coding statistical significance of differences in POP scores between and non-coding sequences. NED and PSA (*) as well as between PSA and SYS (**) 0030. In one embodiment, the systems and methods clinical groups (p<7x107 and p<1-10, respectively). provide for the molecular analysis of the expression levels of 0024 FIG. 5. Box plots showing interquartile range and one or more of the target sequences as set forth in SEQ ID distribution of POP scores for each clinical group using a NOs: 1-2114 (Table 4). Increased relative expression of one 10-target sequence metagene (Table 9) to derive patient or more target sequences in a NED Group corresponding outcome predictor scores scaled and normalized on a data to the sequences as set forth in SEQ ID NOs: 1-913 is range of 0-100 points. T-tests were used to evaluate the indicative of or predictive of a non-recurrent form of pros statistical significance of differences in POP scores between tate cancer and can be correlated with an increased likeli recurrent (i.e., SYS) and non-recurrent (i.e., PSA and hood of a long-term NED prognosis or low risk of prostate NED) patient groups (**, p<4x10'). cancer recurrence. Increased relative expression of one or 0025 FIG. 6. Box plots showing interquartile range and more target sequences in a SYS Group corresponding to distribution of POP scores for each clinical group using a the sequences as set forth in SEQ ID NOs: 914-2114 is 41-target sequence metagene (Table 10) to derive patient indicative of or predictive of an aggressive form of prostate outcome predictor scores scaled and normalized on a data cancer and can be correlated with an increased likelihood of range of 0-100 points. T-tests were used to evaluate the a long-term SYS prognosis or high risk of prostate cancer statistical significance of differences in POP scores between recurrence. Optionally, intermediate relative levels of one or recurrent (i.e., SYS) and non-recurrent (i.e., PSA and more target sequences in a PSA Group corresponding to NED) patient groups (**, p<2x10'). target sequences set forth in Table 7 is indicative of or 0026 FIG. 7. Box plots showing interquartile range and predictive of biochemical recurrence. Subsets and combina distribution of POP scores for each clinical group using a tions of these target sequences or probes complementary 148-target sequence metagene to derive patient outcome thereto may be used as described herein. predictor scores scaled and normalized on a data range of 0031. Before the present invention is described in further 0-100 points. T-tests were used to evaluate the statistical detail, it is to be understood that this invention is not limited significance of differences in POP scores between recur to the particular methodology, compositions, articles or rent (i.e., SYS) and non-recurrent (i.e., PSA and NED) machines described, as Such methods, compositions, articles patient groups (**, p<9x10'). or machines can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of DETAILED DESCRIPTION OF THE describing particular embodiments only, and is not intended INVENTION to limit the scope of the present invention. 0027. The present invention provides a system and method for assessing prostate cancer recurrence risk by Definitions distinguishing clinically distinct disease states in men with 0032 Unless defined otherwise or the context clearly prostate cancer at the time of initial diagnosis or Surgery. The dictates otherwise, all technical and Scientific terms used system and methods are based on the identification of gene herein have the same meaning as commonly understood by transcripts following a retrospective analysis of tumor one of ordinary skill in the art to which this invention samples that are differentially expressed in prostate cancer in belongs. In describing the present invention, the following a manner dependent on prostate cancer aggressiveness as terms will be employed, and are intended to be defined as indicated by long-term post-prostatectomy clinical outcome. indicated below. These gene transcripts can be considered as a library which 0033. The term “polynucleotide' as used herein refers to can be used as a resource for the identification of sets of a polymer of greater than one nucleotide in length of specific target sequences (“prostate cancer prognostic sets’), ribonucleic acid (RNA), deoxyribonucleic acid (DNA), which may represent the entire library of gene transcripts or hybrid RNA/DNA, modified RNA or DNA, or RNA or DNA a subset of the library and the detection of which is indica mimetics, including peptide nucleic acids (PNAs). The tive of prostate cancer recurrence risk. The invention further polynucleotides may be single- or double-stranded. The provides for probes capable of detecting these target term includes polynucleotides composed of naturally-occur sequences and primers that are capable of amplifying the ring nucleobases, Sugars and covalent internucleoside (back target Sequences. bone) linkages as well as polynucleotides having non 0028. In accordance with one embodiment of the inven naturally-occurring portions which function similarly. Such tion, the system and method for assessing prostate cancer modified or substituted polynucleotides are well-known in recurrence risk are prognostic for a post Surgery clinical the art and for the purposes of the present invention, are outcome selected from no evidence of disease (NED), referred to as “analogues.” US 2017/O 191133 A1 Jul. 6, 2017

0034 “Complementary' or “substantially complemen 0040 “Having is an open ended phrase like “compris tary” refers to the ability to hybridize or between ing” and “including,” and includes circumstances where nucleotides or nucleic acids, such as, for instance, between additional elements are included and circumstances where a sensor peptide nucleic acid or polynucleotide and a target they are not. polynucleotide. Complementary nucleotides are, generally, 0041) “Optional or “optionally’ means that the subse A and T (or A and U), or C and G. Two single-stranded quently described event or circumstance may or may not polynucleotides or PNAS are said to be substantially occur, and that the description includes instances where the complementary when the bases of one strand, optimally event or circumstance occurs and instances in which it does aligned and compared and with appropriate insertions or not deletions, pair with at least about 80% of the bases of the 0042. As used herein NED describes a clinically distinct other strand, usually at least about 90% to 95%, and more disease state in which patients show no evidence of disease preferably from about 98 to 100%. (NED) at least 5 years after surgery, TSA describes a 0035 Alternatively, substantial complementarity exists clinically distinct disease state in which patients show when a polynucleotide will hybridize under selective hybrid biochemical relapse only (two Successive increases in pros ization conditions to its complement. Typically, selective tate-specific antigen levels but no other symptoms of disease hybridization will occur when there is at least about 65% with at least 5 years follow up after surgery; PSA) and complementarity over a stretch of at least 14 to 25 bases, for SYS describes a clinically distinct disease state in which example at least about 75%, or at least about 90% comple patients develop biochemical relapse and present with sys mentarity. See, M. Kanehisa Nucleic Acids Res. 12:203 temic prostate cancer disease or metastases (SYS) within (1984). five years after the initial treatment with radical prostatec 0036 “Preferential binding” or “preferential hybridiza tomy. tion” refers to the increased propensity of one polynucle 0043. As used herein, the term “about” refers to approxi otide to bind to its complement in a sample as compared to mately a +/-10% variation from a given value. It is to be a noncomplementary polymer in the sample. understood that such a variation is always included in any 0037 Hybridization conditions will typically include salt given value provided herein, whether or not it is specifically concentrations of less than about 1M, more usually less than referred to. about 500 mM, for example less than about 200 mM. In the 0044) Use of the singular forms “a,” “an,” and “the case of hybridization between a peptide nucleic acid and a include plural references unless the context clearly dictates polynucleotide, the hybridization can be done in solutions otherwise. Thus, for example, reference to “a polynucle containing little or no salt. Hybridization temperatures can otide' includes a plurality of polynucleotides, reference to “a be as low as 5°C., but are typically greater than 22°C., and target includes a plurality of Such targets, reference to “a more typically greater than about 30° C., for example in normalization method’ includes a plurality of such methods, excess of about 37° C. Longer fragments may require higher and the like. Additionally, use of specific plural references, hybridization temperatures for specific hybridization as is such as “two.” “three,” etc., read on larger numbers of the known in the art. Other factors may affect the stringency of same Subject, unless the context clearly dictates otherwise. hybridization, including base composition and length of the 0045 Terms such as “connected,” “attached,” “linked complementary Strands, presence of organic solvents and and "conjugated are used interchangeably herein and extent of base mismatching, and the combination of param encompass direct as well as indirect connection, attachment, eters used is more important than the absolute measure of linkage or conjugation unless the context clearly dictates any one alone. Other hybridization conditions which may be otherwise. controlled include buffer type and concentration, solution 0046 Where a range of values is recited, it is to be pH, presence and concentration of blocking reagents to understood that each intervening integer value, and each decrease background binding Such as repeat sequences or fraction thereof, between the recited upper and lower limits blocking protein Solutions, detergent type(s) and concentra of that range is also specifically disclosed, along with each tions, molecules such as polymers which increase the rela Subrange between such values. The upper and lower limits tive concentration of the polynucleotides, metal ion(s) and of any range can independently be included in or excluded their concentration(s), chelator(s) and their concentrations, from the range, and each range where either, neither or both and other conditions known in the art. limits are included is also encompassed within the invention. 0038 “Multiplexing herein refers to an assay or other Where a value being discussed has inherent limits, for analytical method in which multiple analytes can be assayed example where a component can be present at a concentra simultaneously. tion of from 0 to 100%, or where the pH of an aqueous 0039. A “target sequence' as used herein (also occasion Solution can range from 1 to 14, those inherent limits are ally referred to as a “PSR' or “probe selection region') specifically disclosed. Where a value is explicitly recited, it refers to a region of the genome against which one or more is to be understood that values which are about the same probes can be designed. As used herein, a probe is any quantity or amount as the recited value are also within the polynucleotide capable of selectively hybridizing to a target Scope of the invention, as are ranges based thereon. Where sequence or its complement, or to an RNA version of either. a combination is disclosed, each Subcombination of the A probe may comprise ribonucleotides, deoxyribonucle elements of that combination is also specifically disclosed otides, peptide nucleic acids, and combinations thereof. A and is within the scope of the invention. Conversely, where probe may optionally comprise one or more labels. In some different elements or groups of elements are disclosed, embodiments, a probe may be used to amplify one or both combinations thereof are also disclosed. Where any element Strands of a target sequence or an RNA form thereof, acting of an invention is disclosed as having a plurality of alter as a sole primer in an amplification reaction or as a member natives, examples of that invention in which each alternative of a set of primers. is excluded singly or in any combination with the other US 2017/O 191133 A1 Jul. 6, 2017

alternatives are also hereby disclosed; more than one ele 001013671); NM_001033517; NM 183049; NM ment of an invention can have Such exclusions, and all 212559: 5'-3' exoribonuclease 1: A kinase (PRKA) anchor combinations of elements having such exclusions are hereby protein (yotiao) 9; AarF domain containing kinase 4: Abhy disclosed. drolase domain containing 3; 0047 Prostate Cancer Prognostic System 0054 Aconitase 1, soluble; Actinin, alpha 1: ADAM 0048. The system of the present invention is based on the metallopeptidase domain 19 (meltrin beta); Adaptor-related identification of a library of gene and RNA transcripts that protein complex 1, gamma 2 subunit, Adenosine deaminase, are differentially expressed in prostate cancer in a manner RNA-specific, B2 (RED2 homolog rat); Adenylate cyclase dependent on prostate cancer aggressiveness as indicated by 3; ADP-ribosylation factor GTPase activating protein 3: the post-prostatectomy clinical outcome of the patient. For ADP-ribosylation factor guanine nucleotide-exchange fac example, relative over expression of one or more of the gene tor 2 (brefeldin A-inhibited); ADP-ribosylation factor-like transcripts in a prostate cancer Sample compared to a refer 4D: Adrenergic, beta, receptor kinase 2: AF4/FMR2 family, ence sample or expression profile or signature there from member 3; Amyloid beta (A4) precursor protein-binding, may be prognostic of a clinically distinct disease outcome family B, member 1 (Fe(55): Anaphase promoting complex post-prostatectomy selected from no evidence of disease subunit 1: Ankyrin 3, node of Ranvier (ankyrin G); Ankyrin (NED), biochemical relapse (PSA) and prostate cancer repeat domain 15: Ankyrin repeat domain 28; Annexin Al; disease systemic recurrence or metastases (SYS). The Annexin A2; Anterior pharynx defective 1 homolog B (C. reference sample can be, for example, from prostate cancer elegans):Anthrax toxin receptor 1; Antizyme inhibitor 1; sample(s) of one or more references Subject(s) with a known Arachidonate 12-lipoxygenase, 12R type; Arginine vaso post-prostatectomy clinical outcomes. The reference expres pressin receptor 1A: Arginine-glutamic acid dipeptide (RE) sion profile or signature may optionally be normalized to repeats; ARP3 actin-related protein 3 homolog (yeast); one or more appropriate reference gene transcripts. Alter Arrestin 3, retinal (X-arrestin); Arrestin domain containing natively or in addition to, expression of one or more of the 1; Aryl hydrocarbon receptor interacting protein-like 1; Aryl gene transcripts in a prostate cancer sample may be com hydrocarbon receptor nuclear translocator; Ataxin 1: ATM/ pared to an expression profile or signature from normal ATR-Substrate Chk2-Interacting Zn2+-finger protein; prostate tissue. ATPase, Class I, type 8B, member 1: ATPase, Na+/K+ 0049 Expression profiles or signatures from prostate transporting, alpha 1 polypeptide; ATP-binding cassette, cancer samples may be normalized to one or more house sub-family F (GCN20), member 1: Autism susceptibility keeping gene transcripts such that normalized over and/or candidate 2: Baculoviral IAP repeat-containing 6 (apollon); under expression of one or more of the gene transcripts in a Basonuclin 2: Brain-specific angiogenesis inhibitor 3: Bro sample may be indicative of a clinically distinct disease state modomain containing 7: Bromodomain containing 8: Bro or prognosis. modomain PHD finger transcription factor; BTB (POZ) 0050. Prostate Prognostic Library domain containing 16: -BTB (POZ) domain containing 7; 0051. The Prostate Prognostic Library in accordance with Calcium activated nucleotidase 1, Calcium binding protein the present invention comprises one or more gene or RNA P22; Calcium channel, Voltage-dependent, beta 4 subunit; transcripts whose relative and/or normalized expression is Calcium channel, Voltage-dependent, L type, alpha 1 C Sub indicative of prostate cancer recurrence and which may be unit, Calcium channel, Voltage-dependent, L type, alpha 1D prognostic for post-prostatectomy clinical outcome of a Subunit; Calcyclin binding protein; Calmodulin 1 (phospho patient. Exemplary RNA transcripts that showed differential rylase kinase, delta); Calsyntenin 1; Carbonyl reductase 3: expression in prostate cancer samples from patients with Cardiolipin synthase 1: Carnitine palmitoyltransferase 1A clinically distinct disease outcomes after initial treatment (liver); Casein kinase 1, delta; Casein kinase 1, gamma 1: with radical prostatectomy are shown in Table 3. In one Casein kinase 1, gamma 3: Caspase 1, apoptosis-related embodiment of the invention, the library comprises one or cysteine peptidase (interleukin 1, beta, convertase); CD109 more of the gene transcripts of the genes listed in Table 3. molecule; CD99 molecule-like 2: CDKS regulatory subunit 0052. In one embodiment, the library comprises at least associated protein 2: CDP-diacylglycerol synthase (phos one transcript from at least one gene selected from those phatidate cytidylyltransferase) 2: Cell adhesion molecule 1: listed in Table 3. In one embodiment, the library comprises Cell division cycle and apoptosis regulator 1: Centrosomal at least one transcript from each of at least 5 genes selected protein 70 kDa; Chloride channel 3: Chromodomain heli from those listed in Table 3. In another embodiment, the case DNA binding protein 2: Chromodomain helicase DNA library comprises at least one transcript from each of at least binding protein 6: Chromodomain protein, Y-like 2: Chro 10 genes selected from those listed in Table 3. In a further mosome 1 ORF 116; 1 ORF52; Chromosome embodiment, the library comprises at least one transcript 10 ORF 118: ORF 30; Chromosome 13 from each of at least 15 genes selected from those listed in ORF23; Chromosome 16 ORF 45: Chromosome 18 ORF 1: Table 1. In other embodiments, the library comprises at least Chromosome 18 ORF 1: Chromosome 18 ORF 1: Chromo one transcript from each of at least 20, at least 25, at least 30, some 18 ORF 1: Chromosome 18 ORF 17: Chromosome 2 at least 35, at least 40, at least 45, at least 50, at least 55, at ORF 3: Chromosome 20 ORF 133; Chromosome 21 ORF least 60 and at least 65 genes selected from those listed in 25; Chromosome 21 ORF 34; Chromosome 22 ORF 13: Table 3. In a further embodiment, the library comprises at Chromosome 3 ORF 26; ORF3; Chromo least one transcript from all of the genes listed in Table 3. In some 5 ORF 33: Chromosome 5 ORF 35; Chromosome 5 a further embodiment, the library comprises at all transcripts ORF 39; ORF 13; Chromosome 7 ORF 42: from all of the genes listed in Table 3. Chromosome 9 ORF 3: Chromosome 9 ORF 94; Chromo 0053. In one embodiment, the library comprises at least some Y ORF 15B; Chymase 1, mast cell; Citrate lyase beta one transcript from at least one gene selected from the group like; Class II, major histocompatibility complex, transacti consisting of NM_001004722); NM 001005522); NM vator; C-Maf-inducing protein; Coatomer protein complex, US 2017/O 191133 A1 Jul. 6, 2017

subunit alpha; Cofilin 2 (muscle); Coiled-coil domain con C1 (VP16-accessory protein); Hyaluronan binding protein taining 50: Coiled-coil domain containing 7: Coiled-coil 4; Hyperpolarization activated cyclic nucleotide-gated helix-coiled-coil-helix domain containing 4: Cold shock potassium channel 3: Hypothetical gene Supported by domain containing El, RNA-binding; Collagen, type XII. AK128346; Hypothetical LOC51149; Hypothetical protein alpha 1: Complement component 1, r Subcomponent-like; FLJ12949; Hypothetical protein FLJ20035: Hypothetical Core-binding factor, runt domain, alpha Subunit 2; translo protein FLJ20309: Hypothetical protein FLJ38482: Hypo cated to. 2: CREB binding protein (Rubinstein-Taybi syn thetical protein HSPC148: Hypothetical protein drome); CTD (carboxy-terminal domain, RNA polymerase LOC130576; Hypothetical protein LOC285908: Hypotheti II, polypeptide A) small phosphatase 2: CTD (carboxy cal protein LOC643 155: Iduronidase, alpha-L-IKAROS terminal domain, RNA polymerase II, polypeptide A) small family Zinc finger 1 (Ikaros); IlvE (bacterial acetolactate phosphatase-like: CUG triplet repeat, RNA binding protein synthase)-like; Inhibitor of kappa light polypeptide gene 2: Cullin 3: Cut-like 2: Cyclin F: Cyclin Y: Cysteine-rich enhancer in B-cells, kinase beta; Inositol polyphosphate-4- with EGF-like domains 1: Cytochrome P450, family 4, phosphatase, type II; Integrin, alpha 4 (antigen CD49D. subfamily F, polypeptide 11; Cytoplasmic FMR1 interacting alpha 4 subunit of VLA-4 receptor); Integrin, alpha 6; protein 2: DAZ interacting protein 1-like; DCP2 decapping Integrin, alpha 9; Integrin, alpha V (vitronectin receptor, enzyme homolog (S. cerevisiae); DEAD box polypeptide alpha polypeptide, antigen CD51); Integrin, beta-like 1 (with 47: DEAD box polypeptide 5: DEAD box polypeptide 52: EGF-like repeat domains); Inter-alpha (globulin) inhibitor DEAD box polypeptide 56: Death inducer-obliterator 1: H3; Interleukin enhancer binding factor 3, 90kDa. Intestine Dedicator of cytokinesis 2: DEP domain containing 1B: specific homeobox DEP domain containing 2: DEP domain containing 6: Development and differentiation enhancing factor 1; Dia 0057 Intraflagellar transport 172 homolog; Janus kinase cylglycerol lipase, alpha; Diaphanous homolog 2 (Droso 1; Jumonji domain containing 1B: Jumonji domain contain phila); Dickkopf homolog 3: Dihydropyrimidine dehydro ing 2B: Jumonji domain containing 2C; Kalirin, RhoGEF genase; Dipeptidyl peptidase 10; Discs, large homolog 2. kinase; Kallikrein-related peptidase 2: Karyopherin alpha 3 chapsyn-110; Dishevelled, dsh homolog 2: DnaJ (Hsp40) (importin alpha 4); Keratinocyte associated protein 2: homolog, subfamily C, member 6; Dpy-19-like 3: Dual KIAA0152; KIAA0241; KIAA0319-like; KIAA0495; specificity phosphatase 5; Ectodysplasin A receptor; Ecto KIAA0562; KIAA0564 protein; KIAA1217; KIAA1244: nucleoside triphosphate diphosphohydrolase 7: EGFR-co KIAA 1244: La ribonucleoprotein domain family, member 1: amplified and overexpressed protein; ELL associated factor Lamin A/C, LATS, large tumor Suppressor, homolog 2; 1: Emopamil binding protein (sterol isomerase); Enabled Leiomodin 3 (fetal); Leptin receptor overlapping transcript homolog: Ephrin-A5; ER lipid raft associated 1; Erythro like 1; Leucine rich repeat containing 16; Leucine-rich blast membrane-associated protein (Scianna blood group); repeat kinase 1: Leucine-rich repeat-containing G protein Erythrocyte membrane protein band 4.1 like 4A; Etoposide coupled receptor 4: LIM domain 7: Major histocompatibility induced 2.4 mRNA; Eukaryotic translation initiation factor complex, class II, DR beta 1: Malignant fibrous histiocy 4E family member 3: FAD1 flavin adenine dinucleotide toma amplified sequence 1; Maltase-glucoamylase (alpha synthetase homolog; Family with sequence similarity 110, glucosidase); Mannosidase, alpha, class 2A, member 1, member A. Family with sequence similarity 114, member Mannosyl (alpha-1,6-)-glycoprotein beta-1,6-N-acetyl-glu A1: Family with sequence similarity 135, member A. Fam cosaminyltransferase: MBD2-interacting zinc finger; Mela ily with sequence similarity 40, member A. Family with nin-concentrating hormone receptor 1; Methionyl-tRNA sequence similarity 80, member B; F-box and leucine-rich synthetase; Methyl CpG binding protein 2: Methyl-CpG repeat protein 11; F-box and leucine-rich repeat protein 7: binding domain protein 5: Microcephaly, primary autosomal F-box protein 2; Ferritin, heavy polypeptide 1: Fibronectin recessive 1; Microseminoprotein, beta-; Microtubule-asso type III domain containing 3A: Fibronectin type III domain ciated protein 1B: Microtubule-associated protein 2: containing 3B; Fibulin 1: FLJ25476 protein: FLJ41603 Minichromosome maintenance complex component 3 asso protein; Forkhead box J3; Forkhead box J3; Forkhead box ciated protein; Mitochondrial ribosomal protein S15; K1; Forkhead box P1: Frizzled homolog 3; Frizzled Mohawk homeobox; Monooxygenase, DBH-like 1; MORN homolog 5: G protein-coupled receptor kinase interactor 2; repeat containing 1; Muscle RAS oncogene homolog; Mus GABAA receptor, delta; GATA binding protein 2: GDNF cleblind-like; Myelin protein zero-like 1; Myeloid/lymphoid family receptor alpha 2: Gelsolin (amyloidosis, Finnish or mixed-lineage leukemia (trithorax homolog, Drosophila); translocated to. 4; Myocyte enhancer factor 2B: Myosin IF: type); Genethonin 1: Glucose phosphate isomerase; Glu Myosin, heavy chain 3, skeletal muscle, embryonic; cose-fructose oxidoreductase domain containing 1: Glucosi N-acetylgalactosaminidase, alpha-, N-acetylglucosamine-1- dase, beta (bile acid) 2: Glutamate dehydrogenase 1: Glu phosphate transferase, alpha and beta Subunits; Nascent taminase; Glutamyl aminopeptidase (aminopeptidase A); polypeptide-associated complex alpha subunit; NECAP Glutathione reductase endocytosis associated 2; Necdin homolog; Neural precur 0055 Glycogen synthase kinase 3 beta; Grainyhead-like sor cell expressed, developmentally down-regulated 9; Neu 2: Gremlin 1, cysteine knot superfamily, homolog; GTPase regulin 1: Neuron navigator 1: Nibrin; Nicotinamide activating protein (SH3 domain) binding protein 1; Hairy/ N-methyltransferase: NIMA (never in mitosis gene a)-re enhancer-of-split related with YRPW motif 2: Heparan lated kinase 6: NLR family, CARD domain containing 5: sulfate 6-O-sulfotransferase 3: Hermansky-Pudlak syn NOL1/NOP2/Sun domain family, member 3: NOL1/NOP2/ drome 5: Heterogeneous nuclear ribonucleoprotein C (C1/ Sun domain family, member 3: NOL1/NOP2/Sun domain C2) family, member 6: Nuclear receptor coactivator 2; Nuclear 0056 Hippocalcin-like 1; Histocompatibility (minor) 13: receptor coactivator 6:Nuclear receptor Subfamily 2, group Histone cluster 1, H3d; Histone deacetylase 6: Homeobox F, member 2; Nuclear receptor subfamily 3, group C, A1; Homeobox and leucine Zipper encoding; Host cell factor member 2; Nuclear receptor Subfamily 4, group A, member US 2017/O 191133 A1 Jul. 6, 2017

2: Nuclear transcription factor, X-box binding-like 1: threonine kinase 32A; Serine/threonine kinase 32C. SGT1, Nucleolar and coiled-body phosphoprotein 1: Overex suppressor of G2 allele of SKP1; SH3 and PX domains 2A; pressed in colon carcinoma-1: PANS polyA specific ribo Signal peptide peptidase 3: Signal transducer and activator nuclease subunit homolog, PAP associated domain contain of transcription 1, 91 kDa. Single-stranded DNA binding ing 1; Paraoxonase 2; Paraspeckle component 1; PCTAIRE protein 2: Small nuclear ribonucleoprotein polypeptide N: protein kinase 2: Peptidase D; Pericentrin (kendrin); Per SNF8, ESCRT-II complex subunit, homolog: Sodium chan oxisomal biogenesis factor 19; PHD finger protein 8: Phos nel, Voltage-gated, type III, alpha Subunit; Solute carrier phatidic acid phosphatase type 2 domain containing 3; family 1 (neutral amino acid transporter), member 5; Solute Phosphatidylinositol 4-kinase, catalytic, alpha polypeptide; carrier family 16, member 7 (monocarboxylic acid trans Phosphatidylinositol glycan anchor biosynthesis, class O: porter 2); Solute carrier family 2 (facilitated glucose trans Phosphatidylinositol transfer protein, beta; Phosphodies porter), member 11; Solute carrier family 2 (facilitated terase 4D. cAMP-specific: Phosphoglucomutase 5: Phos glucose transporter), member 11, Solute carrier family 24 phoglycerate mutase family member 5: Phosphoinositide-3- (sodium/potassium/calcium exchanger), member 3: Solute kinase, class 2, beta polypeptide; Phospholipase A2, group carrier family 3 (activators of dibasic and neutral amino acid IVB (cytosolic); Phospholipase C, beta 1 (phosphoinositide transport), member 2: Solute carrier family 30 (zinc trans specific); Phospholipase C, gamma 2 (phosphatidylinositol porter), member 6: Solute carrier family 39 (zinc trans specific):Phosphorylase kinase, beta; Plasminogen activator, porter), member 10; Solute carrier family 43, member 1: tissue, Platelet-activating factor acetylhydrolase, isoform Solute carrier family 9 (sodium/hydrogen exchanger), mem 1b, alpha subunit 45kDa. Pleckstrin homology domain con ber 3 regulator 2; SON DNA binding protein; Sortilin taining, family A (phosphoinositide binding specific) mem related VPS10 domain containing receptor 3: Sparcfos ber 3: Pleckstrin homology domain containing, family A teonectin, cwcV and kazal-like domains proteoglycan member 7: Pleckstrin homology domain containing, family (testican) 2; Spectrin repeat containing, nuclear envelope 1; G (with RhoGef domain) member 3: Pleckstrin homology Sperm associated antigen 9: Splicing factor 3a, Subunit 2. domain containing, family H (with MyTH4 domain) mem 66kDa; Splicing factor 3b, subunit 1, 155kDa; Staphylococ ber 1: Poly (ADP-ribose) polymerase family, member 16; cal nuclease and tudor domain containing 1; Staufen, RNA Poly (ADP-ribose) polymerase family, member 2: Poly(A) binding protein, homolog 1: Suppression of tumorigenicity polymerase alpha; Poly(A)-specific ribonuclease (dead 7; Suppressor of variegation 4-20 homolog 1; Synapsin III; enylation nuclease); Polymerase (DNA directed) nu: Poly Syntaxin 3: Syntaxin 5; Tachykinin receptor 1; TAO kinase merase (DNA directed), gamma 2, accessory subunit; Poly 3: TBC1 domain family, member 16; TBC1 domain family, merasc (RNA) II (DNA directed) polypeptide L; Polymerase member 19: Testis specific, 10; Tetraspanin 6: Tetratrico (RNA) III (DNA directed) polypeptide E; Polymerase I and peptide repeat domain 23; Thioredoxin-like 2: THUMP transcript release factor; Potassium channel tetramerisation domain containing 3: TIMELESS interacting protein; TOX domain containing 1; Potassium channel tetramerisation high mobility group box family member 4, Trafficking domain containing 2; Potassium channel tetramerisation protein, kinesin binding 1; Transcription factor 7-like 1 domain containing 7; Potassium channel, Subfamily K, (T-cell specific, HMG-box); Transcription factor 7-like 2 member 1:Presenilin 1: PRKR interacting protein 1: Procol (T-cell specific, HMG-box); Translocase of inner mitochon lagen-lysine, 2-oxoglutarate 5-dioxygenase 2: ProSAPiP1 drial membrane 13 homolog; Translocated promoter region protein; Prostaglandin E synthase 3 (cytosolic); Protease, (to activated MET oncogene); Translocation associated serine, 2 (trypsin. 2); Protein kinase, Y-linked; Protein phos membrane protein 2: Transmembrane 9 Superfamily mem phatase 1, regulatory (inhibitor) subunit 9A; Protein phos ber 2; Transmembrane emp24 protein transport domain phatase 3 (formerly 2B), catalytic subunit, alpha isoform: containing 8: Transmembrane emp24-like trafficking protein Protein phosphatase 4, regulatory subunit 1-like: Protein 10: Transmembrane protein 134: Transmembrane protein tyrosine phosphatase, non-receptor type 18 (brain-derived); 18; Transmembrane protein 18; Transmembrane protein 29: Protein tyrosine phosphatase, non-receptor type 3: Protein Triadin; Tribbles homolog 1; Trinucleotide repeat containing tyrosine phosphatase, receptor type, D: Protein-O-manno 6C; Tripartite motif-containing 36; Tripartite motif-contain syltransferase 1: Proteolipid protein 2 (colonic epithelium ing 61; TRNA methyltransferase 11 homolog: TruB enriched); Protocadherin 7: Protocadheringamma subfamily pseudouridine (psi) synthase homolog 2; Tubby like protein A, 1; PRP6 pre-mRNA processing factor 6 homolog: Puta 4; Tuftelin 1; Tumor necrosis factor receptor superfamily, tive homeodomain transcription factor 1; RAB GTPase member 11b (osteoprotegerin); Tumor necrosis factor recep activating protein 1-like; RAB10. RAB30; Rabaptin, RAB tor Superfamily, member 25; Tumor necrosis factor, alpha GTPase binding effector protein 1: RAD51-like 1; RALBP1 induced protein 8: Tyrosine kinase 2: Ubiquinol-cytochrome associated Eps domain containing 2: Rap guanine nucleotide c reductase core protein I. Ubiquitin specific peptidase 47; exchange factor (GEF) 1; Rapamycin-insensitive compan Ubiquitin specific peptidase 5 (isopeptidase T); Ubiquitin ion of mTOR; Ras and Rab interactor 2: Receptor accessory specific peptidase 8: Ubiquitin-like 7 (bone marrow stromal protein 3: Reelin; Replication factor C (activator 1) 3: cell-derived); UBX domain containing 6: UDP-glucose cer Replication protein A3; Rho GTPase activating protein 18; amide glucosyltransferase-like 2: UDP-N-acetyl-alpha-D- Rho guanine nucleotide exchange factor (GEF) 10-like: galactosamine:polypeptide N-acetylgalactosaminyltrans Rhophilin, Rho GTPase binding protein 1: Ribonuclease ferase 2 (GalNAc-T2); Unc-93 homolog Bl:UTP6, small H2, subunit B: Ribonuclease P 1.4kDa subunit; Ring finger Subunit (SSU) processome component, homolog; Vacuolar protein 10; Ring finger protein 144, Ring finger protein 44; protein sorting 8 homolog; V-akt murine thymoma viral RNA binding motif protein 16; Roundabout, axon guidance oncogene homolog 1; V-ets erythroblastosis virus E26 onco receptor, homolog 1; Roundabout, axon guidance receptor, gene homolog; Viral DNA polymerase-transactivated pro homolog 2: RUN domain containing 2A; Scinderin; SEC23 tein 6: WD repeat domain 33: WD repeat domain 90; interacting protein; Sec61 alpha 2 subunit; Septin 11; Serine/ Wingless-type MMTV integration site family, member 2B: US 2017/O 191133 A1 Jul. 6, 2017

WW and C2 domain containing 1; Yipl domain family, modomain helicase DNA binding protein 2: Chromosome 9 member 3: YTH domain family, member 3; Zinc finger and open reading frame 94; Chromosome 21 open reading frame BTB domain containing 10; Zinc finger and BTB domain 34; and Dual specificity phosphatase 5. containing 20. Zinc finger and SCAN domain containing 18; 0063. In one embodiment, the library comprises at least Zinc finger homeodbmain 4; Zinc finger protein 14 one transcript from at least one gene selected from the group homolog; Zinc finger protein 335; Zinc finger protein 394: consisting of Citrate lyase beta like: Phosphodiesterase 4D, Zinc finger protein 407; Zinc finger protein 608; Zinc finger cAMP-specific; Ectodysplasin A receptor; DEP domain con protein 667; Zinc finger protein 692; Zinc finger protein 718; taining 6: Basonuclin 2: Chromosome 2 open reading frame Zinc finger protein 721; Zinc finger, CCHC domain con 3: FLJ25476 protein; Staphylococcal nuclease and tudor taining 9; Zinc finger, matrin type 1; Zinc finger, MYND domain containing 1; Hermansky-Pudlak syndrome 5 and domain containing 11; Zinc finger, ZZ-type containing 3; Chromosome 12 open reading frame 30. Zinc fingers and homeoboxes 2; and Zwilch, kinetochore 0064. In one embodiment, the library comprises at least associated, homolog. one transcript from at least one gene selected from the group 0058. In one embodiment, the library comprises at least consisting of Replication factor C (activator 1) 3: Tripartite one transcript from at least one gene selected from the group motif-containing 61: Citrate lyase beta like; Adaptor-related consisting of Replication factor C (activator 1) 3: Tripartite protein complex 1, gamma 2 subunit; Kallikrein-related motif-containing 61; Citrate lyase beta like; Ankyrin repeat peptidase 2: Phosphodiesterase 4D, cAMP-specific; domain 15: UDP-glucose ceramide glucosyltransferase-like Cytochrome P450, family 4, subfamily F, polypeptide 11: 2: Hypothetical protein FLJ12949; Chromosome 22 open Ectodysplasin A receptor reading frame 13; Phosphatidylinositol glycan anchor bio 0065. Phospholipase C, beta 1; KIAA 1244: Paraoxonase synthesis, class 0; Solute carrier family 43, member 1: 2: Arachidonate 12-lipoxygenase, 12R type; Cut-like 2: Rabaptin, RAB GTPase binding effector protein 1; Zinc Chemokine (C-X-C motif) ligand 12: Rho guanine nucleo finger protein 14 homolog: Hypothetical gene Supported by tide exchange factor (GEF) 5: Olfactory receptor, family 2, AK128346: Adenylate cyclase 3: Phosphatidylinositol trans Subfamily A, member 4, Chromosome 19 open reading fer protein, beta; Zinc finger protein 667: Gremlin 1, cys frame 42; Hypothetical gene supported by AK128346; Phos teine knot Superfamily, homolog: Ankyrin 3, node of Ran phoglucomutase 5: Hyaluronan binding protein 4: NECAP vier (ankyrin G) and Maltase-glucoamylase (alpha endocytosis associated 2 glucosidase). 0.066 Myeloid/lymphoid or mixed-lineage leukemia; 0059. In one embodiment, the library comprises at least translocated to. 4: Signal transducer and activator of tran one transcript from at least one gene selected from the group scription 1: Chromosome 2 open reading frame 3; FLJ25476 consisting of Replication factor C (activator 1) 3; Ankyrin protein; Staphylococcal nuclease and tudor domain contain repeat domain 15: Hypothetical protein FLJ12949; Solute ing 1; Transmembrane protein 18; Hermansky-Pudlak Syn carrier family 43, member 1: Thioredoxin-like 2: Poly drome 5: Chromosome 12 open reading frame 30: Splicing merase (RNA) II (DNA directed) polypeptide L: Syntaxin 5: factor 3b, subunit 1; Cofilin 2: Heparan sulfate 6-0-Sul Leucine rich repeat containing 16; Calcium channel, Volt fotransferase 3: Enabled homolog: Mannosyl (alpha-1,6-)- age-dependent, beta 4 subunit; NM 001005522; G pro glycoprotein beta-1,6-N-acetyl-glucosaminyltransferase; tein-coupled receptor kinase interactor 2: Ankyrin 3, node of Solute carrier family 24 (Sodium/potassium/calcium Ranvier (ankyrin G); Gremlin 1, cysteine knot Superfamily, exchanger), member 3; Inositol 1,4,5-triphosphate receptor, homolog; Zinc finger protein 667; Hypothetical gene Sup type 1: CAP-GLY domain containing linker protein; Trans ported by AK128346; Transmembrane 9 superfamily mem glutaminase 4: MOCO sulphurase C-terminal domain con ber 2: Potassium channel, subfamily K, member 1: Chro taining 2: 4-hydroxyphenylpyruvate dioxygenase-like; and modomain helicase DNA binding protein 2: Microcephaly, R3H domain containing 1. primary autosomal recessive 1: Chromosome 21 open read 0067. The invention also contemplates that alternative ing frame 34 and Dual specificity phosphatase 5. libraries may be designed that include transcripts of one or 0060. In one embodiment, the library comprises at least more of the genes in Table 3, together with additional gene one transcript from at least one gene selected from the group transcripts that have previously been shown to be associated consisting of Replication factor C (activator 1) 3: Tripartite with prostate cancer systemic progression. As is known in motif-containing 61; Citrate lyase beta like; Ankyrin repeat the art, the publication and sequence databases can be mined domain 15: Ankyrin 3, node of Ranvier (ankyrin G) and using a variety of search Strategies to identify appropriate Maltase-glucoamylase (alpha-glucosidase). additional candidates for inclusion in the library. For 0061. In one embodiment, the library comprises at least example, currently available scientific and medical publica one transcript from at least one gene selected from the group tion databases such as Medline, Current consisting of Replication factor C (activator 1) 3; Ankyrin 0068 Contents, OMIM (online Mendelian inheritance in repeat domain 15; man), various Biological and Chemical Abstracts, Journal 0062 Hypothetical protein FLJ12949; Solute carrier indexes, and the like can be searched using term or key-word family 43, member 1: Thioredoxin-like 2: Polymerase searches, or by author, title, or other relevant search param (RNA) II (DNA directed) polypeptide L: Syntaxin 5: Leu eters. Many such databases are publicly available, and cine rich repeat containing 16; Calcium channel, Voltage strategies and procedures for identifying publications and dependent, beta 4 subunit; NM 001005522]: G protein their contents, for example, genes, other nucleotide coupled receptor kinase interactor 2: Ankyrin 3, node of sequences, descriptions, indications, expression pattern, etc. Ranvier (ankyrin G); Gremlin 1, cysteine knot Superfamily, are well known to those skilled in the art. Numerous homolog; Zinc finger protein 667; Hypothetical gene Sup databases are available through the internet for free or by ported by AK128346; Transmembrane 9 superfamily mem subscription, see, for example, the National Center Biotech ber 2: Potassium channel, subfamily K, member 1: Chro nology Information (NCBI), Infotrieve, Thomson ISI, and US 2017/O 191133 A1 Jul. 6, 2017

Science Magazine (published by the AAAS) websites. Addi Prognostic Set comprises target sequences derived from the tional or alternative publication or citation databases are also transcripts of at least 5 genes. In another embodiment, the available that provide identical or similar types of informa Prostate Cancer Prognostic set comprises target sequences tion, any of which can be employed in the context of the derived from the transcripts of at least 10 genes. In a further invention. These databases can be searched for publications embodiment, the Prostate Cancer Prognostic set comprises describing altered gene expression between recurrent and target sequences derived from the transcripts of at least 15 non-recurrent prostate cancer. Additional potential candidate genes. In other embodiments, the Prostate Cancer Prognostic genes may be identified by searching the above described set comprises target sequences derived from the transcripts databases for differentially expressed and by iden of at least 20, at least 25, at least 30, at least 35, at least 40, tifying the nucleotide sequence encoding the differentially at least 45, at least 50, at least 55, at least 60 and at least 65 expressed proteins. A list of genes whose altered expression genes. is between patients with recurrent disease and non-recurrent 0073. Following the identification of candidate gene tran prostate cancer is presented in Table 6. Scripts, appropriate target sequences can be identified by 0069 Prostate Cancer Prognostic Sets screening for target sequences that have been annotated to be 0070 A Prostate Prognostic Set comprises one or more associated with each specific gene from a number of target sequences identified within the gene transcripts in the annotation sources including GenBank, RefSeq, Ensembl. prostate prognostic library, or a Subset of these gene tran dbEST, GENSCAN, TWINSCAN, Exoniphy, Vega, micro Scripts. The target sequences may be within the coding RNAs registry and others (see Affymetrix Exon Array and/or non-coding regions of the gene transcripts. The set design note). can comprise one or a plurality of target sequences from 0074 As part of the target sequence selection process, each gene transcript in the library, or subset thereof. The target sequences can be further evaluated for potential relative and/or normalized level of these target sequences in cross-hybridization against other putative transcribed a sample is indicative of the level of expression of the sequences in the design (but not the entire genome) to particular gene transcript and thus of prostate cancer recur identify only those target sequences that are predicted to rence risk. For example, the relative and/or normalized uniquely hybridize to a single target. expression level of one or more of the target sequences may 0075 Optionally, the set of target sequences that are be indicative of an recurrent form of prostate cancer and predicted to uniquely hybridize to a single target can be therefore prognostic for prostate cancer Systemic progres further filtered using a variety of criteria including, for sion while the relative and/or normalized expression level of example, Sequence length, for their mean expression levels one or more other target sequences may be indicative of a across a wide selection of human tissues, as being repre non-recurrent form of prostate cancer and therefore prog sentive of transcripts expressed either as novel alternative nostic for a NED clinical outcome. (i.e., non-consensus) exons, alternative retained introns, 0071. Accordingly, one embodiment of the present inven novel exons 5' or 3' of the gene's transcriptional start site or tion provides for a library or catalog of candidate target representing transcripts expressed in a manner antisense to sequences derived from the transcripts (both coding and the gene, amongst others. non-coding regions) of at least one gene Suitable for clas 0076. In one embodiment, the Prostate Classification Set Sifying prostate cancer recurrence risk. In a further embodi comprises target sequences derived from 382.253 base pair ment, the present invention provides for a library or catalog 3' of Replication factor C (activator 1)3,38kDa. 58,123 base of candidate target sequences derived from the non-coding pair 3' of Tripartite motif-containing 61; in intron #3 of regions of transcripts of at least one gene Suitable for Citrate lyase beta like; in intron #2 of Ankyrin repeat domain classifying prostate cancer recurrence risk. In still a further 15; in exon #1 of UDP-glucose ceramide glucosyltrans embodiment, the library or catalog of candidate target ferase-like 2; in exon of #19 of Hypothetical protein sequences comprises target sequences derived from the FLJ12949; in intron #4 of Chromosome 22 open reading transcripts of one or more of the genes set forth in Table 3 frame 13; in exon #2 of phatidylinositol glycan anchor and/or Table 6. The library or catalog in affect provides a biosynthesis, class 0; in exon #15 of Solute carrier family 43, resource list of transcripts from which target sequences member 1; in exon #1 of Rabaptin, RAB GTPase binding appropriate for inclusion in a Prostate Cancer Prognostic set effector protein 1; in intron #38 of Maltase-glucoamylase can be derived. In one embodiment, an individual Prostate (alpha-glucosidase); in intron #23 of Ankyrin 3, node of Cancer Prognostic set may comprise target sequences Ranvier (ankyrin G); 71.333 base pair 3' of Gremlin 1, derived from the transcripts of one or more genes exhibiting cysteine knot superfamily, homolog: 1,561 base pair of 3' a positive correlation with recurrent prostate cancer. In one Zinc finger protein 667; in exon #4 of Phosphatidylinositol embodiment, an individual Prostate Cancer Prognostic Set transfer protein, beta; in intron #18 of Adenylate cyclase 3: may comprise target sequences derived from the transcripts and in exon #2 of Hypothetical gene Supported by of one or more genes exhibiting a negative correlation with AK128346. recurrent prostate cancer. In one embodiment, an individual 0077. In one embodiment, the Prostate Classification Set Prostate Cancer Prognostic Set may comprise target comprises target sequences derived from 382.253 base pair sequences derived from the transcripts of two or more genes, 3' of Replication factor C (activator 1) 3; in intron #2 of wherein at least one gene has a transcript that exhibits a Ankyrin repeat domain 15; in exon #19 of Hypothetical positive correlation with recurrent prostate cancer and at protein FLJ12949; in exon #15 of Solute carrier family 43, least one gene has a transcript that exhibits negative corre member 1; 313,721 base pair 3' of Thioredoxin-like 2; in lation with recurrent prostate cancer. exon #2 of Polymerase (RNA) II (DNA directed) polypep 0072. In one embodiment, the Prostate Cancer Prognostic tide L, 7.6kDa; in intron #10 of Syntaxin 5; 141,389 base Set comprises target sequences derived from the transcripts pair 5' of Leucine rich repeat containing 16; in intron #2 of of at least one gene. In one embodiment, the Prostate Cancer Calcium channel, Voltage-dependent, beta 4 subunit; 5,474 US 2017/O 191133 A1 Jul. 6, 2017

base pair 5 of NM_001005522; in intron #14 of G poxygenase, 12R type; in exon #22 of Cut-like 2: 143,098 protein-coupled receptor kinase interactor 2; in intron #23 of base pair 5' of Chemokine (C-X-C motif) ligand 12; in intron Ankyrin 3, node of Ranvier (ankyrin G); 71.333 base pair 3' #6 of Rho guanine nucleotide exchange factor (GEF) 5: of Gremlin 1, cysteine knot Superfamily, homolog; 1,561 15,057 base pair 5' of Olfactory receptor, family 2, subfam base pair of 3' of Zinc finger protein 667; in exon #2 of ily A, member 4, 6,025 base pair 3' of Chromosome 19 open Hypothetical gene supported by AK128346; in intron #11 of reading frame 42; in exon #2 of Hypothetical gene Supported Transmembrane 9 superfamily member 2; in intron #1 of by AK128346; in exon #11 of Phosphoglucomutase 5: in Potassium channel, subfamily K, member 1; in intron #2 of exon #9 of Hyaluronan binding protein 4; in exon #8 of Chromodomain helicase DNA binding protein 2: 22, 184 NECAP endocytosis associated 2; in intron #20 of Myeloid/ base pair 5' of Microcephaly, primary autosomal recessive 1: lymphoid or mixed-lineage leukemia; 1,558 base pair 3' of in intron #4 of Chromosome 21 open reading frame 34; and Signal transducer and activator of transcription; in intron #1 in intron #2 of Dual specificity phosphatase 5. of Chromosome 2 open reading frame 3; in intron #1 of 0078. In one embodiment, the Prostate Classification Set F1125476 protein; in intron #10 of Staphylococcal nuclease comprises target sequences derived from 382.253 base pair and tudor domain containing 1; 84,468 base pair 3' of 3' of Replication factor C (activator 1)3,38kDa. 58,123 base Transmembrane protein 18; in exon #22 of Hermansky pair 3' of Tripartite motif-containing 61; in intron #3 of Pudlak syndrome 5; in exon #24 of Chromosome 12 open Citrate lyase beta like; in intron #2 of Ankyrin repeat domain reading frame 30; 95,745 base pair of 3' Splicing factor 3b, 15; in intron #38 of Maltase-glucoamylase (alpha-glucosi subunit 1; in exon #4 of Cofilin 2; in intron #1 of Heparan dase); and in intron #23 of Ankyrin 3, node of Ranvier sulfate 6-O-sulfotransferase 3: in intron #1 of Enabled (ankyrin G). homolog; in intron #2 of Mannosyl (alpha-1,6-)-glycopro 0079. In one embodiment, the Prostate Classification Set tein beta-1,6-N-acetyl-glucosaminyltransferase; in intron #8 comprises target sequences derived from 382.253 base pair of Solute carrier family 24, member 3; 32.382 base pair 3' of 3' of Replication factor C (activator 1) 3, 38kDa; in intron #2 Inositol 1,4,5-triphosphate receptor, type 1; in intron #9 of of Ankyrin repeat domain 15; in exon #19 of Hypothetical CAP-GLY domain containing linker protein 1; in exon #14 protein FLJ12949; in exon #15 of Solute carrier family 43, of Transglutaminase 4; in intron #4 of MOCO sulphurase member 1; 313,721 base pair 3' of Thioredoxin-like 2; in C-terminal domain containing 2: 21,555 base pair 5' of 4 exon #2 of Polymerase (RNA) II (DNA directed) polypep hydroxyphenylpyruvate dioxygenase-like; and in exon #26 tide L, 7.6kDa; in intron #10 of Syntaxin 5; 141,389 base of R3H domain containing 1. pair 5' of Leucine rich repeat containing 16; in intron #2 of I0082 In one embodiment, the potential set of target Calcium channel, Voltage-dependent, beta 4 subunit; 5,474 sequences can be filtered for their expression levels using base pair 5 of NM_001005522; in intron #14 of G the multi-tissue expression data made publicly available by protein-coupled receptor kinase interactor 2; in intron #2 of Affymetrix at (http://www.affymetrix.com/support/techni Ankyrin 3, node of Ranvier (ankyrin G); 71.333 base pair of cal/sample data/exon array data.affix) Such that probes 3' Gremlin 1, cysteine knot superfamily; 1.561 base pair 3' with, for example, elevated expression across numerous of Zinc finger protein 667; in exon #2 of Hypothetical gene tissues (non-specific) or no expression in prostate tissue can supported by AK128346; in intron #11 of Transmembrane 9 be excluded. superfamily member 2; in intron #1 of Potassium channel, I0083 Validation of Target Sequences subfamily K, member 1; in intron #2 of Chromodomain I0084. Following in silico selection of target sequences, helicase DNA binding protein 2; in exon #8 of Chromosome each target sequence Suitable for use in the Prostate Cancer 9 open reading frame 94; in intron #4 of Chromosome 21 Prognostic Set may be validated to confirm differential open reading frame 34; and in intron #2 of Dual specificity relative or normalized expression in recurrent prostate can phosphatase 5. cer or non-recurrent prostate cancer. Validation methods are 0080. In one embodiment, the Prostate Classification Set known in the art and include hybridization techniques such comprises target sequences derived from in intron #3 of as microarray analysis or Northern blotting using appropri Citrate lyase beta like; 210,560 base pair 5' of Phosphodi ate controls, and may include one or more additional steps, esterase 4D: 189,722 base pair 5' of Ectodysplasin A recep such as reverse transcription, transcription, PCR, RT-PCR tor; 3.510 base pair 3' of DEP domain containing 6; in exon and the like. The validation of the target sequences using #6 of Basonuclin 2; in intron #1 of Chromosome 2 open these methods is well within the abilities of a worker skilled reading frame 3; in intron #1 of FLJ25476 protein; in intron in the art. #10 of Staphylococcal nuclease and tudor domain contain I0085 Minimal Expression Signature ing 1; in exon #22 of Hermansky-Pudlak syndrome 5; and in 0086. In one embodiment, individual Prostate Cancer exon #24 of Chromosome 12 open reading frame 30. Prognostic Sets provide for at least a determination of a 0081. In one embodiment, the Prostate Classification Set minimal expression signature, capable of distinguishing comprises target sequences derived from 382.253 base pair recurrent from non-recurrent forms of prostate cancer. 3' of Replication factor C (activator 1)3,38kDa. 58,123 base Means for determining the appropriate number of target pair 3' of Tripartite motif-containing 61; in intron #3 of sequences necessary to obtain a minimal expression signa Citrate lyase beta like; in intron #1 of Adaptor-related ture are known in the art and include the Nearest Shrunken protein complex 1, gamma 2 subunit; in intron #2 of Centroids (NSC) method. Kallikrein-related peptidase 2; 210,560 base pair 5' of Phos I0087. In this method (see US 2007003 1873), a standard phodiesterase 4D: 3,508 base pair 3' of Cytochrome P450, ized centroid is computed for each class. This is the average family 4, subfamily F, polypeptide 11, 189,722 base pair 5 gene expression for each gene in each class divided by the of Ectodysplasin A receptor; in intron #2 of Phospholipase within-class standard deviation for that gene. Nearest cen C, beta 1; in intron #10 of KIAA1244; in intron #2 of troid classification takes the gene expression profile of a new Paraoxonase 2; 11.235 base pair 3' of Arachidonate 12-li sample, and compares it to each of these class centroids. The US 2017/O 191133 A1 Jul. 6, 2017

class whose centroid that it is closest to, in squared distance, 0090. Other heuristic rules can be applied that are not is the predicted class for that new sample. Nearest shrunken necessarily related to the biology in question. For example, centroid classification “shrinks’ each of the class centroids one can apply a rule that only a prescribed percentage of the toward the overall centroid for all classes by an amount portfolio can be represented by a particular gene or group of called the threshold. This shrinkage consists of moving the genes. Commercially available Software Such as the Wagner centroid towards Zero by threshold, setting it equal to Zero Software readily accommodates these types of heuristics. if it hits zero. For example if threshold was 2.0, a centroid This can be useful, for example, when factors other than of 3.2 would be shrunk to 1.2, a centroid of -3.4 would be accuracy and precision (e.g., anticipated licensing fees) have shrunk to -1.4, and a centroid of 1.2 would be shrunk to an impact on the desirability of including one or more genes. Zero. After shrinking the centroids, the new sample is classified by the usual nearest centroid rule, but using the 0091. In one embodiment, the Prostate Cancer Prognostic shrunken class centroids. This shrinkage can make the Set for obtaining a minimal expression signature comprises classifier more accurate by reducing the effect of noisy genes at least one, two, three, four, five, six, eight, 10, 15, 20, 25 and provides an automatic gene selection. In particular, if a or more of target sequences shown to have a positive gene is shrunk to Zero for all classes, then it is eliminated correlation with non-recurrent prostate disease, for example from the prediction rule. Alternatively, it may be set to zero those depicted in SEQID NOs: 1-913 or a subset thereof. In for all classes except one, and it can be learned that the high another embodiment, the Prostate Cancer Prognostic Set for or low expression for that gene characterizes that class. The obtaining a minimal expression signature comprises at least user decides on the value to use for threshold. Typically one one, two, three, four, five, six, eight, 10, 15, 20, 25 or more examines a number of different choices. To guide in this of those target sequences shown to have a positive correla choice, PAM does K-fold cross-validation for a range of tion with recurrent prostate cancer, for example those threshold values. The samples are divided up at random into depicted in of SEQID NOs: 914-2114, or a subset therof. In K roughly equally sized parts. For each part in turn, the yet another embodiment, the Prostate Cancer Prognostic Set classifier is built on the other K-1 parts then tested on the for obtaining a minimal expression signature comprises at remaining part. This is done for a range of threshold values, least one, two, three, four, five, six, eight, 10, 15, 20, 25 or and the cross-validated misclassification error rate is more of target sequences shown to have a correlation with reported for each threshold value. Typically, the user would non-recurrent or recurrent prostate cancer, for example those choose the threshold value giving the minimum cross depicted in SEQ ID NOS:1-2114 or a subset thereof. validated misclassification error rate. 0092. In some embodiments, the Prostate Cancer Prog 0088 Alternatively, minimal expression signatures can nostic Set comprises target sequences for detecting expres be established through the use of optimization algorithms sion products of SEQIDs:1-2114. In some embodiments, the Such as the mean variance algorithm widely used in estab Prostate Cancer Prognostic Set comprises probes for detect lishing stockportfolios. This method is described in detail in ing expression levels of sequences exhibiting positive and US patent publication number 20030194734. Essentially, negative correlation with a disease status of interest are the method calls for the establishment of a set of inputs employed. For example, a combination target sequences (stocks in financial applications, expression as measured by useful in these methods were found to include those encod intensity here) that will optimize the return (e.g., signal that ing RNAs corresponding to SEQ ID NOs: 1-913 (found at is generated) one receives for using it while minimizing the increased expression in prostate cancer Samples from NED variability of the return. In other words, the method calls for patients) and/or corresponding to SEQ ID NOs: 914-2114 the establishment of a set of inputs (e.g., expression as (found at increased expression levels in prostate cancer measured by intensity) that will optimize the signal while samples from SYS patients), where intermediate levels of minimizing variability. Many commercial Software pro certain target sequences (Table 7) are observed in prostate grams are available to conduct such operations. “Wagner cancer samples from PSA patients with biochemical recur Associates Mean-Variance Optimization Application.” rence, where the RNA expression levels are indicative of a referred to as “Wagner Software' throughout this specifica disease state or outcome. Subgroups of these target tion, is preferred. This software uses functions from the sequences, as well as individual target sequences, have been “Wagner Associates Mean-Variance Optimization Library’ found useful in such methods. to determine an efficient frontier and optimal portfolios in 0093. In some embodiments, an RNA signature corre the Markowitz sense is preferred. Use of this type of sponding to SEQ ID NOs: 1, 4, 6, 9, 14-16, 18-21915-917, Software requires that microarray data be transformed so that 920, 922,928,929,931, 935 and 936 (the 21-RNA signa it can be treated as an input in the way stock return and risk ture) and/or SEQ ID NOs: 1-11, 914-920 (the 18-RNA measurements are used when the Software is used for its signature) and/or SEQID NOs: 1-4,914,915) (the 6-RNA intended financial analysis purposes. signature) and/or SEQ ID NOs: 1, 4, 6, 9, 14-16, 18-21, 0089. The process of selecting a minimal expression 915-917, 920, 922, 928,929, 931, 935 and 936 (the 20 signature can also include the application of heuristic rules. RNA signature) and/or SEQID NOS 3.36, 60, 63,926,971, Preferably, such rules are formulated based on biology and 978, 999, 1014 and 1022 (the 10-RNA signature) and/or an understanding of the technology used to produce clinical SEQID NOs 1-3, 32,33, 36,46, 60, 63, 66, 69,88, 100, 241, results. More preferably, they are applied to output from the 265,334,437, 920, 925,934, 945, 947,954,971,978,999, optimization method. For example, the mean variance 1004, 1014, 1022, 1023, 1032, 1080, 1093, 1101, 1164, method of portfolio selection can be applied to microarray 1248, 1304, 1311, 1330, 1402 and 1425 (the 41-RNA data for a number of genes differentially expressed in signature) are formulated into a linear combination of their subjects with cancer. Output from the method would be an respective expression values for each patient generating a optimized set of genes that could include Some genes that are patient outcome predictor (POP) score and indicative of expressed in peripheral blood as well as in diseased tissue. the disease status of the patient after prostatectomy. US 2017/O 191133 A1 Jul. 6, 2017

0094 Exemplary subsets and combinations of interest these sequence listings, or of the target sequences described also include at least five, six, 10, 15, 18, 20, 23, 25, 27, 30, in the examples, may be utilized in the methods of the 35, 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200, invention. 225, 250, 275, 300, 350, 400, 450, 500, 750, 1000, 1200, 0097. The Prostate Cancer Prognostic Set can optionally 1400, 1600, 1800, 2000, or all 2114 target sequences in include one or more target sequences specifically derived Table 4; at least five, six, 10, 15, 18, 20, 23, 25, 27, 30, 35, from the transcripts of one or more housekeeping genes 40, 45, 50, 55, 60, 70, 80, 90, 100, 125, 150, 175, 200, 225, and/or one or more internal control target sequences and/or 250, 275, 300, 350, 400, 450, 500, or all 526 target one or more negative control target sequences. In one sequences in Table 7: SEQID NOS:1, 4,915, 6,916, 9,917, embodiment, these target sequences can, for example, be 920, 922, 14, 15, 16,928,929, 18, 19,931, 20, 21,935,936, used to normalize expression data. Housekeeping genes or combinations thereof: SEQ ID NOs: 1, 2, 3, 4, 5, 6, 7, 8, from which target sequences for inclusion in a Prostate 9, 10, 11,914,915,916,917,918, 919,920, or combinations Cancer Prognostic Set can be derived from are known in the thereof: SEQID NOs: 1, 4, 6, 9, 14-16, 18-21,915-917,920, art and include those genes in which are expressed at a 922,928,929,931, 935,936 or combinations thereof: SEQ constant level in normal and prostate cancer tissue. ID NOS 3, 36, 60, 63,926, 971, 978, 999, 1014, 1022 or 0098. The target sequences described herein may be used combinations thereof: SEQID NOs 1-3, 32, 33, 36, 46, 60, alone or in combination with each other or with other known 63, 66, 69, 88, 100, 241, 265, 334, 437, 920, 925,934, 945, or later identified disease markers. 947, 954, 971, 978, 999, 1004, 1014, 1022, 1023, 1032, 0099 Prostate Cancer Prognostic Probes/Primers 1080, 1093, 1101, 1164, 1248, 1304, 1311, 1330, 1402, 1425 0100. The system of the present invention provides for or combinations thereof, at least one, two, three, four, five or combinations of polynucleotide probes that are capable of six of SEQID NOS:1, 4, 6, 9, 14, 15, 16, 18, 19, 20, and 21 detecting the target sequences of the Prostate Cancer Prog and at least one, two, three, four, five or six of SEQ ID nostic Sets. Individual polynucleotide probes comprise a NOs:915,916, 917, 920, 922,928,929,931, 935, and 936; nucleotide sequence derived from the nucleotide sequence and at least one, two, three, four, five or six of SEQ ID of the target sequences or complementary sequences thereof. NOs: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, and 11 at least one, two, three, The nucleotide sequence of the polynucleotide probe is four, five or six of at least one, two, three, four, five or six designed such that it corresponds to, or is complementary to of SEQID NOs:914, 915,916, 917,918, 919, and 920 and the target sequences. The polynucleotide probe can specifi at least one, two, three, four, five or six of SEQ ID NOs: 1. cally hybridize under either stringent or lowered stringency 4, 6, 9, 14-16, 18-21,915-917,920,922,928,929,931,935, hybridization conditions to a region of the target sequences, 936; and at least one, two, three, four, five or six of SEQID to the complement thereof, or to a nucleic acid sequence NOs 3, 36, 60, 63,926, 971,978, 999, 1014, 1022; and at least one, two, three, four, five or six of SEQ ID NOs 1-3, (such as a cDNA) derived therefrom. 32, 33, 36, 46, 60, 63, 66, 69, 88, 100, 241, 265, 334, 437, 0101 The selection of the polynucleotide probe 920, 925, 934, 945, 947, 954,971, 978,999, 1004, 1014, sequences and determination of their uniqueness may be 1022, 1023, 1032, 1080, 1093, 1101, 1164, 1248, 1304, carried out in silico using techniques known in the art, for example, based on a BLASTN search of the polynucleotide 1311, 1330, 1402, 1425. sequence in question against gene sequence databases. Such 0095 Exemplary subsets of interest include those as the Sequence, UniGene, dbPST or the described herein, including in the examples. Exemplary non-redundant database at NCBI. In one embodiment of the combinations of interest include those utilizing one or more invention, the polynucleotide probe is complementary to a of the sequences listed in Tables 5, 7, 8, 9 or 10. Of particular region of a target mRNA derived from a target sequence in interest are those combinations utilizing at least one the Prostate Cancer Prognostic Set. Computer programs can sequence exhibiting positive correlation with the trait of also be employed to select probe sequences that will not interest, as well as those combinations utilizing at least one cross hybridize or will not hybridize non-specifically. sequence exhibiting negative correlation with the trait of 0102 One skilled in the art will understand that the interest. Also of interest are those combinations utilizing at nucleotide sequence of the polynucleotide probe need not be least two, at least three, at least four, at least five or at least identical to its target sequence in order to specifically six of those sequences exhibiting Such a positive correlation, hybridize thereto. The polynucleotide probes of the present in combination with at least two, at least three, at least four, invention, therefore, comprise a nucleotide sequence that is at least five, or at least six of those sequences exhibiting Such at least about 75% identical to a region of the target gene or a negative correlation. Exemplary combinations include mRNA. In another embodiment, the nucleotide sequence of those utilizing at least one, two, three, four, five or six of the the polynucleotide probe is at least about 90% identical a target sequences depicted in Tables 5 and 6. region of the target gene or mRNA. In a further embodiment, 0096. In some embodiments, increased relative expres the nucleotide sequence of the polynucleotide probe is at sion of one or more of SEQ IDs: 1-913, decreased relative least about 95% identical to a region of the target gene or expression of one or more of SEQ ID NOs:914-2114 or a mRNA. Methods of determining sequence identity are combination of any thereof is indicative/predictive of the known in the art and can be determined, for example, by patient exhibiting no evidence of disease for at least seven using 5 the BLASTN program of the University of Wiscon years or more after Surgery. In some embodiments, increased sin Computer Group (GCG) software or provided on the relative expression of SEQIDs:914-2114, decreased relative NCBI website. The nucleotide sequence of the polynucle expression of one or more of SEQ ID NOs: 1-913 or a otide probes of the present invention may exhibit variability combination of any thereof is indicative/predictive of the by differing (e.g. by nucleotide Substitution, including tran patient exhibiting systemic prostate cancer. Increased or sition or transversion) at one, two, three, four or more decreased expression of target sequences represented in nucleotides from the sequence of the target gene. US 2017/O 191133 A1 Jul. 6, 2017

0103). Other criteria known in the art may be employed in standard quantitative or qualitative PCR-based assays to the design of the polynucleotide probes of the present assess transcript expression levels of RNAs defined by the invention. For example, the probes can be designed to have Prostate Cancer Prognostic Set. Alternatively, these primers <50% G content and/or between about 25% and about 70% may be used in combination with probes, such as molecular G+C content. Strategies to optimize probe hybridization to beacons in amplifications using real-time PCR. the target nucleic acid sequence can also be included in the 0108. In one embodiment, the primers or primer pairs, process of probe selection. Hybridization under particular when used in an amplification reaction, specifically amplify pH, salt, and temperature conditions can be optimized by at least a portion of a nucleic acid depicted in one of SEQ taking into account melting temperatures and by using ID NOs: 1-2114 (or subgroups thereof as set forth herein), empirical rules that correlate with desired hybridization an RNA form thereof, or a complement to either thereof. behaviours. Computer models may be used for predicting 0109 As is known in the art, a nucleoside is a base-sugar the intensity and concentration-dependence of probe hybrid combination and a nucleotide is a nucleoside that further ization. includes a phosphate group covalently linked to the Sugar 0104. As is known in the art, in order to represent a portion of the nucleoside. In forming oligonucleotides, the unique sequence in the human genome, a probe should be at phosphate groups covalently link adjacent nucleosides to least 15 nucleotides in length. Accordingly, the polynucle one another to form a linear polymeric compound, with the otide probes of the present invention range in length from normal linkage or backbone of RNA and DNA being a 3' to about 15 nucleotides to the full length of the target sequence 5' phosphodiester linkage. Specific examples of polynucle or target mRNA. In one embodiment of the invention, the otide probes or primers useful in this invention include polynucleotide probes are at least about 15 nucleotides in oligonucleotides containing modified backbones or non length. In another embodiment, the polynucleotide probes natural internucleoside linkages. As defined in this specifi are at least about 20 nucleotides in length. In a further cation, oligonucleotides having modified backbones include embodiment, the polynucleotide probes are at least about 25 both those that retain a phosphorus atom in the backbone and nucleotides in length. In another embodiment, the poly those that lack a phosphorus atom in the backbone. For the nucleotide probes are between about 15 nucleotides and purposes of the present invention, and as sometimes refer about 500 nucleotides in length. In other embodiments, the enced in the art, modified oligonucleotides that do not have polynucleotide probes are between about 15 nucleotides and a phosphorus atom in their internucleoside backbone can about 450 nucleotides, about 15 nucleotides and about 400 also be considered to be oligonucleotides. nucleotides, about 15 nucleotides and about 350 nucleotides, 0110 Exemplary polynucleotide probes or primers hav about 15 nucleotides and about 300 nucleotides in length. ing modified oligonucleotide backbones include, for 0105. The polynucleotide probes of a Prostate Cancer example, those with one or more modified internucleotide Prognostic Set can comprise RNA, DNA, RNA or DNA linkages that are phosphorothioates, chiral phosphorothio mimetics, or combinations thereof, and can be single ates, phosphorodithioates, phosphotriesters, aminoalkyl stranded or double-stranded. Thus the polynucleotide probes phosphotriesters, methyl and other alkyl phosphonates can be composed of naturally-occurring nucleobases, Sugars including 3'-alkylene phosphonates and chiral phospho and covalent internucleoside (backbone) linkages as well as nates, phosphinates, phosphoramidates including 3' amino polynucleotide probes having non-naturally-occurring por phosphoramidate and aminoalkylphosphoramidates, thiono tions which function similarly. Such modified or substituted phosphoramidates, thionoalkyl-phosphonates, thionoalcyl polynucleotide probes may provide desirable properties phosphotriesters, and boranophosphates having normal 3'-5' Such as, for example, enhanced affinity for a target gene and linkages. 2'-5' linked analogs of these, and those having increased stability. inverted polarity wherein the adjacent pairs of nucleoside 0106 The system of the present invention further pro units are linked 3'-5' to 5'-3' or 2'-5' to 5'-2'. Various salts, vides for primers and primer pairs capable of amplifying mixed salts and free acid forms are also included. target sequences defined by the Prostate Cancer Prognostic 0111 Exemplary modified oligonucleotide backbones Set, or fragments or Subsequences or complements thereof. that do not include a phosphorus atom are formed by short The nucleotide sequences of the Prostate Cancer Prognostic chain alkyl or cycloalkyl internucleoside linkages, mixed set may be provided in computer-readable media for in silico heteroatom and alkyl or cycloalkyl internucleoside linkages, applications and as a basis for the design of appropriate or one or more short chain heteroatomic or heterocyclic primers for amplification of one or more target sequences of internucleoside linkages. Such backbones include mor the Prostate Cancer Prognostic Set. pholino linkages (formed in part from the Sugar portion of a 0107 Primers based on the nucleotide sequences of target nucleoside); siloxane backbones; Sulfide, Sulfoxide and Sul sequences can be designed for use in amplification of the phone backbones; formacetyl and thioformacetyl back target sequences. For use in amplification reactions such as bones; methylene formacetyl and thioformacetyl backbones: PCR, a pair of primers will be used. The exact composition alkene containing backbones; Sulphamate backbones; meth of the primer sequences is not critical to the invention, but yleneimino and methylenehydrazino backbones; Sulphonate for most applications the primers will hybridize to specific and Sulfonamide backbones; amide backbones; and others sequences of the Prostate Cancer Prognostic Set under having mixed N, O, S and CH2 component parts. stringent conditions, particularly under conditions of high 0112 The present invention also contemplates oligo stringency, as known in the art. The pairs of primers are nucleotide mimetics in which both the sugar and the inter usually chosen so as to generate an amplification product of nucleoside linkage of the nucleotide units are replaced with at least about 50 nucleotides, more usually at least about 100 novel groups. The base units are maintained for hybridiza nucleotides. Algorithms for the selection of primer tion with an appropriate nucleic acid target compound. An sequences are generally known, and are available in com example of Such an oligonucleotide mimetic, which has mercial software packages. These primers may be used in been shown to have excellent hybridization properties, is a US 2017/O 191133 A1 Jul. 6, 2017

peptide nucleic acid (PNA) Nielsen et al., Science, 254: methoxyethyl) or 2-MOE) Martin et al., Hely. Chim. Acta, 1497-1500 (1991). In PNA compounds, the sugar-backbone 78:486-504(1995), 2'-dimethylaminooxyethoxy (O(CH), of an oligonucleotide is replaced with an amide containing ON(CH) group, also known as 2'-DMAOE), 2-methoxy backbone, in particular an aminoethylglycine backbone. The (2 O—CH), 2-aminopropoxy (2 OCH, CH, CH, nucleobases are retained and are bound directly or indirectly NH) and 2'-fluoro (23-F). to aza-nitrogen atoms of the amide portion of the backbone. 0117 Similar modifications may also be made at other 0113. The present invention also contemplates polynucle positions on the polynucleotide probes or primers, particu otide probes or primers comprising “locked nucleic acids' larly the 3' position of the sugar on the 3' terminal nucleotide (LNAs), which are novel conformationally restricted oligo or in 2'-5' linked oligonucleotides and the 5' position of 5' nucleotide analogues containing a methylene bridge that terminal nucleotide. Polynucleotide probes or primers may connects the 2'-O of ribose with the 4'-C (see, Singh et al., also have Sugar mimetics such as cyclobutyl moieties in Chem. Commun., 1998, 4:455-456). LNA and LNA ana place of the pentofuranosyl Sugar. logues display very high duplex thermal stabilities with 0118 Polynucleotide probes or primers may also include complementary DNA and RNA, stability towards 3'-exonu modifications or Substitutions to the nucleobase. As used clease degradation, and good solubility properties. Synthesis herein, “unmodified’ or “natural nucleobases include the of the LNA analogues of adenine, cytosine, guanine, purine bases adenine (A) and guanine (G), and the pyrimi 5-methylcytosine, thymine and uracil, their oligomerization, dine bases thymine (T), cytosine (C) and uracil (U). Modi and nucleic acid recognition properties have been described fied nucleobases include other synthetic and natural nucle (see Koshkin et al., Tetrahedron, 1998, 54:3607-3630). obases such as 5-methylcytosine (5-me-C), 5 Studies of mis-matched sequences show that LNA obey the hydroxymethyl cytosine, Xanthine, hypoxanthine, 2-amino Watson-Crick base pairing rules with generally improved adenine, 6-methyl and other alkyl derivatives of adenine and selectivity compared to the corresponding unmodified ref guanine, 2-propyl and other alkyl derivatives of adenine and erence Strands. guanine, 2-thiouracil, 2-thiothymine and 2-thiocytosine, 0114 LNAs form duplexes with complementary DNA or 5-halouracil and cytosine, 5-propynyl uracil and cytosine, RNA or with complementary LNA, with high thermal affini 6-aZO uracil, cytosine and thymine, 5-uracil (pseudouracil), ties. The universality of LNA-mediated hybridization has 4-thiouracil, 8-halo, 8-amino, 8-thiol, 8-thioalkyl, 8-hy been emphasized by the formation of exceedingly stable droxyl and other 8-Substituted adenines and guanines, 5-halo LNA:LNA duplexes (Koshkin et al., J. Am. Chem. Soc., particularly 5-bromo, 5-trifluoromethyl and other 5-substi 1998, 120:13252-13253). LNA:LNA hybridization was tuted uracils and cytosines, 7-methylguanine and 7-methyl shown to be the most thermally stable nucleic acid type adenine, 8-azaguanine and 8-azaadenine, 7-deazaguanine duplex system, and the RNA-mimicking character of LNA and 7-deazaadenine and 3-deazaguanine and 3-deazaade was established at the duplex level. Introduction of three nine. Further nucleobases include those disclosed in U.S. LNA monomers (T or A) resulted in significantly increased Pat. No. 3,687,808: The Concise Encyclopedia Of Polymer melting points toward DNA complements. Science And Engineering, (1990) pp 858-859, Kroschwitz, 0115 Synthesis of 2'-amino-LNA (Singh et al., J. Org. J. I., ed. John Wiley & Sons: Englisch et al., Angewandte Chem., 1998, 63, 1.0035-10039) and 2'-methylamino-LNA Chemie, Int. Ed., 30:613 (1991); and Sanghvi, Y. S., (1993) has been described and thermal stability of their duplexes Antisense Research and Applications, pp 289-302, Crooke, with complementary RNA and DNA strands reported. S. T. and Lebleu, B., ed., CRC Press. Certain of these Preparation of phosphorothioate-LNA and 2'-thio-LNA have nucleobases are particularly useful for increasing the bind also been described (Kumar et al., Bioorg. Med. Chem. ing affinity of the polynucleotide probes of the invention. Lett., 1998, 8:2219 2222). These include 5-substituted pyrimidines, 6-azapyrimidines 0116 Modified polynucleotide probes or primers may and N-2, N-6 and 0-6 substituted purines, including also contain one or more Substituted Sugar moieties. For 2-aminopropyladenine, 5 propynyluracil and 5-propynylcy example, oligonucleotides may comprise Sugars with one of tosine. 5-methylcytosine substitutions have been shown to the following substituents at the 2' position: OH: F: O-, S increase nucleic acid duplex stability by 0.6-1.2° C. or N-alkyl; O-, S-, or N-alkenyl; O-, S- or N-alkynyl; or Sanghvi, Y. S., (1993) Antisense Research and Applica O-alkyl-O-alkyl, wherein the alkyl, alkenyl and alkynyl may tions, pp. 276-278, Crooke, S. T. and Lebleu, B., ed., CRC be substituted or unsubstituted C1 to C10 alkyl or C2 to C10 Press, Boca Raton. alkenyl and alkynyl. Examples of Such groups are: O(CH2) 0119. One skilled in the art will recognize that it is not OCH, O(CH), OCH, O(CH), NH, O(CH2)CH, necessary for all positions in a given polynucleotide probe or O(CH), ONH, and O(CH), ON(CH2)CH), where n primer to be uniformly modified. The present invention, and mare from 1 to about 10. Alternatively, the oligonucle therefore, contemplates the incorporation of more than one otides may comprise one of the following Substituents at the of the aforementioned modifications into a single polynucle 2 position: C to Co lower alkyl, substituted lower alkyl, otide probe or even at a single nucleoside within the probe alkaryl, aralkyl, O-alkaryl or O-aralkyl, SH, SCH, OCN, or primer. CI, Br, CN, CF. OCF, SOCH, SO, CH, ONO., NO, N, 0.120. One skilled in the art will also appreciate that the NH, heterocycloalkyl, heterocycloalkaryl, aminoalky nucleotide sequence of the entire length of the polynucle lamino, polyalkylamino, Substituted silyl, an RNA cleaving otide probe or primer does not need to be derived from the group, a reporter group, an intercalator, a group for improv target sequence. Thus, for example, the polynucleotide ing the pharmacokinetic properties of an oligonucleotide, or probe may comprise nucleotide sequences at the 5' and/or 3' a group for improving the pharmacodynamic properties of termini that are not derived from the target sequences. an oligonucleotide, and other Substituents having similar Nucleotide sequences which are not derived from the properties. Specific examples include 2'-methoxyethoxy nucleotide sequence of the target sequence may provide (2' O CH, CH, OCH also known as 2 O—(2- additional functionality to the polynucleotide probe. For US 2017/O 191133 A1 Jul. 6, 2017 example, they may provide a restriction enzyme recognition linkages. For example, a polynucleotide may comprise a sequence or a “tag” that facilitates detection, isolation, biotin-binding species, and an optically detectable label may purification or immobilisation onto a solid Support. be conjugated to biotin and then bound to the labeled 0121 Alternatively, the additional nucleotides may pro polynucleotide. Similarly, a polynucleotide sensor may vide a self-complementary sequence that allows the primer/ comprise an immunological species such as an antibody or probe to adopt a hairpin configuration. Such configurations fragment, and a secondary antibody containing an optically are necessary for certain probes, for example, molecular detectable label may be added. beacon and Scorpion probes, which can be used. in Solution I0127 Chromophores useful in the methods described hybridization techniques. herein include any Substance which can absorb energy and 0122) The polynucleotide probes or primers can incorpo emit light. For multiplexed assays, a plurality of different rate moieties useful in detection, isolation, purification, or signaling chromophores can be used with detectably differ immobilisation, if desired. Such moieties are well-known in ent emission spectra. The chromophore can be a lumophore the art (see, for example, Ausubel et al., (1997 & updates) or a fluorophore. Typical fluorophores include fluorescent Current Protocols in Molecular Biology, Wiley & Sons, dyes, semiconductor nanocrystals, lanthanide chelates, poly New York) and are chosen such that the ability of the probe nucleotide-specific dyes and green fluorescent protein. to hybridize with its target sequence is not affected. 0128 Coding schemes may optionally be used, compris 0123 Examples of suitable moieties are detectable labels, ing encoded particles and/or encoded tags associated with Such as radioisotopes, fluorophores, chemiluminophores, different polynucleotides of the invention. A variety of enzymes, colloidal particles, and fluorescent microparticles, different coding schemes are known in the art, including as well as antigens, antibodies, haptens, avidin/streptavidin, fluorophores, including SCNCs, deposited metals, and RF biotin, haptens, enzyme cofactors/substrates, enzymes, and tags. the like. I0129 Polynucleotides from the described target 0.124. A label can optionally be attached to or incorpo sequences may be employed as probes for detecting target rated into a probe or primer polynucleotide to allow detec sequences expression, for ligation amplification schemes, or tion and/or quantitation of a target polynucleotide represent may be used as primers for amplification schemes of all or ing the target sequence of interest. The target polynucleotide a portion of a target sequences. When amplified, either may be the expressed target sequence RNA itself, a cDNA Strand produced by amplification may be provided in puri copy thereof, or an amplification product derived therefrom, fied and/or isolated form. and may be the positive or negative Strand, so long as it can 0.130. In one embodiment, polynucleotides of the inven be specifically detected in the assay being used. Similarly, an tion include a nucleic acid depicted in(a) any one of SEQID antibody may be labeled. NOs: 1-2114, or a subgroup thereof as set forth herein; (b) 0.125. In certain multiplex formats, labels used for detect an RNA form of any one of the nucleic acids depicted in ing different targets may be distinguishable. The label can be SEQ ID NOs: 1-2114, or a subgroup thereof as set forth attached directly (e.g., via covalent linkage) or indirectly, herein; (c) a peptide nucleic acid form of any of the nucleic e.g., via a bridging molecule or series of molecules (e.g., a acids depicted in SEQ ID NOs: 1-2114, or a subgroup molecule or complex that can bind to an assay component, thereof as set forth herein; (d) a nucleic acid comprising at or via members of a binding pair that can be incorporated least 20 consecutive bases of any of (a-c); (e) a nucleic acid into assay components, e.g. biotin-avidin or streptavidin). comprising at least 25 bases having at least 90% sequenced Many labels are commercially available in activated forms identity to any of (a-c); and (f) a complement to any of (a-e). which can readily be used for Such conjugation (for example I0131 Complements may take any polymeric form through amine acylation), or labels may be attached through capable of base pairing to the species recited in (a)-(e), known or determinable conjugation schemes, many of including nucleic acid such as RNA or DNA, or may be a which are known in the art. neutral polymer Such as a peptide nucleic acid. Polynucle 0126 Labels useful in the invention described herein otides of the invention can be selected from the subsets of include any substance which can be detected when bound to the recited nucleic acids described herein, as well as their or incorporated into the biomoleculd of interest. Any effec complements. tive detection method can be used, including optical, spec 0.132. In some embodiments, polynucleotides of the troscopic, electrical, piezoelectrical, magnetic, Raman scat invention comprise at least 20 consecutive bases as depicted tering, Surface plasmon resonance, colorimetric, in SEQ ID NOS:1-2114, or a complement thereto. The calorimetric, etc. A label is typically selected from a chro polynucleotides may comprise at least 21, 22, 23, 24, 25, 27. mophore, a lumiphore, a fluorophore, one member of a 30, 32, 35 or more consecutive bases as depicted in SEQID quenching System, a chromogen, a hapten, an antigen, a NOs:1-2114, as applicable. magnetic particle, a material exhibiting nonlinear optics, a I0133. In some embodiments, the nucleic acid in (a) can semiconductor nanocrystal, a metal nanoparticle, an be selected from those in Table 3, and from SEQID NOS:1, enzyme, an antibody or binding portion or equivalent 4,915, 6,916, 9,917, 920, 922, 14, 15, 16,928,929, 18, 19, thereof, an aptamer, and one member of a binding pair, and 931, 20, 21,935, and 936; or from SEQ 111 NOs: 1, 2, 3, 4, combinations thereof. Quenching schemes may be used, 5, 6, 7, 8, 9, 10, 11,914, 915, 916,917,918, 919, and 920; wherein a quencher and a fluorophore as members of a or from SEQID NOs: 1, 4, 6, 9, 14-16, 18-21,915-917,920, quenching pair may be used on a probe, such that a change 922, 928,929,931,935 and 936; or from SEQID NOs 3, in optical parameters occurs upon binding to the target 36,60, 63,926,971,978,999, 1014 and 1022; or from SEQ introduce or quench the signal from the fluorophore. One ID NOs 1-3, 32,33, 36,46, 60, 63, 66, 69, 88, 100, 241,265, example of Such a system is a molecular beacon. Suitable 334,437,920, 925,934,945, 947,954,971,978,999, 1004, quencher/fluorophore systems are known in the art. The 1014, 1022, 1023, 1032, 1080, 1093, 1101, 1164, 1248, label may be bound through a variety of intermediate 1304, 1311, 1330, 1402 and 1425. US 2017/O 191133 A1 Jul. 6, 2017

0134. The polynucleotides may be provided in a variety 0141 Surfaces on the substrate can be composed of the of formats, including as Solids, in Solution, or in an array. same material as the Substrate or can be made from a The polynucleotides may optionally comprise one or more different material, and can be coupled to the substrate by labels, which may be chemically and/or enzymatically chemical or physical means. Such coupled Surfaces may be incorporated into the polynucleotide. composed of any of a wide variety of materials, for example, 0135) In one embodiment, solutions comprising poly polymers, plastics, resins, polysaccharides, silica or silica nucleotide and a solvent are also provided. In some embodi based materials, carbon, metals, inorganic glasses, mem ments, the Solvent may be water or may be predominantly branes, or any of the above-listed substrate materials. The aqueous. In some embodiments, the solution may comprise Surface can be optically transparent and can have surface at least two, three, four, five, six, seven, eight, nine, ten, Si-OH functionalities, such as those found on silica surfaces. twelve, fifteen, seventeen, twenty or more different poly 0142. The substrate and/or its optional surface can be nucleotides, including primers and primer pairs, of the chosen to provide appropriate characteristics for the Syn invention. Additional Substances may be included in the thetic and/or detection methods used. The substrate and/or Solution, alone or in combination, including one or more Surface can be transparent to allow the exposure of the labels, additional solvents, buffers, biomolecules, polynucle substrate by light applied from multiple directions. The otides, and one or more enzymes useful for performing substrate and/or surface may be provided with reflective methods described herein, including polymerases and “mirror structures to increase the recovery of light. ligases. The solution may further comprise a primer or 0143. The substrate and/or its surface is generally resis primer pair capable of amplifying a polynucleotide of the tant to, or is treated to resist, the conditions to which it is to invention present in the solution. be exposed in use, and can be optionally treated to remove 0136. In some embodiments, one or more polynucle any resistant material after exposure to Such conditions. otides provided herein can be provided on a substrate. The 0144. The substrate or a region thereof may be encoded Substrate can comprise a wide range of material, either so that the identity of the sensor located in the substrate or biological, nonbiological, organic, inorganic, or a combina region being queried may be determined. Any Suitable tion of any of these. For example, the substrate may be a coding scheme can be used, for example optical codes, polymerized Langmuir Blodgett film, functionalized glass, RFID tags, magnetic codes, physical codes, fluorescent Si, Ge, GaAs, GaP, SiO, SiNa, modified silicon, or any one codes, and combinations of codes. of a wide variety of gels or polymers such as (poly) (0145 Preparation of Probes and Primers tetrafluoroethylene, (poly) vinylidenedifluoride, polystyrene, 0146 The polynucleotide probes or primers of the pres cross-linked polystyrene, polyacrylic, polylactic acid, ent invention can be prepared by conventional techniques polyglycolic acid, poly(lactide coglycolide), polyanhy well-known to those skilled in the art. For example, the drides, poly(methyl methacrylate), poly(ethylene-co-vinyl polynucleotide probes can be prepared using Solid-phase acetate), polysiloxanes, polymeric silica, latexes, dextran synthesis using commercially available equipment. As is polymers, epoxies, polycarbonates, or combinations thereof. well-known in the art, modified oligonucleotides can also be Conducting polymers and photoconductive materials can be readily prepared by similar methods. The polynucleotide used. probes can also be synthesized directly on a solid Support 0.137 Substrates can be planar crystalline substrates such according to methods standard in the art. This method of as silica based substrates (e.g. glass, quartz, or the like), or synthesizing polynucleotides is particularly useful when the crystalline Substrates used in, e.g., the semiconductor and polynucleotide probes are part of a nucleic acid array. microprocessor industries, such as silicon, gallium arsenide, 0147 Polynucleotide probes or primers can be fabricated indium doped GaN and the like, and includes semiconductor on or attached to the substrate by any suitable method, for nanocrystals. example the methods described in U.S. Pat. No. 5,143,854, 0.138. The substrate can take the form of an array, a PCT Publ. No. WO92/10092, U.S. patent application Ser. photodiode, an optoelectronic sensor Such as an optoelec No. 07/624,120, filed Dec. 6, 1990 (now abandoned), Fodor tronic semiconductor chip or optoelectronic thin-film semi et al., Science, 251: 767-777 (1991), and PCT Publ. No. WO conductor, or a biochip. The location(s) of probe(s) on the 90/15070). Techniques for the synthesis of these arrays substrate can be addressable; this can be done in highly using mechanical synthesis strategies are described in, e.g., dense formats, and the location(s) can be microaddressable PCT Publication No. WO 93/09668 and U.S. Pat. No. or nanoaddressable. 5,384.261. Still further techniques include bead based tech 0139 Silica aerogels can also be used as substrates, and niques such as those described in PCT Appl. No. PCT/US93/ can be prepared by methods known in the art. Aerogel 04145 and pin based methods such as those described in Substrates may be used as free standing Substrates or as a U.S. Pat. No. 5.288,514. Additional flow channel or spotting Surface coating for another Substrate material. methods applicable to attachment of sensor polynucleotides 0140. The substrate can take any form and typically is a to a substrate are described in U.S. patent application Ser. plate, slide, bead, pellet, disk, particle, microparticle, nano No. 07/980,523, filed Nov. 20, 1992, and U.S. Pat. No. particle, Strand, precipitate, optionally porous gel, sheets, 5,384,261. tube, sphere, container, capillary, pad, slice, film, chip, 0148 Alternatively, the polynucleotide probes of the multiwell plate or dish, optical fiber, etc. The substrate can present invention can be prepared by enzymatic digestion of be any form that is rigid or semi-rigid. The Substrate may the naturally occurring target gene, or mRNA or cDNA contain raised or depressed regions on which an assay derived therefrom, by methods known in the art. component is located. The surface of the substrate can be 0149 Prostate Cancer Prognostic Methods etched using known techniques to provide for desired Sur 0150. The present invention further provides methods for face features, for example trenches, V-grooves, mesa struc characterizing prostate cancer sample for recurrence risk. tures, or the like. The methods use the Prostate Cancer Prognostic Sets, US 2017/O 191133 A1 Jul. 6, 2017 probes and primers described herein to provide expression TaqMan(R) assay (PE Biosystems, Foster City, Calif., e.g. as signatures or profiles from a test sample derived from a described in U.S. Pat. Nos. 5,962,233 and 5,538,848), and Subject having or Suspected of having prostate cancer. In may be quantitative or semi-quantitative, and may vary Some embodiments, such methods involve contacting a test depending on the origin, amount and condition of the sample with Prostate Cancer Prognostic probes (either in available biological sample. Combinations of these methods solution or immobilized) under conditions that permit may also be used. For example, nucleic acids may be hybridization of the probe(s) to any target nucleic acid(s) amplified, labeled and Subjected to microarray analysis. present in the test sample and then detecting any probe: Single-molecule sequencing (e.g., Illumina, Helicos, target duplexes formed as an indication of the presence of PacBio, ABI SOLID), in situ hybridization, bead-array the target nucleic acid in the sample. Expression patterns technologies (e.g., LumineX XMAP, Illumina BeadChips), thus determined are then compared to one or more reference branched DNA technology (e.g., Panomics, Genisphere). profiles or signatures. Optionally, the expression pattern can 0158. The expressed target sequences can be directly be normalized. The methods use the Prostate Cancer Prog detected and/or quantitated, or may be copied and/or ampli nostic Sets, probes and primers described herein to provide fied to allow detection of amplified copies of the expressed expression signatures or profiles from a test sample derived target sequences or its complement. In some embodiments, from a Subject to classify the prostate cancer as recurrent or degraded and/or fragmented RNA can be usefully analyzed non-recurrent. for expression levels of target sequences, for example RNA 0151. In some embodiments, such methods involve the having an RNA integrity number of less than 8. specific amplification of target sequences nucleic acid(s) 0159. In some embodiments, quantitative RT-PCR assays present in the test sample using methods known in the art to are used to measure the expression level of target sequences generate an expression profile or signature which is then depicted in SEQ IDs: 1-2114. In other embodiments, a compared to a reference profile or signature. GeneChip or microarray can be used to measure the expres 0152. In some embodiments, the invention further pro sion of one or more of the target sequences. vides for prognosing patient outcome, predicting likelihood 0160 Molecular assays measure the relative expression of recurrence after prostatectomy and/or for designating levels of the target sequences, which can be normalized to treatment modalities. the expression levels of one or more control sequences, for 0153. In one embodiment, the methods generate expres example array control sequences and/or one or more house sion profiles or signatures detailing the expression of the keeping genes, for example GAPDFI. Increased (or 2114 target sequences having altered relative expression decreased) relative expression of the target sequences as with different prostate cancer outcomes. In one embodiment, described herein, including any of SEQ ID NOS:1-2114, the methods generate expression profiles or signatures may thus be used alone or in any combination with each detailing the expression of the Subsets of these target other in the methods described herein. In addition, negative sequences having 526 or 18 target sequences as described in control probes may be included. the examples. 0.161 Diagnostic Samples 0154) In some embodiments, increased relative expres 0162 Diagnostic samples for use with the systems and in sion of one or more of SEQ IDs: 1-913, decreased relative the methods of the present invention comprise nucleic acids expression of one or more of SEQ ID NOs:914-2114 or a suitable for providing RNAs expression information. In combination of any thereof is indicative of a non-recurrent principle, the biological sample from which the expressed form of prostate cancer and may be predictive a NED RNA is obtained and analyzed for target sequence expres clinical outcome after Surgery. In some embodiments, sion can be any material Suspected of comprising prostate increased relative expression of SEQ IDs:914-2114, cancer tissue or cells. The diagnostic sample can be a decreased relative expression of one or more of SEQ ID biological sample used directly in a method of the invention. NOs: 1-913 or a combination of any thereof is indicative of Alternatively, the diagnostic sample can be a sample pre a recurrent form of prostate cancer and may be predictive of pared from a biological sample. a SYS clinical outcome after surgery. Increased or decreased 0163. In one embodiments, the sample or portion of the expression of target sequences represented in these sequence sample comprising or Suspected of comprising prostate listings, or of the target sequences described in the examples, cancer tissue or cells can be any source of biological may be utilized in the methods of the invention. material, including cells, tissue or fluid, including bodily 0155. In one embodiment, intermediate levels of expres fluids. Non-limiting examples of the source of the sample sion of one or more target sequences depicted in Table 7 include an aspirate, a needle biopsy, a cytology pellet, a bulk indicate a probability of future biochemical recurrence. tissue preparation or a section thereof obtained for example 0156. In some embodiments, the methods detect combi by Surgery or autopsy, lymph fluid, blood, plasma, serum, nations of expression levels of sequences exhibiting positive tumors, and organs. and negative correlation with a disease status. In one 0164. The samples may be archival samples, having a embodiment, the methods detect a minimal expression sig known and documented medical outcome, or may be nature. samples from current patients whose ultimate medical out 0157 Any method of detecting and/or quantitating the come is not yet known. expression of the encoded target sequences can in principle 0.165. In some embodiments, the sample may be dis be used in the invention. Such methods can include Northern sected prior to molecular analysis. The sample may be blotting, array or microarray hybridization, by enzymatic prepared via macrodis section of a bulk tumor specimen or cleavage of specific structures (e.g., an Invader(R) assay, portion thereof, or may be treated via macrodissection, Third Wave Technologies, e.g. as described in U.S. Pat. Nos. forexample via Laser Capture Microdissection (LCM). 5,846,717, 6,090,543; 6,001,567; 5,985,557; and 5,994,069) 0166 The sample may initially be provided in a variety and amplification methods, e.g. RT-PCR, including in a of states, as fresh tissue, fresh frozen tissue, fine needle US 2017/O 191133 A1 Jul. 6, 2017 aspirates, and may be fixed or unfixed. Frequently, medical RNA Micro KitTM, Roche Applied Science, Indianapolis, laboratories routinely prepare medical samples in a fixed Ind.). RNA can be extracted from frozen tissue sections state, which facilitates tissue storage. A variety of fixatives using TRIZol (Invitrogen, Carlsbad, Calif.) and purified can be used to fix tissue to stabilize the morphology of cells, using RNeasy Protect kit (Qiagen, Valencia, Calif.). RNA and may be used alone or in combination with other agents. can be further purified using DNAse I treatment (Ambion, Exemplary fixatives include crosslinking agents, alcohols, Austin, Tex.) to eliminate any contaminating DNA. RNA acetone, Bouin’s solution, Zenker Solution, Hely solution, concentrations can be made using a Nanodrop ND-1000 osmic acid solution and Carnoy Solution. spectrophotometer (Nanodrop Technologies, Rockland, 0167 Crosslinking fixatives can comprise any agent Suit Del.). RNA integrity can be evaluated by running electro able for forming two or more covalent bonds, for example pherograms, and RNA integrity number (RIN, a correlative an aldehyde. Sources of aldehydes typically used for fixation measure that indicates intactness of mRNA) can be deter include formaldehyde, paraformaldehyde, glutaraldehyde or mined using the RNA 6000 Pico Assay for the Bioanalyzer formalin. Preferably, the crosslinking agent comprises form 2100 (Agilent Technologies, Santa Clara, Calif.). aldehyde, which may be included in its native form or in the (0175 Reverse Transcription For QRT-PCR Analysis form of paraformaldehyde or formalin. One of skill in the art 0176 Reverse transcription can be performed using the would appreciate that for samples in which crosslinking Omniscript kit (Qiagen, Valencia, Calif.), Superscript III kit fixatives have been used special preparatory steps may be (Invitrogen, Carlsbad, Calif.), for RT-PCR. Target-specific necessary including for example heating steps and protein priming can be performed in order to increase the sensitivity ase-k digestion; see methods of detection of target sequences and generate target-specific 0168 One or more alcohols may be used to fix tissue, cDNA alone or in combination with other fixatives. Exemplary 0177 Taqman(R) Gene Expression Analysis alcohols used for fixation include methanol, ethanol and isopropanol. (0178 25 Taq Man R. RT-PCR can be performed using 0169. Formalin fixation is frequently used in medical Applied Biosystems Prism (ABI) 7900 HT instruments in a laboratories. Formalin comprises both an alcohol, typically 5ul Volume with target sequence-specific cDNA equivalent methanol, and formaldehyde, both of which can act to fix a to 1 ng total RNA. biological sample. 0179 Primers and probes concentrations for TaqMan 0170 Whether fixed or unfixed, the biological sample analysis are added to amplify fluorescent amplicons using may optionally be embedded in an embedding medium. PCR cycling conditions such as 95°C. for 10 minutes for Exemplary embedding media used in histology including one cycle, 95°C. for 20 seconds, and 60° C. for 45 seconds paraffin, Tissue-Tek R. V.I.PTM, Paramat, Paramat Extra, for 40 cycles. A reference sample can be assayed to ensure Paraplast, Paraplast X-tra, Paraplast Plus, Peel Away Paraf reagent and process stability. Negative controls (i.e., no fin Embedding Wax, Polyester Wax, Carbowax Polyethylene template) should be assayed to monitor any exogenous Glycol, PolyfinTM, Tissue Freezing Medium TFMTM, Cryo nucleic acid contamination. GelTM, and OCT Compound (Electron Microscopy Sciences, 0180 Amplification and Hybridization Hatfield, Pa.). Prior to molecular analysis, the embedding 0181. Following sample collection and nucleic acid material may be removed via any suitable techniques, as extraction, the nucleic acid portion of the sample comprising known in the art. For example, where the sample is embed RNA that is or can be used to prepare the target polynucle ded in wax, the embedding material may be removed by otide(s) of interest can be subjected to one or more prepara extraction with organic solvent(s), for example Xylenes. Kits tive reactions. These preparative reactions can include in are commercially available for removing embedding media vitro transcription (IVT), labeling, fragmentation, amplifi from tissues. Samples or sections thereof may be subjected cation and other reactions. mRNA can first be treated with to further processing steps as needed, for example serial reverse transcriptase and a primer to create cDNA prior to hydration or dehydration steps. detection, quantitation and/or amplification; this can be done 0171 In some embodiments, the sample is a fixed, wax in vitro with purified mRNA or in situ, e.g., in cells or tissues embedded biological sample. Frequently, Samples from affixed to a slide. medical laboratories are provided as fixed, wax-embedded 0182. By “amplification' is meant any process of pro samples, most commonly as formalin-fixed, paraffin embed ducing at least one copy of a nucleic acid, in this case an ded (FFPE) tissues. expressed RNA, and in many cases produces multiple cop 0172 Whatever the source of the biological sample, the ies. An amplification product can be RNA or DNA, and may target polynucleotide that is ultimately assayed can be include a complementary strand to the expressed target prepared synthetically (in the case of control sequences), but sequence. DNA amplification products can be produced typically is purified from the biological source and Subjected initially through reverse translation and then optionally from to one or more preparative steps. The RNA may be purified further amplification reactions. The amplification product to remove or diminish one or more undesired components may include all or a portion of a target sequence, and may from the biological sample or to concentrate it. Conversely, optionally be labeled. A variety of amplification methods are where the RNA is too concentrated for the particular assay, Suitable for use, including polymerase-based methods and it may be diluted. ligation-based methods. (0173 RNA Extraction 0183 Exemplary amplification techniques include the 0.174 RNA can be extracted and purified from biological polymerase chain reaction method (PCR), the ligase chain samples using any Suitable technique. A number of tech reaction (LCR), ribozyme-based methods, self sustained niques are known in the art, and several are commercially sequence replication (3SR), nucleic acid sequence-based available (e.g., FormaPureTM nucleic acid extraction kit, amplification (NASBA), the use of Q Beta replicase, reverse Agencourt Biosciences, Beverly M A. High Pure FFPE transcription, nick translation, and the like. US 2017/O 191133 A1 Jul. 6, 2017

0184 Asymmetric amplification reactions may be used to “touchdown PCR, hot-start techniques, use of nested prim preferentially amplify one strand representing the target ers, or designing PCR primers so that they form stem-loop sequence that is used for detection as the target polynucle structures in the event of primer-dimer formation and thus otide. In some cases, the presence and/or amount of the are not amplified. Techniques to accelerate PCR can be used, amplification product itself may be used to determine the for example centrifugal PCR, which allows for greater expression level of a given target sequence. In other convection within the sample, and comprising infrared heat instances, the amplification product may be used to hybrid ing steps for rapid heating and cooling of the sample. One ize to an array or other Substrate comprising sensor poly or more cycles of amplification can be performed. An excess nucleotides which are used to detect and/or quantitate target of one primer can be used to produce an excess of one primer sequence expression. extension product during PCR; preferably, the primer exten 0185. The first cycle of amplification in polymerase sion product produced in excess is the amplification product based methods typically forms a primer extension product to be detected. A plurality of different primers may be used complementary to the template strand. If the template is to amplify different target polynucleotides or different single-stranded RNA, a polymerase with reverse tran regions of a particular target polynucleotide within the Scriptase activity is used in the first amplification to reverse sample. transcribe the RNA to DNA, and additional amplification 0188 An amplification reaction can be performed under cycles can be performed to copy the primer extension conditions which allow an optionally labeled sensor poly products. The primers for a PCR must, of course, be nucleotide to hybridize to the amplification product during at designed to hybridize to regions in their corresponding least part of an amplification cycle. When the assay is template that will produce an amplifiable segment; thus, performed in this manner, real-time detection of this hybrid each primer must hybridize so that its 3' nucleotide is paired ization event can take place by monitoring for light emission to a nucleotide in its complementary template Strand that is or fluorescence during amplification, as known in the art. located 3' from the 3' nucleotide of the primer used to 0189 Where the amplification product is to be used for replicate that complementary template strand in the PCR. hybridization to an array or microarray, a number of Suitable 0186 The target polynucleotide can be amplified by commercially available amplification products are available. contacting one or more strands of the target polynucleotide These include amplification kits available from NuGEN, with a primer and a polymerase having Suitable activity to Inc. (San Carlos, Calif.), including the WT OvationTM Sys extend the primer and copy the target polynucleotide to tem, WTOvationTM System v2, WTOvationTM Pico Sys produce a full-length complementary polynucleotide or a tem, WTOvationTM FFPE Exon Module, WTOvationTM Smaller portion thereof. Any enzyme having a polymerase FITE Exon Module RiboAmp and Ribo Amp'." RNA activity that can copy the target polynucleotide can be used, Amplification Kits (MDS Analytical Technologies (formerly including DNA polymerases, RNA polymerases, reverse Arcturus) (Mountain View, Calif.), Genisphere, Inc. (Hat transcriptases, enzymes having more than one type of poly field, Pa.), including the RampUp PlusTM and Sense.AmpTM merase or enzyme activity. The enzyme can be thermolabile RNA Amplification kits, alone or in combination. Amplified or thermostable. Mixtures of enzymes can also be used. nucleic acids may be subjected to one or more purification Exemplary enzymes include: DNA polymerases such as reactions after amplification and labeling, for example using DNA Polymerase I (“Pol I), the Klenow fragment of Pol I, magnetic beads (e.g., RNAClean magnetic beads, Agencourt T4, T7, Sequenase R. T7, Sequenase R. Version 2.0 T7, Tub, Biosciences). Taq, Tth, Pfx, Pfu, Tsp, Tfl. 15Tli and Pyrococcus sp G13-D 0190. Multiple RNA biomarkers can be analyzed using DNA polymerases; RNA polymerases such as E. coli, SP6, real-time quantitative multiplex RT-PCR platforms and T3 and T7 RNA polymerases; and reverse transcriptases other multiplexing technologies such as GenomeLab GeXP such as AMV. M-MuDV. MMLV RNAse MMLV (Super Genetic Analysis System (Beckman Coulter, Foster City, Script(R), SuperScript(R) II, ThermoScript(R), HIV-1, and Calif.), SmartCyclerR 9600 or GeneXpert(R) Systems RAV2 reverse transcriptases. All of these enzymes are (Cepheid, Sunnyvale, Calif.), ABI 7900 HT Fast Real Time commercially available. Exemplary polymerases with mul PCR system (Applied Biosystems, Foster City, Calif.), tiple specificities include RAV2 and Tli (exo-) polymerases. LightCyclerR 480 System (Roche Molecular Systems, Exemplary thermostable polymerases include Tub, Taq. Tth, Pleasanton, Calif.), XMAP 100 System (Luminex, Austin, Pfx, Pfu, Tsp, Tfl, Tli and Pyrococcus sp. GB-D DNA Tex.) Solexa Genome Analysis System (Illumina, Hayward, polymerases. Calif.), OpenArray Real Time qPCR (BioTrove, Woburn, 0187 Suitable reaction conditions are chosen to permit Mass.) and BeadXpress System (Illumina, Hayward, Calif.). amplification of the target polynucleotide, including pH, 0191 Prostate Classification Arrays buffer, ionic strength, presence and concentration of one or 0.192 The present invention contemplates that a Prostate more salts, presence and concentration of reactants and Cancer Prognostic Set or probes derived therefrom may be cofactors such as nucleotides and magnesium and/or other provided in an array format. In the context of the present metal ions (e.g., manganese), optional cosolvents, tempera invention, an "array' is a spatially or logically organized ture, thermal cycling profile for amplification schemes com collection of polynucleotide probes. Any array comprising prising a polymerase chain reaction, and may depend in part sensor probes specific for two or more of SEQ ID NOs: on the polymerase being used as well as the nature of the 1-2114 or a product derived therefrom can be used. Desir sample. Cosolvents include formamide (typically at from ably, an array will be specific for 5, 10, 15, 20, 25, 30, 50, about 2 to about 10%), glycerol (typically at from about 5 to 75, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, about 10%), and DMSO (typically at from about 0.9 to about 1000, 1200, 1400, 1600, 1800, 2000 or more of SEQ ID 10%). Techniques may be used in the amplification scheme NOs: 1-2114. Expression of these sequences may be in order to minimize the production of false positives or detected alone or in combination with other transcripts. In artifacts produced during amplification. These include Some embodiments, an array is used which comprises a wide US 2017/O 191133 A1 Jul. 6, 2017 20 range of sensor probes for prostate-specific expression prod 0199 Factors known in the art for diagnosing and/or ucts, along with appropriate control sequences. An array of Suggesting, selecting, designating, recommending or other interest is the Human Exon 1.0 ST Array (HuEx 1.0 ST, wise determining a course of treatment for a patient or class Affymetrix, Inc., Santa Clara, Calif.). of patients Suspected of having prostate cancer can be 0193 Typically the polynucleotide probes are attached to employed in combination with measurements of the target a solid substrate and are ordered so that the location (on the sequence expression. These techniques include cytology, substrate) and the identity of each are known. The poly histology, ultrasound analysis, MRI results, CT scan results, nucleotide probes can be attached to one of a variety of solid and measurements of PSA levels. Substrates capable of withstanding the reagents and condi 0200 Certified tests for classifying prostate disease status tions necessary for use of the array. Examples include, but and/or designating treatment modalities are also provided. A are not limited to, polymers, such as (poly)tetrafluoroethyl certified test comprises a means for characterizing the ene, (poly)Vinylidenedifluoride, polystyrene, polycarbonate, expression levels of one or more of the target sequences of polypropylene and polystyrene; ceramic; silicon; silicon interest, and a certification from a government regulatory dioxide; modified silicon; (fused) silica, quartz or glass; agency endorsing use of the test for classifying the prostate functionalized glass; paper, such as filter paper, diazotized disease status of a biological sample. cellulose; nitrocellulose filter, nylon membrane; and poly 0201 In some embodiments, the certified test may com acrylamide gel pad. Substrates that are transparent to light prise reagents for amplification reactions used to detect are useful for arrays that will be used in an assay that and/or quantitate expression of the target sequences to be involves optical detection. characterized in the test. An array of probe nucleic acids can 0194 Examples of array formats include membrane or be used, with or without prior target amplification, for use in filter arrays (for example, nitrocellulose, nylon arrays), plate measuring target sequence expression. arrays (for example, multiwell, such as a 24-, 96-, 256 384- 864- or 1536-well, microtitre plate arrays), pin arrays, 0202 The test is submitted to an agency having authority and bead arrays (for example, in a liquid “slurry). Arrays on to certify the test for use in distinguishing prostate disease Substrates Such as glass or ceramic slides are often referred status and/or outcome. Results of detection of expression to as chip arrays or “chips. Such arrays are well known in levels of the target sequences used in the test and correlation the art. In one embodiment of the present invention, the with disease status and/or outcome are submitted to the Prostate Cancer Prognosticarray is a chip. agency. A certification authorizing the diagnostic and/or (0195 Data Analysis prognostic use of the test is obtained. 0196. Array data can be managed and analyzed using 0203 Also provided are portfolios of expression levels techniques known in the art. The Genetrix suite of tools can comprising a plurality of normalized expression levels of the be used for microarray analysis (Epicenter Software, Pasa target sequences described herein, including SEQ ID NOs: dena, CA). Probe set modeling and data pre-processing can 1-2114. Such portfolios may be provided by performing the be derived using the Robust Multi-Array (RMA) algorithm methods described herein to obtain expression levels from or variant GC-RMA, Probe Logarithmic Intensity Error an individual patient or from a group of patients. The (PLIER) algorithm or variant iterPLIER. Variance or inten expression levels can be normalized by any method known sity filters can be applied to pre-process data using the RMA in the art; exemplary normalization methods that can be used algorithm, for example by removing target sequences with a in various embodiments include Robust Multichip Average standard deviation of <10 or a mean intensity of <100 (RMA), probe logarithmic intensity error estimation intensity units of a normalized data range, respectively. (PLIER), non-linear fit (NLFIT) quantile-based and nonlin 0197) In some embodiments, one or more pattern recog ear normalization, and combinations thereof. Background nition methods can be used in analyzing the expression level correction can also be performed on the expression data; of target sequences. The pattern recognition method can exemplary techniques useful for background correction comprise a linear combination of expression levels, or a include mode of intensities, normalized using median polish nonlinear combination of expression levels. In some probe modeling and sketch-normalization. embodiments, expression measurements for RNA tran 0204. In some embodiments, portfolios are established scripts or combinations of RNA transcript levels are formu such that the combination of genes in the portfolio exhibit lated into linear or non-linear models or algorithms (i.e., an improved sensitivity and specificity relative to known meth expression signature) and converted into a likelihood ods. In considering a group of genes for inclusion in a score. This likelihood score indicates the probability that a portfolio, a small standard deviation in expression measure biological sample is from a patient who will exhibit no ments correlates with greater specificity. Other measure evidence of disease, who will exhibit systemic cancer, or ments of variation Such as correlation coefficients can also who will exhibit biochemical recurrence. The likelihood be used in this capacity. The invention also encompasses the score can be used to distinguish these disease states. The above methods where the specificity is at least about 50% or models and/or algorithms can be provided in machine read at least about 60%. The invention also encompasses the able format, and may be used to correlate expression levels above methods where the sensitivity is at least about 90%. or an expression profile with a disease state, and/or to 0205 The gene expression profiles of each of the target designate a treatment modality for a patient or class of sequences comprising the portfolio can fixed in a medium patients. Such as a computer readable medium. This can take a 0198 Thus, results of the expression level analysis can be number of forms. For example, a table can be established used to correlate increased expression of RNAs correspond into which the range of signals (e.g., intensity measure ing to SEQ ID NOs: 1-2114, or subgroups thereof as ments) indicative of disease or outcome is input. Actual described herein, with prostate cancer outcome, and to patient data can then be compared to the values in the table designate a treatment modality. to determine the patient samples diagnosis or prognosis. In US 2017/O 191133 A1 Jul. 6, 2017 a more Sophisticated embodiment, patterns of the expression expression profiles in Such media. For example, the articles signals (e.g., fluorescent intensity) are recorded digitally or may comprise a readable storage form having computer graphically. instructions for comparing gene expression profiles of the 0206. The expression profiles of the samples can be portfolios of genes described above. The articles may also compared to a control portfolio. If the sample expression have gene expression profiles digitally recorded therein So patterns are consistent with the expression pattern for a that they may be compared with gene expression data from known disease or disease outcome, the expression patterns patient samples. Alternatively, the profiles can be recorded can be used to designate one or more treatment modalities. in different representational format. A graphical recordation For patients with test scores consistent with systemic disease is one such format. Clustering algorithms can assist in the outcome after prostatectomy, additional treatment modali visualization of Such data. ties such as adjuvant chemotherapy (e.g., docetaxel, mitox 0210 Kits antrone and prednisone), Systemic radiation therapy (e.g., 0211 Kits for performing the desired method(s) are also Samarium or strontium) and/or anti-androgen therapy (e.g., provided, and comprise a container or housing for holding Surgical castration, finasteride, dutasteride) can be desig the components of the kit, one or more vessels containing nated. Such patients would likely be treated immediately one or more nucleic acid(s), and optionally one or more with anti-androgen therapy alone or in combination with vessels containing one or more reagents. The reagents radiation therapy in order to eliminate presumed micro include those described in the composition of matter section metastatic disease, which cannot be detected clinically but above, and those reagents useful for performing the methods can be revealed by the target sequence expression signature. described, including amplification reagents, and may Such patients can also be more closely monitored for signs include one or more probes, primers or primer pairs, of disease progression. For patients with test scores consis enzymes (including polymerases and ligases), intercalating tent with PSA or NED, adjuvant therapy would not likely be dyes, labeled probes, and labels that can be incorporated into recommended by their physicians in order to avoid treat amplification products. ment-related side effects such as metabolic syndrome (e.g., 0212. In some embodiments, the kit comprises primers or hypertension, diabetes and/or weight gain) or osteoporosis. primer pairs specific for those Subsets and combinations of Patients with samples consistent with NED could be desig target sequences described herein. At least two, three, four or nated for watchful waiting, or for no treatment. Patients with five primers or pairs of primers suitable for selectively test scores that do not correlate with systemic disease but amplifying the same number of target sequence-specific who have successive PSA increases could be designated for polynucleotides can be provided in kit form. In some watchful waiting, increased monitoring, or lower dose or embodiments, the kit comprises from five to fifty primers or shorter duration anti-androgen therapy. pairs of primers suitable for amplifying the same number of 0207 Target sequences can be grouped so that informa target sequence-representative polynucleotides of interest. tion obtained about the set of target sequences in the group 0213. The primers or primer pairs of the kit, when used can be used to make or assist in making a clinically relevant in an amplification reaction, specifically amplify at least a judgment Such as a diagnosis, prognosis, or treatment portion of a nucleic acid depicted in one of SEQ ID NOs: choice. 1-2114 (or subgroups thereof as set forth herein), an RNA 0208. A patient report is also provided comprising a form thereof, or a complement to either thereof. The kit may representation of measured expression levels of a plurality include a plurality of Such primers or primer pairs which can of target sequences in a biological sample from the patient, specifically amplify a corresponding plurality of different wherein the representation comprises expression levels of nucleic acids depicted in one of SEQ ID NOs: 1-2114 (or target sequences corresponding to any one, two, three, four, subgroups thereof as set forth herein), RNA forms thereof, five, six, eight, ten, twenty, thirty, fifty or more of the target or complements thereto. At least two, three, four or five sequences depicted in SEQ ID NOs: 1-2114, or of the primers or pairs of primers suitable for selectively amplify subsets described herein, or of a combination thereof. In ing the same number of target sequence-specific polynucle Some embodiments, the representation of the measured otides can be provided in kit form. In some embodiments, expression level(s) may take the form of a linear or nonlinear the kit comprises from five to fifty primers or pairs of combination of expression levels of the target sequences of primers Suitable for amplifying the same number of target interest. The patient report may be provided in a machine sequence-representative polynucleotides of interest. (e.g., a computer) readable format and/or in a hard (paper) 0214. The reagents may independently be in liquid or copy. The report can also include Standard measurements of Solid form. The reagents may be provided in mixtures. expression levels of said plurality of target sequences from Control samples and/or nucleic acids may optionally be one or more sets of patients with known disease status and/or provided in the kit. Control samples may include tissue outcome. The report can be used to inform the patient and/or and/or nucleic acids obtained from or representative of treating physician of the expression levels of the expressed prostate tumor samples from patients showing no evidence target sequences, the likely medical diagnosis and/or impli of disease, as well as tissue and/or nucleic acids obtained cations, and optionally may recommend a treatment modal from or representative of prostate tumor samples from ity for the patient. patients that develop systemic prostate cancer. 0209. Also provided are representations of the gene 0215. The nucleic acids may be provided in an array expression profiles useful for treating, diagnosing, prognos format, and thus an array or microarray may be included in ticating, and otherwise assessing disease. In some embodi the kit. The kit optionally may be certified by a government ments, these profile representations are reduced to a medium agency for use in prognosing the disease outcome of prostate that can be automatically read by a machine Such as com cancer patients and/or for designating a treatment modality. puter readable media (magnetic, optical, and the like). The 0216) Instructions for using the kit to perform one or articles can also include instructions for assessing the gene more methods of the invention can be provided with the US 2017/O 191133 A1 Jul. 6, 2017 22 container, and can be provided in any fixed medium. The 0223) The device also comprises output means for out instructions may be located inside or outside the container or putting the disease status, prognosis and/or a treatment housing, and/or may be printed on the interior or exterior of modality. Such output means can take any form which any surface thereof. A kit may be in multiplex form for transmits the results to a patient and/or a healthcare provider, concurrently detecting and/or quantitating one or more dif and may include a monitor, a printed format, or both. The ferent target polynucleotides representing the expressed device may use a computer system for performing one or target Sequences. more of the steps provided. 0217 Devices 0218 Devices useful for performing methods of the CITATIONS invention are also provided. The devices can comprise means for characterizing the expression level of a target Patents and Published Applications sequence of the invention, for example components for 0224) 1. US 2003/0224399 A1 patent application Meth performing one or more methods of nucleic acid extraction, ods for determining the prognosis for patients with a prostate amplification, and/or detection. Such components may neoplastic condition 2003-12-04 include one or more of an amplification chamber (for 0225 2. US 2007/0048738 A1 patent application Meth example a thermal cycler), a plate reader, a spectrophotom ods and compositions for diagnosis, staging and prognosis eter, capillary electrophoresis apparatus, a chip reader, and of prostate cancer 2007-03-01 or robotic sample handling components. These components 0226) 3. US 2007/0099197 A1 patent application Meth ultimately can obtain data that reflects the expression level ods of prognosis of prostate cancer 2007 May 03 of the target sequences used in the assay being employed. 0227 4. US 2007/0259352 A1 patent application Prostate 0219. The devices may include an excitation and/or a cancer-related nucleic acids 2007 Nov. 08 detection means. Any instrument that provides a wavelength 0228 5. US 2008/0009001 A1 patent application Method that can excite a species of interest and is shorter than the for Identification of Neoplastic Transformation with Particu emission wavelength(s) to be detected can be used for lar Reference to Prostate Cancer 2008 Jan. 10 excitation. Commercially available devices can provide Suit 0229. Publications able excitation wavelengths as well as suitable detection 0230. 1: Cooper CS, Campbell C, Jhavar S. Mechanisms components. of Disease: biomarkers and molecular targets from microar 0220 Exemplary excitation sources include a broadband ray gene expression studies in prostate cancer. Nat Clin Pract UV light source Such as a deuterium lamp with an appro Urol. 2007 Dec;4(12):677-87. Review. priate filter, the output of a white light source Such as a 0231. 2: Reddy G. K. BalkSP. Clinical utility of microar Xenon lamp or a deuterium lamp after passing through a ray-derived genetic signatures in predicting outcomes in monochromator to extract out the desired wavelength(s), a prostate cancer. Clin Genitourin Cancer. 2006 Dec:5(3): 187 continuous wave (cw) gas laser, a solid state diode laser, or 9. Review. any of the pulsed lasers. Emitted light can be detected 0232 3: Nelson PS. Predicting prostate cancer behavior through any Suitable device or technique; many Suitable using transcript profiles. J Urol. 2004 Nov;172(5 Pt 2):528 approaches are known in the art. For example, a fluorimeter 32; discussion S33. Review. or spectrophotometer may be used to detect whether the test 0233 4: Bibikova M. Chudin E, Arsanjani A, Zhou L, sample emits light of a wavelength characteristic of a label Garcia E. W. Modder J, Kostelec M, Barker D, Downs T. Fan used in an assay. J. B. Wang-Rodriguez J. Expression signatures that corre 0221) The devices typically comprise a means for iden lated with Gleason score and relapse in prostate cancer. tifying a given sample, and of linking the results obtained to Genomics. 2007 Jun;89(6):666-72. Epub 2007 Apr 24. that sample. Such means can include manual labels, bar 0234 5: Schlomm T. Erbersdobler A, Mirlacher M, Sau codes, and other indicators which can be linked to a sample ter G. Molecular staging of prostate cancer in the year 2007. vessel, and/or may optionally be included in the sample World J Urol. 2007 Mar:25(1):19-30. Epub 2007 Mar 2. itself, for example where an encoded particle is added to the Review. sample. The results may be linked to the sample, for 0235 6: Mendiratta P. Febbo P G. Genomic signatures example in a computer memory that contains a sample associated with the development, progression, and outcome designation and a record of expression levels obtained from of prostate cancer. Mol Diagn Then 2007:11 (6):345-54. the sample. Linkage of the results to the sample can also 0236 7: Reddy G. K. BalkSP. Clinical utility of microar include a linkage to a particular sample receptacle in the ray-derived genetic signatures in predicting outcomes in device, which is also linked to the sample identity. prostate cancer. Clin Genitourin Cancer. 2006 Dec:5(3): 187 0222. The devices also comprise a means for correlating 9. Review. the expression levels of the target sequences being studied 0237 8: True L, Coleman I, Hawley S. Huang C. Y. with a prognosis of disease outcome. Such means may Gifford D, Coleman R, Beer T M, Gelmann E. Datta M, comprise one or more of a variety of correlative techniques, Mostaghel E. Knudsen B. Lange P. Vessella R, Lin D. Hood including lookup tables, algorithms, multivariate models, L. Nelson PS. A molecular correlate to the Gleason grading and linear or nonlinear combinations of expression models system for prostate adenocarcinoma. Proc Natl Acad Sci U or algorithms. The expression levels may be converted to S.A. 2006 Jul 18:103(29): 10991-6. Epub 2006 Jul 7. one or more likelihood scores, reflecting a likelihood that the 0238 9: Stephenson A.J. Smith A, Kattan M. W. Satago patient providing the sample will exhibit a particular disease pan J, Reuter V. E. Scardino PT, Gerald W L. Integration of outcome. The models and/or algorithms can be provided in gene expression profiling and clinical variables to predict machine readable format, and can optionally further desig prostate carcinoma recurrence after radical prostatectomy. nate a treatment modality for a patient or class of patients. Cancer. 2005 Jul 15:104(2):290-8. US 2017/O 191133 A1 Jul. 6, 2017

0239) 10: Bueno R, Loughlin K R, Powell M. H. Gordon commercially available High Pure FFPERNAMicro nucleic G. J. A diagnostic test for prostate cancer from gene expres acid extraction kit (Roche Applied Sciences, Indianapolis, sion profiling data.J Urol. 2004 Feb;171(2 Pt 1):903-6. Ind.). RNA concentrations were calculated using a Nano 0240) 11: Yu Y P. Landsittel D, Jing L, Nelson J, Ren B, drop ND-1000 spectrophotometer (Nanodrop Technologies, Liu L, McDonald C, Thomas R, Dhir R, Finkelstein S, Rockland, Del). Michalopoulos G. Becich M, Luo J. H. Gene expression 0248 RNA Amplification and GeneChip Hybridization. alterations in prostate cancer predicting tumor aggression Purified RNA was subjected to whole-transcriptome ampli and preceding development of malignancy. J Clin Oncol. fication using the WTOvation FFPE system including the 0241 2004 Jul 15:22(14):2790-9. WTOvation Exon and FL-Ovation Biotin V2 labeling mod 0242) 12: Feroze-Merzoug F. Schober M. S., Chen Y Q. ules, with the following modifications. Fifty (50) nanograms Molecular profiling in prostate cancer. Cancer Metastasis of RNA extracted from FFPE sections was used to generate Rev. 2001:20(3–4): 165-71. Review. amplified Ribo-SPIA product. For the WTOvation Exon 0243) 13: Nakagawa T, Kollmeyer T M. Morlan B W, sense-target strand conversion kit 4 ug of Ribo-SPIA prod Anderson SK, Bergstralh E. J. Davis BJ, Asmann Y.W. Klee uct were used. All clean-up steps were performed with G. G. Ballman K. V. Jenkins R B. A tissue biomarker panel RNAC lean magnetic beads (Agencourt BioSciences). predicting systemic progression after PSA recurrence post Between 2.5 and 5 micrograms of WTOvation Exon prod definitive prostate cancer therapy. PLoS ONE. 2008: 10 uct were used to fragment and label using the FL-Ovation 3(5):e2318. Biotin V2 labeling module and labeled product was hybrid 0244) 14: Shariat S F, Karakiewicz, P I, Roehrborn C G, ized to Affymetrix Human Exon 1.0 ST GeneChips follow Kattan MW. An updated catalog of prostate cancer predic ing manufacturer's recommendations (Affymetrix, Santa tive tools. Cancer 2008; 113(11): 3062-6. Clara, Calif.). Of the 30 samples processed, 22 had sufficient amplified material (i.e. >2.5ug of WTOvation Exon prod EXAMPLES uct) for GeneChip hybridization. 0249 Microarray Analysis. All data management and 0245. To gain a better understanding of the invention analysis was conducted using the Genetrix Suite of tools for described herein, the following examples are set forth. It will microarray analysis (Epicenter Software, Pasadena, Calif.). be understood that these examples are intended to describe Probe set modeling and data pre-processing were derived illustrative embodiments of the invention and are not using the Robust Multi-Array (RMA) algorithm. The mode intended to limit the scope of the invention in any way. of intensity values was used for background correction and Efforts have been made to ensure accuracy with respect to RMA-sketch was used for normalization and probe model numbers used (e.g., amounts, temperature, etc.) but some ing used a median polish routine. A variance filter was experimental error and deviation should be accounted for. applied to data pre-processed using the RMA algorithm, by Unless otherwise indicated, parts are parts by weight, tem removing target sequences with a mean intensity of <10 perature is degree centigrade and pressure is at or near intensity units of a normalized data range. Target sequences atmospheric, and all materials are commercially available. typically comprise four individual probes that interrogate the expression of RNA transcripts or portions thereof. Target Example 1 sequence annotations and the sequences (RNAS) that they interrogate were downloaded from the Affymetrix website Identification of Target Sequences Differentially (www.netaffx.com). Supervised analysis of differentially Expressed in Prostate Disease States expressed RNA transcripts was determined based on the fold 0246 Tissue Samples. Formalin-fixed paraffin embedded change in the average expression (at least 2 fold change) and (FFPE) samples of human prostate adenocarcinoma pros the associated t-test, with a p-value cut-off of p-0.001 tatectomies were collected from patients at the Mayo Clinic between different prostate cancer patient disease states. Comprehensive Cancer Center according to an institutional Linear regression was also used to screen differentially review board-approved protocol and stored in the Depart expressed transcripts that displayed an expression pattern of ment of Pathology for up to 20 years. For each patient NED>PSA>SYS or SYS>PSA>NED and genes were sample four 4 micron sections were cut from formalin-fixed selected with a p-value cut-off of p-0.01 for two-way paraffin embedded blocks. Pathological review of FFPE hierarchical clustering using Pearson’s correlation distance tissue sections was used to guide macrodissection of tumor metric with complete-linkage cluster distances. and Surrounding normal tissue. Patients were classified into (0250 Archived FFPE blocks of tumors were selected one of three clinical disease states; no evidence of disease from 30 patients that had undergone a prostatectomy at the (NED, n=10) for those patients with no biochemical or other Mayo Clinic Comprehensive Cancer Center between the clinical signs of disease progression (at least 10 years years 1987-1997, providing for at least 10 years follow-up follow-up); prostate-specific antigen biochemical recurrence on each patient. Twenty-two patient samples had RNA of (PSA, n=10) for those patients with two successive increases Sufficient quantity and quality for RNA amplification and in PSA measurements above an established cut-point of >4 Subsequent GeneChip hybridization. Three clinical catego ng/mL (rising PSA); and systemic disease (SYS, n=10) for ries of patients were evaluated; patients alive with no those patients that had rising PSA and developed metasta evidence of disease (NED, n=6), patients with rising PSA ses or clinically detectable disease progression within five or biochemical recurrence (defined as two Successive years after initial prostatectomy. Clinical disease was con increases in PSA measurements) (PSA, n=7) and patients firmed using bone or CT scans for prostate cancer metasta with rising PSA and clinical evidence of systemic or recur SS rent disease (e.g., determined by bone scan, CT) (SYS, 0247 RNA Extraction. RNA was extracted and purified n=9) after prostatectomy. No statistically significant differ from FFPE tissue sections using a modified protocol for the ences between these three clinical groups were apparent US 2017/O 191133 A1 Jul. 6, 2017 24 when considering pathological factors such as Gleason score 0255. Using the Nearest Shrunken Centroids (NSC) algo or tumor stage (Table 1). As samples from older archived rithm with leave-1-out cross-validation, smaller subsets of FFPE blocks are typically more degraded and fragmented RNA transcripts were identified that distinguish recurrent than younger blocks, the distribution of block ages was (i.e., SYS) and non-recurrent (i.e., PSA and NED) similar in the three clinical groups so as not to skew or bias disease (Tables 9 and 10). NSC algorithm identified 10- and the results due to a block age effect. Fifty nanograms of 41-target sequence metagenes used to derive patient out RNA extracted from FFPE sections was amplified and come predictor scores scaled and normalized on a data range hybridized to whole-transcriptome microarrays, interrogat of 0-100 points. FIGS. 5 and 6 depict box plots showing ing >1.4 million probe target sequences measuring RNA interquartile range and distribution of POP scores for each levels for RefSeq, dbPST and predicted transcripts (collec clinical group. A 148-target sequence metagene (Table 5) tively, RNAs). was similarly used to derive POP scores depicted in FIG. 0251 Table 3 displays the number of target sequences 7. T-tests were used to evaluate the statistical significance of identified in two-way comparisons between different clinical differences in POP scores between recurrent (i.e., SYS) states using the appropriate t-tests and a p-value cut-off of and non-recurrent (i.e., PSA and NED) patient groups p-0.001. At total of 2,114 target sequences (Table 3) were (indicated in the figures) and show that increasing the identified as differentially expressed in these comparisons number of target sequences in the metagene combination and a principle components analysis demonstrates that these target sequences discriminate the distinct clinical states into increases the significance level of the differences in POP three clusters (FIG. 1A). SCOS. 0252) A linear regression filter was next employed to 0256 The data generated from such methods can be used statistically rank target sequences that followed a trend of to determine a prognosis for disease outcome, and/or to either increased expression with poor prognosis patients recommend or designate one or more treatment modalities (i.e., SYS>PSA>NED) or increased expression in good for patients, to produce patient reports, and to prepare prognosis patients (NED-PSA>SYS, alternatively expression profiles. decreased expression in poor prognosis patients) (Table 4). FIG. 1B depicts a two-way hierarchical clustering dendro gram and expression matrix of top-ranked 526 target sequences and 22 tumor samples. Patients in the PSA Lengthy table referenced here clinical status category displayed intermediate expression levels for genes expressed at increased levels in SYS (n=313) and NED (n=213), respectively (Table 4). FIG. 1C US2O170191133A1-20170706-TOOOO1 depicts a two-way hierarchical clustering dendrogram and expression matrix of 148 target sequences and 22 tumor Please refer to the end of the specification for access instructions. samples. These target sequences were a Subset of the dif ferentially expressed transcripts (Table 3) filtered using a t-test to query recurrent (i.e., SYS) and non-recurrent (i.e., PSA and NED) patient samples (Table 5). 0253) The expression levels of these genes were summa rized for each patient into a metagene using a simple linear Lengthy table referenced here combination by taking the expression level and multiplying it by a weighting factor for each target sequence in the US2O170191133A1-20170706-TOOOO2 metagene signature and combining these values into a single Please refer to the end of the specification for access instructions. variable. Weighting factors were derived from the coeffi cients of the linear regression fit analysis (Table 4). FIG. 2 shows a histogram plot of the metagene expression values for the summarized 526 target sequences in each of the three clinical groups. This 526-metagene achieved maximal sepa ration between clinical groups and low variance within each Lengthy table referenced here clinical group. Metagenes comprised of Smaller Subsets of 21, 18 and 6 target sequences were also generated (FIG. 3, US2O170191133A1-20170706-TOOOO3 Tables 7 and 8). The distinctions between clinical groups Please refer to the end of the specification for access instructions. with respect to the metagene scores were preserved, although increased within-group variance was observed when using fewer target sequences (FIG. 3). 0254 Next, Patient outcome predictor (POP) scores were generated from the metagene values for each patient. For the 18-target sequences metagene, this entailed scaling Lengthy table referenced here and normalizing the metagene scores within a range of 0 to US2O170191133A1-20170706-TOOOO4 100, where a value of between 0-20 points indicates a patient with NED, 40-60 points a patient with PSA recurrence and Please refer to the end of the specification for access instructions. 80-100 points a patient with SYS metastatic disease (FIG. 4). In contrast, Gleason scores for patients could not be used on their own to distinguish the clinical groups (Table 1). US 2017/O 191133 A1 Jul. 6, 2017 25

Lengthy table referenced here Lengthy table referenced here US2O170191133A1-20170706-TOOOO8 US2O 170191133A1-20170706-TOOOO5 Please refer to the end of the specification for access instructions.

Please refer to the end of the specification for access instructions.

Lengthy table referenced here US2O170191133A1-20170706-TOOOO9 Please refer to the end of the specification for access instructions. Lengthy table referenced here

US2O 170191133A1-20170706-TOOOO6

Please refer to the end of the specification for access instructions. Lengthy table referenced here US2O170191133A1-20170706-TOOO 10 Please refer to the end of the specification for access instructions.

L engthyhv table referencedref dh here 0257 Although the invention has been described with US2O 170191133A1-20170706-TOOOO7 reference to certain specific embodiments, various modifi M cations thereof will be apparent to those skilled in the art Please refer to the end of the specification for access instructions. without departing from the spirit and scope of the invention. All such modifications as would be apparent to one skilled in the art are intended to be included within the scope of the following claims.

LENGTHY TABLES The patent application contains a lengthy table section. A copy of the table is available in electronic form from the USPTO web site (http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20170191133A1). An electronic copy of the table will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).

SEQUENCE LISTING The patent application contains a lengthy “Sequence Listing section. A copy of the “Sequence Listing is available in electronic form from the USPTO web site (http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20170191133A1). An electronic copy of the “Sequence Listing will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR 1.19(b)(3).

1-42. (canceled) 45. The method of claim 44, wherein the prostate cancer 43. A method comprising: is recurrent. (a) determining the expression level of at least one coding target sequence and at least one non-coding target 46. The method of claim 44, wherein the prostate cancer sequence in a sample from a subject having cancer; and is non-recurrent. (b) administering a treatment to the Subject. 47. The method of claim 43, wherein the sample is 44. The method of claim 43, wherein the cancer is prostate selected from the group consisting of an aspirate, a needle CaCC. biopsy, a cytology pellet, a bulk tissue preparation or a US 2017/O 191133 A1 Jul. 6, 2017 26 section thereof obtained for example by Surgery or autopsy, 52. The method of claim 43, wherein determining the lymph fluid, blood, plasma, serum, tumors, and organs. expression level of at least one coding target sequence and at least one non-coding target sequence comprises amplify 48. The method of claim 43, wherein the sample is a ing the target sequences. needle-biopsy sample or Surgical resection sample that has 53. The method of claim 43, wherein determining the been fixed in formalin and embedded in paraffin wax. expression level of at least one coding target sequence and at least one non-coding target sequence comprises sequenc 49. The method of claim 43, wherein the expression level ing the target sequences. of one or more target sequences is determined by a method 54. The method of claim 43, wherein determining the selected from the group consisting of RT-PCR, Northern expression level of at least one coding target sequence and blotting, ligase chain reaction, array hybridization, and a at least one non-coding target sequence comprises extracting combination thereof. the target sequences from the sample. 55. The method of claim 43, wherein determining the 50. The method of claim 43, wherein the treatment is expression level of at least one coding target sequence and selected from the group consisting of adjuvant chemo at least one non-coding target sequence comprises a com therapy, systemic radiation therapy, anti-androgen therapy or puter model or algorithm. a combination thereof. 56. The method of claim 57, wherein the computer model 51. The method of claim 43, wherein at least one target or algorithm is linear or non-linear. sequence is selected from SEQ ID NOs: 1-2114. k k k k k