US 20110136690A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2011/0136690 A1 HOOD et al. (43) Pub. Date: Jun. 9, 2011

(54) METHODS FOR IDENTIFYING AND Publication Classification MONITORING DRUGSIDE EFFECTS (51) Int. Cl. (76) Inventors: LEROY HOOD, SEATTLE, WA GOIN 33/68 (2006.01) (US); BLAOYANG LIN GOIN 33/53 (2006.01) BOfHELE WA (US) s C40B 30/04 (2006.01) (21) Appl. No.: 13/023,366 (52) U.S. Cl...... 506/9; 436/86; 435/7.92

(22) Filed: Feb. 8, 2011 (57) ABSTRACT Related U.S. Application Data The present invention relates generally to methods for iden (63) Continuation of application No. 12/468.834, filed on tifying drug side effects by detecting perturbations in organ May 19, 2009 E. Pat. No 7 8838 58 which is a specific molecular blood fingerprints. The invention further continuation of application No 1 73 42 3 67 filed on relates to methods for identifying drug-specific organ-spe Jan. 27, 2006, now abandoned s- Y s cific molecular blood fingerprints. As such, the present inven • 1- s s tion provides compositions comprising organ-specific pro (60) Provisional application No. 60/683,004, filed on May teins, detection reagents for detecting Such , and 20, 2005, provisional application No. 60/647,792, panels and arrays for determining organ-specific molecular filed on Jan. 27, 2005. blood fingerprints. US 2011/O 136690 A1 Jun. 9, 2011

METHODS FOR DENTIFYING AND one embodiment, the level of one or more organ-specific MONITORING DRUGSIDE EFFECTS proteins is measured. In yet an additional embodiment, the plurality of organ-specific proteins comprises from at least STATEMENT OF GOVERNMENT INTEREST about 2 organ-specific proteins to about 100, 150, 160, 170, 0001. This invention was made with government support 180, 190, 200 or more organ-specific proteins. In this regard, under Grant Nos. P50 CAO971.86 and PO1 CA085857 the plurality of organ-specific proteins may comprise at least awarded by the National Cancer Institute. The government 2, 3, 4, 5, 6,7,8,9, 10, or more organ-specific proteins. In one may have certain rights in this invention. embodiment, the plurality of organ-specific proteins com prises about 5, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 STATEMENT REGARDING SEQUENCE organ-specific proteins. In another embodiment, the organ LISTING SUBMITTED ON CD-ROM specific proteins comprise proteins from any of a number of organs, such as, but not limited to the liver or kidney. In a 0002 The Sequence Listing associated with this applica further embodiment the organ-specific proteins comprise car tion is provided on CD-ROM in lieu of a paper copy, and is diac-specific proteins. In yet another embodiment, the organ hereby incorporated by reference into the specification. Three specific proteins are from an organ other than the expected CD-ROMs are provided, containing identical copies of the therapeutic target of the drug. sequence listing: CD-ROM No. 1 is labeled COPY 1, con 0009. Another aspect of the present invention provides a tains the file 402.app.txt which is 3.31 MB and created on Jan. method for detecting a drug side effect comprising measuring 27, 2006; CD-ROM No. 2 is labeled COPY 2, contains the file in the blood of a subject taking the drug the level of one or 402.app.txt which is 3.31 MB and created on Jan. 27, 2006; more organ-specific proteins secreted from an organ wherein CD-ROM No. 3 is labeled CRF (Computer Readable Form), the level of the one or more organ-specific proteins together contains the file 402.app.txt which is 3.31 MB and created on provide an organ-specific molecular blood fingerprint that Jan. 27, 2006. indicates a drug side effect on the organ in the Subject. BACKGROUND OF THE INVENTION 0010. Another aspect of the invention provides a method for determining the presence or absence of a drug side effect 0003 1. Field of the Invention in a Subject taking the drug comprising, detecting a level of 0004. The present invention relates to organ-specific each of a plurality of organ-specific proteins in a blood molecular blood fingerprints and methods for using the same sample from the Subject, wherein the plurality of organ-spe in identifying and/or monitoring drug side effects. cific proteins are secreted from the same organ; comparing 0005 2. Description of the Related Art said level of each of the plurality of organ-specific proteins in 0006 Side effects, particularly adverse side effects, are the blood sample from the subject to a level of each of the monitored and tested throughout the developmental path of plurality of organ-specific proteins in a control sample of all drugs. A variety of in vitro and in vivo toxicity assays are drug-free blood; wherein a statistically significant altered available in the art for testing on and off-target effects of drugs level of one or more of the plurality of organ-specific proteins during development including metabolic effects on liver P450 in the blood is indicative of the presence or absence of a drug enzymes. Clearly, however, as is evidenced by the recent side effect. As would be readily appreciated by the skilled controversy surrounding COX-2 inhibitors, off-target side artisan, an altered level can mean an increase in the level or a effects can be subtle and difficult to detect. Therefore, moni decrease in the level. In this regard, the skilled artisan would toring and identifying drug side effects remains an important readily appreciate that a variety of statistical tests can be used issue with regard to drug safety of developing drugs and to determine if an altered level is significant. The Z-test (Man, approved drugs already on the market. The COX-2 story M. Z. et al., Bioinformatics, 16: 953-959, 2000) or other highlights the need in the art for improved methods to detect appropriate statistical tests can be used to calculate P values sometimes subtle adverse off-target effects of drugs. for comparison of expression levels. In certain 0007. The present invention provides methods that satisfy embodiments, the level of each of the plurality of organ this and other needs. specific proteins in the blood sample from the subject is compared to a previously determined normal control level of BRIEF SUMMARY OF THE INVENTION each of the plurality of organ-specific proteins taking into 0008. One aspect of the present invention provides a account standard deviation (see e.g., U.S. Patent Application method for detecting a drug side effect comprising measuring No. 20020095259). In one embodiment, the level of each of in the blood of a subject taking the drug the level of a plurality the plurality of organ-specific proteins is detected using any of organ-specific proteins secreted from an organ wherein the one or more methods, such as, but not limited to mass spec levels of the plurality of organ-specific proteins together pro trometry (e.g., tandem mass spectrometry or other spectrom vide an organ-specific molecular blood fingerprint that indi etry-based techniques), and immunoassays (e.g., ELISA, cates a drug side effect on the organ in the Subject. In one Western blot, or other immunoaffinity-based assays). In an embodiment, the level of the plurality of organ-specific pro additional embodiment, the method level of each of the plu teins is measured with any of a variety of methods, including rality of organ-specific proteins is measured using an anti but not limited to mass spectrometry, such as tandem mass body array. In yet an additional embodiment, the method spectrometry, an immunoassay, such as an ELISA, Western provides for determining the presence or absence of a drug blot, microfluidics/nanotechnology sensors, and aptamer side effect wherein the organ-specific proteins comprise liver capture assay. In this regard, an aptamer may be used in a specific proteins or kidney-specific proteins. In certain similar manner to an antibody in a variety of appropriate embodiments, the organ-specific proteins are from an organ binding assays known to the skilled artisan and described other than the expected therapeutic target of the drug. herein. In a further embodiment, the plurality of organ-spe 0011. A further aspect of the present invention provides a cific proteins is measured using tandem mass spectrometry. In method for detecting perturbation of a normal biological State US 2011/O 136690 A1 Jun. 9, 2011 induced by a drug, contacting a blood sample from a subject (0016 SEQID NOs:73-593 are MPSS signature sequences taking the drug with a plurality of detection reagents each that correspond to differentially expressed in prostate specific for an organ-specific protein secreted into blood, cancer cell lines LNCaP and CL1 that encode secreted pro wherein each organ-specific protein is secreted from the same teins (see Table 3). organ; measuring the amount of the organ-specific protein 0017 SEQ ID NOs:594-1511 are the GENBANK detected in the blood sample by each detection reagent, com sequences of differentially expressed genes that encode pre paring the amount of the organ-specific protein detected in the dicted secreted proteins as referred to in Table 3. Both poly blood sample by each detection reagent to a predetermined nucleotide and amino acid sequences are provided for each control amount for each organ-specific protein; wherein an GENBANK accession number. altered level in one or more of the organ-specific proteins (0018 SEQ ID NOs: 1512-1573 are the amino acid indicates a perturbation in the normal biological state induced sequences from GENBANK of prostate-specific proteins by the drug. In this regard, the plurality of detection reagents potentially secreted into blood as described in Table 4. may comprise from at least about 2 detection reagents to (0.019 SEQ ID NOs: 1574-1687 are the GENBANK about 100, 150, 160, 170, 180, 190, 200 or more detection sequences of examples of differentially expressed genes as reagents. In on embodiment, the plurality of detection described in Table 1. Both polynucleotide and amino acid reagents comprises about 5, 10, or 20 detection reagents. In sequences are provided where available for each GENBANK yet another embodiment, the organ-specific proteins com accession number. prise kidney-specific proteins, liver-specific proteins, or car 0020 SEQ ID NOs: 1688-1796 are MPSS signature diac-specific proteins. As would be recognized by the skilled sequences that correspond to prostate-specific/enriched artisan upon reading the present disclosure, the organ-specific genes as described in Table 5. proteins can be derived from any organ in the body as 0021 SEQ ID NOs: 1797-1947 are the GENBANK described further herein. sequences of prostate-specific genes as described in Table 5. 0012. In yet a further aspect, the invention provides a Both polynucleotide and amino acid sequences are provided diagnostic panel for determining a drug side effectina Subject where available for each GENBANK accession number. taking the drug comprising, a plurality of detection reagents each specific for detecting one of a plurality of organ-specific DETAILED DESCRIPTION OF THE INVENTION proteins present in a blood/serum/plasma sample; wherein 0022. A powerful new systems approach to disease is the organ-specific proteins are secreted from the same organ revealing powerful new blood diagnostics/monitoring and wherein detection of the plurality of organ-specific pro approaches. Particularly, inspecific cells there are protein and teins with the plurality of detection reagents results in an regulatory networks that mediate the normal functions organ-specific molecular blood fingerprint indicative of the of the cell. The disease process causes one or more of these drug side effect in the subject. In certain embodiments, the networks to be perturbed, either genetically or environmen fingerprint (e.g., the pattern of interaction of the detection tally (e.g. infections). The disease-altered networks result in reagents with each of the plurality of organ-specific proteins) altered patterns of protein expression—and some of the tran is the combination of, a snapshot of sorts, of the different Scripts with altered expression levels are organ (cell)-specific quantitative levels of the organ-specific proteins detected. and Some of these organ-specific transcripts encode secreted Thus, in other words, the fingerprint is a set of numbers, each proteins. Hence disease leads to altered expression patterns of number corresponding to a level of a particular organ-specific organ-specific, Secreted proteins in the blood. Drugs also protein. This set of numbers and the specific organ-specific cause altered expression of organ-specific Secreted proteins in proteins that they correspond to together make up the unique the blood. In particular, the liver and kidney are organs that fingerprint that defines a biological condition. In this regard, often reflect the side effects of drugs. the detection reagents may comprise antibodies or antigen 0023 Hence the blood may be viewed as a window into the binding fragments thereof or monoclonal antibodies, or anti health and disease of an individual. The levels of organ gen-binding fragments thereof The panels of the present specific secreted proteins present in the blood taken together invention may comprise from at least about 2 detection represent molecular fingerprints in the blood that reflect the reagents to about 100, 150, 160, 170, 180, 190, 200 or more operation of normal organs. Each organ has a specific quan detection reagents. In one embodiment, the panel comprises titative molecular fingerprint. When a drug has a side effect on about 5, 10, or 20 detection reagents. In a further embodi a particular organ, that blood fingerprint changes, for ment, the plurality of detection reagents are specific for kid example, in the levels of these proteins expressed in the blood ney-specific proteins, liver-specific proteins, cardiac-specific and the change in the fingerprint correlates with the specific proteins, or indeed specific for proteins derived from any effect the drug has on the organ. The changes in the finger organ as described herein. prints occur as a consequence of virtually any disease or organ perturbation (e.g., drug effect) with each disease or BRIEF DESCRIPTION OF THE SEQUENCE drug effect resulting in a unique fingerprint. The changes in IDENTIFIERS the fingerprints are sufficiently informative to visualize side effects of drugs, be they adverse or positive. Thus, as used 0013 SEQ ID NO:1 is the cDNA sequence that encodes herein, side effect refers to any unintended effect of a drug, the WDR19 prostate specific secreted protein. either positive (e.g., a previously unrecognized positive indi 0014 SEQ ID NO:2. is the amino acid sequence of the cation for a drug) or negative (e.g., toxicity or other adverse WDR19 prostate specific secreted protein. effect). The drug-altered fingerprints are determined by com 0015 SEQ ID NOS:3-72 are MPSS signature sequences paring the blood from normal individuals against that from that correspond to differentially expressed genes in LNCaP patients on a particular drug regimen. Not only will the abso cells (early prostate cancer phenotype) to androgen-indepen lute levels of the changes in the proteins constituting indi dent CL1 cells (late prostate cancer phenotype) (see Table 1). vidual fingerprints be determined, but all the protein changes US 2011/O 136690 A1 Jun. 9, 2011

(e.g. N changed proteins) will be compared against one invention provides the ability to determine those individuals another to generate an N-dimensional shape space that will who may have adverse reactions to drugs. correlate even more powerfully with the stratifications of 0027. A drug, as used herein, refers to any substance (syn drug-induced alterations as described herein (see e.g., U.S. thetic or natural) which when administered to or otherwise Patent Application No. 20020095259). absorbed into a living organism or system derived therefrom, 0024. The studies described herein use prostate cancer as may modify one or more of its functions. a model for studying perturbation of organ-specific molecular Methods for Identifying Organ-Specific Proteins Secreted blood fingerprints. The same principles apply in the setting of into the Blood. determining perturbations that result from drugs. In the stud 0028. The invention provides methods for identifying ies described herein, the transcriptomes of two prostate can organ-specific secreted proteins. In this regard, as used cer cell lines were analyzed: LNCaP, an androgen sensitive herein, the term “organ” is defined as would be understood in cell line, and hence a model for early stage of prostate cancer, the art. Thus, the term, “organ-specific' as used herein refers and a variant of this cell, CL1, an androgen unresponsive cell to proteins (or transcripts) that are primarily expressed in a line, thus, a model for late stage of prostate cancer. Analyses single organ. It should be noted that the skilled artisan would of the transcriptomes of these two cell lines revealed changes readily appreciate upon reading the instant specification that in cellular states that occur with the progression of prostate cell-specific transcripts and proteins and tissue-specific tran cancer. These transcriptomes were also compared to normal Scripts and proteins are also contemplated in the present prostate tissue, prostate cancer tissues and prostate cancer invention. As such, and as discussed further herein, in certain metastases. These prostate transcriptomes were compared embodiments, organ-specific protein is defined as a protein against their counterparts from 29 other tissues to identify encoded by a transcript that is expressed at a level of at least those transcripts that are primarily expressed in the prostate. 3 copies/million (as measured, for example, by massively Computational approaches were used to predict which of parallel signature sequencing (MPSS) in the cell/tissue/organ these transcripts encode secreted proteins. Further, a prostate of interest but is expressed at less than 3 copies/million in protein, referred to as WDR19, that was previously shown by other cells/tissues/organs. In a further embodiment, an organ microarray and northern analysis to be prostate-specific, was specific protein is one that is encoded by a transcript that is used in a multiparameter analysis of prostate cancer samples. expressed 95% in one organ and the remaining 5% in one or 0025 Thus, the present invention is generally directed to more other organs. (In this context, total expression across all methods for identifying organ-specific secreted proteins organs examined is taken as 100%). present in the blood. The present invention is also directed to 0029. In certain embodiments, an organ-specific protein is methods for defining organ-specific molecular blood finger one that is encoded by a transcript that is expressed at about prints and further provides defined examples of predicted 50%, 55%, 60%, 65%, 70%, 75%, 80% to about 90% in one organ-specific molecular blood fingerprints. Additionally, the organ and wherein the remaining 10%-50% As would be present invention is directed to panels of reagents or pro readily recognized by the skilled artisan upon reading the teomic techniques employing mass spectrometry that detect present disclosure, in certain embodiments, an organ-specific organ-specific secreted proteins in the blood for use in iden molecular blood fingerprint can readily be discerned even if tifying side effects of drugs, evaluating drug toxicity, and Some expression of an “organ-specific' protein from a par other related applications. ticular organ is detected at Some level in another organ, or 0026. By predefining the components of a given molecular even more than one organ. For example, the organ-specific blood fingerprint using the methods described herein, the molecular blood fingerprint from prostate can conclusively present invention alleviates the need to blindly search for identify a particular prostate disease (and stage of disease) protein patterns using blood proteomics. Thus, the present despite expression of one or more protein members of the invention enables the skilled artisan to 1) identify blood pro fingerprint in one or more other organs. Thus, an organ teins which collectively constitute unique organ-specific specific protein as described herein may be predominantly or molecular blood fingerprints for healthy, diseased individuals differentially expressed in an organ of interest rather than and individuals affected (either adversely or positively) by uniquely or specifically expressed in the organ. In this regard, one or more drugs; 2) identify unique organ-specific molecu in certain embodiments, differentially expressed means at lar blood fingerprints associated with the direct (e.g., least 1.5 fold expression in the organ of interest as compared intended) effects of drugs or the side effects of different to other organs. In another embodiment, differentially drugs; 3) identify fingerprints that can uniquely distinguish expressed means at least 2 fold expression in the organ of the different types of side effects. Importantly, the organ interest as compared to expression in other organs. In yet a specific, Secreted blood fingerprints can be predicted from a further embodiment, differentially expressed means at least combination of quantitative comparative transcriptome stud 2.5, 3, 3.5, 4, 4.5, 5 fold or higher expression in the organ of ies and computational methods to predict which transcripts interest as compared to expression of the protein in other encode secreted proteins. The methods for determining the organs. As described elsewhere herein, “protein’ expression organ-specific, blood fingerprints for all organs described can be determined by analysis of transcript expression using herein allow drug effects (either adverse or positive) on any a variety of methods. organ to be easily identified. Further, the present invention 0030. In one embodiment, the organ-specific proteins are can be used to determine distinct, normal organ-specific identified by preparing a cDNA library from an organ of molecular blood fingerprints, such as in different populations interest. Any organ of a mammalian body is contemplated of people. In this regard, there may be differences in normal herein. Illustrative organs include, but are not limited to, organ-specific molecular blood fingerprints between popula heart, kidney, ureter, bladder, urethra, liver, prostate, heart, tions of individuals that permit the stratification of patients blood vessels, bone marrow, skeletal muscle, Smooth muscle, into classes of individuals who would respond positively to a brain (amygdala, caudate nucleus, cerebellum, corpuscallo particular drug and those who would not. Thus, the present Sum, fetal, hypothalamus, thalamus), spinal cord, peripheral US 2011/O 136690 A1 Jun. 9, 2011

nerves, retina, nose, trachea, lungs, mouth, salivary gland, ing to the methods described in U.S. Pat. Nos. 6,013,445; esophagus, stomach, Small intestines, large intestines, hypo 6,172,218: 6,172,214; 6,140,489 and Brenner, P., et al., Nat thalamus, pituitary, thyroid, pancreas, adrenal glands, ova Biotechnol, 18:630-634 2000. Briefly, polynucleotide tem ries, Oviducts, uterus, placenta, Vagina, mammary glands, plates from a cDNA library of interest are cloned into a vector testes, seminal vesicles, penis, lymph nodes, PBMC, thymus, system that contains a vast set of minimally cross-hybridizing and spleen. As noted above, upon reading the present disclo oligonucleotide tags (see U.S. Pat. No. 5,863,722). The num Sure, the skilled artisan would recognize that cell-specific and ber of tags is usually at least 100 times greater than the tissue-specific proteins are contemplated herein and thus, number of cDNA templates (see e.g., U.S. Pat. No. 6,013.445 proteins specifically expressed in cells or tissues that make up and Brenner, P. et al., Supra). Thus, the set of tags is such that Such organs are also contemplated herein. In certain embodi a 1% sample taken of template-tag conjugates ensures that ments, in each of these organs, transcriptomes are obtained essentially every template in the sample is conjugated to a for the cell types in which the disease of interest arises. For unique tag and that at least one of each of the different tem example, in the prostate there are two dominant types of plate cDNAs is represented in the sample with >99% prob cells—epithelial cells and stromal cells. About 98% of pros ability (U.S. Pat. No. 6,013,445 and Brenner, P. et al., supra). tate cancers arise in epithelial cells. As such, in certain The conjugates are then amplified and hybridized under Strin embodiments, “organ-specific’ means the transcripts that are gent conditions to microbeads each of which has attached expressed in particular cell types of the organ of interest (e.g., thereto a unique complementary, minimally cross-hybridiz prostate epithelial cells). In this regard, any cell type that ing oligonucleotide tag. The transcripts are then directly makes up any of the organs described herein is contemplated sequenced simultaneously in a flow cell using a ligation herein. Illustrative cell types include, but are not limited to, based sequencing method (see e.g., U.S. Pat. No. 6,013.445). epithelial cells, stromal cells, endothelial cells, endodermal A short signature sequence of about 17-20 base pairs is gen cells, ectodermal cells, mesodermal cells, lymphocytes (e.g., erated simultaneously from each of the hundreds of thou B cells and T cells including CD4+ T helper 1 or T helper 2 sands of beads (or more) in the flow cell, each having attached type cells, CD8+ cytotoxic T cells), erythrocytes, kerati thereto copies of a unique transcript from the sample. This nocytes, and fibroblasts. Particular cell types within organs or technique is termed massively parallel signature sequencing tissues may be obtained by histological dissection, by the use (MPSS). of specific cell lines (e.g., prostate epithelial cell lines), by cell 0034. In certain embodiments, other techniques may be sorting or by a variety of other techniques known in the art. used to evaluate the transcripts from a particular cDNA 0031. It should be noted that in certain embodiments, fin library, including microarray analysis (Han, M., et al., Nat gerprints can be determined from “organ-specific' proteins Biotechnol, 19:631-635, 2001; Bao, P., et al., Anal Chem, 74: from multiple organs, such as from organs that share a com 1792-1797, 2002; Schena et al., Proc. Natl. Acad. Sci. USA mon function or make up a system (e.g., digestive system, 93: 10614-19, 1996; and Heller et al., Proc. Natl. Acad. Sci. circulatory system, respiratory system, the immune system USA 94:2150-55, 1997) and SAGE (serial analysis of gene (including the different cells of the immune system, such as, expression). Like MPSS, SAGE is digital and can generate a but not limited to, B cells, T cells including CD4+ T helper 1 large number of signature sequences. (see e.g., Velculescu, V. or T helper 2 type cells, regulatory T cells, CD8+ cytotoxic T E., et al., Trends Genet, 16: 423-425., 2000; Tuteja R. and cells, NK cells, dendritic cells, macrophages, monocytes, Tuteja N. Bioessays. 2004 August; 26(8):916-22) although neutrophils, granulocytes, mast cells, etc.), cardiovascular the coverage is not nearly as deep as with MPSS. system, the sensory system, the skin, brain and the nervous 0035. The resulting sequences, (e.g., MPSS signature system, and the like). sequences), are generally about 20 bases in length. However, 0032 Complementary DNA (cDNA) libraries can be gen in certain embodiments, the sequences can be about 10, 15. erated using techniques known in the art, such as those 20, 25, 30,35, 40, 45, 50,55, 60, 65,70, 75, 80, 85,90, 95, or described in Ausubel et al. (2001 Current Protocols in 100 or more bases in length. The sequences are annotated Molecular Biology, Greene Publ. Assoc. Inc. & John Wiley & using annotated sequence (Such as human Sons, Inc., NY, N.Y.); Sambrooketal. (1989 Molecular Clon genome release hg16, released in November, 2003, or other ing, Second Ed., Cold Spring Harbor Laboratory, Plainview, public databases) and the human Unigene (Unigene build N.Y.); Maniatis et al. (1982 Molecular Cloning, Cold Spring #184) using methods known in the art, such as the method Harbor Laboratory, Plainview, N.Y.) and elsewhere. Further, described by Meyers, B. C., et al., Genome Res, 14: 1641 a variety of commercially available kits for constructing 1653, 2004. Other databases useful in this regard include cDNA libraries are useful for making the cDNA libraries of Genbank, EMBL, or other publicly available databases. In the present invention. Libraries are constructed from organs/ certain embodiments, transcripts are considered only for tissues/cells procured from normal Subjects. those with 100% matches between an MPSS or other type of 0033 All or substantially all of the transcripts of the signature and a genome signature. As would be readily appre cDNA library, e.g., representing virtually or substantially all ciated by the skilled artisan upon reading the present disclo genes functioning in the organ of interest, are cloned and Sure, this is a stringent match criterion and in certain embodi sequenced using any of a variety of techniques known in the ments, it may be desirable to use less Stringent match criteria. art. In this regard, in certain embodiments, Substantially all Indeed, polymorphisms could lead to variations in transcripts refers to a sample representing at least 80% of all genes that would be missed if only exact matches were used. For functioning in the organ of interest. In a further embodiment, example, it may be desirable to consider signature sequences substantially all refers to a sample representing at least 85%, that match a genome signature with 80%, 85%, 90%. 95%, 90%. 95%, 96%.97%, 98% 99% or higher of all genes func 96%, 97%, 98%, or 99% identity. In one embodiment, signa tioning in the organ of interest. In one embodiment, Substan tures that are expressed at less than 3 transcripts per million in tially all the transcripts from a cDNA library are amplified, libraries of interest are disregarded, as they might not be Sorted and signature sequences generated therefrom accord reliably detected since this, in effect, represents less than one US 2011/O 136690 A1 Jun. 9, 2011

transcript per cell (see for example, Jongeneel, C. V., et al., 65%, 70%, 75%, 80% to about 90% in one organ and wherein Proc Natl AcadSci USA, 2003). cDNA signatures are classi the remaining 10%-50% is expressed in one or more other fied by their positions relative to polyadenylation signals and Organs. poly (A) tails and by their orientation relative to the 5->3' 0039. In another embodiment, organ-specific transcripts orientation of source mRNA. Full-length sequences corre are identified by determining the ratio of expression of a sponding to the signature sequences can be thus identified. transcript in the organ of interest as compared to other organs. 0036. In order to identify organ-specific transcripts, the In this regard, expression levels in the organ of interest of at resulting annotated transcripts are compared against public least 2.0, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5, 7.0 fold or and/or private sequence databases, such as a variety of anno higher as compared to expression in all other organs is con tated human genome sequence databases (e.g., Genebank, the sidered to be organ-specific expression. EMBL and Japanese databases and databases generated and 0040. As would be readily recognized by the skilled arti compiled from other normal tissues, to identify those tran San upon reading the present disclosure, in certain embodi Scripts that are expressed primarily in the organ of interest but ments, an organ-specific molecular blood fingerprint can are not expressed in other organs. As noted elsewhere herein, readily be discerned even if some expression of an “organ Some expression in organs other than the organ of interest specific' protein from a particular organ is detected at Some does not necessarily preclude the use of a particular transcript level in another organ, or even more than one organ. This is in a blood molecular signature panel of the present invention. because the fingerprint (e.g., the combination of the levels of 0037 Comparisons of the transcripts between databases multiple proteins; the pattern of the expression levels of mul can be made using a variety of computer analysis algorithms tiple markers) itself is unique despite that the expression known in the art. As such, alignment of sequences for com levels of one or more individual members of the fingerprint parison may be conducted by the local identity algorithm of may not be unique to a particular organ. For example, the Smith and Waterman (1981) Add. APL. Math 2:482, by the organ-specific molecular blood fingerprint from prostate can identity alignment algorithm of Needleman and Wunsch conclusively identify a particular prostate disease (and stage (1970).J. Mol. Biol. 48:443, by the search for similarity meth of disease) despite Some expression of one or more members ods of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA of the fingerprint in one or more other organs. Thus the 85: 2444, by computerized implementations of these algo present invention relates to determining the presence or rithms (GAP, BESTFIT, BLAST, FASTA, and TFASTA in the absence of a disease or condition or stage of disease based on Wisconsin Genetics Software Package, Genetics Computer a pattern (e.g., fingerprint) of markers measured concurrently Group (GCG),575 Science Dr., Madison, Wis.), or by inspec using any one or more of a variety of methods described tion. As would be understood by the skilled artisan, many herein (e.g., antibody binding, mass spectrometry, and the algorithms are available and are continually being developed. like), rather than the measure of individual markers. Appropriate algorithms can be chosen based on the specific 0041. In further embodiments, specificity can be con needs for the comparisons being made (See also, e.g., J. A. firmed at the protein level using immunohistochemistry Cuff, et al., Bioinformatics, 16(2):111-116, 2000; S. F Alts (IHC) and/or other protein measurement techniques known in chul and B. W. Erickson. Bulletin of Mathematical Biology, the art (e.g., isotope-coded affinity tags and mass spectrom 48(5/6):603-616, 1986; S. F. Altschul and B. W. Erickson. etry, such as described by Han, D. K., et al., Nat Biotechnol, Bulletin of Mathematical Biology, 48(5/6):633-660, 1986; S. 19:946-951, 2001). The Z-test (Man, M. Z. et al., Bioinfor F. Altschul, et al., J. Mol. Bio., 215:403-410, 1990; K. Bucka matics, 16: 953-959, 2000) or other appropriate statistical Lassen, et al., BIOINFORMATICS, 15(2):122-130, 1999; tests can be used to calculate P values for comparison of gene K.-M. Chao, et al., Bulletin of Mathematical Biology, 55(3): and protein expression levels between libraries from organs 503-524, 1993: W. M. Fitch and T. F. Smith. Proceedings of of interest. the National Academy of Sciences, 80: 1382-1386, 1983: A. 0042. Organ-specific sequences identified as described D. Gordon. Biometrika, 60:197-200, 1973; O. Gotoh. J Mol herein are further analyzed to determine which of the Biol, 162:705-708, 1982; O. Gotoh. Bulletin of Mathematical sequences encode secreted proteins. Proteins with signal pep Biology, 52(3):359-373, 1990; X. Huang, et al., CABIOS, tides (classical secretory proteins) can be predicted using 6:373-381, 1990; X. Huang and W. Miller. Advances in computation analysis known in the art. Illustrative methods Applied Mathematics, 12:337-357, 1991; J. D. Thompson, et include, but are not limited to the criteria described by Chen al., Nucleic Acids Research, 27(13):2682-2690, 1999). et al., Mamm Genome, 14: 859-865, 2003. In certain embodi 0038. In certain embodiments, a particular transcript is ments, such analyses are carried out using prediction servers, considered to be organ-specific when the number of tran for example SignalP3.0 server developed by The Center for scripts/million as determined by MPSS is 3 or greater in the Biological Sequence Analysis, Lyngby, Denmark (http colon organ of interest but is less than 3 in all other organs. In double slash www dot cbs dot dtu dot dk/services/SignalP-3. another embodiment, a transcript is considered organ-spe 0; see also, J. D. Bendtsen, et al., J. Mol. Biol. 340:783-795, cific if it is expressed in the organ of interest at a detectable 2004.) and the TMHMM2.0 server (see for example A. level using a standard measurement (e.g., microarray analy Krogh, et al., Journal of Molecular Biology, 305(3):567-580, sis, quantitative real-time RT-PCR, MPSS, etc.) in the organ January 2001: E. L. L. Sonnhammer, et al. In J. Glasgow, T. of interest but is not detectably expressed in other organs, Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen, using appropriate negative and positive controls as would be editors, Proceedings of the Sixth International Conference on familiar to the skilled artisan. In a further embodiment, an Intelligent Systems for Molecular Biology, pages 175-182, organ-specific transcript is one that is expressed 95% in one Menlo Park, Calif., 1998. AAAI Press). Other prediction organ and the remaining 5% in one or more other organs. (In methods that can be used in the context of the present inven this context, total expression across all organs examined is tion include those described for example, in S. Moller, M.D. taken as 100%). In certain embodiments, an organ-specific R. et al., Bioinformatics, 17(7):646-653, July 2001. Nonclas transcript is one that is expressed at about 50%, 55%, 60%, sical Secretory secreted proteins (without signal peptides) can US 2011/O 136690 A1 Jun. 9, 2011 be predicted using, for example, the SecretomeP 1.0 server, analyses of the corresponding proteins in the blood, and the (http colon double slash www dot cbs dot dtu dot dk/services/ like). As such, in certain embodiments, the biological func SecretomeP-1.0/) with an odds ratio scored 3.0. Updated ver tion of a variant protein is not relevant for utility in the sions of these analysis programs are also contemplated for methods for detection and/or diagnostics described herein. use in the present methods as are other methods known in the Polypeptide variants generally encompassed by the present art (e.g., PSORT (http colon double slash psort dot nibb dotac invention will typically exhibit at least about 70%, 75%,80%, dot jp/) and Sigfind (httpcolon double slash 139.91.72.10/ 85%. 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, sigfind/sigfind dot html). 95%, 96%, 97%, 98%, or 99% or more identity along its 0043 Confirmation that the identified secreted proteins length, to a polypeptide sequence set forth herein. Within a are present in blood can be carried out using a variety of polypeptide variant, amino acid Substitutions are usually methods known in the art. For example, the proteins can be made at no more than 50% of the amino acid residues in the expressed, purified, and specific antibodies can be made native polypeptide, and in certain embodiments, at no more against them. The specific antibodies can then be used to test than 25% of the amino acid residues. In certain embodiments, the presence of the protein in blood/serum/plasma by a vari Such substitutions are conservative. A conservative substitu ety of immunoaffinity based techniques (e.g., immunoblot, tion is one in which an amino acid is Substituted for another Western analysis, immunoprecipitation, ELISA, etc.). Anti amino acid that has similar properties, such that one skilled in bodies specific for the organ-specific protein identified herein the art of peptide chemistry would expect the secondary struc can also be used to study expression patterns of the identified ture and hydropathic nature of the polypeptide to be substan proteins. It should be noted that in certain circumstances, the tially unchanged. In general, the following amino acids rep secreted protein may not be detectable in normal blood resent conservative changes: (1) ala, pro, gly, glu, asp. gln, samples but will be detected in the blood as a result of per asn, ser, thr; (2) cys, ser, tyr, thr; (3) Val, ile, leu, met, ala, phe; turbation due to disease or other environmental factors. (4) lys, arg, his; and (5) phe, tyr, trp, his. Thus, a variant may Accordingly, both normal and disease samples are tested for comprise only a portion of a native polypeptide sequence as the presence of the secreted protein and particularly for provided herein. In addition, or alternatively, variants may changes in levels of expression in the two states. As an alter contain additional amino acid sequences (such as, for native, aptamers (short DNA or RNA fragments with binding example, linkers, tags and/or ligands), usually at the amino complementarity to the proteins of interest) may be used in and/or carboxy termini. Such sequences may be used, for assays similar to those described for antibodies (see for example, to facilitate purification, detection or cellular uptake example, Biotechniques. 2001 February;30(2):290-2,294-5: of the polypeptide. Clinical Chemistry. 1999:45:1628–1650). In addition, anti 0045. When comparing polypeptide sequences, two bodies oraptamers may be used in connection with nanowires sequences are said to be “identical” if the sequence of amino to create highly sensitive detections systems (see e.g., J. acids in the two sequences is the same when aligned for Heath et al., Science. Dec. 17, 2004; 306(5704):2055-6). In maximum correspondence, as described below. Comparisons further embodiments, mass spectrometry-based methods can between two sequences are typically performed by compar be used to confirm the presence of a particular protein in the ing the sequences over a comparison window to identify and blood. compare local regions of sequence similarity. A "comparison 0044 As would be recognized by the skilled artisan, while window' as used herein, refers to a segment of at least about the organ-specific secreted proteins, the levels of which make 20 contiguous positions, usually 30 to about 75, 40 to about up a given fingerprint, need not be isolated, in certain embodi 50, in which a sequence may be compared to a reference ments, it may be desirable to isolate such proteins (e.g., for sequence of the same number of contiguous positions after antibody production). As such, the present invention provides the two sequences are optimally aligned. for isolated organ-specific secreted proteins or fragments or 0046. Optimal alignment of sequences for comparison portions thereof and polynucleotides that encode such pro may be conducted using the Megalign program in the Laser teins. As used herein, the terms protein and polypeptide are gene suite of bioinformatics software (DNASTAR, Inc., used interchangeably. The terms “polypeptide' and “protein’ Madison, Wis.), using default parameters. This program encompass amino acid chains of any length, including full embodies several alignment schemes described in the follow length endogenous (i.e., native) proteins and variants of ing references: Dayhoff, M.O. (1978) A model of evolution endogenous polypeptides described herein. Illustrative ary change in proteins—Matrices for detecting distant rela polypeptides of the present invention are described in Table 1 tionships. In Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Tables 3-5, the section entitled “Brief Description of the and Structure, National Biomedical Research Foundation, Sequence Identifiers' and are set forth in the sequence listing. Washington D.C. Vol. 5, Suppl. 3, pp. 345-358; Hein J. (1990) “Variants' are polypeptides that differ in sequence from the Unified Approach to Alignment and Phylogenes pp. 626-645 polypeptides of the present invention only in Substitutions, Methods in Enzymology vol. 183, Academic Press, Inc., San deletions and/or other modifications, such that either the vari Diego, Calif.; Higgins, D. G. and Sharp, P. M. (1989) ants disease-specific expression patterns are not significantly CABIOS 5:151-153; Myers, E. W. and Muller W. (1988) altered or the polypeptides remain useful for diagnostics/ CABIOS 4:11-17: Robinson, E. D. (1971) Comb. Theor detection of organ-specific blood fingerprints as described 11:105; Saitou, N. Nei, M. (1987) Mol. Biol. Evol. 4:406-425; herein. For example, modifications to the polypeptides of the Sneath, P. H. A. and Sokal, R. R. (1973) Numerical Tax present invention may be made in the laboratory to facilitate Onomy—the Principles and Practice of Numerical Taxonomy, expression and/or purification and/or to improve immunoge Freeman Press, San Francisco, Calif.; Wilbur, W.J. and Lip nicity for the generation of appropriate antibodies and other man, D.J. (1983) Proc. Natl. Acad, Sci. USA 80:726-730. binding agents, etc. Modified variants (e.g., chemically modi 0047 Alternatively, optimal alignment of sequences for fied) of the polypeptides of organ-specific, Secreted proteins comparison may be conducted by the local identity algorithm may be useful herein, (e.g., as standards in mass spectrometry of Smith and Waterman (1981) Add. APL. Math 2:482, by the US 2011/O 136690 A1 Jun. 9, 2011 identity alignment algorithm of Needleman and Wunsch fication may be readily designed based on the polynucleotide (1970).J. Mol. Biol. 48:443, by the search for similarity meth sequences encoding organ-specific polypeptides as provided ods of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA herein, for example, using programs such as the PRIMER3 85: 2444, by computerized implementations of these algo program (httpcolon double slash www-genome dot widot mit rithms (GAP, BESTFIT, BLAST, FASTA, and TFASTA in the dot edu/cgi-bin/primer/primer3 www dot cgi). Wisconsin Genetics Software Package, Genetics Computer 0053 Polynucleotides encoding the organ-specific Group (GCG), 575 Science Dr. Madison, Wis.), or by inspec secreted polypeptides as described herein are also provided tion. by the present invention. A polynucleotide as used herein may 0048 Illustrative examples of algorithms that are suitable be single-stranded (coding or antisense) or double-stranded, for determining percent sequence identity and sequence simi and may be DNA (genomic, cDNA or synthetic) or RNA larity include the BLAST and BLAST 2.0 algorithms, which molecules. Thus, within the context of the present invention, are described in Altschul et al. (1977) Nucl. Acids Res. a polynucleotide encoding a polypeptide may also be a gene. 25:3389-3402 and Altschul et al. (1990).J. Mol. Biol. 215: A gene is a segment of DNA involved in producing a polypep 403-410, respectively. BLAST and BLAST 2.0 can be used, tide chain; it includes regions preceding and following the for example, to determine percent sequence identity for the coding region (leader and trailer) as well as intervening polynucleotides and polypeptides of the invention. Software sequences (introns) between individual coding segments (ex for performing BLAST analyses is publicly available through ons). Additional coding or non-coding sequences may, but the National Center for Biotechnology Information. need not, be present within a polynucleotide of the present 0049. An isolated polypeptide is one that is removed from invention, and a polynucleotide may, but need not, be linked its original environment. For example, a naturally occurring to other molecules and/or Support materials. An isolated poly protein or polypeptide is isolated if it is separated from some nucleotide, as used herein, means that a polynucleotide is or all of the coexisting materials in the natural system. In Substantially away from other coding sequences, and that the certain embodiments, such polypeptides are also purified, DNA molecule does not contain large portions of unrelated e.g., are at least about 90% pure, in some embodiments, at coding DNA. Such as large chromosomal fragments or other least about 95% pure and in further embodiments, at least functional genes or polypeptide coding regions. Of course, about 99% pure. this refers to the DNA molecule as originally isolated, and 0050. In one embodiment of the present invention, a does not exclude genes or coding regions later added to the polypeptide comprises a fusion protein comprising an organ segment by the hand of man. specific secreted polypeptide. The present invention further 0054 Polynucleotides of the present invention may com provides, in other aspects, fusion proteins that comprise at prise a native sequence (i.e., an endogenous polynucleotide, least one polypeptide as described herein, as well as poly for instance, a native or non-artificially engineered or natu nucleotides encoding such fusion proteins. The fusion pro rally occurring gene as provided herein) encoding an organ teins may comprise multiple polypeptides or portions/vari specific secreted protein, an alternate form of Such a ants thereof, as described herein, and may further comprise sequence, or a portion or splice variant thereof or may com one or more polypeptide segments for facilitating the expres prise a variant of Such a sequence. Polynucleotide variants Sion, purification, detection, and/or activity of the polypep may contain one or more Substitutions, additions, deletions tide(s). and/or insertions such that the polynucleotide encodes a 0051. In certain embodiments, the proteins and/or poly polypeptide useful in the methods described herein, such as nucleotides, and/or fusion proteins are provided in the form of for the detection of organ-specific proteins (e.g., wherein said compositions, e.g., pharmaceutical compositions, vaccine polynucleotide variants encode polypeptides that can be used compositions, compositions comprising a physiologically to generate detection reagents as described herein that are acceptable carrier or excipient. Such compositions may com specific for an organ-specific secreted protein). In certain prise buffers such as neutral buffered saline, phosphate buff embodiments, variants exhibit at least about 70% identity, ered saline and the like; carbohydrates such as glucose, man and in other embodiments, exhibit at least about 80%, 85%, nose. Sucrose or dextrans, mannitol; proteins; polypeptides or 86%, 87%. 88%, 89%, identity and in yet further embodi amino acids Such as glycine; antioxidants; chelating agents ments, at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%, Such as EDTA or glutathione; adjuvants (e.g., aluminum 97%, 98%, or 99% identity to a polynucleotide sequence that hydroxide); and preservatives. encodes a native organ-specific secreted polypeptide or an 0.052 In general, organ-specific secreted polypeptides and alternate form or a portion thereof. Illustrative polynucle polynucleotides encoding Such polypeptides as described otides of the present invention are described in Table 1 and herein, may be prepared using any of a variety of techniques Tables 3-5, the section entitled “Brief Description of the that are well known in the art. For example, a DNA sequence Sequence Identifiers' and are set forth in the sequence listing. encoding an organ-specific secreted protein may be prepared The percent identity may be readily determined by comparing by amplification from a suitable cDNA or genomic library sequences using computer algorithms well known to those using, for example, polymerase chain reaction (PCR) or having ordinary skill in the art and described herein. hybridization techniques. Libraries may generally be pre 0055 Polynucleotides that are complementary to the pared and screened using methods well known to those of polynucleotides described herein, or that have substantial ordinary skill in the art, such as those described in Sambrook identity to a sequence complementary to a polynucleotide as et al., Molecular Cloning. A Laboratory Manual, Cold Spring described herein are also within the scope of the present Harbor Laboratories, Cold Spring Harbor, N.Y., 1989. cDNA invention. “Substantial identity”, as used herein refers to libraries may be prepared from any of a variety of organs, polynucleotides that exhibit at least about 70% identity, and tissues, cells, as described herein. Other libraries that may be in certain embodiments, at least about 80%, 85%, 86%, 87%, employed will be apparent to those of ordinary skill in the art 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, upon reading the present disclosure. Primers for use in ampli 98%, or 99% identity to a polynucleotide sequence that US 2011/O 136690 A1 Jun. 9, 2011 encodes a native organ-specific secreted polypeptide as each drug or combination of drugs will create a unique organ described herein. Substantial identity can also refer to poly specific molecular blood fingerprint for each organ that it nucleotides that are capable of hybridizing under stringent affects. As would be readily appreciated by the skilled artisan, conditions to a polynucleotide complementary to a poly each disease or stage of a disease or drug or combination of nucleotide encoding an organ-specific secreted protein. Suit drugs can affect multiple organs. For example, in kidney able hybridization conditions include prewashing in a solu cancer, a primary perturbation in the kidney-specific molecu tion of 5xSSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); lar blood fingerprint would occur. However, a secondary or hybridizing at 50-65° C., 5xSSC, overnight; followed by indirect effect may also be observed in the bladder-specific washing twice at 65° C. for 20 minutes with each of 2x, 0.5x molecular blood fingerprint. As another example, in liver and 0.2xSSC containing 0.1% SDS. Nucleotide sequences cancer, perturbation of a liver-specific blood fingerprint as a that, because of code degeneracy, encode a polypeptide primary indicator of disease would occur. However, second encoded by any of the above sequences are also encompassed ary or indirect effects at other sites, for example in a lympho by the present invention. cyte-specific blood fingerprint, would also be observed. As 0056 Oligonucleotide primers for amplification of the described elsewhere herein, each disease type and stage polynucleotides encoding organ-specific secreted proteins results in a unique, identifiable fingerprint for each organ that are also within the scope of the present invention. Many it affects, for primary and secondary organs affected. Like amplification methods are known in the art such as PCR, wise, each drug or combination of drugs results in a unique, RT-PCR, quantitative real-time PCR, and the like. The PCR identifiable fingerprint for each organ that it effects, both conditions used can be optimized in terms of temperature, primary and secondary organs. Thus, multiple organ-specific annealing times, extension times and number of cycles molecular blood fingerprints can be used in combination to depending on the oligonucleotide and the polynucleotide to determine a particular drug side effect and the fingerprints be amplified. Such techniques are well known in the art and may include those for the primary organ affected and/or for a are described in, for example, Mullis et al., Cold Spring secondary or indirect organ that is affected by a particular Harbor Symp. Ouant. Biol., 51:263, 1987: Erlich ed., PCR drug or combination of drugs. Technology, Stockton Press, NY, 1989. Oligonucleotide prim 0059 Most common diseases such as prostate cancer actu ers can be anywhere from 8, 9, 10, 11, 12, 13, 14, 15, 16, 17. ally represent multiple distinct diseases that initially appear 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, or 30 nucleotides similar (e.g., benign and very slowly growing prostate cancer, in length. In certain embodiments, the oligonucleotide prim slowly invasive prostate cancer and rapidly metastatic pros ers of the present invention are typically 35, 40, 45, 50, 55,60, tate cancer represent three different types of prostate can or more nucleotides in length. cer—the process of dividing individual prostate cancers into one of these three types is called stratification). The blood Organ-Specific Molecular Blood Fingerprints molecular fingerprints will be distinct for each of these dis 0057 The present invention also provides methods for ease types, thus allowing for the stratification of similar dis defining organ-specific molecular blood fingerprints. Addi eases and rapid intervention where necessary. The blood fin tionally, the present invention provides defined examples of gerprints will also be perturbed in unique ways as each type of organ-specific molecular blood fingerprints as described fur disease progresses—hence the blood fingerprints will also ther herein. permit the progression of disease to be followed. The blood 0058. Each normal organ controls the expression of a vari fingerprints also change with therapy, and hence will permit ety of genes, some of which are expressed at major levels at the effectiveness of therapy to be followed, thereby allowing other organs or tissues in the body and some of which are a physician to alter treatment accordingly. Importantly, the expressed only in the organ of interest or at significantly blood fingerprints change with exposure to a variety of envi increased levels in the organ of interest as compared to ronmental factors, such as drugs, and can be used to assess expression in other organs/tissues (e.g., at least 2 fold, at least toxic or off target damage by the drug and will even permit 2.5 fold, at least 3.0 fold, at least 3.5 fold, at least 4.0 fold, at following the Subsequent recovery from Such adverse drug least 4.5 fold, or higher fold expression in the organ of interest exposure. as compared to other tissues. Some of the organ-specific 0060. One of the advantages of the organ-specific, transcripts encode proteins which can be secreted into the secreted blood fingerprints is the possibility that very subtle blood. Hence these secreted proteins constitute an organ side effects of drugs can be detected, either adverse effects, or specific molecular fingerprint for that organ in the blood. previously unrecognized positive effects. Analysis of levels of these proteins in the blood provides 0061 Thus, an organ-specific molecular blood fingerprint organ-specific molecular blood fingerprints that are indica for a given setting (e.g., one or more particular drug side tive of biological states. A biological State may be a normal, effects) is defined by the levels in the blood of the organ healthy state or a disease state (e.g., perturbation from nor specific proteins that make up the fingerprint. As such, an mal). Thus, there are molecular fingerprints in the blood that organ-specific molecular blood fingerprint for a given organ reflect the operation of normal organs and each organ has a at any given time and in any given setting (e.g., drug-induced specific molecular fingerprint. These organ-specific blood perturbation) is determined by measuring the levels of each of fingerprints are perturbed when disease, or other agents such a plurality of organ-specific proteins in the blood. It is the as drugs, affects the organ. Different diseases will alter the combination of the different levels in the blood of the organ organ-specific blood fingerprints in different ways (e.g. alter specific proteins that reveals a unique pattern that defines the the expression levels of the corresponding secreted proteins). fingerprint. Equally important, each of the levels of the pro Likewise, different drugs will alter the organ-specific blood teins can be compared against one another to create an N-di fingerprints in different ways. Thus, a unique perturbed blood mensional measure of the fingerprint space, a very powerful molecular fingerprint is associated with each type of distinct correlate to health, disease, and drug-induced changes (see disease and with each drug or combination of drugs. In effect, e.g., U.S. Patent Application No 20020095259). It should be US 2011/O 136690 A1 Jun. 9, 2011

noted that, in certain embodiments, an organ-specific viduals can be defined and these differences permit the strati molecular blood fingerprint may be comprised of the deter fication of patients into classes of individuals who would mined level in the blood of one or more organ-specific respond positively to a particular drug and those who would secreted proteins. In one embodiment, an organ-specific not. Thus, the present invention provides the ability to deter molecular blood fingerprint may comprise the determined mine those individuals who may have adverse reactions to level in the blood of anywhere from about 2 to more than drugs. about 100, 200 or more organ-specific secreted proteins from 0065 Organ-specific molecular blood fingerprints can be a particular organ of interest. In one embodiment, the organ determined using any of a variety of detection reagents in the specific molecular blood fingerprint comprises the quantita context of a variety of methods for measuring protein levels. tively measured level in the blood of at least 2, 3, 4, 5, 6, 7, 8, Any detection reagent that can specifically bind to or other 9, or 10 organ-specific secreted proteins. In another embodi wise detect an organ-specific secreted protein as described ment, the organ-specific molecular blood fingerprint com herein is contemplated as a Suitable detection reagent. Illus prises the determined level in the blood of at least, 11, 12, 13, trative detection reagents include, but are not limited to anti 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 28, 29, or 30 bodies, or antigen-binding fragments thereof, yeast ScFv, organ-specific secreted proteins. Inafurther embodiment, the DNA or RNA aptamers, isotope labeled peptides, microflu organ-specific molecular blood fingerprint comprises the idic/nanotechnology measurement devices and the like. determined level in the blood of at least, 31, 32,33,34,35,36, 0066. In one illustrative embodiment, a detection reagent 37, 38, 39, or 40 organ-specific secreted proteins. In yet a is an antibody or an antigen-binding fragment thereof. Anti further embodiment, the organ-specific molecular blood fin bodies may be prepared by any of a variety of techniques gerprint comprises the determined level in the blood of at known to those of ordinary skill in the art. See, e.g., Harlow least, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 organ-specific and Lane, Antibodies. A Laboratory Manual, Cold Spring secreted proteins. In an additional embodiment, the organ Harbor Laboratory, 1988. In general, antibodies can be pro specific molecular blood fingerprint comprises the deter duced by cell culture techniques, including the generation of mined level in the blood of 51, 52, 53, 54, 55, 56, 57, 58, 59, monoclonal antibodies as described herein, or via transfec or 60 organ-specific secreted proteins. In another embodi tion of antibody genes into Suitable bacterial or mammalian ment, the organ-specific molecular blood fingerprint com cell hosts, in order to allow for the production of recombinant prises the determined level in the blood of 61, 62. 63, 64, 65, antibodies. In one technique, an immunogen comprising the 66, 67, 68, 69, or 70 organ-specific secreted proteins. In polypeptide is initially injected into any of a wide variety of further embodiments, the organ-specific molecular blood fin mammals (e.g., mice, rats, rabbits, sheep or goats). In this gerprint comprises the determined level in the blood of 75,80, step, the polypeptides of this invention may serve as the 85, 90, 100, or more organ-specific secreted proteins. immunogen without modification. Alternatively, particularly 0062. It should be noted that in certain circumstances, an for relatively short polypeptides, a Superior immune response organ-specific molecular blood fingerprint can be defined (in may be elicited if the polypeptide is joined to a carrier protein, part or entirely) merely by the presence or absence of one or Such as bovine serum albuminor keyhole limpet hemocyanin. a plurality of organ-specific proteins, and determining the The immunogen is injected into the animal host, usually exact level of each of a plurality of organ-specific proteins in according to a predetermined schedule incorporating one or the blood may not necessary. more booster immunizations, and the animals are bled peri 0063. In a further embodiment, the fingerprints associated odically. Polyclonal antibodies specific for the polypeptide with a particular drug and side effects thereof are determined may then be purified from Such antisera by, for example, by comparing the blood from normal individuals against that affinity chromatography using the polypeptide coupled to a from Subjects taking a particular drug of interest. As such, a Suitable Solid Support. statistically significant change in the levels (e.g., an increase 0067. In one embodiment, multiple target proteins or pep or a decrease) of one or more of the organ-specific proteins tides are used in a single immune response to generate mul that comprise the fingerprint as compared to normal is indica tiple useful detection reagents simultaneously. In one tive of a perturbation of the fingerprint and is useful in iden embodiment, the individual specificities are later separated tifying direct effects or side effects of the drug of interest. The Out skilled artisan would readily appreciate that a variety of sta 0068. In certain embodiments, antibody can be generated tistical tests can be used to determine if an altered level of a by phage display methods (such as described by Vaughan, T. given protein is significant. The Z-test (Man, M. Z. et al., J., et al., Nat Biotechnol, 14:309-314, 1996; and Knappik, A., Bioinformatics, 16:953-959, 2000) or other appropriate sta et al., Mol Biol, 296:57-86, 2000): ribosomal display (such as tistical tests can be used to calculate P values for comparison described in Hanes, J., et al., Nat Biotechnol, 18: 1287-1292, of protein expression levels. In certain embodiments, the level 2000), or periplasmic expression in E. coli (see e.g., Chen, G., of each of the plurality of organ-specific proteins in the blood et al., Nat Biotechnol, 19:537-542, 2001). In furtherembodi sample from the Subject is compared to a previously deter ments, antibodies can be isolated using a yeast Surface display mined normal control level of each of the plurality of organ library. See e.g., nonimmune library of 10 human antibody specific proteins taking into account standard deviation. schv fragments as constructed by Feldhaus, M.J., et al., Nat Thus, the present invention provides determined normal con Biotechnol, 21: 163-170, 2003. There are several advantages trol levels of each of a plurality of organ-specific proteins that of this yeast Surface display compared to more traditional make up a particular molecular blood fingerprint. large nonimmune human antibody repertoires such as phage 0064. In an additional embodiment, the present invention display, ribosomal display, and periplasmic expression in E. can be used to determine distinct, normal organ-specific coli 1). The yeast library can be amplified 10'-fold without molecular blood fingerprints, such as in different populations measurable loss of clonal diversity and repertoire bias as the of people. In this regard, differences in normal organ-specific expression is under control of the tightly GAL1/10 promoter molecular blood fingerprints between populations of indi and expansion can be done under non induction conditions; 2) US 2011/O 136690 A1 Jun. 9, 2011

nanomolar-affinity SchvS can be routinely obtained by mag 2662; Hochman et al. (1976) Biochem 15:2706-2710; and netic beadscreening and flow-cytometric sorting, thus greatly Ehrlich et al. (1980) Biochem 19:4091-4096. simplified the protocol and capacity of antibody Screening; 3) 0072 A single chain Fv (“shv') polypeptide is a with equilibrium screening, a minimal affinity threshold of covalently linked V::V, heterodimer which is expressed the antibodies desired can be set; 4) the binding properties of from a gene fusion including V- and V-encoding genes the antibodies can be quantified directly on the yeast Surface; linked by a peptide-encoding linker. Huston et al. (1988) 5) multiplex library Screening against multiple antigens Proc. Nat. Acad. Sci. USA 85(16):5879-5883. A number of simultaneously is possible; and 6) for applications demand methods have been described to discern chemical structures ing picomolar affinity (e.g. in early diagnosis), Subsequent for converting the naturally aggregated—but chemically rapid affinity maturation (Kieke, M.C., et al., J Mol Biol, 307: separated—light and heavy polypeptide chains from an anti 1305-1315, 2001.) can be carried out directly on yeast clones body V region into an sEv molecule which will fold into a without further re-cloning and manipulations. three dimensional structure substantially similar to the struc 0069 Monoclonal antibodies specific for an organ-spe ture of an antigen-binding site. See, e.g., U.S. Pat. Nos. 5,091, cific secreted polypeptide of interest may be prepared, for 513 and 5,132,405, to Huston et al.; and U.S. Pat. No. 4,946, example, using the technique of Kohler and Milstein, Eur: J. 778, to Ladner et al. Immunol. 6:51 1-519, 1976, and improvements thereto. 0073. Each of the above-described molecules includes a Briefly, these methods involve the preparation of immortal heavy chain and a light chain CDR set, respectively inter cell lines capable of producing antibodies having the desired posed between a heavy chain and a light chain FR set which specificity (i.e., reactivity with the polypeptide of interest). provide support to the CDRS and define the spatial relation Such cell lines may be produced, for example, from spleen ship of the CDRs relative to each other. As used herein, the cells obtained from an animal immunized as described above. term “CDR set' refers to the three hypervariable regions of a The spleen cells are then immortalized by, for example, heavy or light chain V region. Proceeding from the N-termi fusion with a myeloma cell fusion partner, in certain embodi nus of a heavy or light chain, these regions are denoted as ments, one that is Syngeneic with the immunized animal. A “CDR1,” “CDR2, and “CDR3 respectively. An antigen variety of fusion techniques may be employed. For example, binding site, therefore, includes six CDRs, comprising the the spleen cells and myeloma cells may be combined with a CDR set from each of a heavy and a light chain V region. A nonionic detergent for a few minutes and then plated at low polypeptide comprising a single CDR. (e.g., a CDR1, CDR2 density on a selective medium that Supports the growth of or CDR3) is referred to herein as a “molecular recognition hybrid cells, but not myeloma cells. An illustrative selection unit.” Crystallographic analysis of a number of antigen-anti technique uses HAT (hypoxanthine, aminopterin, thymidine) body complexes has demonstrated that the amino acid resi selection. After a sufficient time, usually about 1 to 2 weeks, dues of CDRs form extensive contact with bound antigen, colonies of hybrids are observed. Single colonies are selected wherein the most extensive antigen contact is with the heavy and their culture Supernatants tested for binding activity chain CDR3. Thus, the molecular recognition units are pri against the polypeptide. Hybridomas having high reactivity marily responsible for the specificity of an antigen-binding and specificity are preferred. site. 0070 Monoclonal antibodies may be isolated from the 0074 As used herein, the term “FR set refers to the four Supernatants of growing hybridoma colonies. In addition, flanking amino acid sequences which frame the CDRS of a various techniques may be employed to enhance the yield, CDR set of a heavy or light chain V region. Some FR residues such as injection of the hybridoma cell line into the peritoneal may contact bound antigen; however, FRS are primarily cavity of a suitable vertebrate host, such as a mouse. Mono responsible for folding the V region into the antigen-binding clonal antibodies may then be harvested from the ascites fluid site, particularly the FR residues directly adjacent to the or the blood. Contaminants may be removed from the anti CDRS. Within FRS, certain amino residues and certainstruc bodies by conventional techniques. Such as chromatography, tural features are very highly conserved. In this regard, all V gel filtration, precipitation, and extraction. The polypeptides region sequences contain an internal disulfide loop of around of this invention may be used in the purification process in, for 90 amino acid residues. When the V regions fold into a bind example, an affinity chromatography step. ing-site, the CDRS are displayed as projecting loop motifs 0071. A number of therapeutically useful molecules are which form an antigen-binding Surface. It is generally recog known in the art which comprise antigen-binding sites that nized that there are conserved structural regions of FRs which are capable of exhibiting immunological binding properties influence the folded shape of the CDR loops into certain ofan antibody molecule. The proteolytic enzyme papain pref “canonical structures—regardless of the precise CDR amino erentially cleaves IgG molecules to yield several fragments, acid sequence. Further, certain FR residues are known to two of which (the “F(ab) fragments) each comprise a cova participate in non-covalent interdomain contacts which sta lent heterodimer that includes an intact antigen-binding site. bilize the interaction of the antibody heavy and light chains. The enzyme pepsin is able to cleave IgG molecules to provide 0075. The detection reagents of the present invention may several fragments, including the “F(ab')2 fragment which comprise any of a variety of detectable labels. The invention comprises both antigen-binding sites. An "Fv’ fragment can contemplates the use of any type of detectable label, includ be produced by preferential proteolytic cleavage of an IgM, ing, e.g., visually detectable labels, fluorophores, and radio and on rare occasions IgG or IgA immunoglobulin molecule. active labels. The detectable label may be incorporated within FV fragments are, however, more commonly derived using or attached, either covalently or non-covalently, to the detec recombinant techniques known in the art. The Fv fragment tion reagent. includes a non-covalent V::V heterodimer including an 0076 Methods for measuring organ-specific protein lev antigen-binding site which retains much of the antigen rec els from blood/serum/plasma include, but are not limited to, ognition and binding capabilities of the native antibody mol immunoaffinity based assays such as ELISAs, Western blots, ecule. Inbaret al. (1972) Proc. Nat. Acad. Sci. USA 69:2659 and radioimmunoassays, and mass spectrometry based meth US 2011/O 136690 A1 Jun. 9, 2011

ods (matrix-assisted laser desorption ionization (MALDI), ment, the panel/array comprises a plurality of detection MALDI-Time-of-Flight (TOF), Tandem MS (MS/MS), elec reagents, wherein, the plurality of detection reagents may be trospray ionization (ESI), Surface Enhanced Laser Desorp anywhere from about 2 to about 100, 150, 160, 170, 180, 190, tion Ionization (SELDI)-TOF MS, liquid chromatography 200, or more detection reagents each specific for an organ (LC)-MS/MS, etc). Other methods useful in this context specific secreted protein. In one embodiment, the panel/array include isotope-coded affinity tag (ICAT) followed by multi comprises at least 2, 3, 4, 5, 6,7,8,9, or 10 detection reagents dimensional chromatography and MS/MS. The procedures each specific for one of the plurality of organ-specific described herein for analysis of blood organ-specific protein secreted proteins that make up a given fingerprint. In another fingerprints can be modified and adapted to make use of embodiment, the panel/array comprises at least 11, 12, 13, 14. microfluidics and nanotechnology in order to miniaturize, 15, 16, 17, 18, 19, or 20 detection reagents each specific for parallelize, integrate and automate diagnostic procedures one of the plurality of organ-specific secreted proteins that (see e.g., L. Hood, et al., Science 306:640-643; R. H. Carlson, make up a given fingerprint. In a further embodiment, the et al., Phys. Rev. Lett. 79:2149 (1997); A. Y. Fu, et al., Anal. panel/array comprises at least 21, 22, 23, 24, 25, 26, 27, 28. Chem. 74:2451 (2002); J. W. Hong, et al., Nature Biotechnol. 29, or 30 detection reagents each specific for one of the 22:435 (2004); A. G. Hadd, et al., Anal. Chem. 69:3407 plurality of organ-specific secreted proteins that make up a (1997): I. Karube, et al., Ann. N.Y. Acad. Sci. 750:101 (1995); given fingerprint. In an additional embodiment, the panel/ L. C. Waters et al., Anal. Chem. 70:158 (1998); J. Fritz et al., array comprises at least 31, 32,33,34,35,36, 37,38,39, or 40 Science 288,316 (2000)). detection reagents each specific for one of the plurality of 0077. It should be noted that when the term “blood is used organ-specific secreted proteins that make up a given finger herein, any part of the blood is intended. Accordingly, for print. In yet a further embodiment, the panel/array comprises determining molecular blood fingerprints, whole blood may at least 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 detection be used directly where appropriate, or plasma or serum may reagents each specific for one of the plurality of organ-spe be used. cific secreted proteins that make up a given fingerprint. In an additional embodiment, the panel/array comprises at least 51, Panels/Arrays for Detecting Organ-Specific Molecular Blood 52, 53, 54, 55, 56, 57, 58, 59, or 60 detection reagents each Fingerprints specific for one of the plurality of organ-specific secreted proteins that make up a given fingerprint. In one embodiment, 0078. The present invention also provides panels for the panel/array comprises at least 61, 62, 63, 64, 65, 66, 67. detecting the organ-specific blood fingerprints at any given 68, 69, or 70 detection reagents each specific for one of the time in a subject. The term "subject” is intended to include plurality of organ-specific secreted proteins that make up a any mammal or indeed any vertebrate that may be used as a given fingerprint. In one embodiment, the panel/array com model system for human disease. Examples of Subjects prises at least 75, 80, 85,90, 100, 150, 160, 170, 180, 190, include humans, monkeys, apes, dogs, cats, mice, rats, fish, 200, or more, detection reagents each specific for one of the Zebra fish, birds, horses, pigs, cows, sheep, goats, chickens, plurality of organ-specific secreted proteins that make up a ducks, donkeys, turkeys, peacocks, chinchillas, ferrets, ger given fingerprint. bils, rabbits, guinea pigs, hamsters and transgenic species thereof. Further subjects contemplated herein include, but are 0080 Further in this regard, the solid phase surface may be not limited to, reptiles and amphibians, e.g., lizards, Snakes, of any material, including, but not limited to, plastic, poly turtles, frogs, toads, Salamanders, and newts. In one embodi carbonate, polystyrene, polypropylene, polyethlene, glass, ment, the panel/array of the present invention comprises one nitrocellulose, dextran, nylon, metal, silicon and carbon detection reagent that specifically detects an organ-specific nanowires, nanoparticles that can be made of a variety of secreted protein. In another embodiment, the panel/arrays are materials and photolithographic materials. In certain embodi comprised of a plurality of detection reagents that each spe ments, the solid phase Surface is a chip. In another embodi cifically detects an organ-specific secreted protein, wherein ment, the Solid phase Surface may comprise microtiter plates, the levels of organ-specific secreted proteins taken together beads, membranes, microparticles, the interior Surface of a form a unique pattern that defines the fingerprint. In certain reaction vessel Such as a test tube or other reaction vessel. In embodiments, detection reagents can be bispecific Such that other embodiments the peptides will be fractionated by one or the panel/array is comprised of a plurality of bispecific detec more one-dimensional columns using size separations, ion tion reagents that may specifically detect more than one exchange or hydrophobicity properties and, for example, organ-specific secreted protein. The term "specifically' is a deposited in a MALDI 96 or 384 well plate and then injected term of art that would be readily understood by the skilled into an appropriate mass spectrometer. artisan to mean, in this context, that the protein of interest is I0081. In one embodiment, the panel/array is an address detected by the particular detection reagent but other proteins able array. As such, the addressable array may comprise a are not detected in a statistically significant manner under the plurality of distinct detection reagents. Such as antibodies or same conditions. Specificity can be determined using appro aptamers, attached to precise locations on a Solid phase Sur priate positive and negative controls and by routinely opti face, such as a plastic chip. The position of each distinct mizing conditions. detection reagent on the Surface is known and therefore 007.9 The panels/arrays may be comprised of a solid “addressable'. In one embodiment, the detection reagents are phase surface having attached thereto a plurality of detection distinct antibodies that each have specific affinity for one of a reagents each attached at a distinct location. As would be plurality of organ-specific polypeptides. recognized by the skilled artisan, the number of detection I0082 In one embodiment, the detection reagents, such as reagents on a given panel would be determined from the antibodies, are covalently linked to the Solid Surface. Such as number of organ-specific secreted proteins in the fingerprint a plastic chip, for example, through the Fc domains of anti to be measured. In one embodiment, the panel/array com bodies. In another embodiment, antibodies are adsorbed onto prises one or more detection reagents. In a further embodi the solid surface. In a further embodiment, the detection US 2011/O 136690 A1 Jun. 9, 2011 reagent, such as an antibody, is chemically conjugated to the application Ser. No. entitled Methods for Identifying Solid Surface. In a further embodiment, the detection reagents and Using Organ-Specific Proteins in Blood, filed concur are attached to the solid surface via a linker. In certain rently on Jan. 27, 2006. embodiments, detection with multiple specific detection I0087. The present invention can be used as a standard reagents is carried out in solution. screening test. In this regard, one or more of the detection 0083 Methods of constructing protein arrays, including panels described herein can be run on an individual taking a particular drug and any statistically significant deviation from antibody arrays, are known in the art (see, e.g., U.S. Pat. No. a normal organ-specific molecular blood fingerprint would 5,489,678; U.S. Pat. No. 5,252,743; Blawas and Reichert, indicate that drug-related perturbation was present. Thus, the 1998, Biomaterials 19:595-609; Firestone et al., 1996, J. present invention provides a standard or “normal blood fin Amer: Chem. Soc. 18,9033-9041; Mooney et al., 1996, Proc. gerprint for any given organ. In certain embodiments, a nor Natl. Acad. Sci. 93,12287-12291; Pirrungetal, 1996, Biocon mal blood fingerprint is determined by measuring the normal jugate Chem. 7, 317-321; Gao et al., 1995, Biosensors Bio range of levels of the individual protein members of a finger electron 10, 317-328; Schena et al., 1995, Science 270, 467 print. Any deviation therefrom or perturbation of the normal 470; Lomet al., 1993, J. Neurosci. Methods, 385-397; Pope et fingerprint that is outside the standard deviation (normal al., 1993, Bioconjugate Chem. 4, 116-171; Schramm et al., range) has utility in determining drug side effects (see also 1992, Anal. Biochem. 205, 47-56; Gombotz et al., 1991, J. U.S. Patent Application No. 0020095259). As would be rec Biomed. Mater. Res. 25, 1547-1562; Alarie et al., 1990, Analy. ognized by the skilled artisan, the significance of any devia Chim. Acta 229, 169-176: Owaku et al., 1993, Sensors Actua tion in the levels of (e.g., a significantly altered level of one or tors B, 13-14, 723-724; Bhatia et al., 1989, Analy: Biochem. more of) the individual protein members of a fingerprint can 178, 408-413: Lin et al., 1988, IEEE Trans. Biomed. Enging., be determined using statistical methods known in the art and 35(6), 466-471). described herein. In certain embodiments, a normal standard 0084. In one embodiment, the detection reagents, such as can be generated from a blood sample taken from the indi antibodies, are arrayed on a chip comprised of electronically vidual prior to administration of the drug such that compari activated copolymers of a conductive polymer and the detec Sons can be made thereto. tion reagent. Such arrays are known in the art (see e.g., U.S. I0088. Further, the present invention provides methods for Pat. No. 5,837,859 issued Nov. 17, 1998: PCT publication determining and evaluating not only the absolute levels of the WO94/22889 dated Oct. 13, 1994). The arrayed pattern may changes in the proteins constituting individual fingerprints, be computer generated and stored. The chips may be prepared but also for evaluating all the protein changes (e.g. Nchanged in advance and stored appropriately. The antibody array chips proteins) and comparing them against one another to generate can be regenerated and used repeatedly. an N-dimensional shape space that provides more powerful 0085. Using the methods described herein, a vast array of correlation with the stratifications of drug-induced alterations organ-specific molecular blood fingerprints can be defined described above (see e.g., U.S. Patent Application No. for any of a variety of drugs as described further herein. As 20020095259). Such, the present invention further provides information data I0089. In a further embodiment, the present invention can bases comprising data that make up molecular blood finger be used to determine the risk of having one or more side prints as described herein. As such, the databases may com effects from a drug or combination thereof. A statistically prise the defined differential expression levels as determined significant alteration (e.g., increase or decrease) in the levels using any of a variety of methods such as those described of one or more members of a particular molecular blood herein, of each of the plurality of organ-specific Secreted fingerprint may signify a risk of developing a one or more side proteins that make up a given fingerprint in any of a variety of effects from a drug or combination thereof. settings (e.g., normal or drug-associated fingerprints). 0090 The organ-specific molecular blood fingerprints of the present invention can be used to detect side effects, either positive or negative (or lack thereof) from any of a variety of Methods of Use drugs. As would be recognized by the skilled artisan, the I0086. The present invention provides methods for identi present invention can be used to detect the side effects (or lack fying organ-specific secreted proteins and methods for iden thereof) from virtually any drug. In this regard, any drug tifying organ-specific molecular blood fingerprints. The either currently under development or already approved and present invention further provides panels/arrays of detection on the market is contemplated in the context of this invention. reagents for detecting Such fingerprints. The present inven In particular, the present invention provides methods for tion also provides defined organ-specific molecular blood monitoring the organ-specific molecular blood fingerprint of fingerprints for normal and disease settings and for finger organs/tissues/cells that fall outside the expected therapeutic prints associated with/resulting from a particular drug or targets (e.g., monitoring off-target effects of drugs). For combination of drugs. As such, the present invention provides example, the liver and kidney are organs that often reflect the for methods for identifying and monitoring drug effects in side effects of drugs. Further, as demonstrated by the side any of a variety of settings. Further, the present invention effects of COX-2 inhibitors, drugs can have off-target effects provides methods for following responses to therapy in a on the cardiovascular system, or any other cell/organ/tissue? variety of disease settings such that any adverse or other drug system as described herein. Thus, the present invention also side effects can be monitored. The present invention also provides methods for monitoring non-target organs for any provides methods of detecting disease, stratifying disease, drug, including drugs under development and drugs currently monitoring the progression of disease, and monitoring on the market for which subtle side-effects may not have been responses to therapy such as described in U.S. Provisional detected. Application Nos. 60/647685 and 60/683071 filed Jan. 27, 0091. In a further embodiment, the present invention can 2005 and May 20, 2005, respectively, and Copending U.S. be used to determine side effects of combinations of drugs on US 2011/O 136690 A1 Jun. 9, 2011

any organ. In a further embodiment, the organ-specific context, include, for example, mice, rats, rabbits, pigs, mon molecular blood fingerprints can be monitored in Subjects keys, apes, Zebra fish, etc., and transgenic species thereof. taking very low (non-toxic) doses of drugs to determine whether subtle side effects are occurring. Business Methods 0092. Thus, the organ-specific molecular blood finger 0097. A further embodiment of the present invention com prints of the present invention can be used to detect direct prises a business method of identifying a particular drug side effects or side effects of any drug on the heart, kidney, ureter, effect in a subject taking a drug that comprises detecting an bladder, urethra, liver, prostate, heart, blood vessels, bone organ-specific molecular blood fingerprint as described marrow, skeletal muscle, Smooth muscle, various specific herein. regions of the brain (including, but not limited to the 0098. Thus, the present invention contemplates methods amygdala, caudate nucleus, cerebellum, corpuscallosum, for (a) manufacturing one or more of the detection reagents, fetal, hypothalamus, thalamus), spinal cord, peripheral panels, arrays, (b) providing diagnostic services for determin nerves, retina, nose, trachea, lungs, mouth, salivary gland, ing organ-specific blood fingerprints, and identifying particu lar drug side effects (c) providing manufacturers of genomics esophagus, stomach, Small intestines, large intestines, hypo devices the use of the detection reagents, panels, arrays, blood thalamus, pituitary, thyroid, pancreas, adrenal glands, ova fingerprints or transcriptomes described herein to develop ries, Oviducts, uterus, placenta, Vagina, mammary glands, diagnostic devices, where the genomics device includes any testes, seminal vesicles, penis, lymph nodes, thymus, and device that may be used to define differences in a blood spleen. The present invention can be used to detect drug side sample between the normal and disturbed state resulting from effects on the cardiovascular system, neurological system, one or more drug side effects (d) providing manufacturers of metabolic system, respiratory system, the immune system, proteomics devices the use of the detection reagents, panels, etc. As would be recognized by the skilled artisan, the present arrays, blood fingerprints or transcriptomes described herein invention can be used to detect any side effects wherein the to develop diagnostic devices, where the proteomics device side effects cause perturbation in organ-specific secreted pro includes any device that may be used to define differences in teins. In this regard, a side effect may be an adverse effect or a blood sample between the normal and disturbed state result may be a positive effect. Accordingly, the present invention ing from a drug side effect and (e) providing manufacturers of can be used to identify potential new indications for a par imaging devices the use of the detection reagents, panels, ticular drug. arrays, blood fingerprints or transcriptomes described herein 0093. In an additional embodiment, the present invention to develop diagnostic devices, where the proteomics device can be used to determine distinct, normal organ-specific includes any device that may be used to define differences in a blood sample between the normal and disturbed state result molecular blood fingerprints, such as in different populations ing from one or more drug side effects (f) providing manu of people. In this regard, there may be differences in normal facturers of molecular imaging devices the use of the detec organ-specific molecular blood fingerprints between popula tion reagents, panels, arrays, blood fingerprints or tions of individuals that permit the stratification of patients transcriptomes described herein to develop diagnostic into classes of individuals who would respond positively to a devices, where the proteomics device includes any device that particular drug and those who would not. Thus, the present may be used to define differences in a blood sample between invention provides the ability to determine those individuals the normal and disturbed State resulting from one or more who may have adverse reactions to drugs. Additionally, in drug side effects and g) marketing to healthcare providers the certain embodiments, the nature of the drug-induced changes benefits of using the detection reagents, panels, arrays, and in one or more organ-specific molecular blood fingerprints diagnostic services of the present invention to enhance diag can be used to predict which patients might effectively nostic capabilities and thus, to better treat patients. respond to the drug. Thus, the organ-specific molecular blood 0099. Another aspect of the invention relates to a method fingerprints of the present invention provide the ability to for conducting a business, which includes: (a) manufacturing stratify patients with regard to drug response and the ability to one or more of the detection reagents, panels, arrays, (b) assess the toxicity of drugs. providing services for determining organ-specific molecular 0094. Once a side effect is detected by identifying a per blood fingerprints and (c) marketing to healthcare providers turbation in an organ-specific molecular blood fingerprint, the the benefits of using the detection reagents, panels, arrays, effect can be further mapped using systems approaches by and services of the present invention to enhance capabilities mapping the gene and protein networks perturbed (see to identify drug side effects and thus, to better treat patients. Example 1). In certain embodiments, a single organ-specific 0100 Another aspect of the invention relates to a method secreted protein may be perturbed (as indicated by detection for conducting a business, comprising: (a) providing a distri of an increase or decrease in the level of the protein in the bution network for selling the detection reagents, panels, blood). In further embodiments, more than one organ-specific arrays, diagnostic services, and access to organ-specific secreted protein may be perturbed. molecular blood fingerprint databases (b) providing instruc 0095 To monitor the monitor responses to therapy or tion material to physicians or other skilled artisans for using responses to any drug, one or more organ-specific molecular the detection reagents, panels, arrays, and organ-specific blood fingerprints are detected/measured as described herein molecular blood fingerprint databases to improve the ability using any of the methods as described herein at one time point to identify drug side effects for patients. and detected/measured again at Subsequent time points, 0101. Yet another aspect of the invention relates to a thereby monitoring responses to therapy or to any drug. method for conducting abusiness, comprising: (a)identifying 0096 Organ-specific molecular blood fingerprints can organ-specific secreted proteins in the blood Sera, etc. (b) also be defined and/or perturbations thereof tested in any of a determining the organ-specific molecular fingerprints as variety of animal models. Animals that can be used in this described herein and (c) providing a distribution network for US 2011/O 136690 A1 Jun. 9, 2011 selling access to the database of organ-specific molecular 0108. The processor performs computation and control fingerprints identified in step (b). functions of the computer system, and comprises a Suitable 0102 For instance, the subject business method can central processing unit (CPU). The processor may comprise a include an additional step of providing a sales group for single integrated circuit, such as a microprocessor, or may marketing the database, or panels, or arrays, to healthcare comprise any suitable number of integrated circuit devices providers. and/or circuit boards working in cooperation to accomplish 0103) Another aspect of the invention relates to a method the functions of a processor. for conducting a business, comprising: (a) determining one or 0109. In a preferred embodiment, the auxiliary storage more organ-specific molecular blood fingerprints and (b) licensing, to a third party, the rights for further development interface allows the computer system to store and retrieve and sale of panels, arrays, and information databases related information from auxiliary storage devices, such as magnetic to the organ-specific molecular blood fingerprints of (a). disk (e.g., hard disks or floppy diskettes) or optical storage 0104. The business methods of the present application devices (e.g., CD-ROM). One suitable storage device is a relate to the commercial and other uses, of the methodologies, direct access storage device (DASD). A DASD may be a panels, arrays, organ-specific secreted proteins, organ-spe floppy disk drive that may read programs and data from a cific molecular blood fingerprints, and databases comprising floppy disk. It is important to note that while the present identified fingerprints of the present invention. In one aspect, invention has been (and will continue to be) described in the the business method includes the marketing, sale, or licensing context of a fully functional computer system, those skilled in of the present invention in the context of providing consum the art will appreciate that the mechanisms of the present ers, i.e., patients, medical practitioners, medical service pro invention are capable of being distributed as a program prod viders, and pharmaceutical distributors and manufacturers, uct in a variety of forms, and that the present invention applies with all aspects of the invention described herein, (e.g., the equally regardless of the particular type of signal bearing methods for identifying organ-specific secreted proteins, media to actually carry out the distribution. Examples of detection reagents for Such proteins, molecular blood finger signal bearing media include: recordable type media Such as prints, etc., as provided by the present invention). floppy disks and CD ROMS, and transmission type media 0105. In a particular embodiment of the present invention, Such as digital and analog communication links, including a business method relating to providing information related to wireless communication links. molecular blood fingerprints (e.g., levels of the plurality of 0110. The computer systems of the present invention may organ-specific Secreted proteins that make up a given finger also comprise a memory controller, through use of a separate print), method for determining fingerprints and sale of panels processor, which is responsible for moving requested infor for determining such molecular blood fingerprints. In a spe mation from the main memory and/or through the auxiliary cific embodiment, that method may be implemented through storage interface to the main processor. While for the pur the computer systems of the present invention. For example, poses of explanation, the memory controller is described as a a user (e.g. a health practitioner Such as a physician or a separate entity, those skilled in the art understand that, in diagnostic laboratory technician) may access the computer practice, portions of the function provided by the memory systems of the present invention via a computer terminal and controller may actually reside in the circuitry associated with through the Internet or other means. The connection between the main processor, main memory, and/or the auxiliary Stor the user and the computer system is preferably secure. age interface. 0106. In practice, the user may input, for example, infor 0111. Furthermore, the computer systems of the present mation relating to a patient Such as the patient's disease state invention may comprise a terminal interface that allows sys and/or drugs that the patient is taking, e.g., levels determined tem administrators and computer programmers to communi for the proteins that make up a given molecular blood finger cate with the computer system, normally through program print using a panel or array of the present invention. The mable workstations. It should be understood that the present computer system may then, through the use of the resident invention applies equally to computer systems having mul computer programs, provide a diagnosis or determination of tiple processors and multiple system buses. Similarly, drug side effects that fits with the input information by match although the system bus of the preferred embodiment is a ing the fingerprint parameters (e.g., levels of the proteins typical hardwired, multidrop bus, any connection means that present in the blood as detected using a particular panel or Supports bidirectional communication in a computer-related array of the present invention) with a database offingerprints. environment could be used. 0107. A computer system in accordance with a preferred 0112 The main memory of the computer systems of the embodiment of the present invention may be, for example, an present invention Suitably contains one or more computer enhanced IBM AS/400 mid-range computer system. How programs relating to the organ-specific molecular blood fin ever, those skilled in the art will appreciate that the methods gerprints and an operating system. Computer program is used and apparatus of the present invention apply equally to any in its broadest sense, and includes any and all forms of com computer system, regardless of whether the computer system puter programs, including source code, intermediate code, is a complicated multi-user computing apparatus or a single machine code, and any other representation of a computer user device Such as a personal computer or workstation. Com program. The term “memory” as used herein refers to any puter systems Suitably comprise a processor, main memory, a storage location in the virtual memory space of the system. It memory controller, an auxiliary storage interface, and a ter should be understood that portions of the computer program minal interface, all of which are interconnected via a system and operating system may be loaded into an instruction cache bus. Note that various modifications, additions, or deletions for the main processor to execute, while other files may well may be made to the computer system within the scope of the be stored on magnetic or optical disk storage devices. In present invention Such as the addition of cache memory or addition, it is to be understood that the main memory may other peripheral devices. comprise disparate memory locations. US 2011/O 136690 A1 Jun. 9, 2011

0113 All of the U.S. patents, U.S. patent application pub transcripts must be measured. Certain low-abundance tran lications, U.S. patent applications, foreign patents, foreign Scripts, such as those encoding transcription factors and sig patent applications and non-patent publications referred to in nal transducers, wield significant regulatory influences in this specification and/or listed in the Application Data Sheet, spite of the fact they may be present in the cell at very low are incorporated herein by reference, in their entirety. More copy numbers. Differential display (Bussemakers, M. J., et over, all numerical ranges utilized herein explicitly include all al., Cancer Res, 59: 5975-5979, 1999) or cDNA microarrays integer values within the range and selection of specific (Vaarala, M. H., et al., Lab Invest, 80: 1259-1268, 2000; numerical values within the range is contemplated depending Chang, G. T., et al., Cancer Res, 57: 4075-4081, 1997) have on the particular use. Further, the following examples are been used to profile changes in gene expression during the AD offered by way of illustration, and not by way of limitation. to Al transition; however, those technologies can identify only a limited number of more abundant mRNAs, and they EXAMPLES miss many low-abundance mRNAs due to their low detection sensitivities. Massively parallel signature sequencing Example 1 (MPSS), allows 20-nucleotide signature sequences to be determined in parallel for more than 1,000,000 DNA Evidence for the Presence of Disease-Perturbed Net sequences (Brenner, et al., 2000, supra). MPSS technology works in Prostate Cancer Cells by Genomic and Pro allows identification and cataloging of almost all mRNAS that teomic Analyses: A Systems Approach to Disease are changed between two cell states, even those with one or a few transcripts percell, or between different organs or tissues. 0114. The following example demonstrates the presence Differentially expressed genes thus identified can be mapped of disease-perturbed networks in prostate. This provides a onto cellular networks to provide a systemic understanding of model for studying perturbation of organ-specific molecular changes in cellular state. blood fingerprints. The same principles apply in the setting of 0118. Although transcriptome (mRNA levels) differences determining perturbations that result from drugs. are easier to study than proteome (protein levels) differences 0115 Prostate cancer is the most common nondermato and provide extremely valuable information, cellular func logical cancer in the United States (Greenlee, R.T., et al., CA tions are usually performed by proteins. RNA expression Cancer J Clin, 50: 7-33., 2000). Initially, its growth is andro profiling studies do not address how the encoded proteins gen-dependent (AD); early-stage therapies, including chemi function biologically, and transcript abundance levels do not cal and Surgical castration, kill cancerous cells by androgen always correlate with protein abundance levels (Chen, G., et deprivation. Although Such therapies produce tumor regres al., Mol Cell Proteomics, 1: 304-313, 2002). Therefore, the Sion, they eventually fail because most prostate carcinomas mRNA expression profiling described herein was comple become androgen-independent (Al) (Isaacs, J. T. Urol Clin mented with a more limited protein profiling by using iso North Am, 26: 263-273., 1999). To improve the efficacy of tope-coded affinity tags (ICAT) coupled with tandem mass prostate cancer therapy, it is necessary to understand the spectrometry (MS/MS) (Gygi, S. P. et al., Nat Biotechnol, 17: molecular mechanisms underlying the transition from andro 994-999, 1999). gen dependence to androgen independence. 0119 The LNCaP cell line is a widely used androgen 0116. The transition from AD to AI status likely results sensitive model for early-stage prostate cancer from which from multiple processes, including activation of oncogenes, androgen-independent Sublines have been generated inactivation of tumor Suppressor genes, and changes in key (Vaarala, M. H., et al., 2000, supra; Chang, G. T., et al., 1997, components of signal transduction pathways and gene regu supra; Patel, B.J., et al., J Urol, 164: 1420-1425, 2000). The latory networks. Systems approaches to biology and disease cells of one such variant, CL-1, in contrast to their LNCaP are predicated on the identification of the elements of the progenitors, are highly tumorigenic, and exhibit invasive and systems, the delineation of their interactions and their metastatic characteristics in intact and castrated mice (Patel, changes in distinct disease states. Biological information is of G.J., et al., 2000, supra; Tso, C. L., et al., Cancer J Sci Am, 6: two types: the digital information of the genome (e.g. genes 220-233., 2000). Thus CL-1 cells model late-stage prostate and cis-control elements) and environmental cues. Proteins cancer. MPSS and ICAT data extracted from these model cell rarely act in isolation; rather, they form parts of molecular lines can be validated by real-time RT-PCR or western blot machines or participate in network interactions mediating analysis in more relevant biological models (tumor cellular functions such as signal transduction and develop Xenografts) and in tumor biopsies. mental or physiological response patterns. Gene regulatory 0.120. An MPSS analysis of about 5 million signatures was networks, whose architecture and linkages are established by conducted for the androgen-dependent LNCaP cell line and cis-control elements, integrate information from signal trans its androgen-independent derivative CL1. The resulting data duction networks and output it to developmental or physi base offers the first comprehensive view of the digital tran ological batteries or networks of effector proteins. Normal Scriptomes of prostate cancer cells and allows exploration of protein and gene regulatory networks may be perturbed by the cellular pathways perturbed during the transition from AD disease—through genetic and/or environmental perturba to AI growth. Additionally, protein expression profiles tions and understanding these differences lies at the heart of between LNCaP and CL1 cells were compared using ICAT/ systems approaches to disease. Disease-perturbed networks MS/MS technology. Further, computational analysis was initiate altered responses that bring about pathologic pheno used to identify those proteins that are secreted. Once such types such as the invasiveness of cancer cells. protein was further investigated and shown to be a diagnostic 0117 To map network perturbations in cancer initiation marker for prostate cancer used either alone, or in combina and progression, changes in expression levels of virtually all tion with the known PSA prostate cancer marker. US 2011/O 136690 A1 Jun. 9, 2011

012.1 MPSS analysis: LNCaP and CL1 cells were grown (0123 Real-time RT-PCR: All primers were designed with using methods known in the art, for example, as described by the PRIMER3 program (http colon double slash www-ge Tso et al. 2000, supra). RNAs were isolated using Trizol (Life nome dot widot mit dot edu/cgi-bin/primer/primer3 www Technologies) according to the manufacturer's protocols dot cgi) and BLAST-searched against the human cINA and (see, e.g., as described by Nelson et al. Proc Natl Acad Sci EST database for uniqueness. Real-time PCR was performed USA, 99: 11890-11895, 2002). MPSS cDNA libraries were on an ABI 7700 machine (PE Biosystems) and the SYBR constructed, individual cDNA sequences were amplified and Green dye (Molecular Probe Inc.) was used as a reporter. PCR attached to individual beads and sequenced as described by conditions were designed to give bands of the expected size Brenner, et al., 2000, Supra. The resulting signatures, gener with minimal primer dimer bands. ally 20 bases in length, were annotated using the then most 0.124 Identification of perturbed networks: Genes in the recently annotated human genome sequence (human genome 328 Biocarta and Kyoto Encyclopedia of Genes and Genomes release hg16, released in November, 2003) and the human (KEGG) pathways or networks (http colon double slash cgap Unigene (Unigene build it 184) according to a previously pub dot inci dot nih dot gov/Pathways/) were downloaded and lished method (Meyers, B.C., et al., Genome Res, 14: 1641 compared with the MPSS data, using Unigene IDs as identi 1653, 2004). Only 100% matches between an MPSS signa fiers. If a Unigene ID or an E.C. number corresponded to ture and a genome signature were considered. Those multiple signatures, potentially due to multiple alternatively signatures that expressed at less than 3 tpim in both LNCaP terminated isoforms, the tpm counts of the isoforms were and CL 1 libraries were also excluded, as they might not be combined and then subjected to the Z-test (Man, M. Z. et al., reliably detected (this represents less than one transcript per 2000, supra). Genes with P values of 0.001 or less were cell) (Jongeneel, C.V., et al., Proc Natl AcadSci USA, 2003). considered to be significantly differentially expressed. The Additionally, cDNA signatures were classified by their posi following criteria were used to identify perturbed networks: a tions relative topolyadenylation signals and poly(A)tails and perturbed network must have more than 3 genes represented by their orientation relative to the 5'-->3' orientation of source our differentially expressed gene list (rO.001) and at least 50% mRNA. The Z-test (Man, M. Z. et al., Bioinformatics, 16: of those genes must be up regulated, it was considered an 953-959, 2000) was used to calculate P values for comparison up-regulated pathway (vice versa for the down-regulated of gene expression levels between the cell lines. pathways). 0122) Isotope-Coded Affinity Tag (ICAT) analysis: ICAT 0.125 Display of KEGG networks by Cytoscape: reagents were purchased from Applied Biosystems Inc. Frac Cytoscape software was used (www dot cytoscape dot org) tionation of cells into cytosolic, microsomal and nuclear frac (Shannon, P., et al., Genome Res, 13: 2498-2504, 2003), to tions, as well as ICAT labeling, MS/MS, and data analyses map the data onto the web of intracellular molecular interac were performed as described by Hanetal. Nat Biotechnol, 19: tions. We imported metabolic network maps and related 946-951, 2001. In addition, probability score analysis (Keller, information Such as enzymes, Substrates, and reactions from A., et al., Anal Chem, 74:5383-5392, 2002) and ASAPRatio the recently developed KEGG (http colon double slash www (Automated Statistical Analysis on Protein Ratio) (Li, X.J., et al., Anal Chem, 75: 6648-6657, 2003) were used to assess the dot genome dot ad dot jp/) API 2.0 web server into the quality of MS spectra and to calculate protein ratios from Cytoscape program. Expression data were thus automatically multiple peptide ratios. (Briefly, and as described at http:// mapped to the KEGG and Biocarta pathways/networks and regis.systemsbiology.net/software, Automated Statistical visualized by Cytoscape. Analysis on Protein Ratio (ASAPRatio) accurately calculates 0.126 MPSS analyses of the androgen-dependent LNCaP the relative abundances of proteins and the corresponding cell line and its androgen-independent variant CL 1: Using confidence intervals from ICAT-type ESI-LC/MS data. The MPSS technology, 2.22 million signature sequences were software first uses a Savitzky-Golay smoothing filter to sequenced for LNCaP cells and 2.96 million for CL1 cells. reconstruct LC spectra of a peptide and its partner in a single I0127. A total of 19,595 unique transcript signatures charge state, Subtracts background noise from each spectrum, expressed at levels >3 tpim in at least one of the samples were and calculates light:heavy ratio of the peptide in that charge identified. The signatures were classified into three major state. The ratios of the same peptide in different charge states are averaged and weighted by the corresponding spectrum categories: 1093 signatures matched repeat sequences; intensity to obtain the peptide light:heavy ratio and its error. 15,541 signatures matched unique cDNAs or ESTs, and 2961 Subsequently, all unique peptides identified for a given pro signatures had no matches to any cDNA or EST sequences tein are collected, their ratios and errors calculated, outliers (but did match genomic sequences). The last category are checked for using Dixon's tests, and the relative abun included sequences falling into one of three different catego dance and confidence interval for the protein are calculated by ries: signatures representing new transcripts yet to be defined, applying statistics for weighed samples. The Software quickly signatures representing polymorphisms in cDNA sequences generates a list of interesting proteins based on their relative (a match of an MPSS sequence to cDNA or EST sequences abundance. A byproduct of the software is to identify outlier requires 100% sequence identity), or errors in the MPSS peptides which may be misidentified or, more interestingly, reads. Transcript tags with matches to a cDNA or EST post-translationally modified.) To compare protein and sequence were further classified based on the signatures mRNA expression levels, the Unigene numbers of the differ relative orientation to transcription direction and their posi entially expressed proteins were used to find MPSS signa tion relative to a polyadenylation site and/or poly(A)tail. A tures and their expression levels in transcripts per million searchable MySQL database (www dot mysql dot coin) was (tpm). If one Unigene had more than one MPSS signature, also built containing the expression levels (tpm), the genomic likely due to alternative terminations, the average tpm of all locations of the MPSS sequences, the cDNAs or EST signatures was taken. matches, and the classification of each signature. US 2011/O 136690 A1 Jun. 9, 2011

0128. The first analysis was restricted to those MPSS sig I0131) To compare the sensitivity of the MPSS and cDNA natures corresponding to cDNAs with poly(A)tails and/or microarray procedures, cDNA microarrays containing polyadenylation sites, so that corresponding genes could be 40,000 human cDNAs were hybridized to the same LNCaP conclusively identified. The Z-test was used to compare dif and CL1 RNAs that were used for MPSS. Three replicate ferential gene expression between LNCaP cells and CL1 cells array hybridizations were performed. MPSS signatures and (Mann, et al., 2000, supra). Using very stringent P values (less array clone IDs were mapped to Unigene IDs for data extrac than 0.001), 2088 MPSS signatures were identified (corre tion and comparisons. The results showed that only those sponding to 1987 unique genes, as some genes have two or genes expressed at >40 tpm by MPSS could be reliably more MPSS signatures, due to alternative usages of polyade detected as changing levels by cDNA microarray hybridiza nylation sites) with significant differential expression. Of tions judged by an expression level twice the standard devia these, 1011 signatures (965 genes) were overexpressed in tion of the background, a standard cutoff value for microarray data analysis. This observation is consistent with the 33-60 CL1 cells, and 1077 signatures (1022 genes) were overex tpim sensitivity of microarrays estimated from the experiment pressed in LNCaP cells. The significance score of Z-test was performed by Hill et al. Science, 290: 809-812, 2000, in dependent on the expression level. If a cut off P value of less which known concentrations of synthetic transcripts were than 0.001 was taken in the dataset, the expression level in added. In LNCaP and CL1 cells, about 68.75% (13.471 of tpim changed from 0 to 26 tpm for the most lowly expressed 19,595) of MPSS signatures (>3 tpm) were expressed at a transcript (>26 fold); and changed from 7591 and 11206 tpm level below 40 tpm; changes in the levels of these genes will for the most highly expressed transcript (1.48 fold). be missed by microarray methods. Many attempts have been 0129. The expression levels of nine randomly chosen made to increase the sensitivity of DNA array technology genes were identified using the MPSS and quantitative real (Han, M., et al., Nat Biotechnol, 19:631-635, 2001; Bao, P., time RT-PCR techniques and showed that both RNA data sets et al., Anal Chem, 74: 1792-1797, 2002.), however, the were concordant. The MPSS expression profiling data were present study has not compared these new improvements consistent with the available published data. For example, against MPSS but it is clear that there will still be significant using RT-PCR, Patel et al. (Patel, B. J., et al., J Urol, 164: differences in the levels of change that can be detected. 1420-1425, 2000) showed that CL1 tumors express barely I0132 SAGE (serial analysis of gene expression) (Vel detectable prostate-specific antigen (PSA) and androgen culescu, V. E., et al., Trends Genet, 16: 423-425., 2000) is receptor (AR) mRNAs as compared with LNCaP cells. The another technology for gene expression profiling; like MPSS, present MPSS results indicated that LNCaP cells expressed it is digital and can generate a large number of signature 584 tpmofandrogen receptor (AR) and 841 tpm of PSA; CL1 sequences. However, MPSS, which can sequence ~1 million cells did not express either AR or PSA (0 tpm in both cases). signatures per sample, can achieve a much deeper coverage Freedland et al. found that CD10 expression was lost in CL1 than SAGE (typical ~10,000-100,000 signatures sequenced/ cells compared with LNCaP cells (Freedland, S.J., et al., sample) at reasonable cost. The MPSS data on LNCaP cells Prostate, 55: 71-80, 2003); the present study found that CD10 was compared against publicly available SAGE data on was expressed at 0 tpm in CL1 cells but at 56 tpm in LNCaP LNCaP cells (NCBI SAGE database) through common Uni cells. Using cDNA microarrays, Vaarala et al. (Vaarala, M.H., gene IDs. The SAGE library GSM724 (total SAGE tags et al., Lab Invest, 80: 1259-1268, 2000) compared LNCaP sequenced: 22,721) (Lal, A., et al., Cancer Res, 59: 5403 cells and another androgen-independent variant, non-PSA 5407, 1999) was derived from LNCaP cells with an inacti producing LNCaP line, which is similar to CL1, and identi vated PTEN gene; it is the SAGE library most similar to the fied a total of 56 differentially expressed genes. We found LNCaP cells. Only 400 (about 20%) of the 1987 significantly completely concordant expression changes in these 56 genes differentially expressed genes (P<0.001) had any SAGE tag between LNCaP and CL1 (in contrast to 1987 found by entry in GSM724. These data illustrate the importance of MPSS), and between LNCaP and non-PSA-producing deep sequence coverage in identifying state changes in tran LNCaP cells. This underscores the striking differences in Scripts expressed at low abundance levels. sensitivity between the MPSS and cDNA microarray tech 0.133 Functional classifications of genes differentially niques. expressed between LNCaP and CL1 cells: Examination of the 0130 CL1 cells do not express AR and thus lack the AR GO () classification of the 1987 genes mediated response program. To distinguish androgen revealed that multiple cellular processes change during the response from other programs contributing to prostate cancer transition from LNCaP cells to CL1 cells. The most interest progression, the list of genes differentially expressed between ing groups, categorized by function, are shown in Table 1. LNCaP and CL1 cells were compared with a complementary 0.134 Nineteen differentially expressed proteins are list derived from MPSS analysis of LNCaP cells grown in the related to apoptosis. Twelve of these are up regulated in CL1 presence or absence of androgens (LNCaPR+/R-). From the cells, including the apoptosis inhibitors Taxl (human T-cell 1987 differentially expressed gene between LNCaP and CL1, leukemia virus type I) binding protein 1 (TAX1 BP1) and 525 genes were identified that were also differentially CASP8 and FADD-like apoptosis regulator. Seven are down expressed in the LNCaPR+/R- dataset. Differential expres regulated in CL1, including programmed cell death 8 and 5 sion of these genes between LNCaP and CL1 cells probably (apoptosis-inducing factors), and BCL2-like 13 (an apoptosis reflects the fact that LNCaP cells express AR but CL1 does facilitator). Since CL1 cells have increased expression of not, and the fact that normal medium contains some andro apoptosis inhibitors and decreased expression of apoptosis gen. The remaining 1462 differentially expressed genes were inducers, net inhibition of apoptosis may contribute to their not directly related to cellular AR status. greater tumorigenicity. US 2011/O 136690 A1 Jun. 9, 2011 18

TABLE 1.

EXAMPLES OF DIFFERENTIALLY EXPRESSED GENES AND THEIR FUNCTIONAL CLASSIFICATIONS

LNCaP CL1 SEQ ID Signatures (tpm) (tpm) Description GenBank ID NOS : Apoptosis related

GATCAAATGTGTGGCCT O 3609 lectin, galactoside BCOO1693 574 sfs (SEQ ID NO : 3) binding, soluble, 1 (galectin 1), GATCATAATGTTAACTA 14 pleiomorphic NM 002656 576 577 (SEQ ID NO : 4) adenoma gene-like 1 (PLAGL1)

GATCATCCAGAGGAGCT 16 caspase 7, 578 sfe (SEO ID NO : 5 apoptosis-related cysteine protease

GATCGCGGTATTAAATC 15 tumor necrosis U75380 581 (SEQ ID NO : 6) factor receptor superfamily, member 12

GATCTCCTGTCCATCAG 24 interleukin 1, beta M1533 O 582 583 (SEO ID NO : 7 GATCCCCTTCAAGGACA 19 nudix (nucleoside NM_006024 584 - 585 (SEQ ID NO: 8 diphosphate linked moiety X) - type motif 1

GATCATTGCCATCACCA 51 278 EST, Highly AL832 733 1586 (SEO ID NO: 9) similar to CUL2 HUMAN CULLIN HOMOLOG 2

GATCTGAAAATTCTTGG 16 56 CASP8 and U97 Ofs 1587-1588 (SEQ ID NO: 10) FADD-like apoptosis regulator

GATCCACCTTGGCCTCC 49 149 tumor necrosis NM_003842 1589-1590 (SEQ ID NO: 11) factor receptor superfamily, member 1. Oo

GATCATGAATGACTGAC 118 257 cytochrome c BCOO9582 1591-1592 (SEQ ID NO: 12)

GATCAAGTCCTTTGTGA 299 102 programmed cell H2O713 1593 (SEQ ID NO: 13) death 8 (apoptosis inducing factor)

GATCACCAAAACCTGAT 72 24 BCL2-like 13 1594 (SEQ ID NO: 14) (apoptosis facilitator)

GATCAATCTGAACTATC 563 146 apoptosis related NM 016085 1595-1596 (SEQ ID NO: 15) protein APR-3 (APR-3)

GATCCCTCTGTACAGGC 83 13 unc-13-like (C. NM_006377 1597-1598 (SEQ ID NO: 16) elegans) (UNC13), mRNA

GATCTGGTTGAAAATTG 1 OO6 49 CED-6 protein NM 016315 1599 - 16 OO (SEO ID NO : 17) (CED-6), mRNA.

GATCTCCCATGTTGGCT 86 4 CASP2 and RIPK1 BCO 17042 16O1 - 16 O2 (SEQ ID NO: 18) domain containing adaptor with death domain US 2011/O 136690 A1 Jun. 9, 2011 19

TABLE 1 - continued

EXAMPLES OF DIFFERENTIALLY EXPRESSED GENES AND THEIR FUNCTIONAL CLASSIFICATIONS

LNCaP CL1 SEQ ID Signatures (tpm) (tpm) Description GenBank ID NOS :

GATCAGAAAATCCCTCT 27 DEAD/H (Asp BCO 11556 1603-1604 (SEQ ID NO: 19) Glu-Ala-Asp / His) box polypeptide 20, 103kDa

GATCAAGGATGAAAGCT SO programmed cell 1605 (SEQ ID NO: 2O) death 2 GATCTGATTATTTACTT 1227 321 programmed cell NM_004708 1606-16O7 (SEQ ID NO: 21) death. 5 GATCAAGTCCTTTGTGA 299 1 O2 programmed cell NM_004208 1608 - 1609 (SEQ ID NO: 22) death 8 (apoptosis inducing factor) Cyclins

GATCCTGTCAAAATAGT 47 MCT-1 protein NM 014060 1610-1611 (SEQ ID NO: 23) (MCT-1) mRNA. GATCATTATATCATTGG 39 cyclin-dependent NM 078487 1612-1613 (SEQ ID NO: 24) kinase inhibitor 2 (CDKN2)

GATCATCAGTCACCGAA 38 396 cyclin-dependent BMO54921 1614 (SEQ ID NO: 25) kinase inhibitor 2A (p16)

GATCGGGGGCGTAGCAT 43 cyclin D1 NM O53056 1615-1616 (SEQ ID NO: 26)

GATCTACTCTGTATGGG 4 O 144 cyclin fold protein BG119256 1617 (SEO ID NO: 27)

GATCAGCACTCTACCAC 53 O 258 cyclin B1 BM973. 693 1618 (SEQ ID NO: 28)

GATCTGGTGTAGTATAT 21 O 77 cyclin G2 BM98455.1 1619 (SEQ ID NO: 29)

GATCAGTACACAATGAA 642 224 cyclin G1, BCOOO1.96 1620-1621 (SEQ ID NO: 30)

GATCTCAGTTCTGCGTT 918 CDK2-associated NM 004642 1622-1623 (SEQ ID NO: 31) protein 1 (CDK2AP1), mRNA

GATCCTGAGCTCCCTTT 249 O 650 cyclin. 1, BCOOO42O 1624 - 1625 (SEQ ID NO: 32)

GATCATGCAGTGACATA 15 KIAA1028 protein AL122 O55 1626-1627 (SEQ ID NO: 33)

GATCTGTATGTGATTGG 28 cyclin M3 AA489 Off 1628 (SEQ ID NO: 34)

Kallikreins

GATCCACACTGAGAGAG 841 KLK3 AA523902 1629 (SEO ID NO : 35)

GATCCAGAAATAAAGTC 385 KLK4 AA595 489 163 O (SEQ ID NO: 36)

GATCCTCC TATGTTGTT 314 KLK2 S39329 1631-1633 (SEO ID NO : 37) US 2011/O 136690 A1 Jun. 9, 2011 20

TABLE 1 - continued

EXAMPLES OF DIFFERENTIALLY EXPRESSED GENES AND THEIR FUNCTIONAL CLASSIFICATIONS

LNCaP CL1 SEQ ID Signatures (tpm) (tpm) Description GenBank ID NOS :

CD markers

GATCAGAGAAGATGATA O 810 CD213 a2, interleukin UFO981 1634 - 1635 (SEQ ID NO: 38) 13 receptor, alpha 2

GATCCCTAGGTCTTGGG 23 16.1 CD213al, interleukin AW874 O23 1636 (SEO ID NO. 39) 13 receptor, alpha 1.

GATCCACATCCTCTACA O 63 CD33, CD33 BCO281.52 1637-1638 (SEQ ID NO: 40) antigen (gp67)

GATCAATAATAATGAGG O 151 CD44, CD44 AI 8326.42 1639-164 O (SEQ ID NO: 41) antigen

GATCCTTCAGCCTTCAG O 35 CD 73, 5'- AI831.695 1641 (SEQ ID NO: 42) nucleotidase, ecto (CD73)

GATCTGGAACCTCAGCC 1. 50 CD49e, integrin, BCOO8786 1642 - 1643 (SEQ ID NO : 43) alpha 5

GATCAGAGATGCACCAC 8 122 CD138, syndecan 1 BM974 O52 1644 (SEQ ID NO: 44)

GATCAAAGGTTTAAAGT 38 189 CD166, activated AL8337 O2 1645 (SEQ ID NO: 45) leukocyte cell adhesion molecule

GATCAGCTGTTTGTCAT 53 295 CD71, transferrin BCOO1.188 1646-1647 (SEQ ID NO: 46) receptor (pe O, CD71)

GATCGGTGCGTTCTCCT 287 509 CD107a, AI521424 1648 (SEO ID NO: 47) lysosomal associated membrane protein 1.

GATCTACAAAGGCCATG 161 681 CD29, integrin, NM_002211 1649-1650 (SEQ ID NO: 48) beta 1

GATCATTTATTTTAAGC 56 O CD10 (neutral BQ013520 1651 (SEQ ID NO: 49) endopeptidase, enkephalinase)

GATCAGTCTTTATTAAT 15 O 50 CD107b, AI459107 1652 (SEO ID NO : 50) lysosomal associated membrane protein 2

GATCTTGGCTGTATTTA 84 1014 CD59 antigen p18 NM 000611 1653-1654 (SEQ ID NO: 51) 2O GATCTTGTGCTGTGCTA 408 234 CD9 antigen (p.24) NM_001769 1655-1656 (SEQ ID NO: 52) Transcription factors

GATCAAATAACAAGTCT O 62 transcription factor BM854 818 1657 (SEO ID NO : 53) BMAL2

GATCTCTATGTTTACTT O 27 transcription factor BG163364 1658 (SEQ ID NO: 54) BMAL2

GATCCTGACACATAAGA 12 74 transcription factor BFO55294 1659 (SEO ID NO : 55) BMAL2 US 2011/O 136690 A1 Jun. 9, 2011 21

TABLE 1 - continued

EXAMPLES OF DIFFERENTIALLY EXPRESSED GENES AND THEIR FUNCTIONAL CLASSIFICATIONS

LNCaP CL1 SEQ ID Signatures (tpm) (tpm) Description GenBank ID NOS :

GATCATTTTGTATTAAT 10 61 transcription factor BCO 47878 1660-1661 (SEO ID NO. 56) NRF GATCGTCTCATATTTGC 52 O transcriptional NM O25085 1662-1663 (SEO ID NO : 57) coactivator tubedown-1 OO

GATCCCCCTCTTCAATG O 31 transcriptional co AJ299 431 1664 - 1665 (SEO ID NO. 58) activator with PDZ-binding motif

GATCAAATGCTATTGCA 1. 55 transcriptional AI1265OO 1666 (SEO ID NO. 59) regulator interacting with the PHS-bromodomain 2

GATCTGTGACAGCAGCA 14 O 35 transducer of BCO314 O6 1667-1668 (SEQ ID NO: 6O) ERBB2, 1

GATCAAATCTGTACAGT 239 23 transducer of AA69424 O 1669 (SEQ ID NO : 61) ERBB2, 2

Annexins and their ligands

GATCCTGTGCAACAAGA O 69 annexin A10 BCOO732O 1670- 1671 (SEQ ID NO: 62)

GATCTGTGGTGGCAATG 41 630 annexin A11 AL576782 1672 (SEQ ID NO: 63)

GATCAGAATCATGGTCT O 1079 annexin A2 BCOO1388 1673 - 1674 (SEQ ID NO: 64)

GATCTCTTTGACTGCTG 21 O 860 annexin A5 BCOO1429 1675-1676 (SEO ID NO : 65)

GATCCAAAAACATCCTG 83 241 annexin A6 AIS 66871 1677 (SEQ ID NO: 66)

GATCAGAAGACTTTAAT O 695 annexin A1 BCOO1275 1678- 1679 (SEO ID NO : 67)

GATCAGGACACTTAGCA O 2949 S100 calcium BCO 15973 1680-1681 (SEQ ID NO: 68) binding protein A10 (annexin II ligand)

Matrix metalloproteinase

GATCATCACAGTTTGAG O 38 matrix BCOO2591 1682-1683 (SEQ ID NO: 69) metalloproteinase 1O (stromelys in 2)

GATCCCAGAGAGCAGCT O 108 matrix BCO13118 1684 - 1685 (SEO ID NO : 7O) metalloproteinase 1 (interstitial collagenase)

GATCGGCCATCAAGGGA O 25 matrix AI370581 1686 (SEO ID NO : 71.) metalloproteinase 13 (collagenase 3)

GATCTGGACCAGAGACA O 10 matrix BG33215 O 1687 (SEO ID NO : 72) metalloproteinase 2 (gelatinase A) US 2011/O 136690 A1 Jun. 9, 2011 22

0135 Matrix metalloproteinases (MMPs), which degrade rather generally operate in networks of interactions. Identi extracellular matrix components that physically impede cell fying key nodes (proteins) in the disease-perturbed networks migration, are implicated in tumor cell growth, invasion, and may provide insights into effective drug targets. Comparing metastasis. MMP1, 2, 10 and 13 were found to be signifi the genes (proteins) currently available in the 314 BioCarta cantly overexpressed in CL1 cells (Table 1), which may par and 155 KEGG pathway or network (httpcolon double slash tially explain these cells aggressive and metastatic behavior. cgap dot inci dot nih dot gov/Pathways/) databases with the 0.136 CD (cluster designation of monoclonal antibodies) MPSS data through Unigene IDs, we identified 37 BioCarta markers are generally localized at the cell Surface; some may and 14 KEGG pathways that are up regulated and 23 BioCarta be associated with prostate cancer (Liu, A.Y., et al., Prostate, and 22 KEGG pathways down regulated in LNCaP cells 40: 192-199, 1999). All currently identified CD markers versus CL1 cells (Table 2). The number of genes whose (CD1 to CD247) from the PROW CD index database (http expression patterns changed in each pathway is listed in Table colon double slash www dot ncbi Dot nlm dot nih dot gov/ 2. Each gene along with its expression level in LNCaP and prow/guide/45277084 dot htm) were converted to UniGene CL1 cells is listed pathway by pathway in our database (fip numbers and the Unigene numbers used to identify their colon double slash ftp dot systemsbiology dot net/blin/mpss). signatures and their expression levels. Fifteen CD markers Changes in these pathways reveal the underlying phenotypic were identified that were differentially expressed between differences between LNCaP and CL1 cells. For example, LNCaP and CL1 cells (Z score<0.001) (Table 1). Eleven CD multiple networks involved in modulating cell mobility, markers, including CD213a2 and CD213a1, which encode adhesion and spreading are up regulated in CL1 cells, which IL-13 receptors alpha 1 and 2, are up regulated in CL1 cells; are more metastatic and invasive than LNCaP cells (Table 2). three CD markers, CD9, CD10, and CD107, WERE down In the uCalpain and Friends in Cell Spread pathway, calpains regulated in these cells (Table 1). Six CD markers went from are calcium-dependent thiol proteases implicated in cytosk 0 or 1 tipm to >35tpm (Table 1), making them good digital or eletal rearrangements and cell migration. During cell migra absolute markers or therapeutic targets. These data Suggest tion, calpain cleaves target proteins such as talin, eZrin, and that carefully selected CD markers may be useful in following paxillin at the leading edge of the membrane, while at the the progression of prostate cancer, and indeed could serve as same time cleaving the cytoplasmic tails of the integrins B1 (a) potential targets for antibody-mediated therapies (Liu, A.Y., and B3(b) to release adhesion attachments at the trailing et al., Prostate, 40: 192-199, 1999). membrane edge. Increased activity of calpains increases 0.137 Delineation of disease-perturbed networks in pros migration rates and facilitates cell invasiveness (Liu, A. et al., tate cancer cells. Genes and proteins rarely act alone but Prostate, 40: 192-199, 1999).

TABLE 2

PATHWAYS THAT ARE UP OR DOWN REGULATED COMPARING LNCAP TO CL1 CELLS. # Genes hits # p < 0.001 & # p < 0.001 & # no Pathways in a pathway LNCA > CL1 LNCA < CL1 change Up-regulated Pathways in LNCAP cells BioCarta Pathways Mechanism of Gene Regulation 35 9 2 24 by Peroxisome Proliferators via PPARa alpha T Cell Receptor Signaling 21 6 2 13 Pathway ATM Signaling Pathway 5 5 2 8 CARM1 and Regulation of the 8 5 2 11 Estrogen Receptor HIV-I Nef negative effector of 33 5 2 26 Fas and TNF EGF Signaling Pathway 7 5 11 Role of BRCA1 BRCA2 and 6 5 10 ATR in Cancer Susceptibility TNFR1 Signaling Pathway 7 5 11 Toll-Like Receptor Pathway 7 5 11 FAS signaling pathway CD95 7 4 12 VEGF Hypoxia and 6 4 11 Angiogenesis Bone Remodelling 9 3 5 ER associated degradation 1 3 7 ERAD Pathway Estrogen-responsive protein 1 3 7 Efp controls cell cycle and breast tumors growth influence of Ras and Rho 6 3 12 proteins on G1 to S Transition inhibition of Cellular 3 3 9 Proliferation by Gleevec US 2011/O 136690 A1 Jun. 9, 2011 23

TABLE 2-continued

PATHWAYS THAT ARE UP OR DOWN REGULATED COMPARING LNCAP TO CL1 CELLS. # Genes hits # p < 0.001 & # no Pathways in a pathway LNCA > CL1 change Map Kinase Inactivation of 9 3 1 5 SMRT Corepressor NFkB activation by 6 3 1 12 Nontypeable Hemophilus influenzae RB Tumor Suppressor O 3 Checkpoint Signaling in response to DNA damage Transcription Regulation by O 3 Methyltransferase of CARM1 Ceramide Signaling Pathway 3 4 O Cystic fibrosis transmembrane 7 4 conductance regulator and beta 2 adrenergic receptor pathway Nerve growth factor pathway 1 4 NGF PDGF Signaling Pathway 6 4 12 TNF Stress Related Signaling 4 4 O 10 Activation of Csk by cAMP 9 3 dependent Protein Kinase Inhibits Signaling through the T Cell Receptor AKAP95 role in mitosis and 1 3 dynamics Attenuation of GPCR Signaling 3 Chaperones modulate 1 3 8 interferon Signaling Pathway ChREBP regulation by 2 3 O carbohydrates and cAMP IGF-1 Signaling Pathway 1 3 Insulin Signaling Pathway 1 3 NF-kB Signaling Pathway 1 3 Protein Kinase A at the 2 3 Centrosome Regulation ofck1 colk5 by type O 3 O 1 glutamate receptors Role of Mitochondria in O 3 Apoptotic Signaling Signal transduction through 4 3 11 IL1R KEGG Pathways Aminosugars metabolism 24 9 11 Androgen and estrogen 37 13 5 19 metabolism Benzoate degradation via 5 3 hydroxylation C21-Steroid hormone 4 1 metabolism C5-Branched dibasic acid 2 2 metabolism Carbazole degradation 1 1 Terpenoid biosynthesis 6 Chondroitin heparan Sulfate 14 8 biosynthesis Fatty acid biosynthesis (path 1) 3 2 Fluorene degradation 3 2 Pentose and glucuronate 19 9 interconversions Phenylalanine, tyrosine and 10 5 tryptophan biosynthesis Porphyrin and chlorophyll 28 13 12 metabolism Streptomycin biosynthesis 6 4 Up-regulated Pathways in CL1 cells BioCarta Pathways Rho cell motility signaling 18 2 10 pathway US 2011/O 136690 A1 Jun. 9, 2011 24

TABLE 2-continued

PATHWAYS THAT ARE UP OR DOWN REGULATED COMPARING LNCAP TO CL1 CELLS. # Genes hits # p < 0.001 & # no Pathways in a pathway LNCA > CL1 change Trefoil Factors Initiate 14 6 7 Mucosal Healing Integrin Signaling Pathway 14 5 8 Ca Calmodulin-dependent 7 4 2 Protein Kinase Activation Effects of calcineurin in 9 Keratinocyte Differentiation Angiotensin II mediated 12 activation of JNK Pathway via Pyk2 dependent signaling Bioactive Peptide Induced 16 12 Signaling Pathway CBL mediated ligand-induced 6 downregulation of EGF receptors Control of skeletal myogenesis 12 by HDAC calcium calmodulin-dependent kinase CaMK How does salmonella hijack a 8 cell Melanocyte Development and 4 Pigmentation Pathway Overview of telomerase protein 7 component gene hTert Transcriptional Regulation Regulation of PGC-1a 9 O ADP-Ribosylation Factor 9 O Downregulated of MTA-3 in 7 O ER-negative Breast Tumors Endocytotic role of NDK 7 O Phosphins and Dynamin Mechanism of Protein Import 7 O into the Nucleus Nuclear Receptors in Lipid 7 O Metabolism and Toxicity Pertussis toxin-insensitive 9 O CCR5 Signaling in Macrophage Platelet Amyloid Precursor 5 O Protein Pathway Role of Ran in mitotic spindle 8 O regulation Sumoylation by RanBP2 8 O Regulates Transcriptional Repression uCalpain and friends in Cell 5 O spread KEGG Pathways Arginine and proline 45 7 metabolism ATP synthesis 31 7 Biotin metabolism 5 1 Blood group glycolipid 12 1 biosynthesis -lactoseries yanoamino acid metabolism 5 O hylbenzene degradation 9 1 anglioside biosynthesis 16 2 oboside metabolism 17 3 utathione metabolism 26 4 ycine, serine and threonine 32 6 etabolism ycosphingolipid metabolism 35 6 ycosylphosphatidylinositol(GPI)- 26 5 anchor biosynthesis Glyoxylate and dicarboxylate 9 1 metabolism Huntington's disease 25 4 Methane metabolism 9 1 US 2011/O 136690 A1 Jun. 9, 2011 25

TABLE 2-continued

PATHWAYS THAT ARE UP OR DOWN REGULATED COMPARING LNCAP TO CL1 CELLS. # Genes hits # p < 0.001 & # p < 0.001 & # no Pathways in a pathway LNCA > CL1 LNCA < CL1 change O-Glycans biosynthesis 19 3 8 8 One carbon pool by folate 12 2 8 2 Oxidative phosphorylation 93 21 45 27 Parkinson's disease 30 5 14 11 Phospholipid degradation 21 4 12 5 Synthesis and degradation of 7 1 3 3 ketone bodies Urea cycle and metabolism of 18 2 8 8 amino groups

0.138. Many pathways we identified as perturbed in the Commun, 322:1166-1170, 2004), seven are involved in fatty LNCaP and CL1 comparison are interconnected to form net acids and lipid metabolism that are involved in the carcino works (in fact there are probably no discrete pathways, only genesis and progression of prostate cancer (Pandian, S.S., et networks). For example, the insulin signaling pathway, the al., J R Coll Surg Edinb, 44; 352-361, 1999), five are related signal transduction through IL1R pathway, NF-kB signaling to apoptosis, 11 are cancer related, and five proteins are pathway are interconnected through c-Jun, IL1R and NF-kB. putative transcription factors. As we only identified a limited The mapping of genes onto networks/pathways will be an number of proteins that are significantly differentially ongoing objective as more networks/pathways become avail expressed due to low sensitivity of ICAT technology, we were able. Our transcriptome data will be an invaluable resource in only able to identify a few pathways that are perturbed based delineating these relationships. on ICAT data alone (using the stringent criteria discussed 0.139. As gene regulatory networks controlled by tran above). This also illustrated the importance of MPSS analysis scription factors form the top layer of the hierarchy that described earlier. controls the physiological network, we sought to identify 0141 103 of 190 (54%) differentially expressed proteins differentially expressed transcription factors. Of 554 tran identified have enzymatic activity and hence many are scription factors expressed in LNCaP and CL1 cells, 112 involved in metabolism. Notably, many of the proteins iden showed significantly different levels between the cell lines tified are involved in fatty acid and lipid metabolism, includ (P<0.001) This clearly demonstrated significant difference in ing fatty acid synthase, carnitine palmitoyltransferase II and the functioning of the corresponding gene regulatory net propionyl Coenzyme A carboxylase alpha polypeptide. Fatty works during the progression of prostate cancer from the acid and lipid metabolism is known to be perturbed in prostate early to late stages. cancer (Fleshner, N., et al., J Urol, 171: S 19-24, 2004). 0140. Quantitative Proteomics Analysis of Prostate Can Additionally, many genes involved in lipid transport were cer Cells. We quantitatively profiled the protein expression altered, including the annexins, prosaposin, and fatty acid changes between LNCaP and CL1 cells using the ICAT binding protein 5. Annexin A1 has previously been shown to MS/MS protocol described by Han et al. Nat Biotechnol, 19: be overexpressed in non-PSA-producing LNCaP cells as 946-951, 2001. To increase proteome coverage, cells were compared with PSA-producing LNCaP cells (Vaarala, M. H., separated into nuclear, cytosolic and microsomal fractions et al., 2000, supra). Annexin A7 is postulated to be a prostate prior to ICAT analysis as described in Han et al., 2001, Supra. tumor Suppressor gene (Cardo-Vila, M., et al., Pharmacoge We generated a total of 142,849 tandem mass spectra, 7282 of nomics J. 1:92-94, 2001). Annexin A2 expression is reduced which corresponded to peptides with a mass spectrum quality or lost in prostate cancer cells, and its re-expression inhibits score P value (Keller, A., et al., Anal Chem. 2002 Oct 15:74 prostate cancer cell migration (Liu, J. W., et al. Oncogene, (20):5383-92) greater than 0.9 (allowing unambiguous iden 22: 1475-1485, 2003). tification of peptides). These 7282 peptides represented 971 0142. Other genes identified here have been implicated in proteins (Keller, A., et al., 2002, supra). We obtained quanti carcinogenesis, including tumor Suppressor p16 and insulin tative peptide ratios for 4583 peptides corresponding to 941 like growth factor 2 receptor (Chi, S. G., et al., Clin Cancer proteins. The number of peptides is greater than the number of Res, 3: 1889-1897, 1997; Kiess, W., et al., Horm Res, 41 proteins because 1) mass spectrometry identified multiple Suppl 2: 66-73, 1994). Some genes have previously been peptides from the same protein and 2) the ionization step of implicated in prostate cancer, such as prostate cancer over mass spectrometry created different charge states for the expressed gene 1 POV1, which is over expressed in prostate same peptide. The protein ratios were calculated from mul cancer (Cole, K.A., et al., Genomics, 51:282-287, 1998), and tiple peptide ratios using an algorithm for the automated delta 1 and alpha 1 catenin (cadherin-associated protein) and statistical analysis of protein abundance ratios (ASAPRatio) junction plakoglobin, which are down regulated in prostate (Li, X.J., et al., Anal Chem, 75: 6648-6657, 2003). In the end, cancer cells (Kallakury, B.V., et al., Cancer, 92: 2786-2795, we identified 82 proteins that are down regulated and 108 2001). However, the potential relationships of most of the proteins that are up regulated by at least 1.8-fold in LNCaP proteins identified here to prostate cancer require further elu cells compared with CL1 cells. For example, five proteins cidation. For example, transmembrane protein 4 (TMEM4), a belong to annexins that were markers for prostate and other gene predicted to encode a 182-amino acid type II transmem cancers (Hayes, M.J. and Moss, S. E. Biochem Biophy's Res brane protein, is downregulated about twofold in CL1 cells US 2011/O 136690 A1 Jun. 9, 2011 26 compared with LNCaP cells. MPSS data also indicated that the protein IDs and MPSS signatures to Unigene IDs to com TMEM4 is down regulated about twofold in CL1 cells. Many pare the MPSS data with the ICAT MS/MS data. Welimited type II transmembrane proteins, such as TMPRSS2, are over this comparison to those with common Unigene IDs and with expressed in prostate cancer patients (Vaarala, M. H., et al., reliable ICAT ratios (standard deviation less than 0.5) and ended up with a subset of 79 proteins. Of these, 66 genes IntJ Cancer, 94: 705-710, 2001). It will be interesting to see (83.5%) were concordant in their changes in mRNA and whether TMEM4 overexpression plays a primary role in protein levels of expression and 13 genes (16.5%) were dis prostate carcinogenesis. We also identified 12 proteins that cordant, i.e. having higher protein expression but lower have not been annotated or functionally characterized. mRNA expression or vice versa. There are no functional 0143. The mRNA expression level of eight proteins similarities among the discordant genes. As these mRNAS change from 0 tpm in LNCaP cells to greater than 50 tpm (we and proteins are expressed at relatively high levels, discor called them digital changes because they go from Zero to dance due to measurement errors is unlikely. Clearly post Some expression) in CL1 cells, and that of one protein transcriptional mechanism(s) of protein expression are func changed from 0 tpm in CL1 cells to greater than 50 in LNCaP tioning, although the elucidation of the specific mechanism cells. These genes can be used as digital diagnostic signals. (s) awaits further studies. Twenty-two of the differentially expressed proteins were pre 0145 This system provides a model for studying pertur dicted to be secreted proteins (See Table 3) and can be further bation of organ-specific molecular blood fingerprints. These evaluated as serum marker (see also Example 2 below). results, and those described in the Examples below, indicate a 0144. Additionally, we sought to compare the expression systems approach will offer powerful tools for disease diag at the protein level with that at the mRNA level. We converted nostics, drug side effects diagnostics, and therapeutics.

TABLE 3

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCAGCATGGGCCACG 73 NM_001928 594-595 D component of complement (adipsin)

GATCTACTACTTGGCCT 74 NM_006280 596-597 signal sequence receptor, delta (translocon associated protein delta)

GATCCTGTTGGGAAAGA 7s NM 203329 598-599 CD59 antigen p18-20 (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCCTGTTGGGAAAGA 76 NM 203331 6OO - 601 CD59 antigen p18-20 (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCCCTGAAGTTGCCC 77 NM 203331 6OO - 601 CD59 antigen p18-20 (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCTTGGCTGTATTTA 78 NM 203331 6OO - 601 CD59 antigen p18-20 (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCCCTGAAGTTGCCC 79 NM 203330 6O2- 603 CD59 antigen p18-20 (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344) US 2011/O 136690 A1 Jun. 9, 2011 27

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCCTGTTGGGAAAGA 8O NM 203330 6O2- 603 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCTTGGCTGTATTTA 81 NM 203330 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCCCTGAAGTTGCCC 82 NM 203329 598-599 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCTTGGCTGTATTTA 83 NM 000611 604 - 605 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344) GATCCCTGAAGTTGCCC 84 NM 000611 604 - 605 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCCTGTTGGGAAAGA 85 NM 000611 604 - 605 CD59 antigen p18-2O (antigen identified by monoclonal antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCTTGGCTGTATTTA 86 NM 203329 598-599 CD59 antigen p18-2O (antigen identified by antibodies 16.3A5, EJ16 EJ3 O, EL32 and G344)

GATCTGTGCTGACCCCA 87 NM_002982 6O 6- 607 chemokine (C-C motif) ligand 2 GATCTCTTGGAATGACA 88 NM 012242 608 - 609 dickkopf homolog 1 (Xenopus laevis GATCACCATCAAGCCAG 89 NM 012242 608 - 609 dickkopf homolog 1 (Xenopus laevis

GATCAAACAGCTCTAGT 90 NM 016308 610 - 611 UMP - CMP kinase GATCCCCTGTTACGACA 91 NM_014 155 61.2 - 613 HSPCO 63 protein GATCTCTGATTACCAGC 92 NM_025205 614 - 615 mediator of RNA polymerase II transcription, subunit 28 homolog (yeast)

GATCATTGAACGAGACA 93 NM 03.1903 616 - 617 mitochondrial ribosomal protein L32 US 2011/O 136690 A1 Jun. 9, 2011 28

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GA CACAGACCACGAGT 94 NM 178507 618 - 619 NSSATP13TP2 protein

GA CTGCATCAGTTGTA 95 NM 148170 620 - 6.21 cathepsin C

GA CTCTTGCTAGATTT 96 NM_005059 622 - 623 relaxin 2

GA CACAAGGCTGCCTG 97 NM 000405 624 - 625 GM2 ganglioside activator

GA CGTTTCTCATCTCT 98 NM_006.432 626 - 627 Niemann-Pick disease, type C2

GA CCCCGCGATACTTC 99 NM_015921 628 - 629 chromosome 6 open reading frame 82

GA CTTTTTTTGGATAT OO NM 181777 630 - 631 ubiquitin- conjugating enzyme E2A (RAD6 homolog)

GA CCGAGAGTAAGGAA O1 NM 032488 632 - 633 cornifelin

GA CATGTGTTTCCATG O2 NM_014.435 634 - 635 N-acyl sphingosine amidohydrolase (acid ceramidase) - like

GA CTCAGAACAACCTT O3 NM 016029 636 - 637 dehydrogenase/ reductase (SDR family) member 7

GA CTTAC CTCCTGATA O4 NM_020467 638 - 639 hypothetical protein from clone 643

GA CCCAGACTGGTTCT O5 NM_003 782 64 O - 6 41 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 4

GA CAAGTGCATTTGAC O6 NM 173631 642 - 643 zinc finger protein 547

GA CAGTGCGTCATGGA O7 NM_005423 644 - 645 trefoil factor 2 (spasmolytic protein 1)

GA CCAAGAGGAAGAAT O8 NM_014402 64 6- 6 47 low molecular mass ubiquinone-binding protein (9.5kD)

GA CCAGCAAACAGGTT O9 NM_003851 648 - 649 cellular repressor of E1A-stimulated genes 1

GA CATAGAAGGCTATT 10 NM 181834 650 - 651 neurofibromin 2 (bilateral acoustic neuroma)

GA CCCCCTTCATTTGA 11 NM_004862 652 - 653 lipopolysaccharide induced TNF factor

GA CCCAAATTTGAAGT 12 NM_001685 654 - 655 ATP synthase, H+ transporting, mitochondrial FO complex, subunit F6 US 2011/O 136690 A1 Jun. 9, 2011 29

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCTGCTTTCTGTAAT 113 NM_002406 65. 6- 657 mannosyl (alpha-1,3- ) - glycoprotein beta 1, 2-N- acetylglucosaminyl transferase

GATCACTCCTTATTTGC 114 NM_019021 658 - 659 hypothetical protein FLJ2OO1 O

GATCACCTTCGACGACT 115 NM_003130 660 - 661 sorcin GATCTCTATTGTAATCT 116 NM_002489 662 - 663 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 4, 9kDa GATCTCCTGGCTGCAAA 17 NM 138429 664 - 665 claudin 15 GATCCCAGTCTCTGCCA 18 NM 201397 666 - 667 glutathione peroxidase 1

GATCTTCT, TTATAATTC 19 NM_004.048 668 - 669 beta-2-microglobulin GATCTGTTCAAACAGCA NM 024060 670- 671 hypothetical protein MGC 5395

GATCGTGCTCACAGGCA 21 NM_033280 672 - 673 SEC11-like 3 (S. cerevisiae

GATCAATATGTAAATAT 22 NM_020199 674 - 675 open reading frame 15 GATCAGCTTTGCTCCTG 23 NM 207495 676- 677 hypothetical protein DKFZp686115217

GATCTCTATGGCTGTAA 24 NM_033211 678-679 hypothetical gene supported by AFO38182; BCOO92O3

GATCTCAGAACCTCTGT 125 NM_001001436 680 - 681 Similar to RIKEN cDNA 4.92.1524 J17

GATCCAGCCATTACTAA 126 NM_016205 682 - 683 platelet derived growth factor C GATCTTTCCCAAGATTG 127 NM_001001434 684 - 685 syntaxin 16 GATCGATTCTGTGACAC 128 NM 181726 686 - 687 low density lipoprotein receptor related protein binding protein

GATCTATTTTTTCTAAA 129 NM_004125 688 - 689 guanine nucleotide binding protein (G protein), gamma 10 GATCAAGAATCCTGCTC 13 O NM_006332 690 - 691 interferon, gamma inducible protein 3 O GATCGGTGGAGAACCTC 131 NM 175742 692 - 693 melanoma antigen, family A, 2 GATCGGTGGAGAACCTC 132 NM 175743 694 - 695 melanoma antigen, family A, 2 GATCGGTGGAGAACCTC 133 NM 153488 69. 6- 697 melanoma antigen, family A, 2B US 2011/O 136690 A1 Jun. 9, 2011 30

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCATGGGTGAGGGGT 134 NM_001483 698-699 glioblastoma amplified sequence

GATCCCCCT CACCATGA 135 NM 032621 700-701 brain expressed X linked 2

GATCAACTAATAGCTCT 136 NM 181892 7 O2- 703 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GA CAAATAAAGTTATA 37 NM 181892 7 O2- 703 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GA CAAGGAGACCCGGA 38 NM_024540 704 - 705 mitochondrial ribosomal protein L24

GA CAAGGAGACCCGGA 39 NM 145729 7O6 - 7 Of mitochondrial ribosomal protein L24

GA CCTAAGCCATAGAC 4 O NM_025075 708 - 709 Ngg1 interacting factor 3 like 1 binding protein 1 GA CCATTGAGCCCAGC 41 NM 181725 hypothetical protein FLJ1276 O

GA CTGAGGGCGTCTTC 42 NM 012153 ets homologous factor

GA CTCGGTAGTTACGT 43 NM 012153 ets homologous factor

GA CCCAAGATGATTAA 44 NM 014177 chromosome 18 open reading frame 55

GA CTCAAACTTGTCTT 45 NM_003350 ubiquit in conjugating enzyme E2 variant 2

GA CATAGTTATTATAC 46 NM 032466 aspartate beta hydroxylase

GA CCCAACTGCTCCTG 47 NM_005947 72 O-721 metallothionein 1B (functional)

GA CAAAATGCTAAAAC 48 NM 016311 722-723 ATPase inhibitory factor 1

GA CTGTTTGTTCCCTG 49 NM 013411 724 - 725 adenylate kinase 2 GA CAACAGTGGCAATG SO NM_001001392 726-727 CD44 antigen (homing function and Indian blood group system)

GATCAATAATAATGAGG 151 NM_001001392 726-727 CD44 antigen (homing function and Indian blood group system)

GATCAACTAATAGCTCT 152 NM 181890 728-729 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast) US 2011/O 136690 A1 Jun. 9, 2011 31

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCAAATAAAGTTATA 153 NM 181891 73 O-731 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 154 NM 181890 728-729 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 155 NM 181889 732-733 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 156 NM_003340 734 - 735 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 157 NM 181888 736-737 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 158 NM 181888 736-737 ubiquitin-conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 1.59 NM 181891 73 O-731 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 16 O NM 181887 738-739 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 161 NM 181887 738-739 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 162 NM 181886 74. O-741 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 163 NM 181886 74. O-741 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 164 NM_003340 734 - 735 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAACTAATAGCTCT 1.65 NM 181889 732-733 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCTGATTTTTTCCCC 166 NM 145751 742-743 TNF receptor associated factor 4 US 2011/O 136690 A1 Jun. 9, 2011 32

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCAGAAATGACTGTG 67 NM_018509 744 - 745 hypothetical protein PRO1855

GATCACTGAGAAAAAAT 68 NM 1524O7 746 - 747 GrpE-like 2, mitochondrial (E. coli) GATCCAAGAGTTTAGTG 69 NM_006.807 748-749 chromobox homolog 1 (HP1 beta homolog Drosophila)

GATCTTTGCTGGCAAGC 70 NM_002954 750-751 ribosomal protein S27a.

GATCCACACTGAGAGAG 71. NM 145864 752 - 75.3 kallikrein 3, (prostate specific antigen)

GATCTGTATTATTAAAT 72 NM 032549 754-755 IMP2 inner mitochondrial membrane protease like (S. cerevisiae GATCTGTTTGTTCCCTG 173 NM 172199 76-77 adenylate kinase 2 GATCCCCTGCCTGGTGC 174 NM_001312 78-79 cysteine-rich protein 2

GATCAACTAATAGCTCT 17s NM 181893 760-761 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCAAATAAAGTTATA 176 NM 181893 760-761 ubiquitin- conjugating enzyme E2D 3 (UBC4/5 homolog, yeast)

GATCTTTTTCAAGTCTT 77 NM 012071 762-763 COMM domain containing 3

GATCATGTATGAGATAG 78 NM 012460 764-765 translocase of inner mitochondrial membrane 9 homolog (yeast)

GATCCTTCAGGCAGTAA 79 NM 176805 766 - 767 mitochondrial ribosomal protein S11

GATCTTTTTTTGGATAT 8O NM_003336 768-769 ubiquitin- conjugating enzyme E2A (RAD6 homolog)

GATCCCAGTCTCTGCCA 81 NM 000581 glutathione peroxidase 1

GATCAAGACGAGCCTGC 82 NM_004864 772-773 growth differentiation factor 15

GATCCCAGCTGATGTAG 83 NM_001885 774-775 crystallin, alpha B GATCATGAAGACCTGCT 84 NM_003754 776-77 7 eukaryotic translation initiation factor 3, subunit 5 epsilon, 47kDa

GATCTCAAGGTTGATAG 185 NM_003864 778-779 sin3-associated polypeptide, 3 OkDa US 2011/O 136690 A1 Jun. 9, 2011 33

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GA CACCAGGCTGCCCA 86 NM 148571 78 O-781 mitochondrial ribosomal protein L27

GA CAAAATGCTAAAAC 87 NM 178190 782-783 ATPase inhibitory factor 1

GA CAAGATGACACTGA 88 NM_004.483 784-785 glycine cleavage system protein H (aminomethyl carrier) GA CGGGAACTCCTGCT 89 NM_005952 786-787 metallothionein 1X GA CTTGTCTTTAAAAC 9 O NM_015646 788-789 RAP1B, member of RAS oncogene family

GA CCACACACGTTGGT 91 NM_003 255 79 0-791 tissue inhibitor of metalloproteinase 2 GA CATCAGTCACCGAA 92 NM 000077 792- 793 cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) GATCCAGTATTCACGTCA 193 NM 002166 794-795 inhibitor of DNA binding 2, dominant negative helix-loop helix protein GA CCTTGCAGGGAGCT 94 NM_015343 796 - 797 dullard homolog (Xenopus laevis GA CTCCTTGCCCCAGC 95 NM_015343 796 - 797 dullard homolog (Xenopus laevis GA CGCCTAGTATGTTC 96 NM_003897 798-799 immediate early response 3 GA CAGACTGTATTAAA 97 NM 032052 zinc finger protein 278

GA CGGCCCTACTAGAT 98 NM 032052 zinc finger protein 278

GA CTCCCACTGCGGGG 99 NM 032052 zinc finger protein 278

GA CTGTGATGGTCAGC 2OO NM 000232 sarcoglycan, beta (43kDa dystrophin associated glycoprotein)

GATCACTGTGGTATCTA NM 052822 804 - 805 secretory carrier membrane protein 1 GATCATCAGTCACCGAA NM_058197 806-807 cyclin-dependent kinase inhibitor 2A (melanoma, p16, inhibits CDK4) GATCATTTGTTTATTAA NM_022334 808 - 809 integrin beta 1 binding protein 1

GATCAAATATGTAAAAT 2O4. NM_004.842 810-811 A kinase (PRKA) anchor protein 7 GATCTCTTGCTAGATTT 2O5 NM 134441 812-813 relaxin 2 GATCACCTTCGACGACT NM 1989.01 814 - 815 sorcin US 2011/O 136690 A1 Jun. 9, 2011 34

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GA CGGA TGATTAAAA NM_020353 816-817 phospholipid scramblase 4

GA CTAG TGGGAGATA NM 153367 818 - 819 chromosome 10 open reading frame 56

GA CTTT TTGGCTACT 209 NM_018424 820 - 821 erythrocyte membrane protein band 4.1 like 4B

GA CACA TTTCTGTTG NM 201436 822 - 823 H2A histone family, member W

GA (CACC GGGTTTCTT NM_021999 824 - 825 integral membrane protein 2B

GA (CTAT AGATTCAAA. NM 021105 826-827 phospholipid scramblase 1

GA CTCT ATTTTACAA NM 000546 828-829 tumor protein p53 (Li-Fraumeni syndrome)

GA CATAGAAGGCTATT NM 181835 83 O-831 neurofibromin 2 (bilateral acoustic neuroma)

GA CTTCCTGGACAGGA NM 152992 832-833 POM (POM121 homolog, rat) and ZP3 fusion

GA CAAGGACCGGCCCA NM 032391 834 - 835 small nuclear protein PRAC

GA CGCATTTTTGTAAA NM_058171 836-837 inhibitor of growth family, member 2

GA CCATCCTCATCTCC NM_020188 838-839 DC13 protein GA CGATGGTGGCGCTT NM 138992 beta- Site APP cleaving enzyme 2

GA CTTATAAAAAGAAA 22 O NM_017998 84 O - 841 chromosome 9 open reading frame 40

GA CTGAACGATGCCGT 221 NM_024579 842 - 84.3 hypothetical protein FLJ23221

GA CTCCCCGCCGCAGC 222 NM_015973 844 - 845 galanin GA CGTCGTCCAGGCCA 223 NM 032920 846 - 847 chromosome 21 open reading frame 124

GA CGTTGGGGAACCCC 224 NM 199483 848 - 849 chromosome 2 O open reading frame 24

GA CCTATATGTCC, TCT 225 NM 152344 850 - 851 hypothetical protein FLJ3 O656

GA CGATGGTTGACAAT 226 NM_004.552 852-853 NADH dehydrogenase (ubiquinone) Fe-S protein 5, 15kDa (NADH-coenzyme Q reductase) US 2011/O 136690 A1 Jun. 9, 2011 35

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCTTGTACTAACTTA 227 NM_019059 854-85.5 translocase of outer mitochondrial membrane 7 homolog (yeast)

GATCCCGATGTTCTTAA 228 NM_001806 856-857 CCAATAenhancer binding protein (C/EBP), gamma GATCCTGTTTAACAAAG 229 NM 015469 858 - 859 nip snap homolog 3A (C. elegans) GATCACGCACACACAAT 23 O NM 198337 860 - 8611 insulin induced gene 1.

GATCCAGCCAGACTTGC 231 NM 144772 862-863 apolipoprotein A-I binding protein

GATCCACACTGGAGAGA 232 NM_003450 864 - 865 zinc finger protein 174

GATCTCAGTTCTGCGTT 233 NM 004642 866 - 867 CDK2-associated protein 1

GATCTACACCTCTTGCC 234 NM 052845 868 - 869 methylmalonic aciduria (cobalamin deficiency) type B

GATCCAGCTGGAAAGCT 235 NM_006.406 870 - 871 peroxiredoxin 4 GATCCTTCAGGCAGTAA 236 NM_022839 872-873 mitochondrial ribosomal protein S11

GATCCACACTGAGAGAG 237 NM_001648 874-875 kallikrein 3, (prostate specific antigen)

GATCACCTTATGGATGT 238 NM_003932 876-877 suppression of tumorigenicity 13 (colon carcinoma) (Hsp70 interacting protein)

GATCTAGTTATTTTAAT 239 NM 172178 878-879 mitochondrial ribosomal protein L42

GATCATTGAGAATGCAG 24 O NM 206966 880-881 Similar to AWLW472 GATCATGCCAAGTGGTG 241 NM 058248 882-883 deoxyribonuclease II beta

GATCACATTTTCTGTTG 242 NM 201516 884 - 885 H2A histone family, member W

GATCAGAAAGAAACCTT 243 NM_006.744 886-887 retinol binding protein 4, plasma

GATCCGTGGCAGGGCTG 244 NM 03.1901 888 - 889 mitochondrial ribosomal protein S21

GATCCGTGGCAGGGCTG 245 NM_018997 890-891 mitochondrial ribosomal protein S21

GATCTATCACCCAAACA 246 NM 198157 892 - 89.3 ubiquitin- conjugating enzyme E2L 3 US 2011/O 136690 A1 Jun. 9, 2011 36

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GA CAAGCGTGCTTTCC 247 NM_000.995 894 - 895 ribosomal protein L34

GA CAAGCGTGCTTTCC 248 NM_033625 896-897 ribosomal protein L34

GA CCCTCATCCCTGAA 249 NM 014098 898-899 peroxiredoxin 3 GA CCACCTTGGCC. TCC 25 O NM 147187 9 OO-901 tumor necrosis factor receptor superfamily, member 1 Oo

GA CTTAGGGAGACAAA. 251 NM 182529 9 O2- 903 THAP domain containing 5 GA CAAGATACGGAAGA 252 NM 177924 904 - 905 N-acyl sphingosine amidohydrolase (acid ceramidase) 1 GA CTGTTTGTTCCCTG 253 NM_001625 906 - 9 Of adenylate kinase 2 GA CAGCAAAAGCCAAA. 254 NM 201263 908-909 tryptophanyl tRNA synthetase 2 (mitochondria-) GA CGGGGGAGGGTAAA 255 NM_004544 91 O-911 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 10, 42kDa

GATCGTGGAGGAGGGAC 256 NM 016310 912-913 polymerase (RNA) III (DNA directed) polypeptide K, 12.3 kDa

GA CACTTTTGAAAGCA 257 NM_018465 914 - 91.5 chromosome 9 open reading frame 46 GA CTGATTTGCTAGTT 258 NM 015147 916 - 917 KIAAO 582 GA CCTAGGGGGTTTTG 259 NM 015147 916 - 917 KIAAO 582 GA (CTAAGTTGCCTACC 26 O NM 014176 918-919 HSPC150 protein similar to ubiquitin conjugating enzyme

GA CTTTGTTCTTGACC 261 NM_020531 92 O-921 chromosome 2 O open reading frame 3 GA CTCTTAGCCAGAGG 262 NM 153333 922-923 transcription elongation factor A (SII) -like 8 GA CTCTCTCACCTACA 263 NM_003287 924 - 925 tumor protein D52 like 1

GA CAGAGGTGAAGGGA 264 NM_007021 926-927 chromosome 10 open reading frame 10 GA CTCATTGATGTACA 265 NM_032947 928-929 putative small membrane protein NID67

GA CTGTGCCGGCTTCC 266 NM_005656 930 - 931 transmembrane protease, serine 2 GA CCGTCTGTGCACAT 267 NM_005656 930 - 931 transmembrane protease, serine 2 US 2011/O 136690 A1 Jun. 9, 2011 37

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCGGCTCTGGGAGAC 268 NM_006.315 932-933 ring finger protein 3 GATCGATTAATGAAGTG 269 NM 016326 934 - 935 chemokine-like factor GATCCTGGACTGGGTAC 27 O NM_006830 936-937 ubiquinol cytochrome c reductase (6.4kD) subunit

GATCTTGGAGAATGTGA 271 NM_001216 938-939 carbonic anhydrase IX

GATCTTTTTTTGGATAT 272 NM 181762 94 O-941 ubiquitin- conjugating enzyme E2A (RAD6 homolog)

GATCTAGTTATTTTAAT 273 NM 014050 942-943 mitochondrial ribosomal protein L42

GATCTAGTTATTTTAAT 274 NM 172177 944 - 94.5 mitochondrial ribosomal protein L42

GATCAAGGGACGGCTGA NM_000.978 94 6-947 ribosomal protein L23

GATCAGAAGGCTCTGGT 276 NM_018442 948-949 IQ motif and WD repeats 1

GATCAATGTTGAAGAAT 277 NM_018442 948-949 IQ motif and WD repeats 1

GATCCTGCACTCTAACA 278 NM 203339 950-951 clusterin (complement lysis inhibitor, SP-40, 40, sulfated glycoprotein 2 testosterone repressed prostate message 2, apolipoprotein J) GATCTGATTATTTACTT 279 NM_004708 952-95.3 programmed cell death. 5

GATCCTTGAAGGCAGCT 28O NM 197958 954-955 acheron GATCCCTTTTCTTACTA 281 NM 153713 956-957 hypothetical protein MGC 46719

GATCTGTCCACTTCTGG 282 NM 153713 956-957 hypothetical protein MGC 46719

GATCAGATACCACCAAG 283 NM_001001503 958-959 NADH dehydrogenase (ubiquinone) flavoprotein 3, 10kDa GATCCTTTGGATTAATC 284 NM 016138 96.O-961 coenzyme Q7 homolog, ubiquinone (yeast)

GATCAT TATTTCTGTCT 285 NM_018184 962-963 ADP-ribosylation factor-like 10C

GATCAGCCCT CAAAGAA 286 NM_018184 962-963 ADP-ribosylation factor-like 10C

GATCAGCAAAAATAAAG 287 NM 016096 964 - 965 HSPCO38 protein US 2011/O 136690 A1 Jun. 9, 2011 38

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GA CTCAGCGGCATTAA 288 NM 052951 966-967 deoxynucleotidyl transferase, terminal, interacting protein 1 GA CCCTGGAGTGCCTT 289 NM_003 226 968-969 trefoil factor 3 (intestinal) GA CTGTTTCTACCAAT 29 O NM 183045 97 O-971 ring finger protein (C3H2C3 type) 6 GA CCTGCTGTGAAAGG 291 NM 153750 972-973 chromosome 21 open reading frame 81 GA CTTGAAAGTGCCTG 292 NM_022130 974 - 975 golgi phosphoprotein 3 (coat-protein) GA CAATACAATAACAA 293 NM_003479 976-977 protein tyrosine phosphatase type WA, member 2 GA CTCCTATGAGAACA 294 NM_003479 976-977 protein tyrosine phosphatase type WA, member 2 GA CAATACAATAACAA 295 NM 080391 978-979 protein tyrosine phosphatase type WA, member 2 GA CTCCTATGAGAACA 296 NM 080391 978-979 protein tyrosine phosphatase type WA, member 2 GA CCAACCCTGTACTG 297 NM 177969 980-981 protein phosphatase B (formerly 2C), magnesium dependent, beta isoform

GA CTCTACCATTTAAT 298 NM_001017 982-983 ribosomal protein S13

GA CCAGAAATACTTAA 299 NM_005410 984-985 selenoprotein P, plasma, 1

GA CCAATGCTAAACTC NM_005410 984-985 selenoprotein P, plasma, 1 GA CAAATGAGAATAAA 3O1 NM 182620 986-987 family with sequence similarity 33, member A.

GA CCTTGCCACAAGAA NM_004034 988-989 annexin A7 GA CAGACTGTATTAAA NM 032051 990-991 zinc finger protein 278

GA CTCCCACTGCGGGG NM 032051 990-991 zinc finger protein 278

GA CGGCCCTACTAGAT 305 NM 032051 990-991 zinc finger protein 278

GA CAAAAAGCAAGCAG 3 O 6 NM_015972 992-993 polymerase (RNA) 1 polypeptide D, 16kDa

GA (CACTTCAGCTGCCT 3. Of NM_019007 994 - 995 armadillo repeat containing, X-linked 6 US 2011/O 136690 A1 Jun. 9, 2011 39

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCACCGACTGAAAAT NM 002165 996-997 inhibitor of DNA binding 1, dominant negative helix-loop helix protein

GA CAATGAAGTGAGAA 309 NM_003 094 998-999 Small nuclear ribonucleoprotein polypeptide E

GA CATCTCAGAAGTCT NM_018683 OOO-1OO1 zinc finger protein 313

GA CAGGAAGGACTTGT NM_018683 OOO-1OO1 zinc finger protein 313

GA CAT TCCCATTTCAT NM_002583 OO2-10O3 PRKC, apoptosis, WT1, regulator

GA CGCTTTCTACACTG NM_006926 OO4 - 1005 surfactant, pulmonary-associated protein A2

GA CAGTTAGCTTTTAT NM_014335 OO 6-10 Of CREBBP/EP3OO inhibitor 1

GATCAGTAGTTCAACAG NM 175061 O08-1009 juxtaposed with another zinc finger gene 1

GATCCGATAAGTTATTG NM_004707 O1O-1011 APG12 autophagy 1.2-like (S. cerevisiae)

GATCAGTGGGCACAGTT NM_006818 O12-1013 ALL1-fused gene from chromosome 1g

GATCAGTGCCAGAAGTC NM 016303 O14 - 1015 WW domain binding protein 5

GATCAGAGAAGTAAGTT NM_004871 O16-1017 golgi SNAP receptor complex member 1

GATC. TCACTTTCCCCTT NM_015373 O18-101.9 PKD2 interactor, golgi and endoplasmic reticulum associated 1.

GA CAGGCAGTTCCTGG 321 NM 213.720 O2 O-1021 chromosome 22 open reading frame 16

GA CCTTGCCACAAGAA 322 NM_001156 O22 - 1023 annexin A7 GA CAAGAAAAATAAGG 323 NM_000999 O24 - 1025 ribosomal protein L38

GA CGATTTCTTTCCTC. 324 NM 021102 O2 6-1027 serine protease inhibitor, Kunitz type, 2

GA CATAGAAGGCTATT 3.25 NM 181826 O28-1029 neurofibromin 2 (bilateral acoustic neuroma)

GA CCGGTGCGCCATGT 326 NM 002638 O3 O1 O31 protease inhibitor 3, skin-derived (SKALP) US 2011/O 136690 A1 Jun. 9, 2011 40

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCGCAGTTTGGAAAC 327 NM_005461 1O32- 1033 w-malf musculoaponeurotic fibrosarcoma oncogene homolog B (avian) GATCAATTTCAAACCCT 328 NM_005461 1O32- 1033 w-malf musculoaponeurotic fibrosarcoma oncogene homolog B (avian) GA CTCCTATGAGAACA 329 NM 080392 O34 - 1035 protein tyrosine phosphatase type IWA member 2 GA CAATACAATAACAA 33 O NM 080392 O34 - 1035 protein tyrosine phosphatase type IWA member 2 GA CCTACCACC TACTG 331 NM_018281 O3 6-1037 hypothetical protein FLJ10948

GA CATTTGTTTATTAA 332 NM_004763 O38-1039 integrin beta 1 binding protein 1

GA CAAAATGCTAAAAC 333 NM 178191 O4 O-1041 ATPase inhibitory factor 1

GA CTGGGGTGGGAGTA 334 NM_002773 O42-1043 protease, serine, 8 (prostasin)

GA CATGCTTGTGTGAG 335 NM_018648 O44- 1045 nucleolar protein family A, member 3 (H/ACA small nucleolar RNPs)

GA CAAATATGTAAAAT 336 NM 1386.33 O4 6-104.7 A kinase (PRKA) anchor protein 7

GA CAGACTTCTCAGCT 337 NM_006856 O48-1049 activating transcription factor 7

GA CATAGAAGGCTATT 338 NM 181827 OsO-1051 neurofibromin 2 (bilateral acoustic neuroma)

GA CCACCTTGGCC. TCC 339 NM_003.842 O52-1053 tumor necrosis factor receptor superfamily, member 1 Oo

GA CTCTGGCCCCTCAG 34 O NM 198527 O54 - 1055 Similar to RIKEN cDNA 111OO33OO9 gene

GA CCTCATTGAGCCAC 341 NM_024866 O56-105.7 adrenomedull in 2 GA CCAGTGGGGTCCGG 342 NM_002475 O58 - 1059 myosin light chain 1 slow a

GA CATTTTGTATTAAT 343 NM 017544 O6 O-1061 NF-kappa B repressing factor

GA CAGAAAAAGAAAGA 344 NM_000982 O62-1063 ribosomal protein L21

GA CCTGTTCCTGT CAC 345 NM 203413 O64 - 1065 S-phase 2 protein US 2011/O 136690 A1 Jun. 9, 2011 41

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GA CATGGTTCTCTTTG 346 NM 000202 O66-1067 iduronate 2-sulfatase (Hunter syndrome) GA CCTCTGACCGCTGG 347 NM_022365 O68 - 1069 DnaJ (Hsp40) homolog, subfamily C, member 1 GA CTGCTATTGCCAGC 348 NM 016399 Of O-1071 hypothetical protein HSPC132

GA CCTGGAAATTGCAG 349 NM_001233 O72- 1073 caveolin 2 GA CAGTCTCAAGTGTC 350 NM_003702 O74 - 1075 regulator of G protein signalling 2 O GA CAGGTTAGCAAATG 351 NM 004331 O76- 1077 BCL2/adenovirus EB 19kDa interacting protein 3 like

GATCAGTATGCTGTTTT 352 NM_004968 1078 - 1079 islet cell autoantigen 1, 69kDa GATCTGGTTTCTAGCAA 353 NM 024096 1080-1081 XTP3-transactivated protein. A

GATCTAATTAAATAAAT 3.54 NM_000903 1082-1083 NAD(P)H dehydrogenase, quinone 1 GATCCTGGGTTTTTGTG 355 NM_017830 1084-1085 OCIA domain containing 1 GATCACCGACTGAAAAT 356 NM 181353 1086 - 1087 inhibitor of DNA binding 1, dominant negative helix-loop helix protein GATCAGGTAACCAGAGC 357 NM_002488 1088 - 1089 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 2, 8kDa GA CAGTGAACACTAAC 358 NM 016645 O9 O-1091 mesenchymal stem cell protein DSC92 GA CTCAGATGCTAGAA 359 NM_016567 O92-1093 BRCA2 and CDKN1A interacting protein

GA CGCTCTGCCCATGT 360 NM_016567 O92-1093 BRCA2 and CDKN1A interacting protein

GA CAGCTCCGTGGGGC 361 NM 152398 O94 - 1095 OCIA domain containing 2 GA CATTGCCCAAAGTT 362 NM 152398 O94 - 1095 OCIA domain containing 2 GA CTGGCACTGTGGTT 363 NM_000998 O96-1097 ribosomal protein L37a

GA CTGGCACTGTGGGT 364 NM_000998 O96-1097 ribosomal protein L37a

GA CTCAGATGCTAGAA 365 NM_078468 O98-1099 BRCA2 and CDKN1A interacting protein US 2011/O 136690 A1 Jun. 9, 2011 42

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCGCTCTGCCCATGT 366 NM_078468 1098-1099 BRCA2 and CDKN1A interacting protein

GATCTGCTGTGGAATTG 367 NM 172316 11 OO-1101 Meis1, myeloid ecotropic viral integration site 1 homolog 2 (mouse) GA CGTTCTTGATTTTG 368 NM 032476 O2 - 11 O3 mitochondrial ribosomal protein S6 GA CTTGGTTTCATGTG 369 NM 032476 O2 - 11 O3 mitochondrial ribosomal protein S6 GA CATTCTTGATTTTG 37 O NM 032476 O2 - 11 O3 mitochondrial ribosomal protein S6 GA CCATATGGAAAGAA 3.71 NM 014171 O4 - 11 O5 postsynaptic protein CRIPT

GA CTGCCCCCACTGTC 372 NM 138929 O6-11. Of diablo homolog (Drosophila)

GA CGCCTAGTATGTTC 373 NM 052815 O8-1109 immediate early response 3 GA CAATGCTAATATGA 374 NM_005805 1O-1111 proteasome (prosome, macropain) 26S subunit, non-ATPase, 14

GATCAGCATCAGGCTGT 375 NM 012459 1112-1113 translocase of inner mitochondrial membrane 8 homolog B (yeast) GA CTGGAAGTGAAACA 376 NM 134265 14 - 1115 WD repeat and SOCS box- containing 1 GA CCACGTGTGAGGGA 377 NM 182640 16 - 1117 mitochondrial ribosomal protein S9 GA CACAGAAAAATTAA 378 NM 182640 16 - 1117 mitochondrial ribosomal protein S9 GA CTCTCTGCGTTTGA 379 NM_012.445 18 - 1119 spondin 2, extracellular matrix protein

GA CTCAGAAGTTTTGA NM 138459 2 O-1121 chromosome 6 open reading frame 68 GA CCGGACTTTTTAAA 381 NM_006339 22-1123 high-mobility group 2 OB

GA CATAGTTATTATAC 382 NM 032467 24 - 1125 aspartate beta hydroxylase

GA CCTGCCCTGCTCTC 383 NM_003145 26-1127 signal sequence receptor, beta (translocon associated protein beta) GATCGATTGAGAAGTTA 384 NM 012110 1128-1129 cysteine-rich hydrophobic domain 2 US 2011/O 136690 A1 Jun. 9, 2011 43

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCCAAGTACTCTCTC 385 NM 175081 3 O-1131. purinergic receptor P2X, ligand-gated ion channel 5 GATCATACACCTGCTCA 386 NM_001009 32-1133 ribosomal protein S5 GATCCTGGATGCCACGA 387 NM 174889 34 - 1135 hypothetical protein LOC91.942

GATCCCTGCCACAAGTT 388 NM_006923 36 - 1137 Stromal cell-derived factor 2

GATCAGACGAGGCCATG 389 NM_006 107 38-1139 cisplatin resistance associated overexpressed protein

GATCTTTCAGGAAAGAC 390 NM_033011 4 O-1141 plasminogen activator, tissue GATCTTTTAAAAATATA 391 NM_001914 42-1143 cytochrome b-5 GATCGTTTTGTTTTGTT 392 NM 021149 44 - 1145 coactosin-like 1 (Dictyostelium)

GATCTATGGCCTCTGGT 393 NM 021643 4 6-1147 tribbles homolog 2 (Drosophila) GATCCTAAATCATTTTG 394 NM_022.783 48 - 1149 DEP domain containing 6 GATCTAAGAAGAAACTA. 395 NM_005765 5 O-1151 ATPase, H+ transporting, lysosomal accessory protein 2

GATCTTGGTGTTCAAAA 396 NM_001497 1152-1153 UDP Gal: betaClcNAc beta 1, 4 galactosyltransferase, polypeptide 1 GATCCCTCATCCCTGAA 397 NM_006.793 1154 - 1155 peroxiredoxin 3 GATCTGCAGTGCTTCAC 398 NM 178181. 1156-1157 CUB domain containing protein 1 GATCTATGCCCTTGTTA 399 NM_0331.67 1158-1159 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3

GATCTATGCCCTTGTTA 4 OO NM_033169 116 O-1161. UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3

GATCAGTTTATTATTGA NM_033169 116 O-1161. UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3 GATCTATGCCCTTGTTA NM_0331.68 1162-1163 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3 US 2011/O 136690 A1 Jun. 9, 2011 44

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCAGTTTATTATTGA NM_0331.67 1158-1159 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3 GATCTATGCCCTTGTTA NM_003 781 1164 - 1165 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3

GATCAGTTTATTATTGA 405 NM_003 781 1164 - 1165 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3 GATCAGTTTATTATTGA NM_0331.68 1162-1163 UDP Gal: betaClcNAc beta 1,3- galactosyltransferase, polypeptide 3

GA CGAGTCAAGATGAG 4. Of NM 013442 66 - 1167 stomatin (EPB72) - like 2

GA CACCATGATGCAGA 408 NM 03.1905 68-1169 SVH protein GA CCCGTGTGTGTGTG NM 03.1905 68-1169 SVH protein GA CATGGTTCTCTTTG NM_006123 7 O-1171 iduronate 2-sulfatase (Hunter syndrome) GA CCGCAGGCAGAAGC NM_002775 72- 1173 protease, serine, 11 (IGF binding)

GA CGATGGTGGCGCTT NM 138991 74 - 11.75 beta- Site APP cleaving enzyme 2

GA CTGCATCAGTTGTA NM_001814 76-1177 cathepsin C GA CTCTACTACCACAA NM_001908 78-1179 cathepsin B GA CTCTACTACCACAA NM 147780 8 O-1181 cathepsin B GA CTCTACTACCACAA NM 147781 82-1183 cathepsin B GA CTCTACTACCACAA NM 147782 84 - 11.85 cathepsin B GA CTCTACTACCACAA NM 147783 86 - 1187 cathepsin B GA CGATGGTGGCGCTT NM 012105 88 - 11.89 beta- Site APP cleaving enzyme 2

GA CTTTCAGGAAAGAC NM_000931 90- 1.191 plasminogen activator, tissue

GA CAAATTGCAAAATA 421 NM 153705 92-1193 KDEL (Lys-Asp Glu-Leu) containing 2

GA CTTATTTTCTGAGA 422 NM 014584 94 - 11.95 ERO1-like (S. cerevisiae

GA CCACAAGGCCTGAG 423 NM_001185 96-1197 alpha-2-glycoprotein 1, Zinc US 2011/O 136690 A1 Jun. 9, 2011 45

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCTAGGCCTCATCTT 424 NM 016352 1198-11.99 carboxypeptidase A4 GATCCCTTTGAAATTTT 425 NM_0012.19 12 OO-12 O1 calumenin GATCTACAACATATAAA 426 NM_020648 12 O2- 12 O3 twisted gastrulation homolog 1 (Drosophila)

GATCAGTTTTTTCACCT 427 NM_001901 1204 - 1205 connective tissue growth factor GATCACAGTGTCAGAGA 428 NM_007224 neurexophilin 4 GATCGTTACTATGTGTC 429 NM_004541 1208 - 1209 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 1, 7.5kDa.

GATCATTGACCTCTGTG 43 O NM_006.459 1210-1211 SPFH domain family, member 1

GATCTGAAGCCCAGGTT 431 NM_024514 1212-1213 cytochrome P450, family 2, subfamily R, polypeptide 1

GATCTGTTAAAAAAAAA 432 NM 147159 1214-1215 opioid receptor, sigma 1.

GATCTTTCAGGAAAGAC 433 NM_000930 1216-1217 plasminogen activator, tissue GATCATAAGACAATGGA 434 NM_001657 1218-1219 amphiregulin (schwannoma derived growth factor) GATCAGTCTTTATTAAT 435 NM_013995 122 O-1221 lysosomal-associated membrane protein 2 GATCCAGGCTCACTGTG 436 NM_005 250 1222 - 1223 forkhead box L1 GATCAAATAATGCGACG 437 NM_018.064 1224-1225 chromosome 6 open reading frame 166 GATCTTGGTTTTCCATG 438 NM_003000 1226-1227 succinate dehydrogenase complex, subunit B, iron sulfur (Ip) GATCTGTTAGTCAAGTG 439 NM_005313 1228-1229 glucose regulated protein, 58kDa GATCATTTCTGGTAAAT 44 O NM_005313 1228-1229 glucose regulated protein, 58kDa GATCAAAGCACTCTTCC 441 NM_005313 1228-1229 glucose regulated protein, 58kDa GATCATGCCAAGTGGTG 442 NM 021233 123 O-1231 deoxyribonuclease II beta

GATCATCGCCTCCCTGG 443 NM_006216 1232 - 1233 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 2 US 2011/O 136690 A1 Jun. 9, 2011 46

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GA CACCAGGCTGCCCA 444 NM_016504 234 - 1235 mitochondrial ribosomal protein L27

GA CGGATGGGCAAGTC 445 NM 002178 236 - 1237 insulin-like growth factor binding protein 6

GA CTCAAGACCAAAGA 446 NM 030810 238 - 1239 thioredoxin domain containing 5 GA CTCACATTGTGCCC 447 NM_014254 24 O-1241 transmembrane protein 5

GA CAGTCTTTATTAAT 4 48 NM_002294 242 - 1243 lysosomal-associated membrane protein 2 GA CAGAGAAGATGATA 449 NM 000640 244 - 1245 interleukin 13 receptor, alpha 2 GA CAGGTAACCAGAGC 450 NM 000591 246 - 1247 CD14 antigen GA CATCAGTAAATTTG 451 NM 031284 248 - 1249 ADP-dependent glucokinase

GA CAATAAAATGTGAT 452 NM 002658 25 O-1251 plasminogen activator urokinase GA CCC. TCGGGTTTTGT 453 NM_006.350 252 - 1253 folli statin GA CTTGCAACTCCATT 454 NM_006.350 252 - 1253 folli statin GA CCAGCATGGAGGCC 45.5 NM_018664 254 - 1255 Jun dimerization protein p21 SNFT GA CATTGTGAAGGCAG 456 NM_001511 25 6-1257 chemokine (C-X- C motif) ligand 1 (melanoma growth stimulating activity, alpha)

GATCTGCCAGCAGTGTT 457 NM_002004 1258-1259 fairnesyl diphosphate synthase (farnesyl pyrophosphate synthetase, dimethylallyl transtransferase, geranyltrans trans ferase) GATCAGAGGTTACTAGG 458 NM_0064.08 126 O-1261 anterior gradient 2 homolog (Xenopus laevis) GATCCACAGGGGTGGTG 459 NM 000602 1262-1263 serine (or cysteine) proteinase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 1 GATCACAAGGGGGGGAT 460 NM_016588 1264 - 1265 neuritin 1 GATCTCTGTTTTGACTA 461 NM_004 109 1266-1267 ferredoxin 1 GATCTAACCTGGCTTGT 462 NM_004 109 1266-1267 ferredoxin 1 GATCAGCAAGTGTCCTT 463 NM_000935 1268- 1269 pro collagen-lysine, 2 oxoglutarate 5 dioxygenase 2 US 2011/O 136690 A1 Jun. 9, 2011 47

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCTAGTGGTTCACAC 464 0.03236 27 O - 1271 transforming growth factor, alpha

GATCAAACAGTTTCTGG 465 016139 272 - 1273 coiled-coil-helix coiled-coil-helix domain containing 2

GATCATCAAGAAAAAAG 466 O18464 274 - 1275 chromosome 10 open reading frame 70

GATCCCAGAGAGCAGCT 467 OO2421 276 - 1277 matrix metalloproteinase 1 (interstitial collagenase)

GATCTTGTGTATTTTTG 468 O2O440 278 - 1279 prostaglandin F2 receptor negative regulator

GATCTATGTTCTCTCAG 469 0133.63 280 - 1281 pro collagen C endopeptidase enhancer 2

GATCAGCAAGTGTCCTT 47 O 182943 282 - 1283 pro collagen-lysine, 2 oxoglutarate 5 dioxygenase 2

GATCATGTGCTACTGGT 471 0.03172 284 - 1285 surfeit 1

GATCTGTAAATAAAATC 472 130781 286 - 1287 RAB24 member RAS oncogene family

GATCAGGGCTGAGGGTA 473 000157 288 - 1289 glucosidase, beta; acid (includes glucosylceramidase)

GATCCTCC TATGTTGTT 474 0.05551 29 O - 1291 kallikrein 2, prostatic

GATCAGAGATGCACCAC 47s 002.997 292 - 1293 syndecan 1

GATCTGTCTGTTGCTTG 476 OO5570 294 - 1295 lectin, mannose binding, 1.

GATCACCATGAAAGAAG 477 0.03873 296 - 1297 neuropilin 1

GATCTGTTAAAAAAAAA 478 0.05866 298 - 1299 opioid receptor, sigma 1.

GATCAATTCCCTTGAAT 479 138322 3OO - 13 O1 proprotein convertase Subtilisin/kexin type 6

GATCCCAGACCAACCCT 48O O24642 3O2-13 O3 UDP-N-acetyl-alpha D galactosamine: polypeptide N acetylgalactosaminyl transferase 12 (GalNAc-T12)

GATCATCACAGTTTGAG 481 NM_002.425 1304 - 1305 matrix metalloproteinase 10 (stromelys in 2)

GATCGGAACAGCTCCTT 482 NM 178154 13 O6-13. Of fucosyltransferase 8 (alpha (1, 6) fucosyltransferase) US 2011/O 136690 A1 Jun. 9, 2011 48

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCGGAACAGCTCCTT 483 NM 178155 3O8 - 13 O9 fucosyltransferase 8 (alpha (1, 6) fucosyltransferase)

GATCGGAACAGCTCCTT 484 NM 178156 31 O - 1311 fucosyltransferase 8 (alpha (1, 6) fucosyltransferase)

GATCTGTGGGCCCAGTC 485 NM_004077 312-1313 citrate synthase GATCAACCTTAAAGGAA 486 NM_000143 314 - 1315 fumarate hydratase GATCTTCTACTTGCCTG 487 NM_000302 316 - 1317 pro collagen-lysine 1, 2-oxoglutarate 5 dioxygenase 1

GATCACCAGCCATGTGC 488 NM 004390 318-1319 cathepsin H GATCACCGGAGGTCAGT 489 NM 016026 320-1321 retinol dehydrogenase 11 (all-trans and 9-cis)

GATCTATTTTATGCATG 490 NM_020 792 322 - 1323 KIAA1363 protein GATCTGTTAAAAAAAAA 491 NM 147157 324 - 1325 opioid receptor, sigma 1

GATCATTTTGGTTCGTG 492 NM 016417 326-1327 chromosome 14 open reading frame 87

GATCACTTGTGTACGAA 493 NM_024641 328-1329 mannosidase, endo alpha

GATCCCTCCACCCCCAT 494 NM_001441 330-1331 fatty acid amide hydrolase

GATCCAAAGTCATGTGT 495 NM_058172 332 - 1333 anthrax toxin receptor 2

GATCCATAAATATTTAT 496 NM_058172 332 - 1333 anthrax toxin receptor 2

GATCTGCCTGCATCCTG 497 NM_003225 334-1335 trefoil factor 1 (breast cancer, estrogen inducible sequence expressed in)

GATCCAGTGTCCATGGA 498 NM_007085 1336 -1337 folli statin-like 1 GATCAATTCCCTTGAAT 499 NM 138324 1338-1339 proprotein convertase Subtilisin/kexin type 6

GATCCGTGTGCTTGGGC SOO NM_018143 134 O-1341 kelch-like 11 (Drosophila)

GATCCAGGGTCCCCCAG 5O1 NM_004911 1342-1343 protein disulfide isomerase related protein (calcium binding protein, intestinal-related)

GATCATGGGACCCTCTC 5O2 NM_003 032 1344 - 1345 sialyltransferase 1 (beta galactoside alpha-2, 6– sialyltransferase) US 2011/O 136690 A1 Jun. 9, 2011 49

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCATGGGACCCTCTC SO3 NM 173216 1346-1347 sialyltransferase 1 (beta-galactoside alpha-2, 6– sialyltransferase)

GATCTCACTGTTATTAT SO4 NM_007 115 348- 1349 tumor necrosis factor, alpha-induced protein 6

GATCCTGTATCCAAATC 505 NM_007 115 348- 1349 tumor necrosis factor, alpha-induced protein 6

GATCAGTTTTCTCTTAA SO 6 NM_024769 350 - 1351 adipocyte-specific adhesion molecule

GATCTACCAGATAACCT NM 000522 352 - 1353 homeo box Al3 GATCCTAGTAATTGCCT 508 NM 054034 354 - 1355 fibronectin 1 GATCAATGCAACGACGT 509 NM_006833 356-1357 COP9 constitutive photomorphogenic homolog subunit 6 (Arabidopsis) GATCAATTCCCTTGAAT NM 138325 358 - 1359 proprotein convertase Subtilisin/kexin type 6

GATCAATTCCCTTGAAT NM 138323 360 - 13 61 proprotein convertase Subtilisin/kexin type 6

GATCCCAGAGGGATGCA NM 024040 362 - 1363 CUE domain containing 2

GATCATCAAAAATGCTA NM_017898 364 - 1365 hypothetical protein FLJ2O605

GATCCCTCGGGTTTTGT NM_013 409 366 - 1367 folli statin GATCTTGCAACTCCATT NM_013 409 366 - 1367 folli statin GATCTTGTTAATGCATT NM_001873 368 - 1369 carboxypeptidase E GATCAAAGGTTTAAAGT NM_001627 370- 1371 activated leukocyte cell adhesion molecule

GATCACCAAGATGCTTC NM_018371 372 - 1373 chondroit in beta1, 4 N acetylgalactosaminyl transferase

GATCAAATGTGCCTTAA 519 NM_014918 1374 - 1375 carbohydrate (chondroitin) synthase 1

GATCTTCGGCCTCATTC NM_017860 1376- 1377 hypothetical protein FLJ2O519

GATCCCTTCTGCCCTGG 521 NM_022367 1378- 1379 Sema domain, immunoglobulin domain (Ig) , transmembrane domain (TM) and short cytoplasmic domain, (semaphorin) 4A US 2011/O 136690 A1 Jun. 9, 2011 50

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCCAACCGACTGAAT 522 NM_006670 1380 - 1381 trophoblast glycoprotein

GATCTCTGCAGATGCCA 523 NM_004750 1382 - 1383 cytokine receptor-like factor 1

GATCACAAAATGTTGCC 524 NM_001077 1384 - 1385 UDP glycosyltransferase 2 family, polypeptide B17

GATCTCTCTTTCTCTCT 525 NM 031882 386 - 1387 protocadherin alpha subfamily C, 1 GATCTCTCTTTCTCTCT 526 NM 031860 388-1389 protocadherin alpha 1 O

GATCTCTCTTTCTCTCT 27 NM_018906 390 - 1391 protocadherin alpha 3 GATCTCTCTTTCTCTCT 528 NM 031411 392 - 1393 protocadherin alpha 1 GATCACAGGCGTGAGCT 529 NM 032620 394 - 1395 GTP binding protein 3 (mitochondrial)

GATCAACATCTTTTCTT 53 O NM 004343 396-1397 calreticulin GATCTCTGATTTAACCG 531 NM 002185 398-1399 interleukin 7 receptor GATCTCTCTTTCTCTCT 532 NM 031497 4 OO-14 O1 protocadherin alpha 3 GATCCATTTTTAATGGT 533 NM 198278 4 O2- 14 O3 hypothetical protein LOC255743

GATCTTTTCTAAATGTT 534 NM_005699 4O4 - 14 O5 interleukin 18 binding protein

GATCTCTCTTTCTCTCT 535 NM 031410 4O6 - 14 Of protocadherin alpha 1 GATCGGTGCGTTCTCCT 536 NM_005561 408 - 14 O9 lysosomal-associated membrane protein 1

GATCTTTTCTAAATGTT s37 NM 173042 41 O-1411 interleukin 18 binding protein

GATCTTTTCTAAATGTT 538 NM 173043 412-1413 interleukin 18 binding protein

GATCTCTCTTTCTCTCT 539 NM 031496 414 - 1415 protocadherin alpha 2 GATCCTGTTGGATGTGA 54 O NM 080927 416 - 1417 discoidin, CUB and LCCL domain containing 2

GATCTCTCTTTCTCTCT 541 NM 031864 418 - 1419 protocadherin alpha 12

GATCTCTCTTTCTCTCT 542 NM 031849 42 O-1421 protocadherin alpha 6 GATCCTGTGCTTCTGCA 543 NM_006.464 422-1423 trans-golgi network protein 2

GATCTCTCTTTCTCTCT 544 NM 031865 424 - 1425 protocadherin alpha 13

GATCTGATGAAGTATAT 545 NM_022746 426 - 1427 hypothetical protein FLJ2239 O. US 2011/O 136690 A1 Jun. 9, 2011 51

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCACTTGTCTTGTGG 546 NM_006.988 1428-1429 a disintegrin-like and metalloprotease (reprolysin type) with thrombospondin type 1 motif, 1 GATCTTTTCTAAATGTT 547 NM 173044 43 O-1431 interleukin 18 binding protein

GATCTCTCTTTCTCTCT 548 NM 031856 432-1433 protocadherin alpha 8 GATCTCTCTTTCTCTCT 549 NM 031500 434 - 1435 protocadherin alpha 4 GATCAGCACTGCCAGTG 550 NM_016592 436-1437 GNAS complex locus GATCCGGAAAGATGAAT 551 NM 144640 438 - 1439 interleukin 17 receptor E GATCTCTCTTTCTCTCT 552 NM 031501 44 O-1441 protocadherin alpha 5 GATCTCTCTTTCTCTCT 553 NM 031495 442-1443 protocadherin alpha 2 GATCTAATGTAAAATCC 554 NM 002354 444 - 1445 tumor-associated calcium signal transducer 1

GATCTTCTTTTGTAATG 555 NM_032780 446 - 1447 transmembrane protein 25 GATCAATAATAATGAGG 556 NM_001001390 448- 1449 CD44 antigen (homing function and Indian blood group system)

GATCAACAGTGGCAATG sist NM_001001390 1448- 1449 CD44 antigen (homing function and Indian blood group system)

GATCAACAGTGGCAATG 558 NM_001001391 145 O-1451 CD44 antigen (homing function and Indian blood group system)

GATCAATAATAATGAGG 559 NM_001001391 145 O-1451 CD44 antigen (homing function and Indian blood group system)

GATCATTGCTCCTTCTC 560 NM_004872 452-1453 chromosome 1 open reading frame 8 GATCTCTGCATTTTATA 561 NM_020198 454 - 1455 GKOO1 protein GATCTATGAAATCTGTG 562 NM_020198 454 - 1455 GKOO1 protein GATCTCTCTTTCTCTCT 563 NM_018901 45. 6- 1457 protocadherin alpha 1 O

GATCACTGGAGCTGTGG 564 NM 002116 458 - 1459 major his to compatibility complex, class I, A GATCATCCAGTTTGCTT 565 NM_004540 46 O-1461 neural cell adhesion molecule 2

GATCAAAATTGTTACCC 566 NM_004540 46 O-1461 neural cell adhesion molecule 2 US 2011/O 136690 A1 Jun. 9, 2011 52

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description GATCAACAGTGGCAATG 567 NM_001001389 1462 - 1463 CD44 antigen (homing function and Indian blood group system)

GATCAATAATAATGAGG 568 NM_001001389 1462 - 1463 CD44 antigen (homing function and Indian blood group system)

GATCAACAGTGGCAATG 569 NM 000610 1464 - 1465 CD44 antigen (homing function and Indian blood group system)

GATCAATAATAATGAGG st O NM 000610 1464 - 1465 CD44 antigen (homing function and Indian blood group system)

GATCCATACTGTTTGGA sf1 NM_001792 1466 - 14 67 cadherin 2, type 1, N cadherin (neuronal) GATCTGCATTTTCAGAA NM_015544 1468- 1469 DKFZP564K1964 protein GATCCCATTTTTTGGTA NM_000574 1470 - 1471 decay accelerating factor for complement (CD55, Cromer blood group system)

GA CTGCAGTGCTTCAC NM_022842 472 473 CUB domain containing protein 1 GA CTGTTAAAAAAAAA sts NM 147160 474 - opioid receptor, sigma 1. GA CATAGGTCTGGACA 576 NM 014045 476 477 low density lipoprotein receptor related protein 10 GA (CTAATACTACTGTC 577 NM_001110 478 479 a disintegrin and metalloproteinase domain 10

GA CTCTTGAGGCTGGG NM 016371 481 hydroxysteroid (17 beta) dehydrogenase 7

GA CGTTCATTGCCTTT st 9 NM_001746 482 483 calnexin GA CTCTCTTTCTCTCT 58 O NM_018900 484 - 485 protocadherin alpha 1 GA CTGACCTGGTGAGA 581 NM 004393 486 487 dystroglycan 1 (dystrophin associated glycoprotein 1)

GATCATCTTTCCTGTTC 582 NM 002117 1488-1489 major his to compatibility complex, class I, C GATCGTAAAATTTTAAG 583 NM_003816 1490 - 1491 a disintegrin and metalloproteinase domain 9 (meltrin gamma)

GATCTCTCTTTCTCTCT 584 NM_018904 1492 - 1493 protocadherin alpha 13 US 2011/O 136690 A1 Jun. 9, 2011 53

TABLE 3 - continued

DIFFERENTIALLY EXPRESSED GENES THAT ENCODE PREDICTED SECRETED PROTEINS.

SEQ ID Accession SEQ ID Signature NO : Number NOS : Description

GATCTCTCTTTCTCTCT 585 NM_018911 494-1495 protocadherin alpha 8

GATCTCTCTTTCTCTCT 586 NM_018.905 496 - 1497 protocadherin alpha 2

GATCTCTCTTTCTCTCT 587 NM_018903 498 - 1499 protocadherin alpha 12

GATCTCTCTTTCTCTCT 588 NM_018907 500 - 15 O1. protocadherin alpha 4

GATCTCTCTTTCTCTCT 589 NM_018908 5O2-1503 protocadherin alpha 5

GATCCGGAAAGATGAAT 590 NM 153480 SO4 - 15 OS interleukin 17 receptor E

GATCCGGAAAGATGAAT 591 NM 153483 506-15. Of interleukin 17 receptor E

GATCTCTGTAATTTTAT 592 NM_021923 508-1509 fibroblast growth factor receptor-like 1

GATCTAAGAGATTAATA 593 NM 004362 510-1511 calmegin

Example 2 tein in combination with PSA. The WDR19 prostate-specific Identification of Secreted Proteins by Computational protein is diagnostically superior to PSA when used alone and Analysis of MPSS Signature Sequences further improved prostate cancer detection when used incom 0146 Secreted proteins can readily be exploited for blood bination with PSA. cancer diagnosis and prognosis. As such, the differentially 0149 WDR19 was previously identified as relatively tis expressed genes identified in Example 1 were further ana sue-specific by cDNA array studies and Northern blot analy lyzed to determine how many of the differentially expressed sis (see e.g., U.S. Patent Application Publication No. genes encode secreted proteins. Proteins with signal peptides 20020150893). This protein was selected, expressed as pro (classical secretory proteins) were predicted using the same tein, purified and antibodies were made against it, all using criteria described by Chen et al., Mamm Genome, 14: 859 865, 2003, with the SignalP 3.0 server developed by The standard techniques known in the art (the cDNA encoding the Center for Biological Sequence Analysis, Lyngby, Denmark WDR19 protein is provided in SEQID NO:1, the amino acid (httpcolon double slash www dot cbs dot dtu dot dk/services/ sequence is provided in SEQID NO:2). The WDR19-specific SignalP-3.0; see also, J. D. Bendtsen, et at, J. Mol. Biol., antibody was shown to be an excellent tissue-specific marker 340:783-795, 2004.) and the TMHMM2.0 server (see for of prostate cancer with staining of the specific epithelial cells example A. Krogh, et al., Journal of Molecular Biology, 305 being directly proportional to the progression of the cancer. In (3):567-580, January 2001; E. L. L. Sonnhammer, et al. In J. this regard it is very different from the well-established PSA Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and marker which is not a good prostate tissue cancer marker. C. Sensen, editors, Proceedings of the Sixth International 0150. The WDR19 antibodies and those for the well-es Conference on Intelligent Systems for Molecular Biology, tablished PSA prostate cancer blood marker were used to pages 175-182, Menlo Park, Calif., 1998. AAAI Press). Puta analyze 10 blood samples from normal individuals, 10 blood tively nonclassical secretory secreted proteins (without signal samples from early prostate cancer patients and 10 blood peptides) were predicted based on the SecretomeP 1.0 server, samples from late prostate cancer patients. The results (httpcolon double slash www dot cbs dot dtu dot dk/services/ showed that WDR19 reacted against no normals, against 5/10 SecretomeP-1.0/) and required an odds ratio score >3. early cancers, and against 5/10 late cancers, whereas PSA 0147 Five hundred and twenty one signatures belonging reacted against no normals, no early cancers and 7/10 late to 460 genes potentially encoding secreted proteins (Table 3) cancers. The two markers together detected all the late can were identified. Among these, 287 (259 genes) and 234 (201 genes) signatures were overexpressed or underexpressed in cers. Thus the multiparameter analysis of blood markers (e.g. CL1 cells compared with LNCaP cells. Thus these proteins the analyses of multiple markers) for prostate cancer was far can be used in blood diagnostics to follow prostate cancer more powerful than using each marker alone. progression. Additionally, these proteins can be used in other 0151. Accordingly, the results show a molecular blood settings, such as for identifying drug side effects. fingerprint that comprises the WDR19 and PSA proteins. This fingerprint allows superior diagnostic power to PSA Example 3 alone and further improves prostate cancer detection. Prostate Cancer Diagnostics Using Multiparameter 0152 WDR19 was also shown to be an effective his Analysis tochemical marker for prostate cancer. Two hundred and sev 0148. This example describes a multiparameter diagnostic enty-five tissue cores that contain both stromal and epithelial fingerprint using the WDR19 prostate-specific secreted pro cells from cancer patients, 17 from benign prostatic hyper US 2011/O 136690 A1 Jun. 9, 2011 54 plasia (BPH) and 12 from normal individuals were examined. Example 4 The mean WDR19 protein staining intensities were 2.52 standard error (S.E.), 0.05: 95% confidence interval (CI), Identification of Organ-Specific Secreted Proteins 2.41-2.61 for prostate cancer, 1.03 BPH (S.E. 0.03: 95% CI, Using MPSS and Computational Analysis 0.96-1.09); and 1.0 (S.E., 0, 95% CI 1.0-1.0) for normal (O153 MPSS as described in Example 1 and in the detailed individuals. Pair-wise comparisons (using independent t-test) description, was used to identify more than 2 million tran demonstrated that WDR19 staining intensity is significantly scripts from each of the prostate cell lines (see Example 1) different between prostate cancer and BPH (mean difference and in normal prostate tissue. The MPSS signature sequences 1.49; PK0.0001) and between prostate cancers and normal from normal prostate were compared against 29 other tissues (mean difference 1.52; PK0.0001). These data suggested that each with about 1 million or more mRNA transcripts. This WDR19, in addition to being a prostate-specific blood biom comparison revealed that about 300 of these transcripts are arker, is a quantitative cancer-specific marker for prostate organ-specific and about 60 of these organ-specific tran tissues. scripts are potentially secreted into the blood. (See Table 4).

TABLE 4

PROSTATE-SPECIFIC PROTEINSPOTENTIALLY SECRETED INTO BLOOD

Accession No. SEQID NO: Annotations/Description NP 001176 512 alpha-2-glycoprotein 1, zinc.; Alpha-2-glycoprotein, Zinc Homo sapiens NP OO1719 513 basigin isoform 1: OK blood group; collagenase stimulatory factor; M6 antigen; extracellular matrix metalloproteinase inducer Homo sapiens S14 basigin isoform 2: OK blood group; collagenase stimulatory factor; M6 antigen; extracellular matrix metalloproteinase inducer Homo sapiens P OO4039 515 beta-2-microglobulin precursor Homo sapiens P 002434 S16 beta-microseminoprotein isoform a precursor; seminal plasma beta-inhibin; prostate secreted seminal plasma protein; immunoglobulin binding factor; prostatic secretory protein 94 Homo sapiens NP 619540 517 beta-microseminoprotein isoform b precursor; Seminal plasma beta-inhibin; prostate secreted Seminal plasma protein; immunoglobulin binding factor; prostatic secretory protein 94 Homo sapiens P 817089 S18 cadherin-like 26 isoform a cadherin-like protein VR20 Homo sapiens P 068582 519 cadherin-like 26 isoform b. cadherin-like protein VR20 Homo sapiens P OO1864 52O carboxypeptidase E precursor Homo sapiens P OO4807 521 chromosome 9 open reading frame 61; Friedreich ataxia region gene X123 Homo sapiens P OO1271 522 cold inducible RNA binding protein; Cold-inducible RNA-binding protein; cold inducible RNA-binding protein; glycine-rich RNA binding protein Homo sapiens NP OO8977 523 elastin microfibril interfacer 1: TNF2 elastin microfibril interface located protein; elastin microfibril interface ocated protein Homo sapiens NP 004104 524 fibroblast growth factor 12 isoform 2: fibroblast growth actor 12B: fibroblast growth factor homologous factor 1; myocyte-activating factor; fibroblast growth factor FGF 2b Homo sapiens NP 005962 525 FXYD domain containing ion transport regulator 3 isoform 1 precursor; phospholemman-like protein; FXYD domain-containing ion transport regulator 3 Homo sapiens P 068710 526 FXYD domain containing ion transport regulator 3 isoform 2 precursor; phospholemman-like protein; FXYD domain-containing ion transport regulator 3 Homo sapiens P OO6352 527 homeo box B13; homeobox protein HOX-B13 Homo sapiens P 002139 528 homeo box D10; homeobox protein Hox-D10; homeo box 4D: Hox-4 P 000513 529 homeobox protein A13; homeobox protein HOXA13; homeo box 1J; transcription factor HOXA13 Homo sapiens C 060819 530 hypothetical protein FLJ11175 Homo sapiens C O78985 531 hypothetical protein FLJ14146 Homo sapiens C O61894 532 hypothetical protein FLJ20010 Homo sapiens C 115617 533 hypothetical protein FLJ23544: QM gene; DNA segment on chromosome X (unique) 648 expressed sequence; 60S ribosomal protein L10; tumor suppressor QM; Wilms tumor-related protein; laminin receptor homolog Homo sapiens P 057582 534 hypothetical protein HSPC242 Homo sapiens P 116285 535 hypothetical protein MGC14388 Homo sapiens US 2011/O 136690 A1 Jun. 9, 2011 55

TABLE 4-continued

PROSTATE-SPECIFIC PROTEINSPOTENTIALLY SECRETED INTO BLOOD

Accession No. SEQID NO: Annotations/Description NP 116293 1536 hypothetical protein MGC14433 Homo sapiens NP 077020 1537 hypothetical protein MGC4309 Homo sapiens NP O61074 1538 hypothetical protein PRO1741 Homo sapiens NP 563614 1539 hypothetical protein similar to KIAA0187 gene product Homo sapiens NP 95.1038 1540 -mfa domain-containing protein isoform p40 Homo sapiens NP 005542 1541 kallikrein 2, prostatic isoform 1: glandular kallikrein 2 Homo sapiens NP 004908 1542 kallikrein 4 preproprotein; protease, serine, 17; enamel matrix serine protease 1: kallikrein-like protein 1; protase; androgen-regulated message 1 Homo Sapiens NP 002328 1543 ow density lipoprotein receptor-related protein associated protein 1; lipoprotein receptor associated protein; alpha-2- MRAP; alpha-2-macroglobulin receptor-associated protein 1: low density lipoprotein-related protein associated protein 1: low density li NP 85907 7 544 ow density lipoprotein receptor-related protein binding protein Homo sapiens N P OOO897 545 natriuretic peptide receptor Aguanylate cyclase A (atrionatriuretic peptide receptor A); Natriuretic peptide receptor Aguanylate cyclase A Homo sapiens P 085048 S46 Nedd4 family interacting protein 1: Nedd4WW domain binding protein 5 Fono sapiens P OOO896 547 neuropeptide Y Homo sapiens P O39227 S48 olfactory receptor, family 10, Subfamily H, member 2 Homo sapiens P 000599 549 orosomucoid 2: alpha-1-acid glycoprotein, type 2 Homo sapiens P 002643 550 prolactin-induced protein; prolactin-inducible protein Homo sapiens P 057674 551 prostate androgen-regulated transcript 1 protein; prostate specific and androgen-regulated cDNA 14D7 protein Homo sapiens NP OO1639 552 prostate specific antigen isoform 1 preproprotein; gamma seminoprotein; semenogelase; seminin; P-30 antigen Homo sapiens NP 665863 553 prostate specific antigen isoform 2; gamma seminoprotein; semenogelase; seminin; P-30 antigen Homo sapiens P 001090 554 prostatic acid phosphatase precursor Homo Sapiens P OO1OOO 555 ribosomal protein S5; 40S ribosomal protein S5 Homo sapiens S P O05658 556 ring finger protein 103; Zinc finger protein expressed in cerebellum; Zinc finger protein 103 homolog (mouse) Homo sapiens C 937761 557 ring finger protein 138 isoform 2 Homo sapiens C OO2998 558 semenogelin I isoform a preproprotein Homo sapiens C 937782 559 semenogelin I isoform b preproprotein Homo sapiens C 353669 S60 similar to HIC protein isoform p32 Homo sapiens C O03855 S61 sin3 associated polypeptide p30 Homo sapiens C 036581 S62 six transmembrane epithelial antigen of the prostate; six transmembrane epithelial antigen of the prostate (NOTE: non-standard symbol and name) Homo sapiens N P OO8868 563 SMT3 Suppressor of miftwo 3 homolog 2: SMT3 (Suppressor of miftwo 3 yeast) homolog 2 Homo Sapiens NP 066568 S64 solute carrier family 15 (H+ peptide transporter), member 2 Homo sapiens NP O55394 565 solute carrier family 39 (zinc transporter), member 2 Homo sapiens NP O03209 566 telomeric repeat binding factor 1 isoform 2; Telomeric repeat binding factor 1; telomeric repeat binding protein 1 Homo sapiens NP 110437 567 thioredoxin domain containing 5 isoform 1; thioredoxin related protein; endothelial protein disulphide isomerase Homo sapiens NP 004.863 568 thymic dendritic cell-derived factor 1; liver membrane bound protein Homo sapiens NP 665694 569 TNF receptor-associated factor 4 isoform 2; tumor necrosis receptor-associated factor 4A: malignant 62; cysteine-rich domain associated with ring and TRAF domain Homo sapiens US 2011/O 136690 A1 Jun. 9, 2011 56

TABLE 4-continued

PROSTATE-SPECIFIC PROTEINSPOTENTIALLY SECRETED INTO BLOOD

Accession No. SEQID NO: Annotations/Description NP 005647 1570 transmembrane protease, serine 2: epitheliasin Homo sapiens NP OO8931 1571 uroplakin 1A Homo sapiens NP 036609 1572 WW domain binding protein 1 Homo sapiens NP OO9062 1573 Zinc finger protein 75 Homo sapiens

Example 5 encoded proteins that were potentially secreted (as deter mined by computational analysis from the sequence of the Comparison of Localized Prostate Cancer and Pros gene). These genes are as follows: (listed by gene name as tate Cancer Metastases in the Liver available through the public yeast genome database at http 0154) In an additional experiment, the transcriptome from colon double slash www dot yeastgenome dot org/. The normal prostate tissue was compared to the transcriptome of genomic DNA, cDNA and amino acid sequences correspond each of the LNCaP and CL-1 prostate cancer cell lines. The ing to each of the listed genes are publicly available, for comparison showed that the transcriptomes were distinct for example, through the yeast genome database.) YGL 102C, the normal tissue, the early prostate cancer and the late pros YGL069C, YLL044W, YMR321C, YKL153W, YMR195W, tate cancer. An additional comparison was carried out YHL015 W, YNL096C, YGR030C, YDR123C, YKL186C, between localized prostate cancer and metastases in the liver. YOR234C, YKL001C, YJL188C, YDL023C, YPL143W, About 6,000 genes were identified that were significantly YEL039C, YKL006W, YGR280C, YBR285W, YKR091W, changed between the localized prostate cancer and the metas YDR064W, YBR047W, YGR243W, YOR309C, YDR461W, tasized cancer and again, many of the changed genes encoded YHR053C, YHR055C, YGR148C, YGL187C, YIL018W, secreted proteins that can be part of the blood fingerprints YFR003C, YPL107W, YBR185C, YNR014W, YJL067W, indicative of the more advanced disease status of metastases. YDR451C, YGL031C, YHR141C, YNL162W, YBR046C, The metastases-altered blood fingerprints may indicate the YNL036W, YDL136W, YDL191W, YLR257W, YNL057W, site of metastases. YGL068W, YKRO57W, YLR201C, YHL001 W, YDRO10C, 0155 These experiments demonstrate that there are con YPL138C, YOR312C, YPL276W, YML114C, YLR327C, tinuous changes in the two types of networks as prostate YBR191W, YOR257W, YOR096W, YPL223C, YJL136C, cancer progresses—from localized to androgen indepen YAL044C, YER079W, YMR107W, YPL079W, YDR175C, dence to metastases. These graded network transitions Sug YGRO35C, YDR153C, YDR337W, YOR167C, YMR194W, gest that one will be able to detect the very earliest stages of YOR194C, YHR090C, YGR110W, YMR242C, YHR198C, prostate cancer and, accordingly, that the organ-specific, YPL177C, YLR164W, YMR143W, YDL083C, YLR325C, molecular blood fingerprints approach described herein will YOR203W, YMR193W, YLR062C, YOR383C, YLR300W, permit a very early diagnosis of prostate and other types of YJL079C, YJL158C, YHR139C, YGL032C, YER150W, cancers. These experiments further Support the notion that the YNL160W, YDR382W, YMR305C,YKL096W,YKR013W, drug-related side effects that impact the prostate can be iden YCL043C, YLR042C, YDR055W, YPL163C, YEL040W, tified and monitored using the organ-specific, molecular YJL171C, YLR121C, YDR382W, YLR250W, YGR189C, blood fingerprints. YJL159W, YMR215 W, YDR519W, YIL162W, YKL163W, YDR518W, YDR534C, YPR157W, YML130C, YML128C, Example 6 YBR092C, YDR032C, YLR120C, YBR093C, YHR215W, YAR071 W, YDL130W, YDR144C, YPR123C, YGR174C, MPSS Analysis in a Yeast Model System YOR327C, YNL058C, YGR265W, YGR160W, YIL117C, 0156 This experiment demonstrates perturbation-specific YOLO53W, YGR236C, YGR06OW, YKL120W, YDL046W, fingerprints of patterns of gene expression for nuclear, cyto YHR132C, YMR058W, YLR332W, YKRO61W, YEL001C, plasmic, membrane-bound and secreted proteins in the yeast YKL154W, YKL073W, YMR238W, YJR020W, YIL136W, metabolic system that converts the Sugar galactose into glu YHL028W, YDL010W, YLR339C, YNL217W, YHR063C. cose-6-phosphate (the gal system). 0158. The different knockout strains can be thought of as 0157. The gal systems includes 9 proteins. In the course of analogous to genetic disease mutants. Accordingly, these data studying how this systems works, 9 new strains of yeast were further Support the notion that each disease has a unique created, each with a different one of the 9 relevant genes expression fingerprint and that each disease generates unique destroyed (gene knockouts). Yeast is a single celled eukaryote collections of secreted proteins that constitute molecular fin organism with about 6,000 genes. The expression patterns of gerprints capable of identifying the corresponding disease. each of the 6,000 genes was studied in the wildtype yeast and each of the 9 knockout strains. The data from these experi Example 7 ments showed: 1) the wild type and each of knock out strains Identification of Prostate-Specific/Enriched Genes exhibited Statistically significant changes in patterns of gene Using a 2.5 Fold Overexpression Cut-Off expression from the wild type strain ranging from 89 to 465 altered patterns of gene expression; 2) each of these patterns 0159. Organ specific/enriched expression can be deter of changed gene expression were unique; and 3) on average mined by the ratio of the expression (e.g., measured in tran about 15% of the genes with changed expression patterns Scripts per million (tpm)) in a particular organ as compared to US 2011/O 136690 A1 Jun. 9, 2011 57 other organs. In this example, prostate enriched/specific cytes, peripheral blood lymphocytes, pituitary gland, pla expression was analyzed by comparing the expression level centa, pancreas, prostate, retina, spinal cord, salivary gland, (tpm counts) of MPSS signature sequences identified from normal prostate tissue to their corresponding expression lev Small intestine, stomach, spleen, testis, thymus, trachea, thy els in 33 normal tissues. A particular gene that demonstrated roid, and uterus. This analysis identified 109 unique genes at least a 2.5-fold increase in expression in prostate as com (with mpSS signature sequence belonging to class 1-4, i.e. pared to all tissues examined (each tissue evaluated individu with confirmed match to cDNAs) whose expression was at ally) was considered to be prostate-specific/enriched. The least 2.5 fold that observed in other normal tissues. The list of tissues examined were adrenal gland, bladder, bone marrow, prostate-specific/enriched genes is provided in Tables 5A-5D brain (amygdala, caudate nucleus, cerebellum, corpus callo with the expression level in tpm in prostate shown. This list Sum, hypothalamus, and thalamus), whole fetal brain, heart, includes KLK2, KLK3, KLK4, TMPRSS2, which are genes kidney, liver (new cloning), lung, mammary gland, mono previously shown to be prostate-specific.

TABL E 5A

PROSTATE ENRICHED GENES IDENTIFIED BY RATIO. SCHEMA (RATIO >2.5) k MPSS Sig. SEQ ID Genbank Genbank SEO ID Tissue Names MPSS Signature NO : Name Accession No. NOS: Description

GATCTCAGAACAACCTT 688 DHRS7 BCOOO637 797-1798 Dehydrogenase/reductase (SDR family) member 7

GATCCAGCCCAGAGACA 689 NPY BCO29497 799-18OO Neuropeptide Y

GATCACTCCTTATTTGC 690 FLJ2OO10 AW.172826 8O1 Hypothetical protein FLJ2O010

GATCCCTCTCCTCTCTG 691 C9drf 61 B1771919 8 O2 Chromosome 9 open reading frame 61

GATCTGACTTTTTACTT 692 Lirp2bp BU85.33 O6 8O3 Ankyrin repeat domain 37

GATCGTTAGCCTCATAT 693 HOXB13 BCOOf O92 804 - 1805 Homeo box B13

GATCACAAGGAATCCTG 694 CREB3L4 BCO38962 806-1807 CAMP responsive element binding protein 3-like 4

GATCTCATGGATGATTA 695 LEPREL1 BCOOSO29 808-1809 Leprecan-like 1

GATCCAGAAATAAAGTC 696 KLK4 CBO51271 810 Kallikrein 4 (prostase, enamel matrix, prostate)

GATCTCACAGAAGATGT 697 MGC355.58 NM 145013 811-1812 Chromosome 11 open reading frame 45

GATCCAAAATCACCAAG 698 HAX1 BU157155 813 HCLS1 associated protein X-1

GATCCTGGGCTGGAAGG 699 O AW2O72O6 814 Hypothetical gene supported by AY338954

GATCCAGATGCAGGACT 7 OO O BCO13389 815 LOC44 O156

GATCTGTGCTCATCTGT 7 O1 TMEM16G BCO28162 816-1817 Transmembrane protein 16G

GATCATTTTATATCAAT 7 O2 MGC31963 BXO9916O 818 Chromosome 1 open reading frame 85

GATCCACACTGAGAGAG 7 O3 KLK3 BCOO53. Of 819-182O Kallikrein 3, (prostate specific antigen)

GATCCGTCTGTGCACAT 704 TMPRSS2 NMOO5656 821-1822 Transmembrane protease, serine 2

GATCATTGTAGGGTAAC 7 Os LOC221442 BCO2 6923 823 Hypothetical protein LOC221442

GATCAGCCCTCAAAAAA 706 ARL1OC BU1598 OO 824 ADP-ribosylation factor-like 8B

GATCTGGATTCAGGACC 7 O7 MGC13102 NMO32323 825 - 1826 Hypothetical protein US 2011/O 136690 A1 Jun. 9, 2011 58

TABLE 5A- Continued

PROSTATE ENRICHED GENES IDENTIFIED BY RATIO. SCHEMA (RATIO >2.5) * MPSS Sig. SEO ID Genbank Genbank SEO ID Tissue Names MPSS Signature NO: Name Accession No. NOS: Description

GATCAAAAATAAAATGT 1708 AIS 54.252 1827 Hypothetical gene supported by AKO22914; AKO95211; BCO16035; B.CO41856; BX248778

GA CCGCTCTGGTCAAC SEPX1 BO941313 828 Selenoprotein X, 1

GA CCCTCAAGACTGGT ACPP BCOOf 460 829-1830 Acid phosphatase, prostate

GA CCACAAAGACGAGG IN3 BI911790 831 Bridging integrator 3

GA CTCTCTGCGTTTGA SPON2 BCOO27 Of 832-1833 Spondin 2, extracellular matrix protein

GA CTCAACCTCGCTTG AKO2 6938 834 Hypothetical gene supported by AL713796

GA CAAGTTCCCGCTGG RPL18A BG818587 835 Ribosomal protein L18 a

GA CATAATGAGGTTTG ABCC4 NMOO584.5 836 - 1837 ATP-binding cassette, sub family C (CFTR/MRP), member 4.

GA CGGTGACATCGTAA PS11 AA888242 838 Ribosomal protein S11

GA CCACCAGCTGATAA INSEP1 CN3 53139 839 Y box binding protein 1

GA CAACACACTTTATT J22955 AA256381 84 O Hypothetical protein FLJ22955

GA CCCTTCCTTCCTCT HOXD11 AA513505 841 Homeo box D11

GA CAGGACACAGACTT O RM1 BG564.253 842 Orosomucoid 1

GA CCTGCAATCTTGTA 721 HTPAP 843 Phosphatidic acid phosphatase type 2 domain containing 1B

GA CCTCC TATGTTGTT 722 AA259,243 84.4 Kallikrein 2, prostatic

GA CTGTACCTTGGCTA 723 AI675 682 845 Solute carrier family 2 (facilitated glucose transporter), member 12

GA CGGGGCAAGAGAGG 724 DRG1 NMOO6096 846 - 1847 N-myc downstream regulated gene 1

GA CCCCTCCCC. TCCCC 72 PR1 NMOOO906 848-1849 Natriuretic peptide receptor A/guanylate cyclase A (atrionatriuretic peptide receptor A)

GA CCTACAAAGAAGGA 726 FLJ21511 NMO25087 850 - 1851 Hypothetical protein FLJ21511

GA CATTTGCAGTTAAG 727 FOXA1 NMOO4496 852-1853 Forkhead box A1

GA CTGTCTCCTGCTCT 728 ENPP3 AIS35.878 854 Ectonucleotide pyrophosphatase/phosphodiesterase 3

GA CCTTCCCAAGGTAC 729 GATA2 NMO32 638 855 - 1856 GATA binding protein 2

GA CTTGTTGAAGTCAA 73 O ARG2 BX331427 857 Arginase, type II

GA CGCACCACTGTACA 731 XPO1 AIs 69484 858 Exportin 1 (CRM1 homolog, yeast)

GA CATTTTCTGCTTTA 732 ASB3 BCOO9569 859-1860 Ankyrin repeat and SOCS box containing 3

GA CCCCACACTTGTCC 733 AKOOOO28 861 Hypothetical LOC90 024 US 2011/O 136690 A1 Jun. 9, 2011 59

TABLE 5A Continued

PROSTATE ENRICHED GENES IDENTIFIED BY RATIO. SCHEMA (RATIO >2.5) k MPSS Sig. SEO ID Genbank Genbank SEO ID Tissue Names MPSS Signature NO: Name Accession No. NOS: Description

GA CTGGAATTGTCATA 734 KLF3 BX1 OO 634 862 Kruppel-like factor 3 (basic)

GA CAATAAGCTTTAAA 73 TGM4 BCOOf OO3 863 - 1864 Transglutaminase 4 (prostate)

GA (CAATGTTTGTAGAT 736 FLJ16231 NMOO1 OO 84 O1 865-1866 FLJ16231 protein

GA CTACATGTCTATCA 737 BLNK BX113323 867 B-cell linker

GA CTGTTTTAAATGAG 738 SLC14A1 NMO15865 868 - 1869 Solute carrier family 14 (urea transporter), member 1 (Kidd blood group)

GA CAAAAAATGCTGCA 739 PTPLB AIO17286 87 O Protein tyrosine phosphatase like (proline instead of catalytic arginine), member b

GA CATGTCTTCATTTT 740 871-1872 Olfactory receptor, family 51, subfamily E, member 2

GA CCCTCCACCCCCAT 74.1 FAAH NMOO1441 873- 1874 Fatty acid amide hydrolase

GA CCTAAGCCATAAAT 742 STAT6 ALO44554 Signal transducer and activator of transcription 6, interleukin- 4 induced

GA CATCGTCCTCATCG 743 CBO 494.66 876 Ankylosis, progressive homolog (mouse)

GA CATCATTTGTCATT 744 AWsts f47 Down syndrome critical region gene 1-like 2

GA CTAATTTGAAAAAC 74. TRPM8 878 - 1879 Transient receptor potential cation channel, subfamily M, member 8

GA CTTCCTTGTATCAT 746 TMC4 AWF24.505 Transmembrane channel-like 4

GA CTCCCCCATGCCTG 747 ZNF589 BCOO5859 881 - 1882 Zinc finger protein 589

GA CAAATTTAGTATTT 748 LRRK1 BCOO5408 883 - 1884 Leucine-rich repeat kinase 1 GA CTGCCTTATAAACA 749 STEAP2 885 Six transmembrane epithelial antigen of the prostate 2

GA CAGAAAATGAGCTC SAFB2 BCOO1216 886 Scaffold attachment factor B2

GA CACCGTGGAGGTTA 751. CPE BG7 Of 154 887 Carboxypeptidase E

GA CCC. TCTGTGCTTCT AAO24878 888 Guanline nucleotide binding protein (G protein), beta polypeptide 2-like 1

GA CTCATTTTTAGAGC 75.3 LOC92689 BU688574 889 Hypothetical protein BCOO 1096

GA CATCACATTTCGTG 754 DLG1 BCO 42118 890 Discs, large homolog 1 (Drosophila)

GA CATTTTCTGCTTCA SEMG1 891 - 1892 Semenogelin I

GA CAATGAAGGAGAGA 756 SPATA13 BM875598 893 Spermatogenesis associated 13

GA CCCAACTACTCGGG 757 LOC157657 NM 177965 894 - 1895 Chromosome 8 open reading frame 37

GA CAGTTTTTCTGTAA 758 KIAA1411 896 KIAA1411

GA CAAAATTTTAAAAA 759 BM984 931 897 5'-nucleotidase, cytosolic III-like

GA (CACCCTTCTCTTCC 760 LOC2551.89 BCO3S335 898-1899 Phospholipase A2, group IVF US 2011/O 136690 A1 Jun. 9, 2011 60

TABLE 5A- Continued

PROSTATE ENRICHED GENES IDENTIFIED BY RATIO. SCHEMA (RATIO >2.5) * MPSS Sig. SEO ID Genbank Genbank SEO ID Tissue Names MPSS Signature NO: Name Accession No. NOS: Description

GATCCTGGGTACTGAAA 1761 ERBB2 BCO 801.93 1900 V-erb-b2 erythroblastic leukemia viral oncogene homolog 2, neuro/glioblastoma derived oncogene homolog (avian)

GA CGTTCTAAGAGTGT 762 NM 1994.27 901 - 19 O2 Zinc finger protein 64 homolog (mouse)

GA CATCATCAAGGGCT 763 BCO 4237 O 903 Suppressor of hairy wing homolog 2 (Drosophila)

GA CAAAATGATTTTCA 764 ELOWL7 AL1375O 6 904 - 1905 ELOVL family member 7, elongation of long chain fatty acids (yeast)

GA CTGATTTTTTTCCC 765 TRAF4 AI888 175 906 TNF receptor-associated factor 4

GA CCCATTTCT CACCC 766 AI66.9751 9. Of Solute carrier family 39 (zinc transporter), member 2

GA CCTCCCGCCTTGCC 767 AIO 88739 908 Hepatocyte nuclear factor 4, gamma

GA CTTTC TTTTTTTGT 768 BCOf O3 OO 909 Solute carrier family 22 (extraneuronal monoamine transporter), member 3

GA CTTAACTGTCTCCT 769 HIST2H2BE BCOO5827 Histone 2, H2be

GA CAGTTTGATTCTGT 770 AMD1 BCO 41345 911 - 1912 Adenosylmethionine decarboxylase 1

GA CATGATGTAGAGGG 771 TYMS Thymidylate synthetase

GA CGCACCACTACAGT 772 PHC3 AKO22 455 Polyhomeotic like 3 (Drosophila)

GA CTCAAAGTGCCTTC 773 SARG AL83.294 O 915 - 1916 Chromosome 1 open reading frame 116

GA (CAATGTCAAACTTC 774 MTERF BCOOO965 917-1918 Mitochondrial transcription termination factor

GA CTCCCAGAGTCTAA 775 NMOO7253 919 - 1920 Cytochrome P450, family 4, subfamily F, polypeptide 8

GA CCTGATGGCTGTGT 776 PPAP2A AK1244. O1 921 Phosphatidic acid phosphatase type 2A

GA (CACTTCCCGCAGTC 777 KIAAOO56 BCO11408 922-1923 KIAAOO56 protein

GA CTCAAAGGAACCAA 778 MSMB AA4 69293 924 Microseminoprotein, beta

GA CTGTGCCAGGGTTA 779 WEGF AKO 56914 925 Vascular endothelial growth factor

GA CTCTTTTTATTTAA CDH1 NMOO436 O 926 - 1927 Cadherin. 1, type 1, E-cadherin (epithelial)

GA CTCCAGCACCAATC 781 TARP BCO 62761 928-1929 TCR gamma alternate reading frame protein

GA CTGGCGCTTGGGGG 782 RFP2 NMOO1OO7278 930 - 1931 Ret finger protein 2

GA CCCGACGGGGGCAT 783 MESP1 NMO1867 O 932 - 1933 Mesoderm posterior 1 homolog (mouse) US 2011/O 136690 A1 Jun. 9, 2011 61

TABLE 5A- continued

PROSTATE ENRICHED GENES IDENTIFIED BY RATIO. SCHEMA (RATIO >2.5) k MPSS Sig. SEO ID Genbank Genbank SEO ID Tissue Names MPSS Signature NO: Name Accession No. NOS: Description

GATCCCGGGCCGTTATC 784 TRPM4 AAO26974 934 Transient receptor potential cation channel, subfamily M, member 4

GATCTTTCTCAAAATAT PAK1IP1 AI4 68032 935 PAK1 interacting protein 1

GATCGTGACGCTTAATA 786 HNRPA1 CF122.297 936 Heterogeneous nuclear ribonucleoprotein A1

GATCGCATAATTTTTAA CBO53869 937 Zinc finger protein 207

GATCCCAACACTGAAGG 788 NMO32387 938-1939 WNK lysine deficient protein kinase 4

GATCTTAAAAACTGCAG 789 APXL2 94 O Apical protein 2

GATCATTTTTTCTATCA 79 O MED28 AIS 54 477 941 Mediator of RNA polymerase II transcription, subunit 28 homolog (yeast)

GATCCCATTGTGTGTAT 791 LOC2853 OO AKO 95.655 942 Hypothetical protein LOC2853 OO

GATCTCAAAGGAAAAAA 792 AW291.753 943 Transcribed locus

GATCTTCTGTTATATTT 1. 793 BMO23121 1944 Full length insert cDNA clone ZD79H10

GATCCACAACATACAGC 1. 794 AY33895.3 1945 Prostate-specific P712P mRNA sequence

GATCTGTGCAGTTGTAA 1. 79. AYS33562 1946 KLK1.6 mRNA, partial sequence (clone HGT25) T cell receptor

GATCTACTATGCCAAAT 1. 796 BCO30554 1947 gamma-chain mRNA, V region

* ratio of prostate expression in tp.m. to other organs greater than 2.5

TABLE 5B

PROSTATEENRICHED GENESIDENTIFIED BYRATIO. SCHEMA (RATIO >2.S)* SignalP3.0 prediction Genbank Genbank SignalP3.0 prediction Signal peptide Accession No. SEQI D NOS: Name Prediction probability BCOOO637 797 798 DHRS7 Signal peptide O.999 BCO29497 799 800 NPY Signal peptide O.998 AW172826 8O1 FL2O010 Non-secretory protein O.OO1 BIT 71919 802 C9orf61 Signal peptide O994 BU853.306 803 Lrp2bp Non-secretory protein O BCOO7092 804 805 HOXB13 Non-Secretory protein O BCO38962 806 807 CREB3L4 Non-Secretory protein O BCOOSO29 808 809 LEPREL1 Signa peptide O.995 CBOS1271 810 KLK4 Signa peptide O.988 NM 145013 811 812 MGC35558 Signa peptide O.935 BU157155 813 HAX1 Non-S ecretory protein O.OO1 AW2O72O6 814 O Non-Secretory protein O.OO1 BCO1338.9 815 O Non-Secretory protein BCO28162 816 817 TMEM16G Non-Secretory protein O.OO1 BXO9916O 818 MGC31963 Signa peptide O994 BCOO5307 819 820 KLK3 Signa peptide O.992 NM OO5656 821 822 TMPRSS2 Non-Secretory protein BCO26923 823 LOC221442 Signa 8CO BU1598OO 824 ARL10C Non-S ecretory protein NM 032323 825 826 MGC13102 Non-Secretory protein US 2011/O 136690 A1 Jun. 9, 2011 62

TABLE 5B-continued PROSTATE ENRICHED GENESIDENTIFIED BY RATIOSCHEMA (RATIO -2.5) SignalP3.0 prediction Genbank Genbank SignalP3.0 prediction Signal peptide Accession No. SEQID NOs: Name Prediction probability AI9S4252 827 O Non-secretory protein O.128 BQ941313 828 SEPX1 Non-secretory protein O BCOO7460 829-1830 ACPP Signal peptide 1 BI911790 831 BIN3 Non-secretory protein O BCOO2707 832-1833 SPON2 Signal peptide O.998 AKO26938 834 O Signal peptide 0.587 BG818587 835 RPL18A Non-secretory protein O NM 005845 836-1837 ABCC4 Non-secretory protein O AA888242 838 RPS11 Non-secretory protein O CN353139 839 NSEP1 Non-secretory protein O.OO1 AA256381 840 FLJ22955 Non-secretory protein O.O6 AAS13 SOS 841 HOXD11 Non-secretory protein O BGS642S3 842 ORM1 Signal peptide 1 AI572O87 843 HTPAP Non-secretory protein O.O21 AA259243 844 KLK2 Signal peptide O.98S AI675682 845 SLC2A12 Non-secretory protein O NM OO6096 846-1847 NDRG1 Non-secretory protein O NM OOO906 848-1849 NPR1 Signal peptide O.997 NM O25087 850-1851 FL21511 Non-secretory protein O.OOS NM 004496 852-1853 FOXA1 Non-secretory protein O AIS35878 854 ENPP3 Non-secretory protein O.069 NM 032638 855-1856 GATA2 Non-secretory protein O BX331427 857 ARG2 Non-secretory protein O.O14 AIS69484 858 XPO1 Non-secretory protein O BCOO9569 859-1860 ASB3 Non-secretory protein O AKOOOO28 861 O Non-secretory protein O.OO1 BX1 OO634 862 KLF3 Non-secretory protein O BCOO7003 863-1864 TGM4 Non-secretory protein O NM 001008401 865-1866 FLJ16231 Non-secretory protein O BX113323 867 BLNK Non-secretory protein O NM O15865 868-1869 SLC14A1 Non-secretory protein O AIO17286 870 PTPLB Non-secretory protein O.O6 NM 030774 871-1872 ORS1E2 Non-secretory protein O.OO8 NM OO1441 873-1874 FAAH Signal peptide O.805 ALO44554 875 STAT6 Non-secretory protein O CBO494.66 876 ANKH Non-secretory protein O.OO1 AW575747 877 DSCR1L2 Non-secretory protein O NM 024080 878-1879 TRPM8 Non-secretory protein O AV7245OS 88O TMC4 Non-secretory protein O BCOOS859 881-1882 ZNF589 Non-secretory protein O BCOOS408 883-1884 LRRK1 Non-secretory protein O AA177004 885 STEAP2 Non-secretory protein O BCOO1216 886 SAFEB2 Non-secretory protein O BG707154 887 CPE Signal peptide 1 AAO24878 888 GNB2L1 Non-secretory protein O BU688574 889 LOC92689 Non-secretory protein O BCO42118 890 DLG1 Non-secretory protein O NM OO3007 891-1892 SEMG1 Signal peptide O.922 BM875598 893 SPATA13 Non-secretory protein O NM 177965 894-1895 LOC157657 Non-secretory protein O CA4332O8 896 KIAA1411 Non-secretory protein O BM984931 897 MGC20781. Non-secretory protein O BCO35335 898-1899 LOC255.189 Non-secretory protein O BCO8O193 900 ERBB2 Non-secretory protein O NM 199427 901-1902 ZFP64 Non-secretory protein O BCO42370 903 SUHW2 Non-secretory protein O AL137SO6 904-1905 ELOVL7 Non-secretory protein O AI888175 906 TRAF4 Non-secretory protein O AI669751 907 SLC39A2 Signal peptide O.982 AIO88739 908 HNF4G Non-secretory protein O.OO1 BCO7O3OO 909 SLC22A3 Signal anchor O.097 BCOOS827 910 HIST2H2BE Non-secretory protein O BCO41345 911-1912 AMD1 Non-secretory protein O BX390O36 913 TYMS Non-secretory protein O AKO224.55 914 PHC3 Non-secretory protein O AL83.2940 915-1916 SARG Non-secretory protein O BCOOO96S 917-1918 MTERF Non-secretory protein O NM 007253 919-1920 CYP4F8 Signal peptide 1 AK1244O1 921 PPAP2A Non-secretory protein O.348 US 2011/O 136690 A1 Jun. 9, 2011 63

TABLE 5B-continued PROSTATE ENRICHED GENESIDENTIFIED BY RATIOSCHEMA (RATIO -2.5) SignalP3.0 prediction Genbank Genbank SignalP3.0 prediction Signal peptide Accession No. SEQID NOs: Name Prediction probability BCO11408 922-1923 KIAA0056 Non-secretory protein O AA469293 924 MSMB Signal peptide O.997 AKO56914 925 VEGF Non-secretory protein O NM 004360 926-1927 CDH1 Signal peptide O896 BCO62761 928-1929 TARP Non-secretory protein O NM OO1007278 1930-1931 RFP2 Non-secretory protein O NM 018670 932-1933 MESP1 Signal anchor O.OO)4 AAO26974 934 TRPM4 Non-secretory protein O AI468032 935 PAK1IP1 Non-secretory protein O.OO1 CF122297 936 HNRPA1 Non-secretory protein O CBOS3869 937 ZNF2O7 Non-secretory protein O NM 032387 938-1939 WNK4 Non-secretory protein O BQ448015 940 APXL2 Non-secretory protein O AISS4477 941 MED28 AKO956SS 942 LOC2853OO AW291753 943 O BMO23121 944 O AY33895.3 945 O AYS33562 946 O BCO30554 947 O

ratio of prostate expression in tpm to other organs greater than 2.5

TABLE 5C

PROSTATEENRICHED GENESIDENTIFIED BYRATIO. SCHEMA (RATIO > 2.5)

TMHMM 2.0 SignalP3.0 SecretomeP2.O prediction Genbank prediction prediction Pred trans Genbank SEQID Max cleavage Secreted membrane Accession No. NOS: l8le site probability potential (Odds) domains

BCOOO637 797-1798 DHRS7 O599 between 6.3 1 bos. 28 and 29 BCO29497 799-1800 NPY O-520 between 6.09 1 bos. 28 and 29 AW172826 8O1 FL2O010 OOOO between 6.06 O bos. 46 and 47 BIT 71919 802 C9orf61 O534 between 5.9 2 bos. 29 and 30 BU853.306 803 Lrp2bp OOOO between S.62 O bos. 55 and 56 BCOO7092 804-18OS HOXB13 OOOO between 5.14 O pos. -1 and O BCO38962 806-1807 CREB3L4 OOOO between 4.72 O pos. -1 and O BCOOSO29 808-1809 LEPREL1 O.991 between 4.59 O bos. 24 and 25 CBOS1271 810 KLK4 0.401 between 4.57 1 bos. 29 and 30 NM 145013 811-1812, MGC35558 O.901 between 4.47 O bos. 22 and 23 BU157155 813 HAX1 O.OO1 between 4.41 O bos. 18 and 19 AW2O72O6 814 O O.OO1 between 4.39 O bos. 20 and 21 BCO13389 815 O OOOO between 4.3 O bos. 27 and 28 BCO28162 816-1817 TMEM16G. O.OO1 between 4.29 7 bos. 22 and 23 BXO9916O 818 MGC31963 O.855 between 4.22 2 bos. 35 and 36 BCOO5307 819-1820 KLK3 OS25 between 3.938 O bos. 23 and 24 NM OO5656 821-1822 TMPRSS2 O.OOO between 3.86 1 pos. -1 and O US 2011/O 136690 A1 Jun. 9, 2011 64

TABLE 5C-continued PROSTATE ENRICHED GENESIDENTIFIED BY RATIOSCHEMA (RATIO - 2.5) TMHMM 2.0 SignalP3.0 SecretomeP2.O prediction Genbank prediction prediction Pred trans Genbank SEQID Max cleavage Secreted membrane Accession No. NOS: l8le site probability potential (Odds) domains

BCO26923 823 LOC221442 0.004 between 3.81 O bos. 50 and 51 BU1598OO 824 ARL10C OOOO between 3.76 O bos. 35 and 36 NM 032323 825-1826 MGC13102 O.OOO between 3.69 5 pos. -1 and O AI9S4252 827 O 0.121 between 3.58 O bos. 42 and 43 BQ941313 828 SEPX1 OOOO between 3.49 O bos. 13 and 14 BCOO7460 829-1830 ACPP O.975 between 3.49 1 bos. 32 and 33 BI911790 831 BIN3 OOOO between 3.41 O pos. -1 and O BCOO2707 832-1833 SPON2 O.829 between 3.06 O bos. 26 and 27 AKO26938 834 O O568 between 3.02 O bos. 27 and 28 BG818587 835 RPL18A OOOO between 2.8 O bos. 24 and 25 NM 005845 836-1837 ABCC4 OOOO between 2.67 11 pos. -1 and O AA888242 838 RPS11 OOOO between 2.64 O pos. -1 and O CN353139 839 NSEP1 OOOO between 2.35 O bos. 25 and 26 AA256381 840 FLJ22955 O.O38 between 2.19 1 bos. 15 and 16 AAS13 SOS 841 HOXD11 OOOO between 2.14 O bos. 20 and 21 BGS642S3 842 ORM1 O.923 between 2.03 O bos. 18 and 19 AI572O87 843 HTPAP O.009 between 2.01 4 bos. 63 and 64 AA259243 844 KLK2 O455 between 81 O bos. 17 and 18 AI675682 845 SLC2A12 OOOO between .79 12 bos. 51 and 52 NM OO6096 846-1847 NDRG1 OOOO between .76 O pos. -1 and O NM OOO906 848-1849 NPR1 O960 between 75 O bos. 32 and 33 NM O25087 850-1851 FL21511 O.005 between 75 10 bos. 20 and 21 NM 004496 852-1853 FOXA1 OOOO between 71 O pos. -1 and O AIS35878 854 ENPP3 O.O36 between 69 1 bos. 42 and 43 NM 032638 855-1856 GATA2 OOOO between 6S O bos. 22 and 23 BX331427 857 ARG2 O.O13 between S6 O bos. 36 and 37 AIS69484 858 XPO1 OOOO between 54 O pos. -1 and O BCOO9569 859-1860 ASB3 OOOO between 53 O pos. -1 and O AKOOOO28 861 O OOOO between .46 O bos. 22 and 23 BX1 OO634 862 KLF3 OOOO between .4 O pos. -1 and O BCOO7003 863-1864 TGM4 OOOO between 36 O pos. -1 and O NM OO10O8401 1865-1866 FLJ16231 OOOO between .21 O pos. -1 and O BX113323 867 BLNK OOOO between .21 O pos. -1 and O NM O15865 868-1869 SLC14A1 OOOO between .2 8 pos. -1 and O US 2011/O 136690 A1 Jun. 9, 2011 65

TABLE 5C-continued

PROSTATEENRICHED GENESIDENTIFIED BYRATIO. SCHEMA (RATIO > 2.5)

TMHMM 2.0 SignalP3.0 SecretomeP2.O prediction Genbank prediction prediction Pred trans Genbank SEQID Max cleavage Secreted membrane Accession No. NOS: l8le site probability potential (Odds) domains

AIO17286 870 PTPLB O.028 between .2 4 bos. 63 and 64 NM 030774 871-1872 ORS1E2 O.003 between .2 7 bos. 22 and 23 NM OO1441 873-1874 FAAH O549 between .2 1 bos. 28 and 29 ALO44554 875 STAT6 OOOO between 17 O pos. -1 and O CBO494.66 876 ANKH OOOO between 1S 8 bos. 26 and 27 AW575747 877 DSCR1L2 OOOO between .12 O pos. -1 and O NM 024080 878-1879 TRPM8 OOOO between O7 8 pos. -1 and O AV7245OS 88O TMC4 OOOO between O6 8 pos. -1 and O BCOOS859 881-1882 ZNFS89 OOOO between O.99 1 pos. -1 and O BCOOS408 883-1884 LRRK1 OOOO between O.99 O pos. -1 and O AA177004 885 STEAP2 OOOO between O.9S 6 pos. -1 and O BCOO1216 886 SAFEB2 OOOO between O.9S O pos. -1 and O BG707154 887 CPE O859 between O.93 O bos. 27 and 28 AAO24878 888 GNB2L1 OOOO between O.92 O bos. 33 and 34 BU688574 889 LOC92689 OOOO between O.91 O pos. -1 and O BCO42118 890 DLG1 OOOO between O.87 O pos. -1 and O NM OO3007 891-1892 SEMG1 O515 between O.85 O bos. 23 and 24 BM875598 893 SPATA13 OOOO between O.81 O pos. -1 and O NM 177965 894-1895 LOC157657 O.OOO between O.81 O pos. -1 and O CA4332O8 896 KIAA1411 O.OOO between O.8 O pos. -1 and O BM984931 897 MGC2O781. O.OOO between 0.79 O bos. 25 and 26 BCO35335 898-1899 LOC2551.89 OOOO between O.78 O bos. 23 and 24 BCO8O193 900 ERBB2 OOOO between O.74 2 pos. -1 and O NM 199427 901-1902 ZFP64 OOOO between O.68 O pos. -1 and O BCO42370 903 SUHW2 OOOO between 0.67 O pos. -1 and O AL137SO6 904-1905 ELOVLA OOOO between 0.67 7 pos. -1 and O AI888175 906 TRAF4 OOOO between O.63 O pos. -1 and O AI669751 907 SLC39A2 O.297 between O.62 8 bos. 23 and 24 AIO88739 908 HNF4G O.OO1 between O.S9 O bos. 21 and 22 BCO7O3OO 909 SLC22A3 O.048 between O.S8 12 bos. 33 and 34 BCOOS827 910 HIST2H2BE O.OOO between O.S8 O pos. -1 and O BCO41345 911-1912 AMD1 OOOO between O.S8 O pos. -1 and O BX390O36 913 TYMS OOOO between 0.57 O pos. -1 and O AKO224.55 914 PHC3 OOOO between 0.57 O pos. -1 and O US 2011/O 136690 A1 Jun. 9, 2011 66

TABLE 5C-continued PROSTATE ENRICHED GENESIDENTIFIED BY RATIOSCHEMA (RATIO - 2.5) TMHMM 2.0 SignalP3.0 SecretomeP2.O prediction Genbank prediction prediction Pred trans Genbank SEQID Max cleavage Secreted membrane Accession No. NOS: l8le site probability potential (Odds) domains

AL83.2940 915-1916 SARG OOOO between O.S6 O bos. 21 and 22 BCOOO96S 917-1918. MTERF OOOO between O.S6 O bos. 14 and 15 NM 007253 919-1920 CYP4F8 O.781 between O.S6 1 bos. 36 and 37 AK1244O1 921 PPAP2A O.226 between O.S3 5 bos. 30 and 31 BCO11408 922-1923 KIAAOOS6 OOOO between O.S2 O pos. -1 and O AA469293 924 MSMB O928 between O.S1 1 bos. 20 and 21 AKO56914 925 VEGF OOOO between O485 O pos. -1 and O NM 004360 926-1927 CDH1 O487 between O.36 1 bos. 22 and 23 BCO62761 928-1929 TARP OOOO between O3S 1 bos. 20 and 21 NM OO100.7278 1930-1931 RFP2 OOOO between O.32 1 bos. 24 and 25 NM 018670 932-1933 MESP1 O.OO2 between O.31 O bos. 20 and 21 AAO26974 934 TRPM4 OOOO between O.3 5 pos. -1 and O AI468032 935 PAK1IP1 OOOO between 0.27 O bos. 25 and 26 CF122297 936 HNRPA1 OOOO between O.22 O bos. 32 and 33 CBOS3869 937 ZNF2O7 OOOO between O.21 O pos. -1 and O NM 032387 938-1939 WNK4 OOOO between O.2 O pos. -1 and O BQ448015 940 APXL2 OOOO between O.19 O pos. 41 and 42 AISS4477 941 MED28 #NA #NA AKO956SS 942 LOC285300 #NA #NA AW291753 943 O #NA #NA BMO23121 944 O #NA #NA AY33895.3 945 O #NA #NA AYS33562 946 O #NA #NA BCO30554 947 O #NA #NA

ratio of prostate expression in tpm to other organs greater than 2.5

TABLE 5D

PROSTATEENRICHED GENESIDENTIFIED BY RATIO. SCHEMA (RATIO >2.S)*

Genbank Genbank Prostate Accession No. SEQID NOs: name NN-score Odds Expression (tmp) BCOOO637 1797-1798 DHRS7 O.92 6.302 754 BCO29497 1799-1800 NPY O.911 6.099 642 AW172826 1801 FL2O010 O.911 6.061 92 BIT 71919 1802 C9orf61 O.906 S.902 91 BU853.306 1803 Lrp2bp O.895 S. 626 95 BCOO7092 1804-1805 HOXB13 0.875 S.145 344 BCO38962 1806-1807 CREB3L4 O.866 4.721 334 BCOOSO29 1808-1809 LEPREL1 O.857 4.594 118 CBOS1271 1810 KLK4 O.856 4.575 360 NM 145013 1811-1812 MGC35558 O.86 4477 53 BU157155 1813 HAX1 O.854 4.412 67 AW2O72O6 1814 O O.854 4.391 279 BCO13389 1815 O O.85 4.304 64 BCO28162 1816-1817 TMEM16G O.843 4.293 281

US 2011/O 136690 A1 Jun. 9, 2011 68

TABLE 5D-continued

PROSTATEENRICHED GENESIDENTIFIED BY RATIO. SCHEMA (RATIO >2.S)*

Genbank Genbank Prostate Accession No. SEQID NOs: name NN-score Odds Expression (timp) AKO224.55 914 PHC3 0.287 0.57 105 AL83.2940 915-1916 SARG O3O2 O.S63 158 BCOOO96S 917-1918 MTERF O.3 O.S6 190 NM 007253 919-1920 CYP4F8 O.28 O.S66 S4 AK1244O1 921 PPAP2A O.211 O.S33 75 BCO11408 922-1923 KIAAO056 O.281 0.527 287 AA469293 924 MSMB 0.27 0.517 275 AKO56914 925 VEGF O.256 0.485 2O2 NM 004360 926-1927 CDH1 O.179 0.362 192 BCO62761 928-1929 TARP O.174 0.353 S64 NM OO1007278 1930-1931 RFP2 O.162 0.322 192 NM 018670 932-1933 MESP1 O.154 0.315 133 AAO26974 934 TRPM4 O.147 O.305 290 AI468032 935 PAK1IP1 O.13 O.271 74 CF122297 936 HNRPA1 O.106 0.228 104 CBOS3869 937 ZNF2O7 O.099 0.212 72 NM 032387 938-1939 WNK4 O.O89 0.2O1 100 BQ448015 940 APXL2 O.083 O.19 244 AISS4477 941 MED28 700 AKO956SS 942 LOC2853OO 84 AW291753 943 O 310 BMO23121 944 O 178 AY33895.3 945 O 166 AYS33562 946 O 67 BCO30554 947 O 66 ratio of prostate expression in tpm to other organs greater than 2.5

0160 Additional analysis was carried out to determine the Example 8 secretion potential of the prostate-specific genes identified. The analysis programs used included SignalP3.0, Secretome Prostate Cancer Diagnostics Using Multiparameter 2.0 and TMHMM 2.0 (see http colon double slash www dot Analysis cbs dot dtu dot dk/services/). The SignalPanalysis identifies classical secreted proteins and was conducted using the clas 0163 This example describes a multiparameter diagnostic sical Secretion pathway prediction as described at http colon fingerprint using the NDRG1 prostate-specific protein in double slash www dot cbs dot dtu dot dk/services/SignalP/ combination with PSA. The NDRG1 prostate-specific pro (see Jannick Dyrlov Bendtsen, et al. J. Mol. Biol. 340:783 tein further improved prostate cancer detection when used in 795, 2004; Henrik Nielsen et al., Protein Engineering, 10:1-6, combination with PSA. 1997; Henrik Nielsen and Anders Krogh. Proceedings of the 0164 Commercially available antibodies specific for Sixth International Conference on Intelligent Systems for numerous proteins encoded by prostate-specific genes as Molecular Biology (ISMB 6), AAAI Press, Menlo Park, described in Table 5 were used to determine which proteins Calif., pp. 122-130, 1998). The Secretome2.0 analysis iden would be useful in a multiparameter diagnostic assay for tifies nonclassical secreted proteins (see J. Dyrlev Bendtsen, prostate cancer. Most of the commercially available antibod et al., Protein Eng. Des. Sel., 17(4):349-356, 2004). ies were not suitable (e.g., were not sensitive enough or TMHMM uses hidden Markov model for three-state (TM showed non-specific binding). However, the antibody avail helix, inside, outside) topology prediction of transmembrane able for NDRG1 (anti-NDRG1(C terminal) poly IgY: Cat proteins (see Erik L. L. Sonnhammer, et al., Proc. of Sixth Int. #A22272B; GenWay Inc) was shown to specifically bind to Conf. on Intelligent Systems for Molecular Biology, p. 175 NDRG1 from serum. NDRG1 is a member of the N-myc 182 Ed. J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. downregulated gene (NDRG) family that belongs to the Sankoff, and C. Sensen, Menlo Park, Calif: AAAI Press, alpha/beta hydrolase Superfamily. It is classified as a tumor 1998). Suppressor and heavy metal-response protein. Its expression 0161 According to the SignalPanalysis method, proteins is modulated by diverse physiological and pathological con with an odds scoring 3 or higher have a high confidence of ditions including hypoxia, cellular differentiation, heavy being secreted. However, it should be noted that several pro metal, N-myc and neoplasia (Lachat P. et al.: Histochem Cell teins scoring well below 3 by this method are known to be Biol. 2002 November: 118(5):399-408). secreted proteins detected in the blood (see e.g., Table 5, 0.165 NDRG1 protein expression was analyzed in serum KLK2). Further, these analyses do not take into account pro samples from 18 advanced prostate cancer patients, 21 pros teins that may be shed. tate cancer patients with localized cancer, and 22 normal 0162. In Summary, this example identifies prostate-spe controls. Western blot analysis was used to measure serum cific and potentially secreted prostate-specific proteins that protein expression as follows: Serum was diluted (1:10) with can be used in diagnostic panels for the detection of diseases lysis buffer (50 mM Hepes, pH 7.4, 4 mM EDTA, 2 mM of the prostate. EGTA, 2 uM PMSF, 20 g/ml, leupeptine (or 1x protease US 2011/O 136690 A1 Jun. 9, 2011 69 inhibitor cocktail), 1 mM NaVO, 10 mM NaF, 2 mM Na (1:500) overnight at 4°C. The membranes were washed 3 pyrophosphate, 1% Triton X-100). Protein concentration was times with TBS, and then incubated with horseradish peroxi determined using the Bio-Rad protein assay kit. Serum pro dase conjugated anti-rabbit IgY (1:16,000) for 1 h. The teins (50 g) were subjected to SDS-PAGE electrophoresis immunoblot was thenwashed five times with TBS and devel and transferred to a PVDF membrane (Hybond-P, Amersham oped using an ECL (Amersham). The intensities of the single Pharmacia Biotech, Piscataway, N.J.). The membrane was band corresponding to the NDRG1 protein were then scored. blocked with 4% non-fat milk in TBS (25 mM Tris, pH 7.4, The results are summarized in Table 6 together with serum 125 mM NaCl) for 1 h at room temperature, followed by PSA measurements performed using a commercial ELISA incubation with primary antibodies against NDRG1 IgY kit.

TABLE 6

COMBINED ANALYSIS OF NDRG1 AND PSA SERUMEXPRESSION INCREASES PROSTATE CANCERDLAGNOSIS CONFIDENCE.

NDRG-1 PSA intensity values serum diagnosis cancer Status (scores*) (ng/ml) by PSA serum diagnosis by NDRG1 Advance 3 70.48 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 127.3 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 422.1 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 1223 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 71.28 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 2 133.2 identified as missed by NDRG1 assay cancer by PSA assay Advance 4 353.7 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 1 73.95 identified as missed by NDRG1 assay cancer by PSA assay Advance 3 454.8 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 474 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 6 150.1 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance O 1375 identified as missed by NDRG1 assay cancer by PSA assay Advance 6 71.28 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 6 4.066 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 4 1199 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 1 38.14 identified as missed by NDRG1 assay cancer by PSA assay Advance 6 552.6 identified as identified as cancer by NDRG1 cancer by PSA assay assay Advance 5 321 identified as identified as cancer by NDRG1 cancer by PSA assay assay Primary -1 14.2 possibly cancer Primary 6.27 Grey Zone of diagnosis by Psa US 2011/O 136690 A1 Jun. 9, 2011 70

TABLE 6-continued

COMBINED ANALYSIS OF NDRG1 AND PSA SERUMEXPRESSION INCREASES PROSTATE CANCERDLAGNOSIS CONFIDENCE.

NDRG-1 PSA intensity values serum diagnosis cancer Status (scores*) (ng/ml) by PSA serum diagnosis by NDRG1 Prim ary 2 9.2 Grey Zone of diagnosis by Psa Prim ary 1 8.57 Grey Zone of diagnosis by Psa Prim ary 5.67 Grey Zone of diagnosis by Psa Primary 2 11.3 possibly cancer Primary 4.58 Grey Zone of diagnosis by Psa Primary 5.67 Grey Zone of diagnosis by Psa Prim ary 6.48 Grey Zone of diagnosis by Psa Primary 12.71 possibly cancer strong NDRG-1 expression reinforces the diagnosis of this patient as cancer Primary 4.93 Grey Zone o strong NDRG-1 expression iagnosis by Psa reinforces the diagnosis of this patient as cancer Primary 3.16 Grey Zone o iagnosis by Psa Prim ary 4.87 Grey Zone o iagnosis by Psa Prim ary 4.66 Grey Zone o iagnosis by Psa Prim ary 6.87 Grey Zone o iagnosis by Psa Prim ary 3.91 Grey Zone o iagnosis by Psa Prim ary 6.48 Grey Zone o iagnosis by Psa Primary 2 13.1 possibly cancer Primary 4.58 Grey Zone o iagnosis by Psa Primary 4.72 Grey Zone o iagnosis by Psa Primary 12.71 possibly cancer strong NDRG-1 expression reinforces the diagnosis of this patient as cancer Norma O.8 Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma Normal O8. Norma O Normal O8. Norma O Normal O8. Norma O Normal O8. scores: no expression, -1; no expression to very faint, 0; expression levels then scored from 1 to 6 by intensities

(0166 PSA was detected in 100% of the advanced prostate ng/ml (often referred to as the grey Zone in the PSA assay) cancers. NDRG1 was detected in 14 out of 18 advanced cannot reliably detect prostate cancer as PSA levels in this cancers (78%) (see Table 6, scores greater than 3). Serum range may be the result of other factors such as infection PSA levels below 15 ng/ml, particularly, levels between 4-10 (prostatitis) or benign prostatic hyperplasia (BPH), a com US 2011/O 136690 A1 Jun. 9, 2011 mon condition in older men. Additionally, the normal range of NDRG1 and PSA can improve prostate cancer diagnosis to PSA values increases with patient age. NDRG1 detection in reduce false positive and false negative rates. serum reinforced the diagnosis of three prostate cancer patients with PSA levels between 4.9 ng/ml and 15 ng/ml. In 0168 From the foregoing it will be appreciated that, these three patients, the NDRG1 scores were 3 or 4, signifi although specific embodiments of the invention have been cantly higher than the NDRG1 scores in a cohort of 22 normal described herein for purposes of illustration, various modifi individuals (average 0.09, range -1 to 2). cations may be made without deviating from the spirit and 0167 Thus, this example illustrates that the use of two or Scope of the invention. Accordingly, the invention is not lim more prostate specific/enriched cancer markers such as ited except as by the appended claims.

SEQUENCE LISTING The patent application contains a lengthy “Sequence Listing section. A copy of the “Sequence Listing is available in electronic form from the USPTO web site (http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US20110136690A1). An electronic copy of the “Sequence Listing will also be available from the USPTO upon request and payment of the fee set forth in 37 CFR1.19(b)(3).

What is claimed is: 11. A method for determining the presence or absence of a 1. A method for detecting a drug side effect comprising drug side effect in a Subject taking the drug comprising, measuring in the blood of a subject taking the drug the level of a. detecting a level of each of a plurality of organ-specific a plurality of organ-specific proteins secreted from an organ proteins in a blood sample from the subject, wherein the wherein the levels of the plurality of organ-specific proteins plurality of organ-specific proteins are secreted from the together provide an organ-specific molecular blood finger Same organ; print that indicates a drug side effect on the organ in the b. comparing said level of each of the plurality of organ Subject. specific proteins in the blood sample from the subject to 2. The method of claim 1 wherein the level of the plurality a level of each of the plurality of organ-specific proteins of organ-specific proteins is measured using a method in a control sample of drug-free blood; selected from the group consisting of tandem mass spectrom wherein a statistically significant altered level of one or etry, ELISA, Western blot, microfluidics/nanotechnology more of the plurality of organ-specific proteins in the sensors, and aptamer capture assay. blood is indicative of the presence or absence of a drug 3. The method of claim 2 wherein the level of the plurality side effect. of organ-specific proteins is measured using tandem mass 12. The method of claim 11 wherein the level of each of the spectrometry. plurality of organ-specific proteins is detected using a method 4. The method of claim 2 wherein the level of the plurality selected from the group consisting of mass spectrometry, and of organ-specific proteins is measured using ELISA. an immunoassay. 5. A method for detecting a drug side effect comprising measuring in the blood of a subject taking the drug the level of 13. The method of claim 12 wherein the level of each of the one or more organ-specific proteins secreted from an organ plurality of organ-specific proteins is measured using tandem wherein the level of the one or more organ-specific proteins mass spectrometry. together provide an organ-specific molecular blood finger 14. The method of claim 12 wherein the level of each of the print that indicates a drug side effect on the organ in the plurality of organ-specific proteins is measured using ELISA. Subject. 15. The method of claim 12 wherein the level of each of the 6. The method of claim 1 wherein the plurality of organ plurality of organ-specific proteins is measured using an anti specific proteins comprises about 5 organ-specific proteins. body array. 7. The method of claim 1 wherein the plurality of organ 16. The method of claim 11 wherein the organ-specific specific proteins comprises about 20 organ-specific proteins. proteins comprise liver-specific proteins. 8. The method of claim 1 wherein the organ-specific pro 17. The method of claim 11 wherein the organ-specific teins comprise liver-specific proteins. proteins comprise kidney-specific proteins. 9. The method of claim 1 wherein the organ-specific pro 18. The method of claim 11 wherein the organ-specific teins comprise kidney-specific proteins. proteins are from an organ other than the expected therapeutic 10. The method of claim 1 wherein the organ-specific target of the drug. proteins are from an organ other than the expected therapeutic target of the drug.