US 2003O194725A1 (19) United States (12) Patent Application Publication (10) Pub. No.: US 2003/0194725A1 Greener et al. (43) Pub. Date: Oct. 16, 2003

(54) METHODS FOR IDENTIFYING AND Publication Classification VALIDATING POTENTIAL DRUG TARGETS (51) Int. Cl." ...... C12O 1/68; G01N 33/53; (76) Inventors: Tsvika Greener, Ness-Ziona (IL); G06F 19/00; G01N 33/48; Avishai Levy, Rishon-Le-Zion (IL); GO1N 33/50 Yuval Reiss, Kiriat-Ono (IL); Danny (52) U.S. Cl...... 435/6; 435/7.1; 702/19; 702/20 Ben-Avraham, Tel-Aviv (IL); Iris Alroy, Ness-Ziona (IL) (57) ABSTRACT Correspondence Address: ROPES & GRAY LLP - X ONE INTERNATIONAL PLACE This application provides methods for identifying and Vali BOSTON, MA 02110-2624 (US) dating potential drug targets. In one aspect, the application provides a Systematic method of creating a database of (21) Appl. No.: 10/299,991 related or nucleic acid Sequences with annotations of the potential disease associations of the Sequences, and a (22) Filed: Nov. 19, 2002 method for testing the potential disease associations by Related U.S. Application Data means of a biological assay and validating the disease asSociation by either decreasing expression of the Sequence (60) Provisional application No. 60/331,701, filed on Nov. of interest or increasing expression of the Sequence of 19, 2001. interest. (, OO

Download sequence data 102

clean sequence

104 data

identify E3 from cleaned 106 sequence data

associate disease or biological active with E3 proteins 108 based on their characteristic(s) Patent Application Publication Oct. 16, 2003 Sheet 1 of 5 US 2003/0194725 A1 (, OO

re Download sequence data 102

clean sequence data 104

identify E3 proteins from cleaned 106 Sequence data

associate disease or biological active with E3 proteins 108 based on their characteristic(s) F.G. 1 Patent Application Publication Oct. 16, 2003 Sheet 2 of 5 US 2003/0194725 A1 o O Sequence data 202

204 E3 Characteristic(s) sequence analysis (e.g., domains and motifs) to

208 E3 ProteinCharacteristic(s) Biological Activity

216

Characteristic Biological Activity

218 220 FIG. 2 Patent Application Publication Oct. 16, 2003 Sheet 3 of 5 US 2003/0194725 A1

| CIVWS 9SeáT?In TOHEIH ZOHEIH

JOld333.IgC('IL RIHINV Patent Application Publication Oct. 16, 2003. Sheet 4 of 5 US 2003/0194725 A1

p6wild-type -- p6ATAAP

Figure 4 Patent Application Publication Oct. 16, 2003 Sheet 5 of 5 US 2003/0194725 A1

(WT) (Posh-i-WT)

P55->

P24->

P24->

Figure 5 US 2003/O194725 A1 Oct. 16, 2003

METHODS FOR DENTIFYING AND VALIDATING protein ligases have previously been identified. See, e.g., POTENTIAL DRUG TARGETS D'Andrea, A. D., et al., Nature Genetics, 18:97 (1998); Gonen, H., et al., Isolation, Characterization, and Purifica RELATED APPLICATIONS tion of a Novel Ubiquitin-Protein Ligase, E3-Targeting of Protein Substrates via Multiple and Distinct Recognition 0001. This application claims the benefit of the filing date Signals and Conjugating Enzymes, J. Biol. Chem., 271:302 of U.S. Provisional Application No. 60/331,701, filed Nov. (1996). Accordingly, E3 enzymes are potential drug targets 19, 2001, the specification of which is hereby incorporated and this application provides a Systematic method for iden by reference in its entirety. tifying and validating potential E3 drug targets.

BACKGROUND OF THE INVENTION SUMMARY 0002 Potential drug target validation involves determin 0007. In one aspect, the application provides a systematic ing whether a DNA, RNA or protein molecule is implicated method of creating a database of related protein or nucleic in a disease process and is therefore a Suitable target for acid Sequences with annotations of the potential disease development of new therapeutic drugs. Drug discovery, the asSociations of the Sequences, and a method for testing the proceSS by which bioactive compounds are identified and potential disease associations by means of a biological assay characterized, is a critical Step in the development of new and validating the disease association by either decreasing treatments for human diseases. The landscape of drug dis expression of the Sequence of interest or increasing expres covery has changed dramatically due to the genomics revo Sion of the Sequence of interest. lution. DNA and protein Sequences are yielding a host of new drug targets and an enormous amount of associated 0008. In one aspect, the application provides a method of information. testing and validating potential drug targets. In one aspect the application provides a method of creating a comprehen 0003. The task of deciphering which of these targets are Sive database of related protein and/or nucleic acid implicated in diseases and should be used for Subsequent Sequences, i.e., the protein and nucleic acid Sequences are drug development requires the development of not only included in the database based upon certain Sequence infor Systematic procedures but also high-throughput approaches mation, Structural and/or functional information. In one for determining which targets are a part of disease relevant aspect, the application provides Sequences that are Sorted pathways are critical to the drug discovery process. based upon Sequence, Structural, functional, and biological 0004. The levels of proteins are determined by the bal activity. The Sequences may be further clustered based upon ance between their rates of Synthesis and degradation. The potential disease association; Such as for example, the pres ubiquitin-mediated proteolysis is the major pathway for the ence or absence of certain domains may be indicative of Selective degradation of intracellular proteins. Conse potential disease correlations of that protein or nucleic acid quently, Selective ubiquitination of a variety of intracellular Sequence. The database further comprises annotations indi targets regulates essential cellular functions Such as cating the relevant disease correlations. expression, cell cycle, Signal transduction, biogenesis of 0009. The sequences so clustered may be tested for the ribosomes and DNA repair. Another major function of potential associated disease correlations by means of bio ubiquitin ligation is to regulate intracellular protein Sorting. logical assayS. For example, if the associated disease is viral Whereas poly-ubiquitination targets proteins to proteasome infection, a biological assay may be assaying for the release mediated degradation, attachment of a Single ubiquitin mol of Virus like particles, if the disease is a proliferative disease ecule (mono-ubiquitination) to proteins regulates endocyto the biological assay may be determining the rate of prolif sis of cell Surface receptors and Sorting into lysosomes. It eration of the diseased cells. In another aspect, the associ was also demonstrated that ubiquitination controls Sorting of ated disease may be a ubiquitin-mediated disorder and the proteins in the trans-golgi (TGN). assay may determine an aspect of protein degradation, 0005 The linkage of ubiquitin to a substrate protein is protein trafficking, or cellular localization of proteins. In generally carried out by three classes of accessory enzymes other embodiments, the assay may be determining any in a sequential reaction. Ubiquitin activating enzymes (E1) disease characteristic of the associated disease by means of activate ubiquitin by forming a high energy thiol ester the biological assay. intermediate. Activation of the C-terminal Gly of ubiquitin by E1, is followed by the activity of a ubiquitin conjugating 0010. In another aspect, the application provides methods enzyme E2 which Serves as a carrier of the activated thiol of validating the disease associations by decreasing the ester form of ubiquitin during the transfer of ubiquitin expression of the Sequence of interest and determining the directly to the third enzyme, E3 ubiquitin protein ligase. E3 effect of Such a decrease by means of a biological assay. In one embodiment, if the associated disease is a viral infec ubiquitin protein ligase is responsible for the final Step in the tion, the effect of decreasing expression of the Sequence of conjugation proceSS which results in the formation of an interest on the release of the virus like particles is deter isopeptide bond between the activated Gly residue of ubiq mined. Thus, if decreasing the expression of the Sequence of uitin, and an alpha. -NH group of a Lys residue in the interest results in a decrease in the release of the virus like Substrate or a previously conjugated ubiquitin moiety. See, particles the Sequence may be a potential drug target for viral e.g., Hochstrasser, M., Ubiquitin-Dependent Protein Degra infection. Similarly, if decreasing the expression of the dation, Annu. Rev. Genet., 30:405 (1996). Sequence of interest results in a decrease in the rate of 0006 E3 ubiquitin protein ligase, as the final player in the proliferation of a diseased cell Such as a tumor cell the ubiquitination proceSS, is responsible for target Specificity of Sequence may be a potential drug target for proliferative ubiquitin-dependent proteolysis. A number of E3 ubiquitin disorders. Thus, if decreasing the expression alters any US 2003/O194725 A1 Oct. 16, 2003

disease characteristic of the associated disease, the Sequence and determining the differential expression of Said human may be a potential drug target for the associated disease. E3 in a cell exhibiting disease characteristics of an E3 0011. In another embodiment, the application provides asSociated disease and a corresponding normal cell. The methods for validating the disease associations by increasing expression of said E3 is then altered to determine the effect the expression of the Sequence of interest. For example, if of decreased E3 expression on Said cell exhibiting disease the Sequence of interest is a tumor Suppressor increasing characteristics of an E3 associated disease, wherein a change expression of the Sequence may alter a disease characteristic in Said disease characteristics is indicative that Said human of an associated disease. In other embodiments, the appli E3 is a potential drug target for Said E3 associated disease. cation provides additional drug targets Such as the Substrates 0020) Identification of potential E3 drug targets provides of various enzymes Such as the E3 proteins, wherein either a means assaying for effective therapeutics. increasing expression of the ligase or decreasing expression of its Substrate may alter a disease characteristic of the BRIEF DESCRIPTION OF THE DRAWINGS asSociated disease. For example, the tumor SuppreSSor Von Hippel-Lindau is associated with certain E3-associated dis 0021 FIG. 1 is a flow-chart of a process for identifying eases, increasing expression of the Von Hippel-Lindau gene human E3 proteins that may be involved in diseases or other or decreasing expression of its Substrate would alter at least biological processes of interest. one disease characteristic of the E3 associated disease. 0022 FIG. 2 is a flow-diagram illustrating creation of a Accordingly, in one aspect, the Substrate may be a potential database of human E3 proteins. drug target for the E3-associated disease. 0023 FIG. 3 provides an exemplary schematic represen 0012. In one aspect, this invention provides a method of identifying a potential human E3 drug target comprising tation of some of the E3-domains present in the E3 proteins. providing a database comprising human E3 nucleic acid or 0024 FIG. 4 shows results from a screen to identify E3 protein Sequences. These Sequences are Sorted based on their proteins that are drug targets for the treatment of HIV and Structural and functional attributes providing an E3-associ related viruses. A Virus-Like Particle (VLP) 30 Assay was ated disease specific database. The potential involvement of used. The figure shows Viral proteins in the cellular fraction E3’s in disease is assessed by the criteria which include the (top panel) and in released VLPs (bottom panel). The VLP following: assay was performed with a wild-type viral p6 protein and a mutant p6 protein as positive and negative controls, 0013 1. An E3 that might interact with proteins respectively. SiRNA knockdowns of various mRNAS were whose modification by ubiquitin and/or abnormal tested for effects on VLP production. Knockdown of POSH degradation are the cause for a disease/pathological resulted in complete or near-complete inhibition of VLP condition. production. 0014) 2. Potential E3’s will be selected from E3’s 0025 FIG. 5 shows a pulse-chase VLP experiment com that contain Specific Structural domains and or motifs paring the kinetics of VLP production in normal (WT) VLP that are likely to interact with a Specific domainS/ assay conditions and in a POSH knockdown (POSH+WT). motifs on the interacting protein. siRNA knockdown of POSH results in complete or near 0.015 3. An E3, the cellular localization of which complete inhibition of VLP production. Suggests possible interaction with an interacting pro tein. DETAILED DESCRIPTION 0016 4. Abnormal expression of an individual E3 0026 Definitions that correlates with a disease/pathological condition. 0027 AS used herein, the following terms and phrases 0017 5. Abnormal activity (due to a mutation or shall have the meanings set forth below. Unless defined abnormal regulation) of an E3 that is associated with otherwise, all technical and Scientific terms used herein have a disease or a pathological condition. the same meaning as commonly understood to one of 0.018. Once the E3 sequences are sorted based upon either ordinary skill in the art to which this invention belongs. their structural attributes or their E3 disease-associations, 0028. The singular forms “a,”“an,” and “the” include this invention provides assays for measuring a disease plural reference unless the context clearly dictates other characteristic of Said E3-associated disease; for example, wise. Such disease characteristics include determining the release of viral like particles from infected cells or cells transfected 0029. The phrase “a corresponding normal cell of or with plasmids containing a nucleic acid Sequence encoding "normal cell corresponding to’ or “normal counterpart cell for non infectious viral DNA (e.g. HIV-VLP, VP40 etc"), of a diseased cell refers to a normal cell of the same type determining the differential expression of Said E3S in a as that of the diseased cell. For example, a corresponding normal cells in comparison to a cell exhibiting at least one normal cell of a B lymphoma cell is a B cell. Symptom of a E3-associated disease etc. Upon identifying a 0030. An “address” on an array, e.g., a microarray, refers potential E3 target that is implicated in an E3-associated to a location at which an element, e.g., an oligonucleotide, disease, the expression of Said E3 is altered, i.e., either is attached to the Solid Surface of the array. increased or decreased to determine whether the change in 0031. The term “antibody” as used herein is intended to expression results in a change in the output of the assay. include whole antibodies, e.g., of any isotype (IgG, IgA, 0019. In another aspect, this invention provides a data IgM, IgE, etc), and includes fragments thereof which are base comprising human E3 nucleic acid or protein Sequences also specifically reactive with a vertebrate, e.g., mammalian, US 2003/O194725 A1 Oct. 16, 2003

protein. Antibodies can be fragmented using conventional ment of a nucleic acid Strand can be the complement of a techniques and the fragments Screened for utility in the same coding Strand or the complement of a non-coding Strand. manner as described above for whole antibodies. Thus, the 0037. The phrases “conserved residue" or conservative term includes Segments of proteolytically-cleaved or recom amino acid Substitution” refer to grouping of amino acids on binantly-prepared portions of an antibody molecule that are the basis of certain common properties. A functional way to capable of Selectively reacting with a certain protein. Non define common properties between individual amino acids is limiting examples of Such proteolytic and/or recombinant to analyze the normalized frequencies of amino acid changes fragments include Fab, F(ab')2, Fab', Fv, and single chain between corresponding proteins of homologous organisms antibodies (scFv) containing a VIL and/or VH domain (Schulz, G. E. and R. H. Schirmer, Principles of Protein joined by a peptide linker. The scFv's may be covalently or Structure, Springer-Verlag). According to Such analyses, non-covalently linked to form antibodies having two or groups of amino acids may be defined where amino acids more binding sites. The Subject invention includes poly within a group exchange preferentially with each other, and clonal, monoclonal, or other purified preparations of anti therefore resemble each other most in their impact on the bodies and recombinant antibodies. overall protein structure (Schulz, G. E. and R. H. Schirmer, 0032. By “array' or “matrix” is meant an arrangement of Principles of Protein Structure, Springer-Verlag). Examples addressable locations or “addresses' on a device. The loca of amino acid groups defined in this manner include: tions can be arranged in two dimensional arrays, three dimensional arrays, or, other matrix formats. The number of 0038 (i) a charged group, consisting of Glu and locations can range from Several to at least hundreds of Asp, LyS, Arg and His, thousands. Most importantly, each location represents a 0039 (ii) a positively-charged group, consisting of totally independent reaction site. A "nucleic acid array' LyS, Arg and His, refers to an array containing nucleic acid probes, Such as oligonucleotides or larger portions of . The nucleic 0040 (iii) a negatively-charged group, consisting of acid on the array is preferably single Stranded. ArrayS Glu and Asp, wherein the probes are oligonucleotides are referred to as 0041 (iv) an aromatic group, consisting of Phe, Tyr "oligonucleotide arrays' or "oligonucleotide chips. A and Trp, “microarray,” also referred to herein as a “biochip” or “biological chip' is an array of regions having a density of 0042 (v) a nitrogen ring group, consisting of His discrete regions of at least about 100/cm’, and preferably at and Trp, least about 1000/cm. The regions in a microarray have 0043 (vi) a large aliphatic nonpolar group, consist typical dimensions, e.g., diameters, in the range of between ing of Val, Leu and Ile, about 10-250 um, and are separated from other regions in the array by about the same distance. 0044 (vii) a slightly-polar group, consisting of Met and CyS, 0033. The term “associated disease” as used herein refers to a disease that is correlated to a certain nucleic acid or 0045 (viii) a small-residue group, consisting of Ser, protein Sequence because of the presence or absence of Thr, Asp, ASn, Gly, Ala, Glu, Gln and Pro, certain Sequence information, Structural or functional infor mation, and/or biological activity of that nucleic acid or 0046 (ix) an aliphatic group consisting of Val, Leu, protein Sequence. Ile, Met and Cys, and 0034. The term “biological sample”, as used herein, 0047 (x) a small hydroxyl group consisting of Ser refers to a Sample obtained from an organism or from and Thr. components (e.g., cells) of an organism. The sample may be 0048. In addition to the groups presented above, each of any biological tissue or fluid. Frequently the Sample will amino acid residue may form its own group, and the group be a “clinical sample” which is a sample derived from a formed by an individual amino acid may be referred to patient. Such Samples include, but are not limited to, Spu simply by the one and/or three letter abbreviation for that tum, blood, blood cells (e.g., white cells), tissue or fine amino acid commonly used in the art. needle biopsy Samples, urine, peritoneal fluid, and pleural fluid, or cells therefrom. Biological Samples may also 0049. The term “derivative” refers to the chemical modi include Sections of tissueS Such as frozen Sections taken for fication of a polypeptide Sequence, or a polynucleotide histological purposes. Sequence. Chemical modifications of a polynucleotide Sequence can include, for example, replacement of hydrogen 0035. The term “biomarker” of a disease refers to a gene by an alkyl, acyl, or amino group. A derivative polynucle which is up- or down-regulated in a diseased cell of a Subject otide encodes a polypeptide which retains at least one having the disease relative to a counterpart normal cell, biological or immunological function of the natural mol which gene is Sufficiently Specific to the diseased cell that it ecule. A derivative polypeptide is one modified by glyco can be used, optionally with other genes, to identify or detect Sylation, pegylation, or any Similar process that retains at the disease. Generally, a biomarker is a gene that is char least one biological or immunological function of the acteristic of the disease. polypeptide from which it was derived. 0036) A nucleotide sequence is “complementary' to 0050 “Differential gene expression pattern” between cell another nucleotide Sequence if each of the bases of the two A and cell B refers to a pattern reflecting the differences in Sequences match, i.e., are capable of forming Watson-Crick gene expression between cell A and cell B. A differential base pairs. The term “complementary Strand” is used herein gene expression pattern can also be obtained between a cell interchangeably with the term “complement.” The comple at one time point and a cell at another time point, or between US 2003/O194725 A1 Oct. 16, 2003

a cell incubated or contacted with a compound and a cell that having disease D' relative to Subjects not having disease D'. was not incubated with or contacted with the compound. For example, a diseased cell may be a cell infected with a 0051. The term “domain” as used herein refers to a region Virus or a cancerous cell. within a protein that comprises a particular structure or 0060. The term “drug target” refers to any gene or gene function different from that of other sections of the mol product (e.g. RNA or polypeptide) with implications in an ecule. asSociated disease or disorder. Examples include various 0.052 A“HECT domain” or “HECT" is a protein also proteins Such as enzymes, oncogenes and their polypeptide known as “HECTC” domain involved in E3 ubiquitin ligase products, and cell cycle regulatory genes and their polypep activity. Certain HECT domains are 100-400 amino acids in tide products. In one aspect, the drug target may be an E3. length and comprise an amino acid Sequence as Set forth in 0061 The term “expression profile,” which is used inter the following consensus Sequence (amino acid nomenclature changeably herein with "gene expression profile” and “fin is as set forth in Table 1): ger print” of a cell refers to a set of values representing mRNA levels of 20 or more genes in a cell. An expression 0053 Pro Xaa3 Thr Cys Xaa2-4 Leu Xaa Leu Pro profile preferably comprises values representing expression Xaa Tyr (SEQ ID NO. 1). levels of at least about 30 genes, preferably at least about 50, 0.054 E3 as used herein refers to a nucleic acid or 100, 200 or more genes. Expression profiles preferably encoded protein that is involved with Substrate recognition comprise an mRNA level of a gene which is expressed at in ubiquitin-mediated proteolysis, in membrane trafficking similar levels in multiple cells and conditions, e.g., GAPDH. and protein Sorting. Ubiquitin-mediated proteolysis is the For example, an expression profile of a diseased cell of an major pathway for the Selective, controlled degradation of E3-associated disease D' refers to a Set of values represent intracellular proteins in eukayotic cells. 30 E3 proteins ing mRNA levels of 20 or more genes in a diseased cell. include one or more of the following exemplary domains 0062) The term "heterozygote,” as used herein, refers to and/or motifs: an individual with different alleles at corresponding loci on 0055) HECT, RING, F-BOX, U-BOX, PHD, etc. homologous . Accordingly, the term "het erozygous,” as used herein, describes an individual or Strain 0056 “E3-associated Disease” refers to any disease having different allelic genes at one or more paired loci on wherein: (1) an E3 that interacts with interacting proteins homologous chromosomes. whose modification by ubiquitin and/or abnormal degrada tion are the cause for a disease/pathological condition; (2) an 0063. The term “homozygote,” as used herein, refers to E3 protein is implicated in interacting with a specific an individual with the same allele at corresponding loci on domainS/motifs Such as a domain of an interacting protein homologous chromosomes. Accordingly, the term "homozy Such as the late domain of a viral protein, thereby resulting gous,” as used herein, describes an individual or a Strain in viral infectivity; (3) an E3, the cellular localization of having identical allelic genes at one or more paired loci on which Suggests possible interaction with an Interacting homologous chromosomes. protein that may cause a disease or pathological condition; 0064) “Hybridization” refers to any process by which a (4) differential expression of an E3 gene and or protein Strand of nucleic acid binds with a complementary Strand correlates with a disease/pathological condition: and (5) through base pairing. Two Single-Stranded nucleic acids aberrant activity (due to a mutation or abnormal regulation) “hybridize” when they form a double-stranded duplex. The of an E3 that is associated with a disease or a pathological region of double-Strandedness can include the full-length of condition. Exemplary E-associated diseases include but are one or both of the Single-Stranded nucleic acids, or all of one not limited to viral infections, preferably retroviral infec Single Stranded nucleic acid and a Subsequence of the other tions such as HIV, Ebola, CMV, etc., various cancers such as Single Stranded nucleic acid, or the region of double-Strand breast, lung, renal carcinoma, etc., cystic fibrosis, and cer edness can include a Subsequence of each nucleic acid. tain diseases of the CNS Such as autosomal recessive Hybridization also includes the formation of duplexes which juvenile parkinsonism. contain certain mismatches, provided that the two Strands 0057 A“disease characteristic” as used herein refers any are still forming a double stranded helix. “Stringent hybrid one or more of the following: any phenotype that is distinc ization conditions' refers to hybridization conditions result tive of a disease State or any artificial phenotype that is a ing in essentially Specific hybridization. proxy for a phenotype that is distinctive of a disease State, 0065. The term “interact” as used herein is meant to or that distinguishes a diseased cell from a normal cell. include detectable relationships or association (e.g. bio 0.058 “A diseased cell of an associated disease” refers to chemical interactions) between molecules, Such as interac a cell present in Subjects having an associated diseases D, tion between protein-protein, protein-nucleic acid, nucleic which cell is a modified form of a normal cell and is not acid-nucleic acid, and protein-Small molecule or nucleic present in a Subject not having disease D, or which cell is acid-Small molecule in nature. present in Significantly higher or lower numbers in Subjects 0066. The term “Interacting Protein” refers to protein having disease D relative to Subjects not having disease D. capable of interacting, binding, and/or otherwise associating For example, a diseased cell may be a cancerous cell. to a protein of interest, Such as for example a human E3 0059) “A diseased cell of an E3-associated disease” refers protein. Examples of these proteins include for example the to a cell present in Subjects having an E3-associated diseases “Late domain” or “L domain”, which is a small portion of a D'; which cell is a modified from of a normal cell and is not Gag protein that promotes efficient release of virion particles present in a Subject not having disease D', or which cell is from the membrane of the host cell. L domains typically present in Significantly higher or lower numbers in Subjects comprise one or more short motifs (L motifs). Exemplary US 2003/O194725 A1 Oct. 16, 2003 sequences include: PTAPPEE, PTAPPEY, P(T/S)AP, PxxL, 0073. The term “percent identical” refers to sequence PPxY (eg. PPPY), YxxL (eg. YPDL), PxxP. identity between two amino acid Sequences or between two 0067. The term "isolated” as used herein with respect to nucleotide Sequences. Identity can each be determined by nucleic acids, such as DNA or RNA, refers to molecules comparing a position in each Sequence which may be separated from other DNAS, or RNAS, respectively, that are aligned for purposes of comparison. When an equivalent present in the natural Source of the macromolecule. The term position in the compared Sequences is occupied by the same isolated as used herein also refers to a nucleic acid or peptide base or amino acid, then the molecules are identical at that that is Substantially free of cellular material, Viral material, position; when the equivalent Site occupied by the same or or culture medium when produced by recombinant DNA a similar amino acid residue (e.g., Similar in Steric and/or techniques, or chemical precursors or other chemicals when electronic nature), then the molecules can be referred to as chemically Synthesized. Moreover, an "isolated nucleic homologous (similar) at that position. Expression as a per acid' is meant to include nucleic acid fragments which are centage of homology, Similarity, or identity refers to a not naturally occurring as fragments and would not be found function of the number of identical or Similar amino acids at in the natural state. The term "isolated” is also used herein positions shared by the compared Sequences. Various align to refer to polypeptides which are isolated from other ment algorithms and/or programs may be used, including cellular proteins and is meant to encompass both purified Hidden Markov Model (HMM), FASTA and BLAST. and recombinant polypeptides. HMM, FASTA and BLAST are available through the 0068. As used herein, the terms “label” and “detectable National Center for Biotechnology Information, National label” refer to a molecule capable of detection, including, Library of Medicine, National Institutes of Health, but not limited to, radioactive isotopes, fluorophores, chemi Bethesda, Md. and the European Bioinformatic Institute luminescent moieties, enzymes, enzyme Substrates, enzyme EBI. In one embodiment, the percent identity of two cofactors, enzyme inhibitors, dyes, metal ions, ligands (e.g., Sequences can be determined by these GCG programs with biotin or haptens) and the like. The term “fluorescer” refers a gap Weight of 1, e.g., each amino acid gap is Weighted as to a Substance or a portion thereof which is capable of if it were a single amino acid or nucleotide mismatch exhibiting fluorescence in the detectable range. Particular between the two Sequences. Other techniques for alignment examples of labels which may be used under the invention are described in Methods in Enzymology, vol. 266: Com include fluorescein, rhodamine, dansyl, umbelliferone, puter Methods for Macromolecular Sequence Analysis Texas red, luminol, NADPH, alpha-beta-galactosidase and (1996), ed. Doolittle, Academic Press, Inc., a division of horseradish peroxidase. Harcourt Brace & Co., San Diego, Calif., USA. Preferably, an alignment program that permits gaps in the Sequence is 0069. The “level of expression of a gene in a cell” refers utilized to align the Sequences. The Smith-Waterman is one to the level of mRNA, as well as pre-mRNA nascent type of algorithm that permits gaps in Sequence alignments. transcript(s), transcript processing intermediates, mature See Meth. Mol. Biol. 70: 173-187 (1997). Also, the GAP mRNA(s) and degradation products, encoded by the gene in program using the Needleman and Wunsch alignment the cell. method can be utilized to align Sequences. More techniques 0070 The phrase “normalizing expression of a gene” in and algorithms including use of the HMM are describe in a diseased cell refers to a means for compensating for the Sequence, Structure, and DatabankS: A Practical Approach altered expression of the gene in the diseased cell, So that it (2000), ed. Oxford University Press, Incorporated. In Bio is essentially expressed at the same level as in the corre informatics: Databases and Systems (1999) ed. Kluwer sponding non diseased cell. For example, where the gene is Academic Publishers. An alternative Search Strategy uses over-expressed in the diseased cell, normalization of its MPSRCH software, which runs on a MASPAR computer. expression in the diseased cell refers to treating the diseased MPSRCH uses a Smith-Waterman algorithm to score cell in Such a way that its expression becomes essentially the Sequences on a massively parallel computer. This approach Same as the expression in the counterpart normal cell. improves ability to pick up distantly related matches, and is “Normalization' preferably brings the level of expression to especially tolerant of Small gaps and nucleotide Sequence within approximately a 50% difference in expression, more errors. Nucleic acid-encoded amino acid Sequences can be preferably to within approximately a 25%, and even more used to search both protein and DNA databases. Databases preferably 10% difference in expression. The required level with individual sequences are described in Methods in of closeneSS in expression will depend on the particular Enzymology, ed. Doolittle, Supra. Databases include Gen gene, and can be determined as described herein. bank, EMBL, and DNA Database of Japan (DDBJ). 0071. The phrase “normalizing gene expression in a 0074 “Perfectly matched” in reference to a duplex means diseased cell” refers to a means for normalizing the expres that the poly- or oligonucleotide Strands making up the Sion of essentially all genes in the diseased cell. duplex form a double stranded structure with one other such 0.072 AS used herein, the term “nucleic acid” refers to that every nucleotide in each Strand undergoes Watson-Crick polynucleotides Such as deoxyribonucleic acid (DNA), and, basepairing with a nucleotide in the other Strand. The term where appropriate, ribonucleic acid (RNA). The term should also comprehends the pairing of nucleoside analogs, Such as also be understood to include, as equivalents, analogs of deoxyinosine, nucleosides with 2-aminopurine bases, and either RNA or DNA made from nucleotide analogs, and, as the like, that may be employed. A mismatch in a duplex applicable to the embodiment being described, single (sense between a target polynucleotide and an oligonucleotide or or antisense) and double-stranded polynucleotides. ESTs, olynucleotide means that a pair of nucleotides in the duplex chromosomes, cDNAS, mRNAS, and rRNAS are represen fails to undergo Watson-Crick bonding. In reference to a tative examples of molecules that may be referred to as triplex, the term means that the triplex consists of a perfectly nucleic acids. matched duplex and a third Strand in which every nucleotide US 2003/O194725 A1 Oct. 16, 2003 undergoes Hoogsteen or reverse Hoogsteen association with form a complex that has ubiquitin ligase activity. RING a basepair of the perfectly matched duplex. domains preferably interact with at least one of the follow ing protein types: F box proteins, E2 ubiquitin conjugating 0075 AS used herein, a nucleic acid or other molecule attached to an array, is referred to as a “probe' or “capture enzymes and cullins. probe.” When an array contains Several probes correspond 0082) The term “RNA interference”, “RNAi” or “siRNA” ing to one gene, these probes are referred to as "gene-probe are all refers to any method by which expression of a gene Set.” A gene-probe Set can consist of, e.g., 2 to 10 probes, or gene product is decreased by introducing into a-target cell preferably from 2 to 5 probes and most preferably about 5 one or more double-stranded RNAS which are homologous probes. to the gene of interest (particularly to the messenger RNA of 0076) The “profile” of a cell's biological state refers to the gene of interest). the levels of various constituents of a cell that are known to 0083. As used herein, the term “transfection” means the change in response to drug treatments and other perturba introduction of a nucleic acid, e.g., via an expression vector, tions of the cell's biological State. Constituents of a cell into a recipient cell by nucleic acid-mediated gene transfer. include levels of RNA, levels of protein abundances, or “Transformation', as used herein, refers to a proceSS in protein activity levels. which a cell's genotype is changed as a result of the cellular uptake of exogenous DNA or RNA, and, for example, the 0077. The term “protein' is used interchangeably herein transformed cell expresses a recombinant form of a polypep with the terms “peptide” and “polypeptide.” tide or, in the case of anti-Sense expression from the trans 0078. An expression profile in one cell is “similar to an ferred gene, the expression of a naturally-occurring form of expression profile in another cell when the level of expres the polypeptide is disrupted. Sion of the genes in the two profiles are Sufficiently similar 0084. As used herein, the term “transgene” means a that the Similarity is indicative of a common characteristic, nucleic acid Sequence (encoding, e.g., one of the target e.g., being one and the same type of cell. Accordingly, the nucleic acids, or an antisense transcript thereto) which has expression profiles of a first cell and a Second cell are similar been introduced into a cell. A transgene could be partly or when at least 75% of the genes that are expressed in the first entirely heterologous, i.e., foreign, to the transgenic animal cell are expressed in the Second cellat a level that is within or cell into which it is introduced, or, is homologous to an a factor of two relative to the first cell. endogenous gene of the transgenic animal or cell into which 0079 An “RCC1 domain” is a domain that interacts with it is introduced, but which is designed to be inserted, or is Small GTPases to promote loss of GDP and binding of GTP. inserted, into the animal's genome in Such a way as to alter Certain RCC1 domains are about 50-60 amino acids in the genome of the cell into which it is inserted (e.g., it is length. Often RCC1 domains are found in a series of repeats. inserted at a location which differs from that of the natural The first RCC1 domain was identified in a protein called gene or its insertion results in a knockout). A transgene can “Regulator of Condensation” (RCC1), which also be present in a cell in the form of an episome. A interacts with the Small GTPase Ran. In the RCC1 protein, transgene can include one or more transcriptional regulatory a series of seven tandem repeats of a domain of about 50-60 Sequences and any other nucleic acid, Such as introns, that amino acids fold to form a beta-propeller structure (Renault may be necessary for optimal expression of a Selected et al. Nature 1998392:9-101). RCC1 domains are known to nucleic acid. interact with other types of Small GTPases including mem 0085. The term “treating” a disease in a subject or bers of the Arf, Rab, Rac and Rho families. “treating a Subject having a disease refers to Subjecting the 0080. The term “recombinant protein” refers to a protein Subject to a pharmaceutical treatment, e.g., the administra of the present invention which is produced by recombinant tion of a drug, Such that at least one Symptom of the disease DNA techniques, wherein generally DNA encoding the is decreased. expressed protein is inserted into a Suitable expression 0086 The term “Ubiquitin-mediated disorder” as used vector which is in turn used to transform a host cell to herein refers to a disorder resulting from an abnormal produce the heterologous protein. Moreover, the phrase Ubiquitin-mediated cellular proceSS Such as for example "derived from', with respect to a recombinant gene encod ubiquitin-mediated degradation, protein trafficking, and or ing the recombinant protein is meant to include within the protein Sorting. meaning of “recombinant protein’ those proteins having an 0087. The term “Unigene” or “unigene cluster” refers to amino acid Sequence of a native protein, or an amino acid an experimental System for automatically partitioning Gen Sequence Similar thereto which is generated by mutations bank Sequences into a non-redundant Set of Unigene clus including Substitutions and deletions of a naturally occurring ters. Each Unigene cluster contains Sequences that represent protein. a unique gene, as well as related information Such as the 0081. A “RING domain”, “Ring Finger” or “RING” is a tissue types in which the gene has been expressed and map zinc-binding domain also known as “ZF-C2HC4' with a location. In addition, to well characterized genes, EST defined Octet of cysteine and histidine residues. Certain Sequences are also included in these clusters. Such clusters RING domains comprise the consensus Sequences as Set may be downloaded from ftp://ncbi.nlm.nih.gov/repository/ forth below (amino acid nomenclature is as set forth in Table Unigene/. 1): CyS Xaa Xaa CyS Xaa-. CyS Xaa His Xaas CyS Xaa 0088. The phrase “value representing the level of expres Xaa Cys Xaalso Cys Xaa Xaa Cys (SEQ ID NO: 2) or Cys sion of a gene” refers to a raw number which reflects the Xaa Xaa CyS Xaaoo CyS Xaa His Xaas His Xaa Xaa CyS mRNA level of a particular gene in a cell or biological Xaas Cys XaaXaa Cys (SEQID NO:3). Preferred RING Sample, e.g., obtained from experiments for measuring RNA domains of the invention bind to various protein partners to levels. US 2003/O194725 A1 Oct. 16, 2003

0089. A “variant” of polypeptide X refers to a polypep tide having the amino acid Sequence of peptide X in which TABLE 1-continued is altered in one or more amino acid residues. The variant may have “conservative' changes, wherein a Substituted Abbreviations for classes of amino acids amino acid has similar Structural or chemical properties Amino Acids (e.g., replacement of leucine with isoleucine). More rarely, Symbol Category Represented a variant may have “nonconservative' changes (e.g., X4 Aromatic Phe, His, Trp, Tyr replacement of glycine with tryptophan). Analogous minor X5 Charged Asp, Glu, His, Lys, Arg variations may also include amino acid deletions or inser X6 Hydrophobic Ala, Cys, Phe, Gly, His, Ile, Lys, Leu, Met, Thr, tions, or both. Guidance in determining which amino acid Val, Trp, Tyr residues may be Substituted, inserted, or deleted without X7 Negative Asp, Glu abolishing biological or immunological activity may be X8 Polar Cys, Asp, Glu, His, Lys, Asn., Gln, Arg, Ser, Thr found using computer programs well known in the art, for X9 Positive His, Lys, Arg example, LASERGENE software (DNASTAR). X10 Small Ala, Cys, Asp, Gly, Asn., Pro, Ser, Thr, Val 0090 The term “variant,” when used in the context of a X11 Tiny Ala, Gly, Ser polynucleotide Sequence, may encompass a polynucleotide X12 Turnlike Ala, Cys, Asp, Glu, Gly, Sequence related to that of gene X or the coding Sequence His, Lys, Asn., Gln, Arg, thereof. This definition may also include, for example, Ser, Thr “allelic,”“splice,”“species,” or “polymorphic” variants. A X13 Asparagine-Aspartate Asn, Asp Splice variant may have Significant identity to a reference *Abbreviations as adopted from http://smart.embl-heidelberg.de/SMART molecule, but will generally have a greater or lesser number DATA.?alignments/consensus/grouping.html. of polynucleotides due to alternate Splicing of exons during mRNA processing. The corresponding polypeptide may 0092 Creating a Database possess additional functional domains or an absence of 0093. In one aspect the application provides a method of domains. Species variants are polynucleotide Sequences that creating a comprehensive database of related protein and/or vary from one species to another. The resulting polypeptides nucleic acids, i.e., the protein and nucleic acid Sequences are generally will have Significant amino acid identity relative to included in the database based upon certain Sequence infor each other. A polymorphic variant is a variation in the mation, Structural and/or functional information. In one polynucleotide Sequence of a particular gene between indi aspect, the application provides sequences that are Sorted viduals of a given Species. Polymorphic variants also may based upon Sequence, Structural, functional, and biological encompass “single nucleotide polymorphisms” (SNPs) in activity. The Sequences may be further clustered based upon which the polynucleotide Sequence varies by one base. The potential disease association; Such as for example, the pres presence of SNPs may be indicative of, for example, a ence or absence of certain domains may be indicative of certain population, a disease State, or a propensity for a potential disease correlations of that protein or nucleic acid disease State. Sequence. The database further comprises annotations indi cating the relevant disease correlations. In an illustrative 0091) A “WW Domain” is a small functional domain example, the application provides method for creating an E3 found in a large number of proteins from a variety of Species database. including humans, nematodes, and yeast. WW domains are approximately 30 to 40 amino acids in length. Certain WW 0094 FIG. 1 illustrates a process 100 that identifies domains 30 may be defined by the following consensus human E3 proteins and/or nucleic acid Sequences that may Sequence (Andre and Springael, 1994, Biochem. BiophyS. be involved in diseases or other biological processes of Res. Comm. 205:1201-1205) (amino acid nomenclature is as interest. AS Shown, the process operates on data describing Set forth in Table 1): Trp Xaaco Gly Xaa1- X4X4Xaa-e X1 human protein or nucleic acid Sequences. Such data may be X8 Trp Xaa-Pro (SEQID NO: 4). In certain instances a WW downloaded 102 from a variety of sources such as the domain will be flanked by stretches of amino acids rich in publicly available NCBI (National Center for Biotechnology histidine or cysteine. In Some cases, the amino acids in the Information) or Swiss Prot databases or from proprietary center of WW domains are quite hydrophobic. Preferred databaseS Such as for examples the databases owned by WW domains bind to the L domains of retroviral Gag Incyte Inc. or Celera Inc. Publicly available databases proteins. Particularly preferred WW domains bind to an include for example, the NCBI database of human protein amino acid sequence of ProProXaaTyr (SEQ ID NO: 5). sequences on the World Wide Web at http://www.ncbi.nlm .nih.gov/Entrez/batch.html. and the EBI. TABLE 1. 0.095 As shown, the process 100 may clean 104 the Sequences to identify human protein Sequences. For Abbreviations for classes of amino acids example, the proceSS 100 may eliminate redundant Sequence Amino Acids information. The process 100 may also eliminate Sequence Symbol Category Represented portions based on the polypeptide length. For instance, the X1 Alcohol Ser, Thr process 100 may eliminate polypeptides less than Some X2 Aliphatic Ile, Leu, Val Specified length of amino acids (e.g., 10 or 20) or between Xaa Any Ala, Cys, Asp, Glu, Phe, a range of lengths (e.g., 25-30). Gly, His, Ile, Lys, Leu, Met, ASn, Pro, Gln, Arg, 0096) The process 100 then identifies 106 which Ser, Thr, Val, Trp, Tyr Sequences correspond to human E3 protein Sequences. For example, the process 100 may determine whether a particu US 2003/O194725 A1 Oct. 16, 2003

lar Sequence exhibits one or more domains associated with 0102) As shown, analysis 204 of the sequence data 202 E3 proteins. A domain is a recurring Sequence pattern or yields a comprehensive list of E3 proteins and other related motif. Generally, these domains have a distinct evolutionary proteins 210. Such information may be organized in a origin and function. In particular, the human E3 proteins can database 208 Such as a relational database. The database 208 include HECT, Ubox, RING, PHD, and/or fbox domains. may also store characteristics 212 of the different proteins Based on either the domains present or other characteristics, such as the presence or absence of domains such as WW, the process 100 can associate 108 a disease or other bio RCCI, C2, Cue, SH3, SH2, and even Ubox, fbox, RING, logical activity with the E3 proteins. The E3 proteins are HECT and PHD themselves. Based on these characteristics identified as having at least a HECT, RING, Ubox, Fbox, 212, software can associate the protein 210 with a disorder, ZN3 or PHD domain. In certain embodiments the E3 pro disease, or other biological activity. For example, the Soft teins are identified as having at least a HECT or RING ware may access a database 216 associating different protein domain. characteristics 218 with different biological activities 220. 0097 FIG. 2 illustrates a sample implementation 200 of Needless to say, the database 208 may be constantly updated this proceSS in greater detail. AS shown, the implementation to include either new proteins 210, or other associated 200 includes a database 202 of sequence data. Again, the characteristics 212 and biological activity 220. database 202 may be assembled or downloaded from a 0103) As can be seen from this discussion, databases variety of sources such as the National Institute of Health’s comprising related Sequences may be created by Sorting the (NIH) databases or the EBI human genome protein and nucleic acid Sequences based on Structural, databases. Instead of, or in addition to, protein Sequences, functional and biological activity. AS Such, the related the database 202 may also include nucleotide and/or gene Sequences may be examined for particular domains or motifs Sequences associated with particular proteins. The database and then further clustered based on potential correlations 202 may also include Sequence annotations. with various associated diseases. 0.098 Sequence analysis software 204 can identify E3 characteristics 206 indicated by the Sequences. Such char 0104 Biological Assays acteristicS 206 can include domains and motifs Such as 0105. In one aspect, the application provides methods for RING, HECT, Ubox, Fbox, PHD domains or the PTA/SP determining or testing whether a particular Sequence may be motif. For example, the Software can Search for consensus correlated to an associated disease. In one embodiment, this Sequences of particular domains/motifs. The consensus application provides a means for determining whether a Sequences for Some of these exemplary motifs are Set forth particular gene or encoded protein, Such as an E3 gene or the in the definition section provided above. encoded human E3 protein, is involved in a disease or other 0099. The sequence analysis software 204 discussed biological process of interest. In one aspect, the application above may include a number of different tools. For example, provides functional biological assays for correlating protein the CD-Search Service provided by NCBI. This service and nucleic acid Sequences with associated diseases or provides a useful method of identifying conserved domains pathological conditions. that might be present in a protein Sequence. The CDD 0106 The potential involvement of a protein such as a (conserved domain database) contains domains derived human E3 protein in a disease or biological process of from two collections, Smart and Pfam. In particular, Smart interest may be assessed using a number of methods that are (Simple Modular Architecture Research Tool) is a web known to the skilled artisan. Some exemplary methods for based tool for studying such domains (http:/SMARTembl assessing disease correlations or the involvement of proteins heidelberg.de). It includes more than 400 domain families in a biological process of interest, include: found in Signaling, extracellular, and chromatin-associated proteins. These domains are extensively annotated with 0107 I. Interaction of the proteins such as the human E3 respect to phyletic distributions, functional class, tertiary proteins with Specific domains or motifs of an Interacting Structures, and functionally important residues. Similarly, Protein. It is believed that in the course of normal activities Pfam (http://pfam.wustl.edu) is a large collection of multiple the E3 proteins will be free in the cytoplasm or associated Sequence alignments and hidden Markov models covering with an intracellular organelle, Such as the nucleus, the Golgi common protein domains. AS of August 2001, Pfam contains network, etc. However, during a viral infection, it is possible alignments and models for 3071 protein families. that certain host proteins, Such as certain E3 proteins may be 0100. The sequence analysis software 204 may be inde recruited to the cell membrane to participate in Viral matu pendently developed. Alternatively, public Software may be ration, including ubiquitination and membrane fusion. For used. For example, the process may use the Reverse Posi example, the human E3 proteins containing a HECT tion-Specific (RPS) Blast (Basic Local Alignment Search domain, a RING domain, and a WW or SH3 domain interact Tool) tool. In this algorithm, a query sequence is compared with the viral proteins Such as the gag protein. In one aspect, to a position-specific Score matrix prepared from the under the WW domain of the E3 proteins interacts with the late lying conserved domain alignment. Hits are displayed as a domain of the gag protein having the consensus Sequence pair-wise alignment of the query Sequence with a represen PXXY. Therefore, E3 proteins having such domains may tative domain Sequence, or as a multiple alignment. mediate the ubiquitination of gag to facilitate viral matura tion, and as Such may be potential drug targets for treating 0101 The characteristics 206 may also include unigene clusters. Each human E3 protein is then compared to the Viral infections, Such as retroviral infections. downloaded clusters to determine the particular cluster that 0108. In a further aspect the application provides diag it belongs to. Once the E3 protein has been matched to a nostic assays for determining whether a cell is infected with cluster we determine what other proteins belong to this a virus and for characterizing the nature, progression and/or cluster and introduce these into the E3 database. infectivity of the infection. As a result, the detection of a E3 US 2003/O194725 A1 Oct. 16, 2003 protein associated with the plasma membrane fraction may bases that are generated in a variety of ways (high through be indicative of a viral infection. Additionally, the presence put immunoprecipitations, high throughput two-hybrid of E3 proteins at the plasma membrane may also Suggest that analysis, etc.). Various databases include information culled the infective virus is in the process of reproducing and is from the literature relating to protein function, and Such therefore actively engaged in infective or lytic activity information may also be used to identify drug target E3S that (versus a lysogenic or otherwise dormant activity). interact with an abnormally processed protein. Interactions 0109) A number of assays may be useful in studying the may also be determined de novo, using techniques Such as potential interaction of human host proteins with viral those mentioned above. Once a potential drug target Such as interacting proteins. For example, Such an assay could an E3 is identified, a number of assays may be used for involve the detection of virus like particles from cells testing its biological effects. transfected with a virus or cells infected with a virus, Such 0114. In one example, the abnormally ubiquitinated, as a retrovirus. degraded or aggregated protein is monitored for ubiquitina tion, degradation or aggregation in response to a manipula 0110 Association of the proteins of the invention, such as tion in activity of the candidate drug target. For example, the E3 proteins with the plasma membrane may be detected ubiquitination has been implicated in the turnover of the using a variety of techniques known in the art. For example, tumor SupreSSor protein, p53, and other cell cycle regulators membrane preparations may be prepared by breaking open Such as cyclin A and cyclin B, the kinase c-mos, and various the cells (via Sonication or detergent lysis) and then sepa transcription factorS Such as c-jun, c-fos, and I.kappa B/NF rating the membrane components from the cytosolic fraction kappa.B. Altering the half-lives of these cellular proteins is via centrifugation. Segregation of proteins into the mem expected to have great therapeutic potential, particularly in brane fraction can be detected with antibodies specific for the protein of interest using western blot analysis or ELISA the areas of autoimmune disease, inflammation, cancer, as techniques. Plasma membranes may be separated from intra well as other proliferative disorders. Rolfe, M., et al., The cellular membranes on the basis of density using density Ubiquitin-Mediated Proteolytic Pathway as a Therapeutic gradient centrifugation. Alternatively, plasma membranes Area, J. Mol. Med., 75:5 (1997). Many assays described may be obtained by chemically or enzymatically modifying herein and, in View of this application, known to one of Skill the Surface of the cell and affinity purifying the plasma in the art may be used to test the biological effects of the membrane by Selectively binding the modifications. An potential drug target Such as the E3S. exemplary modification includes non-specific biotinylation 0115 III. Potential drug target proteins such as the E3 of proteins at the cell Surface. Plasma membranes may also proteins may be Selected on the basis of cellular localization. be selected for by affinity purifying for abundant plasma In a variety of disease States, a cellular dysfunction can be membrane proteins. traced to one or more cellular compartments. A protein Such as an E3 that localizes to that compartment may be impli 0111 Transmembrane proteins, such as the E3 proteins cated in the disease, particularly where a dysfunctional containing an extracellular domain can be detected using protein appears to interact with the ubiquitination System. FACS analysis. For FACS analysis, whole cells are incu For example, Cystic Fibrosis is an inherited disorder that is bated with a fluorescently labeled antibody (e.g., an FITC linked to reduced surface expression of the Cystic Fibrosis labelled antibody) capable of recognizinigthe extracellular Transduction Regulator (CFTR). Nearly 70% of the affected domain of the protein of interest. The level of fluorescent patients are homozygous for the CFTR AF^' mutation. staining of the cells may then be determined by FACS Mutant CFTR is rapidly degraded in the endoplasmic reticu analyses (see e.g., Weiss and Stobo, (1984) J. Exp. Med., lum (ER) via the ubiquitin proteolytic System resulting in 160:1284-1299). Such proteins are expected to reside on reduced Surface expression. It is known that modulation of intracellular membranes in uninfected cells and the plasma ER-associated protein degradation triggers the Unfolded membrane in infected cells. FACS analysis would fail to Protein Response (UPR) which results in the production of detect an extracellular domain unless the protein is present a number of proteins that mediate protein folding. The at the plasma membrane. combination of decreased ubiquitination and increased pro 0112 Localization of the proteins of interest, such as for tein folding are expected cause a greater proportion of example the E3 proteins of the invention may also be proteins to successfully mature (Travers et al. (2000) Cell determined using histochemical techniques. For example, 101:249-258). Accordingly, human E3 proteins that are cells may be fixed and stained with a fluorescently labeled either known as being localized to the ER or that are integral antibody specific for the protein of interest. The stained cells membrane E3 proteins may mediate the degradation of the may then be examined under the microscope to determine mutant CFTR and as Such may be potential drug targets for the Subcellular localization of the antibody bound proteins. treating cystic fibrosis. 0113 II. Potential drug target proteins may also be iden 0116 Protein localization such as localization of the E3 tified on the basis of an interaction with an interacting may be determined or predicted by bioinformatic analysis, protein that may be modified by ubiquitin or may undergo e.g. through examination of protein localization signals abnormal degradation in disease cells, in comparison with present in the amino acid Sequences of the E3S present in a normal cells. For example, it is expected that a number of database. Exemplary localization Signals include Signal pep diseases are related to abnormal protein folding and/or tides (indicating that the protein is routed into the ER protein aggregate formation. In these cases, the abnormally mediated Secretion pathway), retention Sequences, indicat processed protein may be identified, and a drug target Such ing retention atone or more positions in the Secretory as an E3S drug target may be identified on the basis of an pathway, Such as the ER, a Dart of the Golgi, etc., nuclear interaction therewith. Interactions may be identified bioin localization signals, membrane domains, lipid modification formatically, using, for example, proteome interaction data Sequences, etc. In View of this specification, one of Skill in US 2003/O194725 A1 Oct. 16, 2003 the art will be able to identify numerous types of Sequence ciated disease or other biological process. For example, information that are indicative of protein localization. In differential expression of an E3 protein in a tumor tissue in another variant, localization may be determined directly by comparison with normal tissue may be indicative that the E3 expression of E3S in a cell line, preferably a mammalian cell may be involved in tumorigenesis. line. The protein may be expressed as a native protein, 0121. In one embodiment, the invention is based on the wherein localization would typically be determined by gene expression profile of cells from an E-3asSociated immunofluorescence micorScopy. Alternatively, the protein disease. Diseased cells may have genes that are expressed at may be expressed with a detectable tag, Such as a fluorescent higher levels (i.e., which are up-regulated) and/or genes that protein (e.g. GFP, BFP, RFP, etc.), and the localization may are expressed at lower levels (i.e., which are down-regu be determined by direct immunofluorescence microscopy. lated) relative to normal cells that do not have any symptoms Localization may also be determined by cellular fraction of the E3-assocaited disease. In particular, certain E3 genes ation followed by high-throughput protein identification, may be up-regulated by at least about 1 fold, preferably 2 Such as by coupled two-dimensional electrophoresis and fold, more preferably 5 fold, in the diseased cell as compared mass spectroScopy. This would permit rapid identification of to the normal cell. Alternatively, certain E3 genes may be proteins present in various cellular compartments. down-regulated by at least about 1 fold, preferably 2 fold, 0117 Having identified one or more drug target E3 more preferably 5 fold in the diseased cells relative to the proteins, a number of different assays are available to test the corresponding normal cells. role of the E3 in the disease State. For example, in numerous 0.122 Preferred methods comprise determining the level diseases, a membrane protein is not properly processed and of expression of one or more E3 genes in diseased cells in partitioned to the plasma membrane. Accordingly, E3 func comparison to the corresponding normal cells. Methods for tion may be manipulated (see below) and the level of determining the expression of tens, hundreds or thousands of membrane protein arriving at the membrane measured. genes, in diseased cells relative to the corresponding normal Increased delivery of protein to the membrane in response to cells include, for e.g., using microarray technology. The manipulation of E3 function indicates that the E3 is a valid expression levels of the E3 genes are then compared to the target for disease therapeutics. AS noted above, CFTR matu expression levels of the same E3 genes one or more other ration is perturbed in cystic fibrosis. In one example, E3S are cell, e.g., a normal cell. validated by manipulating the Subject E3 and determining the level of mutant CFTRAF^accumulated at the plasma 0123 Comparison of the expression levels can be per membrane. Likewise, 98% of the erythropoietin receptor formed Visually. In a preferred embodiment, the comparison fails to mature and is degraded in the Secretory pathway. An is performed by a computer. increased yield of erythropoietin receptor may mimic the 0.124. In another embodiment, values representing effects of erythropoietin itself, which is clinically important expression levels of genes characteristic of an E3 associated Stimulator of hematopoiesis. Accordingly, an E3 may be disease are entered into a computer System, comprising one validated by assessing the effect of increasing or decreasing or more databases with reference expression levels obtained its activity on the amount of erythropoietin at the cell from more than one cell. For example, the computer com Surface. prises expression data of diseased and normal cells. Instruc tions are provided to the computer, and the computer is 0118. In further examples, a variety of E3 enzymes may capable of comparing the data entered with the data in the interact with viral proteins that affect the degradation of host computer to determine whether the data entered is more proteins passing through the ER. Many viruses co-opt the Similar to that of a normal cell or of a diseased cell. ER-associated protein degradation pathway to destabilize host proteins that are unfavorable to viral infection. For 0.125. In one embodiment, the invention provides a example, human cytomegalovirus (HCMV) evades the method for determining the level of expression of one or immune system in part by causing the destruction of MHC more E3 genes which are up- or down-regulated in a class I heavy chains. Two HCMV proteins, US 11 and US2 particular E3-associated diseased cell and comparing these cause rapid retrograde transport of the MHC class I heavy levels of expression with the levels of expression of the E3 chains from the ER to the cytosol, where they are degraded genes in a diseased cell from a Subject known to have the by the proteasome. This proceSS is ubiquitin-dependent. In disease, Such that a similar level of expression of the genes addition, the HIV virus targets the host CD4 protein for is indicative that the E3 gene may be implicated in the destruction through an ER-associated, ubiquitin-dependent disease. protein degradation pathway. Destruction of CD4 is impor 0.126 Comparison of the expression levels of one or more tant because CD4 in the ER associates with and inhibits the E3 genes involved with an E3-associated disease with maturation of the HIV glycoprotein gp160. Therefore, E3s reference expression levels, e.g., expression levels in dis may be validated, for example, by assessing effects on the eased cells of or in normal counterpart cells, is preferably processing or localization of MHC class I heavy chains (or conducted using computer Systems. In one embodiment, other MHC class I complexes) or CD4. expression levels are obtained in two cells and these two Sets of expression levels are introduced into a computer System 0119) IV. Potential drug targets may also be identified by for comparison. In a preferred embodiment, one Set of the differential expression of certain nucleic acids or pro expression levels is entered into a computer System for teins in disease cells in comparison to normal cells. comparison with values that are already present in the 0120 In one aspect, differential expression of a protein in computer System, or in computer-readable form that is then a normal cell in comparison with diseased cells, Such as a entered into the computer System. cell manifesting an associated disease, is indicative that the 0127. In one embodiment, the invention provides a sys differentially expressed gene may be involved in the asso tem that comprises a means for receiving gene expression US 2003/O194725 A1 Oct. 16, 2003 data for one or a plurality of genes, a means for comparing magnitude determined (i.e., the abundance is different in the the gene expression data from each of Said one or plurality two sources of MRNA tested), or as not perturbed (i.e., the of genes to a common reference frame; and a means for relative abundance is the same). In various embodiments, a presenting the results of the comparison. This System may difference between the two sources of RNA of at least a further comprise a means for clustering the data. factor of about 25% (RNA from one source is 25% more abundant in one Source than the other Source), more usually 0128. In one embodiment, the invention provides a com about 50%, even more often by a factor of about 2 (twice as puter readable form of the E3 gene expression profile data of abundant), 3 (three times as abundant) or 5 (five times as the invention, or of values corresponding to the level of abundant) is scored as a perturbation. Perturbations can be expression of at least one E3 gene implicated in an E3-as used by a computer for calculating and expression compari Sociated disease in a diseased cell. The values can be mRNA expression levels obtained from experiments, e.g., microar SOS. ray analysis. The values can also be mRNA levels normal 0.134 Preferably, in addition to identifying a perturbation ized relative to a reference gene whose expression is con as positive or negative, it is advantageous to determine the Stant in numerous cells under numerous conditions, e.g., magnitude of the perturbation. This can be carried out, as GAPDH. In other embodiments, the values in the computer noted above, by calculating the ratio of the emission of the are ratioS of, or differences between, normalized or non two fluorophores used for differential labeling, or by analo normalized mRNA levels in different samples. gous methods that will be readily apparent to those of Skill 0129. The gene expression profile data can be in the form in the art. of a table, Such as an Excel table. The data can be alone, or 0135) In operation, the means for receiving gene expres it can be part of a larger database, e.g., comprising other Sion data, the means for comparing the gene expression data, expression profiles. For example, the expression profile data the means for presenting, the means for normalizing, and the of the invention can be part of a public database. The means for clustering within the context of the Systems of the computer readable form can be in a computer. In another present invention can involve a programmed computer with embodiment, the invention provides a computer displaying the respective functionalities described herein, implemented the gene expression profile data. in hardware or hardware and Software; a logic circuit or 0130. In one embodiment, the invention provides a other component of a programmed computer that performs method for determining the similarity between the level of the operations Specifically identified herein, dictated by a expression of one or more E3 genes characteristic of an E3 computer program; or a computer memory encoded with asSociated disease in a first cell, e.g., a cell of a Subject, and executable instructions representing a computer program that in a Second cell, comprising obtaining the level of that can cause a computer to function in the particular expression of one or more genes characteristic of E3 asso fashion described herein. ciated disease in a first cell and entering these values into a 0.136 Those skilled in the art will understand that the computer comprising a database including records compris Systems and methods described herein may be Supported by ing values corresponding to levels of expression of one or and executed on any Suitable platform, including commer more genes characteristic of Said E3 associated disease in a cially available hardware systems, such as IBM-compatible Second cell, and processor instructions, e.g., a user interface, personal computers executing a variety of the UNIX oper capable of receiving a Selection of one or more values for ating Systems, Such as Linux or BSD, or any Suitable comparison purposes with data that is Stored in the com operating system such as MS-DOS or Microsoft Windows. puter. The computer may further comprise a means for In one embodiment, the data processor may be a MIPS converting the comparison data into a diagram or chart or R10000, based mullet-processor Silicon-Graphic Challenge other type of output. server, running IRJX 6.2. Alternatively and optionally, the Systems and methods described herein may be realized as 0131. In another embodiment, the invention provides a embedded programmable data processing Systems that computer program for analyzing gene expression data com implement the processes of the invention. For example, the prising (i) a computer code that receives as input gene data processing System can comprise a single board com expression data for a plurality of genes and (ii) a computer puter System that has been integrated into a piece of labo code that compares Said gene expression data from each of ratory equipment for performing the data analysis described Said plurality of genes to a common reference frame. above. The single board computer (SBC) system can be any 0132) The invention also provides a machine-readable or suitable SBC, including the SBCs sold by the Micro/Sys computer-readable medium including program instructions Company, which include microprocessors, data memory and for performing the following Steps: (i) comparing a plurality program memory, as well as expandable bus configurations of values corresponding to expression levels of one or more and an on-board operating System. genes characteristic of an E3-associated disease D in a query cell with a database including records comprising reference 0.137 Optionally, the data processing systems may com expression or expression profile data of one or more refer prise an Intel Pentium(R)-based processor or AMD processor ence cells and an annotation of the type of cell; and (ii) or their equals of adequate clock rate and with adequate indicating to which cell the query cell is most similar based main memory, as known to those skilled in the art. Optional on Similarities of expression profiles. The reference cells can external components may include a mass Storage System, be cells from Subjects at different Stages of the E3-associated which can be one or more hard disks (which are typically disease. packaged together with the processor and memory), tape drives, CDROMS devices, storage area networks, or other 0133. The relative abundance of an mRNA in two bio devices. Other external components include a user interface logical Samples can be Scored as a perturbation and its device, which can be a monitor, together with an input US 2003/O194725 A1 Oct. 16, 2003

device, which can be a “mouse' ,or other graphic input teristic of am E3 associated disease. The database may devices, and/or a keyboard. A printing device can also be contain one or more expression profiles of genes character attached to the computer. istic of the E3 associated disease in different cells. 0138 Typically, the computer system is also linked to a 0.141. The database employed may be any suitable data network link, which can be part of an Ethernet link to other base System, including the commercially available local computer Systems, remote computer Systems, or wide MicroSoft AcceSS database, PostgreSQL database System, area communication networks, Such as the Internet. This MySQL database Systems, and optionally can be a local or network link allows the computer System to share data and distributed database System. The design and development of processing tasks with other computer Systems. The network Suitable database Systems are described in McGovern et al., can be, for example, an NFS network with a Postgres SQL A Guide To Sybase and SQL Server, Addison-Wesley (1993). relational database engine and a web server, Such as the The database can be Supported by any Suitable persistent Apache Web Server engine. However, the Server may be any data memory, Such as a hard disk drive, RAID System, tape suitable server process including any HTTP server process drive System, floppy diskette, or any other Suitable System. including the Apache Server. Suitable Servers are known in The system 200 depicted in FIG. 2 depicts several separate the art and are described in Jamsa, Internet Programming, databases devices. However, it will be understood by those Jamsa Press (1995), the teachings of which are herein of ordinary skill in the art that in other embodiments the incorporated by reference. Accordingly, it shall be under database device can be integrated into a single System. stood that in certain embodiments, the Systems and methods 0142. In an exemplary implementation, to practice the described herein may be implemented as web-based Systems methods of the present invention, a user first loads expres and Services that allow for network access, and remote Sion profile data into the computer System. These data can be access. To this end, the Server may communicate with clients directly entered by the user from a monitor and keyboard, or Stations. Each of the client Stations can be a conventional from other computer Systems linked by a network connec personal computer System, Such as a PC compatible com tion, or on removable storage media such as a CD-ROM or puter System that is equipped with a client process that can floppy disk or through the network. Next the user causes operate as a browser, Such as the Netscape Navigator execution of expression profile analysis Software which browser process, the MicroSoft Explorer browser process, or performs the Steps of comparing and, e.g., clustering co any other conventional or proprietary browser process that Varying genes into groups of genes. allows the client Station to download computer files, Such as web pages, from the Server. 0143. In an exemplary implementation, to practice the methods of the present invention, a user first loads expres 0139 In certain embodiments the systems and methods Sion profile data into the computer System. These data can be described herein are realized as Software Systems that com directly entered by the user from a monitor and keyboard, or prise one or more Software components that can load into from other computer Systems linked by a network connec memory during operation. These Software components col tion, or on removable storage media such as a CD-ROM or lectively cause the computer System to function according to floppy disk or through the network. Next the user causes the methods of this invention. In Such embodiments, the execution of expression profile analysis Software which Systems may be implemented as a C language computer performs the Steps of comparing and, e.g., clustering co program, or a computer program written in any high level Varying genes into groups of genes. language including C++, Fortran, Java or BASIC. Addition ally, in an embodiment where SBCs are employed, the 0144. In another exemplary implementation, expression Systems and methods may be realized as a computer pro profiles are compared using a method described in U.S. Pat. gram written in microcode or written in a high level lan No. 6,203,987. A user first loads expression profile data into guage and compiled down to microcode that can be executed the computer System. GeneSet profile definitions are loaded on the platform employed. The development of Such Systems into the memory from the Storage media or from a remote is known to those of skill in the art, and Such techniques are computer, preferably from a dynamic geneset database Sys Set forth in Digital Signal Processing Applications with the tem, through the network. Next the user causes execution of TMS320 Family, Volumes I, II, and III, Texas Instruments projection Software which performs the Steps of converting (1990). Additionally, general techniques for high level pro expression profile to projected expression profiles. The gramming are known, and Set forth in, for example, Stephen projected expression profiles are then displayed. G. Kochan, Programming in C, Hayden Publishing (1983). 0145. In yet another exemplary implementation, a user 0140. Additionally, in certain embodiments, these soft first leads a projected profile into the memory. The user then ware components may be programmed in mathematical causes the loading of a reference profile into the memory. Software packages which allow Symbolic entry of equations Next, the user causes the execution of comparison Software and high-level Specification of processing, including algo which performs the Steps of objectively comparing the rithms to be used, thereby freeing a user of the need to profiles. procedurally program individual equations or algorithms. Such packages include Matlab from Mathworks (Natick, 0146) Once again, having identified one or more drug Mass.), Mathematica from Wolfram Research (Champaign, target proteins that are differentially expressed in disease Ill.), or S-Plus from Math Soft (Cambridge, Mass.). Accord cells, a number of different assays are available to test the ingly, a Software component represents the analytic methods role of the drug target protein in the disease State. of this invention as programmed in a procedural language or 0147 For instance, if a E3 protein is identified as being Symbolic package. In a preferred embodiment, the computer over-expressed in a particular tumor-type, the skilled artisan System also contains a database comprising values repre can readily test for the role of the E3 by conducting a number Senting levels of expression of one or more genes charac of assays, for example one could use techniques Such as US 2003/O194725 A1 Oct. 16, 2003

antisense constructs, RNAi constructs, DNA enzymes etc. to O155 With respect to antisense DNA, oligodeoxyribo decrease the expression of the E3 in a tumor cell line to nucleotides derived from the translation initiation Site, e.g., determine whether inhibition of the E3 results in decreased between the -10 and +10 regions of the potential drug target, proliferation. In other embodiments the activity of the Eif are preferred. AntiSense approaches involve the design of may be decreased by using techniqueS Such as dominant oligonucleotides (either DNA or RNA) that are complemen negative mutants, Small molecules, antibodies etc. Other tary to MRNA encoding the potential drug target. The techniques include proliferation assayS Such as determining antisense oligonucleotides will bind to the mRNA tran thymidine incorporation. Scripts and prevent translation. Absolute complementarity, although preferred, is not required. In the case of double 0148 V. Aberrant activity of certain human drug target Stranded antisense nucleic acids, a single Strand of the proteins may also be associated with a disease State or duplex DNA may thus be tested, or triplex formation may be pathological condition. assayed. The ability to hybridize will depend on both the 014.9 For example, the association of the E3 proteins degree of complementarity and the length of the antisense with certain disease or disorders provides a disease Specific nucleic acid. Generally, the longer the hybridizing nucleic database containing human E3 proteins that may be impli acid, the more base mismatches with an RNA it may contain cated in the disease or disorder. and still form a stable duplex (or triplex, as the case may be). One skilled in the art can ascertain a tolerable degree of 0150 Validating Potential Drug Targets mismatch by use of Standard procedures to determine the 0151. In another aspect, this application provides meth melting point of the hybridized complex. ods for validating the Selected proteins, Such as the E3 0156 Oligonucleotides that are complementary to the 5' proteins as viable drug targets. In one embodiment, the end of the mRNA, e.g., the 5' untranslated Sequence up to methods provide for decreasing the expression of the poten and including the AUG initiation codon, should work most tial drug targets and determining the effects of the reduction efficiently at inhibiting translation. However, Sequences of Such expression. The expression of the drug targets may complementary to the 3' untranslated Sequences of mRNAS be reduced by a number of methods that are known in the art, have recently been shown to be effective at inhibiting Such as the use of antisense methods, dominant negative translation of mRNAS as well. (Wagner, R. 1994. Nature mutants, DNA enzymes, RNAi, ribozymes, to name but a 372:333). Therefore, oligonucleotides complementary to few of Such methods. either the 5' or 3' untranslated, non-coding regions of a gene 0152. In another embodiment, the methods provide for could be used in an antisense approach to inhibit translation increasing the expression of the potential drug targets and of that mRNA. Oligonucleotides complementary to the 5' determining the effects of the increase of Such expression. untranslated region of the mRNA should include the 0153. One aspect of the invention relates to the use of the complement of the AUG Start codon. AntiSense oligonucle isolated “antisense' nucleic acids to inhibit expression, e.g., otides complementary to mRNA coding regions are leSS by inhibiting transcription and/or translation of the potential efficient inhibitors of translation but could also be used in drug target. The antisense nucleic acids may bind to the accordance with the invention. Whether designed to hybrid potential drug target by conventional complemen ize to the 5',3' or coding region of mRNA, antisense nucleic tarity, or, for example, in the case of binding to DNA acids should be at least Six nucleotides in length, and are duplexes, through specific interactions in the major groove preferably less that about 100 and more preferably less than of the double helix. In general, these methods refer to the about 50, 25, 17 or 10 nucleotides in length. range of techniques generally employed in the art, and O157 Regardless of the choice of target sequence, it is include any methods that rely on Specific binding to oligo preferred that in vitro Studies are first performed to quanti nucleotide Sequences. tate the ability of the antisense oligonucleotide to quantitate the ability of the antisense oligonucleotide to inhibit gene 0154) An antisense construct of the present invention can expression. It is preferred that these Studies utilize controls be delivered, for example, as an expression plasmid which, that distinguish between antisense gene inhibition and non when transcribed in the cell, produces RNA which is Specific biological effects of oligonucleotides. It is also complementary to at least a unique portion of the cellular preferred that these studies compare levels of the target RNA mRNA which encodes the potential drug target. Alterna or protein with that of an internal control RNA or protein. tively, the antisense construct is an oligonucleotide probe, Additionally, it is envisioned that results obtained using the which is generated ex vivo and which, when introduced into antisense oligonucleotide are compared with those obtained the cell causes inhibition of expression by hybridizing with using a control oligonucleotide. It is preferred that the the mRNA and/or genomic Sequences of the potential drug control oligonucleotide is of approximately the same length target. Such oligonucleotide probes are preferably modified as the test oligonucleotide and that the nucleotide Sequence oligonucleotides, which are resistant to endogenous of the oligonucleotide differs from the antisense Sequence no nucleases, e.g., exonucleases and/or endonucleases, and are more than is necessary to prevent Specific hybridization to therefore Stable in Vivo. Exemplary nucleic acid molecules for use as antisense oligonucleotides are phosphoramidate, the target Sequence. phosphothioate and methylphosphonate analogs of DNA 0158. The oligonucleotides can be DNA or RNA or (see also U.S. Pat. No. 5,176,996; 5.264,564; and 5,256, chimeric mixtures or derivatives or modified versions 775). Additionally, general approaches to constructing oli thereof, Single-Stranded or double-Stranded. The oligonucle gomers useful in antisense therapy have been reviewed, for otide can be modified at the base moiety, Sugar moiety, or example, by Van der Krol et al. (1988) BioTechniques phosphate backbone, for example, to improve Stability of the 6:958-976; and Stein et al. (1988) Cancer Res 48:2659 molecule, hybridization, etc. The oligonucleotide may 2668. include other appended groups Such as peptides (e.g., for US 2003/O194725 A1 Oct. 16, 2003 targeting host cell receptors), or agents facilitating transport an automated DNA synthesizer (Such as are commercially across the cell membrane (see, e.g., Letsinger et al., 1989, available from BioSearch, Applied BioSystems, etc.). AS Proc. Natl. Acad. Sci. U.S.A. 86:6553-6556; Lemaitre et al., examples, phosphorothioate oligonucleotides may be Syn 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication thesized by the method of Stein et al. (1988, Nucl. Acids No. W088/09810, published Dec. 15, 1988) or the blood Res. 16:3209), methylphosphonate olgonucleotides can be brain barrier (see, e.g., PCT Publication No. WO89/10134, prepared by use of controlled pore glass polymer Supports published Apr. 25, 1988) hybridization-triggered cleavage (Sarin et al., 1988, Proc. Natl. Acad. Sci. U.S.A. 85:7448 agents. (See, e.g., Krol et al., 1988, BioTechniques 6:958 7451), etc. 976) or intercalating agents. (See, e.g., Zon, 1988, Pharm. Res. 5:539-549). To this end, the oligonucleotide may be 0164. While antisense nucleotides complementary to the conjugated to another molecule, e.g., a peptide, hybridiza coding region of an mRNA sequence can be used, those tion triggered cross-linking agent, transport agent, hybrid complementary to the transcribed untranslated region and to ization-triggered cleavage agent, etc. the region 0159. The antisense oligonucleotide may comprise at 0.165. In certain instances, it may be difficult to achieve least one modified base moiety which is selected from the intracellular concentrations of the antisense Sufficient to group including but not limited to 5-fluorouracil, 5-bromou SuppreSS translation on endogenous mRNAS. Therefore a racil, 5-chlorouracil, 5-iodouracil, hypoxanthine, Xantine, preferred approach utilizes a recombinant DNA construct in 4-acetylcytosine, 5-(carboxyhydroxytiethyl) uracil, 5-car which the antisense oligonucleotide is placed under the boxymethylaminomethyl-2-thiouridine, 5-carboxymethy control of a strong pol III or pol II promoter. The use of such laminomethyluracil, dihydrouracil, beta-D-galactosylcque a construct to transfect target cells will result in the tran oSine, inosine, N6-isopentenyladenine, 1-methylguanine, scription of sufficient amounts of single stranded RNAS that 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine, will form complementary base pairs with the endogenous 2-methylguanine, 3-methylcytosine, 5-methylcytosine, potential drug target transcripts and thereby prevent trans N6-adenine, 7-methylguanine, 5-methylaminomethyluracil, lation. For example, a vector can be introduced Such that it 5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylcque is taken up by a cell and directs the transcription of an oSine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil, antisense RNA. Such a vector can remain episomal or 2-methylthio-N6- isopentenyladenine, uracil-5-oxyacetic become chromosomally integrated, as long as it can be acid (v), Wybutoxosine, pseudouracil, queosine, 2-thiocy transcribed to produce the desired antisense RNA. Such tosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, vectors can be constructed by recombinant DNA technology 5-methyluracil, uracil-5-oxyacetic acid methylester, uracil methods Standard in the art. Vectors can be plasmid, Viral, or 5-oxyacetic acid (v), 5-methyl-2-thiouracil, 3-(3-amino-3- others known in the art, used for replication and expression N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine. in mammalian cells. Expression of the Sequence encoding the antisense RNA can be by any promoter known in the art 0160 The antisense oligonucleotide may also comprise to act in mammalian, preferably human cells. Such promot at least one modified Sugar moiety Selected from the group erS can be inducible or constitutive. Such promoters include including but not limited to arabinose, 2-fluoroarabinose, but are not limited to: the SV40 early promoter region Xylulose, and hexose. (Bemoist and Chambon, 1981, Nature 290:304-310), the 0.161 The antisense oligonucleotide can also contain a promoter contained in the 3' long terminal repeat of Rous neutral peptide-like backbone. Such molecules are termed sarcoma virus (Yamamoto et al., 1980, Cell 22:787-797), the peptide nucleic acid (PNA)-oligomers and are described, herpes thymidine kinase promoter (Wagner et al., 1981, e.g., in Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci. Proc. Natl. Acad. Sci. U.S.A. 78:1441-1445), the regulatory U.S.A. 93:14670 and in Eglomet al. (1993) Nature 365:566. Sequences of the metallothionein gene (Brinster et al., 1982, One advantage of PNA oligomers is their capability to bind Nature 296.39-42), etc. Any type of plasmid, cosmid, YAC to complementary DNA essentially independently from the or viral vector can be used to prepare the recombinant DNA ionic Strength of the medium due to the neutral backbone of construct, which can be introduced directly into the tissue the DNA. In yet another embodiment, the antisense oligo Site. nucleotide comprises at least one modified phosphate back 0166 Alternatively, the potential drug target gene expres bone Selected from the group consisting of a phosphorothio Sion can be reduced by targeting deoxyribonucleotide ate, a phosphorodithioate, a phosphoramidothioate, a Sequences complementary to the regulatory region of the phosphoramidate, a phosphordiamidate, a methylphospho gene (i.e., the promoter and/or enhancers) to form triple nate, an alkyl phosphotriester, and a formacetal or analog helical Structures that prevent transcription of the gene in thereof. target cells in the body. (See generally, Helene, C. 1991, 0162. In yet a further embodiment, the antisense oligo Anticancer Drug Des, 6(6):569-84; Helene, C., et al., 1992, nucleotide is an -anomeric oligonucleotide. An -anomeric Ann. N.Y. Acad. Sci., 660:27-36; and Maher, L. J., 1992, oligonucleotide forms specific double-stranded hybrids with Bioassays 14(12):807-15). complementary RNA in which, contrary to the usual -units, 0.167 Nucleic acid molecules to be used in triple helix the strands run parallel to each other (Gautier et al., 1987, formation for the inhibition of transcription are preferably Nucl. Acids Res. 15:6625-6641). The oligonucleotide is a Single Stranded and composed of deoxyribonucleotides. The 2'-O-methylribonucleotide (Inoue et al., 1987, Nucl. Acids base composition of these oligonucleotides should promote Res. 15:6131-6148), or a chimeric RNA-DNA analogue triple helix formation via Hoogsteen base pairing rules, (Inoue et al., 1987, FEBS Lett. 215:327-330). which generally require sizable Stretches of either purines or 0163 Oligonucleotides of the invention may be synthe pyrimidines to be present on one Strand of a duplex. Nucle sized by Standard methods known in the art, e.g., by use of otide Sequences may be pyrimidine-based, which will result

US 2003/O194725 A1 Oct. 16, 2003 known in the art (e.g. Expedite RNA phophoramidites and using fluorescence microscopy for mammalian cell lines thymidine phosphoramidite (Proligo, Germany). Synthetic after co-transfection of hCGFP-encoding pAD3 (Kehlenback oligonucleotides are preferably deprotected and gel-purified et al. (1998) J Cell Biol 141: 863-74). The effectiveness of using methods known in the art (see e.g. Elbashir et al. the RNAi may be assessed by any of a number of assays (2001) Genes Dev. 15: 188-200). Longer RNAs may be transcribed from promoters, such as T7 RNA polymerase following introduction of the dsRNAs. These include West promoters, known in the art. A Single RNA target, placed in ern blot analysis using antibodies which recognize the both possible orientations downstream of an in vitro pro targeted gene product following Sufficient time for turnover moter, will transcribe both Strands of the target to create a of the endogenous pool after new protein Synthesis is dsRNA oligonucleotide of the desired target Sequence. repressed, and Northern blot analysis to determine the level 0.174. The specific sequence utilized in design of the of existing target mRNA. oligonucleotides may be any contiguous Sequence of nucle 0176 Further compositions, methods and applications of otides contained within the expressed gene message of the RNAi technology are provided in U.S. patent application target. Programs and algorithms, known in the art, may be Nos. 6,278,039, 5,723,750 and 5,244,805, which are incor used to Select appropriate target Sequences. In addition, porated herein by reference. optimal Sequences may be Selected utilized programs 0177 Ribozyme molecules designed to catalytically designed to predict the Secondary Structure of a Specified cleave the potential drug target mRNA transcripts can also Single Stranded nucleic acid Sequence and allow Selection of be used to prevent translation of mRNA (See, e.g., PCT those Sequences likely to occur in exposed Single Stranded International Publication WO90/11364, published Oct. 4, regions of a folded mRNA. Methods and compositions for 1990; Sarver et al., 1990, Science 247: 1222-1225 and U.S. designing appropriate oligonucleotides may be found, for Pat. No. 5,093,246). While ribozymes that cleave MRNA at example, in U.S. Pat. No. 6,251,588, the contents of which Site Specific recognition Sequences can be used to destroy are incorporated herein by reference. Messenger RNA particular mRNAS, the use of hammerhead ribozymes is (mRNA) is generally thought of as a linear molecule which preferred. Hammerhead ribozymes cleave mRNAS at loca contains the information for directing protein Synthesis tions dictated by flanking regions that form complementary within the Sequence of ribonucleotides, however Studies base pairs with the target MRNA. The sole requirement is have revealed a number of Secondary and tertiary Structures that the target mRNA have the following sequence of two exist in most mRNAS. Secondary structure elements in RNA bases: 5'-UG-3'. The construction and production of ham are formed largely by Watson-Crick type interactions merhead ribozymes is well known in the art and is described between different regions of the same RNA molecule. more fully in Haseloff and Gerlach, 1988, Nature, 334:585 Important Secondary Structural elements include intramo 591. lecular double Stranded regions, hairpin loops, bulges in duplex RNA and internal loops. Tertiary structural elements 0.178 The ribozymes of the present invention also are formed when Secondary Structural elements come in include RNA endoribonucleases (hereinafter “Cech-type contact with each other or with Single Stranded regions to ribozymes') Such as the one which occurs naturally in Tetrahymena thermophila (known as the IVS, or L-19 IVS produce a more complex three dimensional Structure. A RNA) and which has been extensively described by Thomas number of researchers have measured the binding energies Cech and collaborators (Zaug, et al., 1984, Science, of a large number of RNA duplex structures and have 224:574-578; Zaug and Cech, 1986, Science, 231:470-475; derived a set of rules which can be used to predict the Zaug, et al., 1986, Nature, 324:429-433; published Interna secondary structure of RNA (see e.g. Jaeger et al. (1989) tional patent application No. WO88/04300 by University Proc. Natl. Acad. Sci. USA86:7706 (1989); and Turner et al. Patents Inc.; Been and Cech, 1986, Cell, 47:207-216). The (1988) Annu. Rev. Biophys. Biophys. Chem. 17:167). The Cech-type ribozymes have an eight base pair active Site rules are useful in identification of RNA structural elements which, hybridizes to a target RNA sequence whereafter and, in particular, for identifying Single Stranded RNA cleavage of the target RNA takes place. The invention regions which may represent preferred Segments of the encompasses those Cech-type ribozyrnes which target eight mRNA to target for silencing RNAi, ribozyme or antisense base-pair active site Sequences. technologies. Accordingly, preferred Segments of the mRNA 0179 AS in the antisense approach, the ribozymes can be target can be identified for design of the RNAi mediating composed of modified oligonucleotides (e.g., for improved dsRNA oligonucleotides as well as for design of appropriate Stability, targeting, etc.) and should be delivered to cells ribozyme and hammerheadribozyme compositions of the expressing the potential drug target. A preferred method of invention. delivery involves using a DNA construct “encoding” the ribozyme under the control of a Strong constitutive pol III or 0.175. The dsRNA oligonucleotides may be introduced pol II promoter, So that transfected cells will produce Suf into the cell by transfection with an heterologous target gene ficient quantities of the ribozyme to destroy targeted mes using carrier compositions Such as liposomes, which are Sages and inhibit translation. Because ribozymes unlike known in the art- e.g. Lipofectamine 2000 (Life Technolo antisense molecules, are catalytic, a lower intracellular con gies) as described by the manufacturer for adherent cell centration is required for efficiency. lines. Transfection of dsRNA oligonucleotides for targeting 0180 A further aspect of the invention relates to the use endogenous genes may be carried out using Oligofectamine of DNA enzymes to decrease expression of the potential (Life Technologies). Transfection efficiency may be checked drug targets. DNA enzymes incorporate Some of the mecha US 2003/O194725 A1 Oct. 16, 2003

nistic features of both antisense and ribozyme technologies. nology, Volumes I-IV (D. M. Weir and C. C. Blackwell, eds., DNA enzymes are designed So that they recognize a par 1986) (Cold Spring Harbor Laboratory Press, Cold Spring ticular target nucleic acid Sequence, much like an antisense Harbor, N.Y., 1986). oligonucleotide, however much like a ribozyme they are catalytic and Specifically cleave the target nucleic acid. EXAMPLES 0181. There are currently two basic types of DNA Examples enzymes, and both of these were identified by Santoro and Joyce (see, for example, U.S. Pat. No. 6,110,462). The 10-23 Example 1 DNA enzyme (shown schematically in FIG. 1) comprises a loop Structure which connect two arms. The two arms 0186 Method of Creating the Database provide Specificity by recognizing the particular target 0187. The following procedure illustrates one embodi nucleic acid Sequence while the loop Structure provides ment of creating a database. catalytic function under physiological conditions. 0188 1. NCBI protein database is downloaded from 0182 Briefly, to design an ideal DNA enzyme that spe NCBI ftp site: ftp.ncbi.nlm.nih.gov cifically recognizes and cleaves a target nucleic acid, one of 0189 2. Retrieve hum nr. Retrieve all the human skill in the art must first identify the unique target Sequence. Sequence in an automatic way from the following url: This can be done using the same approach as outlined for http://www.ncbi.nlm.nih.vov/Entrez/batch.html. In the antisense oligonucleotides. Preferably, the unique or Sub HTML form one can specify that all the protein Stantially Sequence is a G/C rich of approximately 18 to 22 Sequences, from Homo Sapiens are to be retrieved. nucleotides. High G/C content helps insure a stronger inter action between the DNA enzyme and the target Sequence. 0.190 3. Whether the protein is a human protein is determined by downloading the full nr file from ncbi 0183) When synthesizing the DNA enzyme, the specific ftp Site, in a fasta format. All the Sequences that have antisense recognition Sequence that will target the enzyme to the pattern Homo Sapiens at the end of the description the message is divided So that it comprises the two arms of Sentence (i.e. from the first line) are parsed out. the DNA enzyme, and the DNA enzyme loop is placed between the two Specific arms. 0191) 4. Clean sequences: These sequences are then cleaned. Two Scripts are run in order to clean the 0184 Methods of making and administering DNA Human nr fasta file. The first script eliminates all the enzymes can be found, for example, in U.S. Pat. No.6,110, redundant Sequences, and leaves all the unique 462. Similarly, methods of delivery DNA ribozymes in vitro Sequences. The Second Script removes all the short or in vivo include methods of delivery RNA ribozyme, as Sequences (less then 30 aa). outlined in detail above. Additionally, one of skill in the art 0192 5. Run RPS-Blast: RPS-Blast is run locally will recognize that, like antisense oligonucleotide, DNA against the CDD database (which contains the Pfam, enzymes can be optionally modified to improve Stability and SMART and LOAD domains). In addition we look for improve resistance to degradation. domains in the prosite database. We also look for 0185. The present invention is further illustrated by the different features in the Sequences: Transmembrane following examples which should not be construed as lim regions (alom2., tmap), signal peptide and other internal iting in any way. The contents of all cited references domains/features. including literature references, issued patents, published or 0193 6. Find E3 proteins: this search is done auto non published patent applications as cited throughout this matically. We look for all the proteins that have one or application are hereby expressly incorporated by reference. more of the following domains (Hect, Ring, Ubox, The practice of the present invention will employ, unless Fbox, PHD). These five domains appear in the different otherwise indicated, conventional techniques of cell biology, databases (pfam, Smart and prosite) in different names. cell culture, molecular biology, transgenic biology, micro In Our Search we look for these domains in all the biology, recombinant DNA, and immunology, which are different names, in all the databases. within the skill of the art. Such techniques are explained 0194 7. Unigene clusters data: We download the clus fully in the literature. (See, for example, Molecular Cloning ters (Hs.data file) from the following ur;: ftp://ncbi.n- A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch lm.nih.gov/repiositor/UniGene/. and Maniatis (Cold Spring Harbor Laboratory Press: 1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985); 0195 E3 Vs. Unigene: We look at each E3 protein Oligonucleotide Synthesis (M. J. Gait ed., 1984); Mullis et from the E3 table; to see in which Unigene Cluster it al. U.S. Pat. No. 4,683,195; Nucleic Acid Hybridization (B. belongs. D. Hames & S. J. Higgins eds. 1984); Transcription And 0196) (9) We check which other proteins are in the E3 Translation (B. D. Hames & S. J. Higgins eds. 1984); (R.I. clusters, which are not E3 proteins, and introduce them Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And in the E3 database. Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods. In Enzy 0197). In addition, multiple sequence alignment may be mology (Academic Press, Inc., N.Y.); Gene Transfer Vectors performed between all the cluster members against the For Mammalian Cells (J. H. Miller and M. P. Calos eds., relative genomic piece. In this way we can See the alterna 1987, Cold Spring Harbor Laboratory); , Vols. 154 and 155 tive transcripts of the gene. (Wu et al. eds.), Immunochemical Methods. In Cell And 0198 In particular, RPS-Blast may be run at least twice. Molecular Biology (Mayer and Walker, eds., Academic In the first run, an E value of 0.01 may be used, and then all Press, London, 1987); Handbook Of Experimental Immu the domains may be run against the human nr. In the Second US 2003/O194725 A1 Oct. 16, 2003

run, an E value of 10 may be used, and only the E3 domains factor, Several hypothetical proteins, and ubiquitin fusion (hect, ring, ulbox, fbox, phd) are run against the human nr. In degradation protein 2, where it may be involved in E2-de this manner the database will have a lower number of false pendent ubiquitination. positives, but have a higher Sensitivity to the E3 domains. 0212 PHD 0199 Further, the E3 database can integrate links to articles, links to patents, annotations of the proteins and 0213 SMART SM0249. The PHD domain is a C4HC3 other biological information that may be available for the Zinc-finger-like motif found in nuclear proteins that are particular protein. thought to be involved in chromatin-mediated transcrip tional regulation. The PHD finger motif is reminiscent of, 0200 Examples of E3 polypeptides and nucleic acids that but distinct from the C3HC4 type RING finger. Like the may be incorporated into one or more databases are pre RING finger and the LIM domain, the PHD finger is Sented in Table 2, appended at the end of the text. Applicants expected to bind two Zinc ions. incorporate by reference herein the nucleic acid and amino acid Sequences corresponding to the accession numbers 0214 B. Protein Domains That May Play a Role in Virus Biogenesis, Maturation and Release in Combination with E3 provided in Table 2. Ubipuitin-Protein Ligase Example 2 0215 RCC1-Domain that Interacts With Small GTPases such ARF1. That Activates AP1 to Polymerize 0201 Domains and/or Motifs of Interest Clathrin 0202 A. Protein Domains That may Play a Role in Virus 0216) Pfam PF00415; The regulator of chromosome con Biogenesis, Maturation and Release densation (RCC1) MEDLINE: 93242659) is a eukaryotic protein which binds to chromatin and interacts with ran, a 0203 E3–Domain of E3 Ubiguitin-Protein Lizase nuclear GTP-binding protein IPRO02041, to promote the 0204 RING loss of bound GDP and the uptake of fresh GTP, thus acting as a guanine-nucleotide dissociation stimulator (GDS). The 0205 SMART SMO184; RING=RNF, E3 ubiquitin-pro interaction of RCC1 with ran probably plays an important tein ligase activity is intrinsic to the RING domain of c-Cb1 role in the regulation of gene expression. RCC1, known as and is likely to be a general function of this domain; Various PRP20 or SRM1 in yeast, pim1 in fission yeast and BJ1 in RING fingers exhibit binding activity towards E2's, i.e., the Drosophila, is a protein that contains Seven tandem repeats ubiquitin-conjugating enzymes (UBC's). of a domain of about 50 to 60 amino acids. As shown in the following Schematic representation, the repeats make up the 0206 HECTc major part of the length of the protein. Outside the repeat 0207 SMARTSMO0119; Pfam PF00632; HECTc= region, there is just a Small N-terminal domain of about 40 HECT, E3 ubiquitin-protein ligases. Can bind to E2 to 50 residues and, in the Drosophila protein only, a C-ter enzymes. The name HECT comes from Homologous to the minal domain of about 130 residues. E6-AP Carboxyl Terminus. Proteins containing this domain at the C-terminus include ubiquitin-protein ligase activity, 0217 WW-Domain That Interacts With PxxPP Seq. on which regulates ubiquitination of CDC25. Ubiquitin-protein Gag L-Domain of HIV ligase accepts ubiquitin from an E2 ubiquitin-conjugating 0218 SMART SM0456; Pfam PF00397; Also known as enzyme in the form of a thioester, and then directly transfers the WWP or rsps domain. Binds proline-rich polypeptides. the ubiquitin to targeted Substrates. A cysteine residue is The WW domain (also known as rsps or WWP) is a short required for ubiquitin-thiolester formation. Human thyroid conserved region in a number of unrelated proteins, among receptor interacting protein 12, which also contains this them dystrophin, responsible for Duchenne muscular dyS domain, is a component of an ATP-dependent multi-Subunit trophy. This short domain may be repeated up to four times protein that interacts with the ligand binding domain of the in some proteins. The WW domain binds to proteins with thyroid hormone receptor. It could be an E3 ubiquitin particular proline-domains, AP-P-P-AP-Y, and having protein ligase. Human ubiquitin-protein ligase E3A interacts fourconserved aromatic positions that are generally Trp. The with the E6 protein of the cancer-associated human papil name WW or WWP derives from the presence of these Trp lomavirus types 16 and 18. The E6/E6-AP complex binds to as well as that of a conserved Pro. It is frequently associated and targets the P53 tumor-SuppreSSor protein for ubiquitin with other domains typical for proteins in Signal transduc mediated proteolysis. tion processes. A large variety of proteins containing the WW domain are known. These include; dystrophin, a mul 0208 F-BOX tidomain cytoskeletal protein; utrophin, a dystrophin-like 0209 SMART SM0256; Pfam PF00646; F-BOX= protein of unknown function; vertebrate YAP protein, Sub FBOX=F-box=Fbox. The F-box domain was first described strate of an unknown serine kinase; mouse NEDD-4, as a Sequence domain found in cyclin-F that interacts with involved in the embryonic development and differentiation the protein SKP1. This domain is present in numerous of the central nervous system; yeast RSP5, similar to proteins and Serves as a link between a target protein and a NEDD-4 in its molecular organization; rat FE65, a tran ubiquitin-conjugating enzyme. The SCF complex (e.g., Scription-factor activator expressed preferentially in liver; Skp1-Cullin-F-box) plays a similar role as an E3 ligase in tobacco DB10 protein and others. the ubiquitin protein degradation pathway. 0219 C2-Domain That Interacts With Phospholipids, 0210 U-BOX Inositol Polyphosphates, and Intracellular Proteins 0211 SMARTSM0504. The U-box domain is a modified 0220 SMART SM0239; Pfam PF00168; Ca2+-binding RING finger domain that is without the full complement of domain present in phospholipases, protein kinases C, and Zn2+-binding ligands. It is found in pre-mRNA splicing Synaptotamins (among others). Some do not appear to US 2003/O194725 A1 Oct. 16, 2003 contain Ca2+-binding sites. Particular C2S appear to bind ing proline and hydrophobic amino acids. Pro-containing phospholipids, inoSitol polyphosphates, and intracellular polypeptides may bind to SH3 domains in 2 different bind proteins. Unusual occurrence in perforin. Synaptotagmin ing orientations. The SH3 domain has a characteristic fold and PLC C2s are permuted in sequence with respect to N which consists of five or six beta-strands arranged as two and C-terminal beta strands. SMART detects C2 domains tightly packed anti-parallel beta sheets. The linker regions using one or both of two profiles. may contain short helices. 0221) Interpro abstract (IPRO00008): Some isozymes of 0227 Protein domain information may be obtained from protein kinase C(PKC) is located between the two copies of any of the following websites: SMART (http://smart.embl the C1 domain (that bind phorbol esters and diacylglycerol) heidelberg.de/), Pfam (http://smart.embl-heidelberg.de/), and the protein kinase catalytic domain. Regions with Sig InterPro (http://www.ebi.ac.uk/interpro/scan.html). nificant homology to the C2-domain have been found in many proteins. The C2 domain is thought to be involved in Example 3 calcium-dependent phospholipid binding. Since domains related to the C2 domain are also found in proteins that do 0228 Methods for Screening the Biological Activity of not bind calcium, other putative functions for the C2 domain the E3 Proteins and Validating the Role of E3's as Potential like e.g. binding to inositol-1,3,4,5-tetraphosphate have been Drug Targets Suggested. The 3D Structure of the C2 domain of Synap 0229. A functional biological assay for a disease or a totagmin has been reported the domain forms an eight pathological condition is developed in each instance. RNA Stranded beta Sandwich constructed around a conserved interference (RNAi) technology or dominant negative forms 4-stranded domain, designated a C2 key. Calcium binds in a of candidate E3S or any of the other techniques that are used cup-shaped depression formed by the N- and C-terminal in the art to inhibit expression of relevant target proteins may loops of the C2-key domain. be used. The ability of these method to remedy the abnor 0222 CUE-Domain That Recruits E2to ER-Membrane mality that causes a disease/pathological condition validates Proximity the role of the Specific E3 and its relevance as a potential drug target. 0223 SMART SMO546; Pfam PF02845; Domain that may be involved in binding ubiquitin-conjugating enzymes 0230) Identification of an E3 Involved in the Ubiquitin (UBCs). CUE domains also occur in two proteins of the IL-1 Mediated Viral Release Signal transduction pathway, tollip and TAB2. 0231 Experimental evidence supports a model wherein 0224 SH3 & SH2– the release of viral like particles (VLP) from infected cells 0225 SMART Sm0252; Pfam PF00017; Src homology 2 is dependent on ubiquitination of a viral protein Such as gag. domains bind phosphotyrosine-containing polypeptides via Ubiquitintaion of gag indicates that a human E3 protein is 2 Surface pockets. Specificity is provided via interaction involved. The gag proteins, Such as the late domain, are with residues that are distinct from the phosphotyrosine. known to interact with the HECT domain and a WW or SH3 Only a single occurrence of a SH2 domain has been found domain of the E3 proteins. Therefore, human E3 proteins in S. cerevisiae. The Src homology 2 (SH2) domain is a that may have wither a HECT or a WW or SH3domain may protein domain of about 100 amino-acid residues first iden mediate the ubiquitination of gag to facilitate Viral release. tified as a conserved Sequence region between the oncopro 0232 The detection and/or measurement of the release of teins Src and FpS. Similar Sequences were later found in VLP from cells infected with retroviral infections provide a many other intracellular Signal-transducing proteins. SH2 convenient biological assay. domains function as regulatory modules of intracellular Signalling cascades by interacting with high affinity to 0233. The inhibition of VLP release by decreasing the phosphotyrosine-containing target peptides in a Sequence expression of the potential drug target validates the potential Specific and strictly phosphorylation-dependent manner. drug target. They are found in a wide variety of protein contexts e.g., in asSociation with catalytic domains of phospholipase Cy 0234 Identification of an E3 Involved in the Ubiquitin (PLCy) and the nonreceptor protein tyrosine kinases, within Mediated Degradation of an Interacting Protein Structural proteins Such as fodrin and tensin; and in a group of Small adaptor molecules, i.e. Crk and Nick. In many cases, 0235 A ubiquitin-protein ligase that mediates the ubiq when an SH2 domain is present so too is an SH3 domain, uitination of CFTR is identified. Cystic fibrosis (CF) is an Suggesting that their functions are inter-related. The domains inherited disorder is caused by the malfunction or reduced are frequently found as repeats in a single protein Sequence. surface expression of the Cystic Fibrosis Transduction The structure of the SH2 domain belongs to the alpha+beta Regulator (CFTR). Approximately 70% of the affected indi class, its overall shape forming a compact flattened hemi viduals are homozygous to the CFTR^' mutation. Mutant Sphere. The core Structural elements comprise a central CFTR is rapidly degraded in the endoplasmic reticulum hydrophobic anti-parallel beta-sheet, flanked by 2 short (ER) via the ubiquitin proteolytic System resulting in inhi alpha-helices. In the V-Src oncogene product SH2 domain, bition of Surface expression. An ER-associated E3 is likely the loop between strands 2 and 3 provides many of the to mediate the ubiquitination of CFTR. Accordingly, pre binding interactions with the phosphate group of its phoS ferred E3 candidates are those localized to the ER or those phopeptide ligand, and is hence designated the phosphate that haye the CUE domain. Cell surface expression of binding loop. CFTR is used as the functional biological assay. Finally, 0226) The SH3 domain (SMART SM0326) shares 3D the target is validated by detecting increased Surface expres similarity with the WW domain, and may bind to PxxPP sion of CFTR^' in cells co-expressing a dominant negative Sequence of the viral gag protein. Src homology 3 (SH3) form of a candidate E3 or transfected with a specific RNAi domains bind to target proteins through Sequences contain derived from a candidate E3. US 2003/O194725 A1 Oct. 16, 2003 20

Example 4 proteins. Since maximal reduction of target protein by RNAi is achieved after 48 hours, cells are trans 0236 Identification and Validation of POSH as a Drug fected twice - first to reduce target mRNAS, and Target for Antiviral Agents Subsequently to express the viral Gag protein. The 0237 An example of the systems disclosed herein was Second transfection is performed with pNLenV (plas used to Successfully identify a drug target for antiviral mid that encodes HIV) and with low amounts of agents, and especially agents that are effective against HIV RNAi to maintain the knockdown of target protein and related viruses. during the time of gag expression and budding of 0238 A database of greater than 500 E3 proteins was VLPs. Reduction in mRNA levels due to RNAi effect assembled. The database contained many of the proteins is verified by RT-PCR amplification of target mRNA. presented in Table 2. A Subset of proteins was Selected based 0246 3. Methods, Materials, Solutions on various characteristics, Such as the presence of RING and SH3 domains or HECT and RCC domains. The proteins of 0247 a. Methods this Subset are shown in Table 3. Proteins of the Subset were 0248 i. Transfections according to manufactur tested for their effects on the lifecycle of HIV using the er's protocol and as described in procedure. Virus-Like Particle (VLP) assay system. A knockdown for each protein was created by contacting the assay cells with 0249 ii. Protein determined by Bradford assay. an SiRNA construct specific for an mRNA sequence corre sponding to each of the proteins of Table 3. Results for 0250) iii. SDS-PAGE in Hoeffer miniVE electro POSH and proteins 1-6 are shown in FIG. 5. Decrease in phoresis System. Transfer in Bio-Rad mini-pro POSH production by siRNA led to a complete or near tean II wet transfer System. Blots visualized using complete disruption of VLP production. A few of the other Typhoon System, and ImageOuant Software E3s tested gave partial effects on VLP production, and most (ABbiotech) E3s had no effect. Tsg101 is used as a positive control. 0251 b. Materials TABLE 3

E3 Subset selected for VLP Assays Material Manufacturer Catalog # Batch # Gene Accession Lipofectamine 2000 Life Technologies 11668-019 1112496 (LF2000) 1. CEB1 ABO27289 OptiMEM Life Technologies 31985-047 3063119 2. HERC1 U50078 RNAi Lamin AfC Self 3 3. HERC2 AFO71172 RNATSG-101 688 Self 65 4. HERC3 D25215 RNAi Posh 524 Self 81 5. ITCH AFO95745 plenv11 PTAP Self 48 6. KIAA1301 ABO37722 plenv11 ATAP Self 49 7. KIAA1593 ABO46813 Anti-p24 polyclonal Seramun A-0236/5- 8. Nedd4 D42O55 antibody 1O-O1 9. NeddL1 ABO4.8365 Anti-Rabbit Cy5 Jackson 44-175-115 4.8715 10. Need4L ABOO7899 conjugated antibody 11. PAM AFO7558 10% acrylamide Tris- Life Technologies NPO321 1081371 12. POSH protlog1 Glycine SDS-PAGE gel 13. SMURF1 ACOO4893 Nitrocellulose Schleicher & 4O1353 BA-83 14. SMURF2 NM O22739 membrane Schuell 15. WWP1 AL136739 NuPAGE 20X transfer Life Technologies NPO006-1 224365 16. WWP2 U96114 buffer 0.45 um filter Schleicher & O4621OO CS1018-1 Schuell 0239 FIG. 6 shows a pulse-chase VLP assay confirming that a decrease in POSH function leads to a complete or near-complete inhibition of VLP production. Accordingly, 0252 c. Solutions Systems disclosed herein are effective for rapidly generating drug targets. 0240 Detailed protocols for performing VLP assays and Compound Concentration siRNA knockdown experiments are as follows. Lysis Buffer Tris-HCl pH 7.6 50 nM 0241 Steady-State VLP Assay: MgCl2 15 mM NaCl 150 nM 0242 1. Objective: Glycerol 10% EDTA 1 mM 0243 Use RNAi to inhibit POSH gene expression EGTA 1 mM and compare the efficiency of Viral budding and ASB-14 (add immediately 1% before use) GAG expression and processing in treated and 6X Sample Tris-HCl, pH = 6.8 1 M untreated cells. Buffer Glycerol 30% SDS 10% 0244 2. Study Plan: DTT 9.3% Bromophenol Blue O.O12% 0245 HeLa SS-6 cells are transfected with mRNA TBS-T Tris pH = 7.6 20 mM specific RNAi in order to knockdown the target Oct. 16, 2003 21

0259 Transfections: -continued 0260 Prepare LF2000 mix: 250 ul OptiMEM+5 ul Compound Concentration LF2000 for each reaction. Mix by inversion, 5 times. Incu NaCl 137 mM bate 5 minutes at room temperature. Tween-20 O.1% 0261) Prepare RNA dilution in OptiMEM (Table 1, col 0253 4. Procedure umn A). Add LF2000 mix dropwise to diluted RNA (Table 0254 a. Schedule 1, column B). Mix by gentle vortex. Incubate at room temperature 25 minutes, covered with aluminum foil. Day. 0262. Add 500 ul transfection mixture to cells dropwise 1. 2 3 4 5 and mix by rocking Side to Side. Plate Transfection I Passage Transfection II Extract RNA cells (RNAi only) cells (RNAi and pNlenv) for RT-PCR (1:3) (12:00, PM) (post 0263 Incubate overnight. transfection) Extract RNA for Harwest VLPs 0264 d. Day3 RT-PCR and cells (pre-transfection) 0265 Split 1:3 after 24 hours. (Plate 4 wells for each reaction, except reaction 2 which is plated into 3 wells.) 0255) b. Day 1 0256 Plate HeLa SS-6 cells in 6-well plates (35 mm 0266 e. Day4 wells) at concentration of 5 X105 cells/well. 0267 2 hours pre-transfection replace medium with 0257 c. Day2 DMEM growth medium without antibiotics. 0258 2 hours before transfection replace growth medium with 2 ml growth medium without antibiotics.

Transfection I:

RNAi A. B 20 uM OPtiMEM LF2000 mix Reaction RNAi name TAGDA # Reactions RNAi nM ill (ul) (ul)

1. Lamin AFC 13 2 50 12.5 500 500 2 Lamin AFC 13 1. 50 6.25 250 250 3 TSG101 688 65 2 2O 5 500 500 5 Posh 524 81 2 50 12.5 500 500

Transfection II

A. B Plasmid RNAi for 2.4 20 uM C D Plasmid Plasmid tug for 10 nM OPtMEM LF2OOO mix RNAi name TAGDA # Reactions (ugful) (ul) (ul) (ul) (ul) Lamin AFC 13 PTAP 3 3.4 3.75 750 750 Lamin AFC 13 ATAP 3 2.5 3.75 750 750 TSG101 688 65 PTAP 3 3.4 3.75 750 750 Posh 524 81 PTAP 3 3.4 3.75 750 750 US 2003/O194725 A1 Oct. 16, 2003 22

0268 Prepare LF2000 mix: 250 ul OptiMEM+5 ul 0280 h. Purification of VLPs from cell media LF2000 for each reaction. Mix by inversion, 5 times. Incu bate 5 minutes at room temperature. 0281 i. Filter the Supernatant from step g through a 0.45 m filter. 0269 Prepare RNA+DNA diluted in OptiMEM (Trans fection II, A+B+C) 0282) ii. Centrifuge Supernatant at 14,000 rpm at 0270 Add LF2000 mix (Transfection II, D) to diluted 40 C for at least 2 h. RNA-DNA dropwise, mix by gentle vortex, and incubate 1 iii. Aspirate Supernatant carefully. h while protected from light with aluminum foil. 0283) 0271 Add LF2000 and DNA+RNA to cells, 500 ul/well, 0284) iv. Re-suspend VLP pellet in hot (100° C. mix by gentle rocking and incubate overnight. warmed for 10 min at least) 1X sample buffer. 0272) f. Day 5 0285) v. Boil samples for 10 minutes, 100° C. 0273 Collect samples for VLP assay (approximately 24 0286) i. Western Blot analysis hours post-transfection) by the following procedure (cells from one well from each sample is taken for RNA assay, by 0287 i. Run all samples from stages A and B on RT-PCR). Tris-Glycine SDS-PAGE 10% (120 V for 1.5 h.). 0274 g. Cell Extracts 0288 ii. Transfer samples to nitrocellulose mem 0275 i. Pellet floating cells by centrifugation brane (65 V for 1.5 h.). (5min, 3000 rpm at 40° C), save Supernatant 0289) iii. Stain membrane with ponceau S Solu (continue with Supernatant immediately to Steph), tion. Scrape remaining cells in the medium which remains in the well, add to the corresponding 0290 iv. Block with 10% low fat milk in TBS-T floating cell pellet and centrifuge for 5 minutes, for 1 h. 1800 rpm at 40° C. 0291 v. Incubate with anti p24 rabbit 1:500 in 0276) ii. Wash cell pellet twice with ice-cold PBS. TBS-T ofn. 0277 iii. Resuspend cell pellet in 100 ul lysis 0292 vi. Wash 3 times with TBS-T for 7 min each buffer and incubate 20 minutes on ice. Wash. 0278 iv. Centrifuge at 14,000 rpm for 15 min. 0293 vii. Incubate with secondary antibody anti Transfer Supernatant to a clean tube. This is the cell extract. rabbit cy5 1:500 for 30 min. 0279 v. Prepare 10 ul of cell extract samples for 0294 viii. Wash five times for 10 min in TBS-T SDS-PAGE by adding SDS-PAGE sample buffer 0295) ix. View in Typhoon gel imaging system to 1X, and boiling for 10 minutes. Remove an (Molecular Dynamics/APBiotech) for fluores aliquot of the remaining Sample for protein deter cence Signal. mination to Verify total initial Starting material. Save remaining cell extract at -80 C. 0296 Exemplary RT-PCR Primers for POSH

Exemplary RT-PCR primers for POSH

Name Position Sequence Sense primer POSH = 271 271 5' CTTGCCTTGCCAGCATAC 3' (SEQ ID NO : 12) Anti-sense primer POSH = 926c 926C 5' CTGCCAGCATTCCTTCAG 3' (SEQ ID NO : 13) siRNA duplexes:

siRNA. No : 153 siRNA. Name: POSH-230 Position in mRNA 426-4 46 Target sequence: 5' AACAGAGGCCTTGGAAACCTG 3' SEQ ID NO: 14 siRNA sense strand: 5' dToTCAGAGGCCUUGGAAACCUG 3' SEQ ID NO: 15 siRNA anti-sense strand: 5' dToTCAGGUUUCCAAGGCCUCUG 3' SEQ ID NO: 16

siRNA. No : 155 siRNA. Name: POSH-4 42 Position in mRNA 638-658 Target sequence: 5' AAAGAGCCTGGAGACCTTAAA 3' SEQ ID NO: 17 siRNA sense strand: 5' did.TodTAGAGCCUGGAGACCUUAAA 3' SEQ ID NO: 18 siRNA anti-sense strand: 5' did.ToTUUUAAGGUCUCCAGGCUCU 3' SEQ ID NO : 19 US 2003/O194725 A1 Oct. 16, 2003 23

- continued siRNA. No : 157 siRNA. Name: POSH-U111 Position in mRNA 2973-2993 Target sequence: 5' AAGGATTGGTATGTGACTCTG 3' SEQ ID NO: 20 siRNA snese strand: 5' dToTGGAUUGGUAUGUGACUCUG 3' SEQ ID NO: 21 siRNA anti-sense strand: 5' dToTCAGAGUCACAUACCAAUCC 3' SEQ ID NO: 22 siRNA. No : 1.59 siRNA. Name: POSH-U410 Position in mRNA 3272-3292 Target sequence: 5' AAGCTGGATTATCTCCTGTTG 3' SEQ ID NO: 23 siRNA sense strand: 5' did.ToTGCUGGAUUAUCUCCUGUUG 3' SEQ ID NO: 24 siRNA anti-sense strand: 5' did.TodTCAACAGGAGAUAAUCCAGC 3' SEQ ID NO: 25

0297 Protocol For Assessing POSH siRNA Effects on the Kinetics of VLP Release

Chase time 0298 A1. Transfections Treatment (hours) Fraction Labeling 0299) 1. One day before transfection plate cells at a Control = WT 1. Cells A1 concentration of 5x10 cell?well in 15 cm plates. VLP A1 V 2 Cells A2 0300 2. Two hours before transfection, replace cell 3 YE A. V media to 20 ml complete DMEM without antibiotics. VE A3 V 4 Cells A4 0301 3. DNA dilution: for each transfection dilute VLP A4 V 62.5ul RNAi in 2.5 ml OptiMEM according to the 5 A. V table below. RNAi stock is 20 uM (recommended Posh - WT 1. Cells B1 concentration: 50 nM, dilution in total medium VLP B1 V 2 Cells B2 amount 1:400). VLP B2 V 0302) 4. LF 2000 dilution:- for each transfection 3 CellsVLP B3B3 V dilute 50 ul lipofectamine 2000 reagent in 2.5 ml 4 Cells B4 OptiMEM. VLP B4W 5 Cells B5 0303) 5. Incubate diluted RNAi and LF 2000 for 5 VLP B5 V minutes at RT. 0304) 6. Mix the diluted RNAi with diluted LF2000 0313 B. Labeling and incubated for 20-25 minutes at RT. 0314 1. Take out starvation medium, thaw and place at 37° C. 0305 7. Add the mixure to the cells (drop wise) and incubate for 24 hours at 37° C. in CO incubator. 0315 2. Scrape cells in growth medium and transfer gently into 15 ml conical tube. 0306 8. One day after RNAi transfection split cells (in complete MEM medium to 2 15 cm plate and 1 0316 3. Centrifuge to pellet cells at 1800 rpm for 5 well in a 6 wells plate) minutes at room temperature. 0317 4. Aspiratep Supernatantp and let tube stand for 0307 9. One day after cells split perform HIV 10 sec. Remove the rest of the Supernatant with a 200 transfection according to SP 30-012-01. All pipetman. 0308) 10. 6 hours after HIV transfection replace 0318 5. Gently add 10 ml warm starvation medium medium to complete MEM medium. and resuspend carefully with a 10 ml pipette, up and 0309 Perform RT-PCR for POSH to assess down, just turning may not resolve the cell pellet). degree of knockdown. 0319 6. Transfer cells to 10 cm tube and place in the incubator for 60 minutes. Set an Eppendorf thermo 0310 A2. Total RNA purification. mixer to 37 C. 0311 1. One day after transfection, wash cells twice 0320 7. Centrifuge to pellet cells at 1800 rpm for 5 with Sterile PBS. minutes at room temperature. 0312) 2. Scrape cells in 2.3 ml/200 ul (for 15 cm 0321 8. Aspiratep Supernatantp and let tube stand for plate/1 well of a 6 wells plate) Tri reagent (with 10 sec. Remove the rest of the Supernatant with a 200 sterile scrapers) and freeze in -70° C. All pipetman. US 2003/O194725 A1 Oct. 16, 2003 24

0322 9. Cut a 200 ultip from the end and resuspend 0335 4. Wash beads once with high salt buffer, once cells (-1.5 107 cells in 150 ul RPIM without Met, but with medium salt buffer and once with low salt try not to go over 250 ul if you have more cells) buffer. After each spin don’t remove all solution, but gently in 150 ul starvation medium. Transfer cells to leave 50 ul solution on the beads. After the last spin an Eppendorf tube and place in the thermo mixer. remove Supernatant carefully with a loading tip and Wait 10 sec and transfer the rest of the cells from the leave ~10 ul solution. 10 ml tube to the Eppendorf tube, if necessary add another 50 ul to splash the rest of the cells out (all 0336 5. Add to each tube 20 til 2x SDS sample Specimens should have the same Volume of labeling buffer. Heat to 70° C. for 10 minutes. reaction). 0337 6. Samples were separated on 10% SDS 0323) 10. Pulse: Add 50 ul of S-methionine (spe PAGE. cific activity 14.2 uCi/ul), tightly cup tubes and place in thermo mixer. Set the mixing Speed to the lowest 0338 7. Fix gel in 25% ethanol and 10% acetic acid possible (700 rpm) and incubate for 25 minutes. for 15 minutes. 0324 11. Stop the pulse by adding 1 ml ice-cold 0339 8. Pour off the fixation solution and soak gels chase/stop medium. Shake tube very gently three in Amplify solution (NAMP 100 Amersham) for 15 times and pellet cells at 6000 rpm for 6 sec. minutes. 0325 12. Remove Supernatant with a 1 ml tip. Add 0340) 9. Dry gels on warm plate (60-80° C) under gently 1 ml ice-cold chase/stop medium to the pel WCUU. leted cells and invert gently to resuspend. 0341 10. Expose gels to screen for 2 hours and scan. 0326 13. Chase: Transfer all tubes to the thermo mixer and incubate for the required chase time Example 5 (830:1,2,3,4 and 5 hours; 828: 3 hours only). At the end of total chase time, place tubes on ice, add 1 ml 0342 Identification of Drug Targets For Anti-Neoplastic ice-cold chase/stop and pellet cells for 1 minute at Agents 14,000 rpm. Remove Supernatant and transfer Super natant to a Second eppendorf tube. The cell pellet 0343 A database of greater than 500 E3 proteins is freeze at -80 C., until all tubes are ready. assembled. The database contains many of the proteins presented in Table 2. A Subset of proteins is selected based 0327 14. Centrifuge Supernatants for 2 hours at on various characteristics, Such as the presence of certain 14,000 rpm, 4 C. Remove the Supernatant very domains. The expression of genes encoding the proteins is gently, leave 20 ul in the tube (labeled as V) and assessed in cancerous and non-cancerous tissues to identify freeze at -80 C. until the end of the time course. genes of the database that are overexpressed or underex 0328. All steps are done on ice with ice-cold pressed in cancerous tissues. Examples of cancerous and buffers non-cancerous tissues to be tested include: lung, laryn gopharynx, pancreas, liver, rectum, colon, Stomach, breast, 0329. 15. When the time course is over, remove all cervix, uterus, Ovary, testes, prostate and skin. tubes form -80° C. Lyse VLP pellet (from step 14) and cell pellet (step 13) by adding 500 ul of lysis 0344 Genes that are identified as overexpressed in cancer buffer (see Solutions), resuspend well by pipeting up are Subjected to siRNA knockdown in a cancerous cell line, and down three times. Incubate on ice for 15 min Such as HeLa cells. If the knockdown decreases proliferation utes, and Spin in an eppendorf centrifuge for 15 of the cancerous cell line, the gene and the encoded protein minutes at 4 C., 14,000 rpm. Remove Supernatant to are targets for developing anti-neoplastic agents. a fresh tube, discard pellet. 0345 POSH is overexpressed in certain cancerous tis 0330) 16. Perform IP with anti-p24 sheep for all sues, and POSH siRNA decreases proliferation of HeLa Samples. cells. 0331 C. Immunoprecipitation 0346 Incorporation. By Reference 0332 1. Preclearing: add to all samples 15 ul ImmunoPure PlusG (Pierce). Rotate for 1 hour at 4 0347 All of the patents and publications cited herein are C. in a cycler, spin 5 min at 4 C., and transfer to a hereby incorporated by reference. new tube for IP. 0348 Equivalents 0333 2. Add to all samples 20 ul of p24-protein G 0349 Those skilled in the art will recognize, or be able to conjugated beads and incubate 4 hours in a cycler at ascertain using no more than routine experimentation, many 4° C. equivalents to the Specific embodiments of the invention 0334 3. Post immunoprecipitations, transfer all described herein. Such equivalents are intended to be immunoprecipitations to a fresh tube. encompassed by the following claims. US 2003/O194725 A1 Oct. 16, 2003 25

TABLE 2.

HS. 1004 Gene: BRPF1 Sequence count: 105 Select G Protein Acc. DNA Acc. Length Description -- - ive 8- r -- E3, 6630865. AAF19605. 1 - a putativeDNA glycosylase 8-hydroxyguanine Ilomo sapicns - similar to bromodomain and is 15295343XP 054520.1 ar 1. 214 PHDfinger containing, 1 (H. w sapiens) Homo sapiens 190352 AAB0219. M91585 1214 Bria bromodomain-containing proteinperegrin bromodomain containing protein, 140kD Homo sapiens

HS10101 Gene: FLJ12875 Sequence count: 197 select GI Protein Ace. DNA Acc. Length Description -- - hypotheticalypothetical proteinprotei 3. 152975six 0157342 '' Fiz875 Homo sapiens) posals BAB14317.1 Akoya, as: tHomosapiens) protein product rolE. asoapsaaahoo. ecoloo 352 AAH10101Say; hypothetical (Homo sapiens)

ga 15559301AAH14010.1 BC014010 352 proteinFLJ12875'SEE, Homo sapiens

5.:: also: poisomos- 352 policypothetical protein----- sapiens

HS. 102652 Gene: ASH1 Sequence count: 190 select GI Protein Acc. DNA Acc. Length - Description

r i mons arosaristos AF2573051sapiens ASH1 (Homo

8922081 Nros. Moss hypothetical(Homosapiens) protein ASH1 US 2003/0194725 A1 Oct. 16, 2003 26

Hs 102737 Gene: GP Sequence count: 363 select GI Protein Ace. DNA Ace. Length Description

goliath protein Homo Sapiens a 3: 14733373XP_003972.2 264 AAH17100 likely ortholog of - smellino 276 mousegl-related zinc finger &X. protein Homo sapiens 767,7054 IAAF67007.1 | AF1556so 276 Al55650-goliath Protein & - Homosapiens goliath protein likely orthologof E 10092651 No. M 018434 276 mouse g1-related zinc finger e protein Homo sapiens

Hs. 1042 selectGene: SSA1GI SequenceProtein count: Acc. 62DNA Acc. Length. Description assis AABsoi - as Stor, 5. 747927 AAAs. - assos gene product

Earoahrosubcoloss as RSoon, FE 337485 |AAA36581.1 M3455 475 2-drossa ibonucleopoein at 338490 Aaaass. M62800 as 52-kD SS-A/Ro autoantigen

... - 88-- -r-, war w 52kD Ro/SSA autoantigen Sjogrensyndrome antigen A1 NP_003132.2NM_003141 475 Sicca syndrome antigen A 15208660 tripartite motifprotein TRIM21 i Homo sapiens

HS. 05280 Gene: GP Sequence count: 83 Select G - Protein ---Acc. DNA Acc. Length--- Description US 2003/0194725 A1 Oct. 16, 2003 27

TTT AAH13948 Similar to RIKENT 1553 0305 AAH13948.1 BC013948 237 cDNA1700045I19 gene (Homo sapiens

HS. 10590 Gene: ZNF313 Sequence count:500 Select Gl Protein Acc. DNA Acc. Length Description 530486 capagosi -- as d.1963K232Homosapiens (novel protein) ------6Koo Homo assia AAFsial AF265215 AF265215Sapiens 1 DJ963K232- (Homo

5.a 15489.174 AAH13695.1-- BC013695 228 313 E.P.'"Homo sapiens 8923898 Neos. 53.1 Moss 228 zincsJ963K23.2Homo finger protein 313 sapiens

HS. 106826 Gene: BHC80 Sequence count: 167 Select G Protein Acc. DNA Acc. Length? Description

rt spa Aaroo m as AF20848_1sapiens BM-006- (Homo

HisE. 16041692 IAAII15714.1 - so proteinAAH1574 Homo similar sapiens to KIAA1696 - 1992.3462NP 057052 - 634 ERAFSHDAC2(Homosapiens) complex BRAF35/HDAC2 complex 80 kDa 634 protein Homo sapiens

poss. Amos------total BAB1787. AB051483, 635 kaios protein Homo sapiens

---- ... ------: -----...------1------fg 10435111 BAB14492.1 Ak023258 634 HomosapiensEnamed Protein product

Hs 107153 Gene: ING1 L Sequence count: 91 Select GI rtProtein - or--- Acc. DNA Acc. Length Description ------for- m E 4115555 |BAA36419. AB012853 280 Nail Homo sapiens US 2003/0194725 A1 Oct. 16, 2003 28

E. 9psis AAgiosafossa, 280 aross, p33 Homo sapiens) I -- - --mm 12053588 |CAC20567.1 AJ006851 280 p32 protein Homo sapiens 4504695 NP 001555.1NM 001564. 280 inhibitor of growth-like EE - - Homosapiens) HS.108106

Gene: UHRF1 Sequence count: 118 Sclect G Protein Acc. DNA Acc. Length Description -E. 6815251- AAF2846 9.1 | AF129507 793 ICBP90so Homo transcription sapiens factor- AF274048_1 nuclear zinc is 14190527AAK55744.1 AF274.048 793 fingerprotein Np95 Homo E. y sapiens

i. housanabisi - Akos's so Homosapiensnamed protein product ubiquitin-like, containing PHD and RING finger domains, 1 16507204 NP_0374142 NM_013282 793 transcription factor ICBP90 Homo sapiens

Hs. 108183 Gene: ING4 Sequence count: 193 select GI Protein Acc. DNA Acc. Length Description

- candidate tumor Suppressor 245 p33ING1 homolog Homo 17456536 XP_006980.2 sapiens) - AF1565521 p29ING4 Homo 249 Sapiens

E: issists AAL977. i. oposagasa Aroso- 262-AF063594-lbrain my036 protein --- x -wa. -- Homosapiens -

3. ---- o - 5730480 IAAD48585.1 | AF10645 24949 Candidate tumor Suppressor p3 E. s.svirus -v- y ING1 homolog Homo, sapiens)

AAHO7781 Unknown (protein 14043612 AAHO7781.1 BC007781 249 forMGC: 12557) Homo sapiens i AAH13038 Similar to RIKEN sists Aahiaosi BC013038 221 cDNA1700027H23 gene Homo US 2003/O194725 A1 Oct. 16, 2003 29

--- -- I sapiens) - gi, 7 705861 Neospas. Mogo 249 song Homo sapiens

HS. 10915 Gene: MGC11279 Sequence count: 127 Select GI Protein Acc. DNA Acc. | Length Description

---10178315 CAC084.01.1 bA181144Homosapiens (novel protein) amm-W | 12, Bcooz. 26 AAHO2912 Unknown (protein ::::::: 2011 AAHO292. BC0212 296 forMGC:l 1279) Homo sapicns) hypothetical protein aposts sportion oss 296 MGC11279 Homo sapiens

HS.110457 Gene: WHSC1 Sequence count: 470 Select GI Protein Acc. DNA Acc. Length Description ------W------Y------umammm mm---- Wolf-IIlirschhorn syndrome candidate) protein, isoform 2 IL-5

promoter REII-region-binding 19913346 (NP 055734.1 - 802 proteintrithorax/ashl-related protein 5 multiple mycloma SET domain protein Homo sapiens , Wolf-Hirschhorn syndrome candidatel protein, isoform 1 IL-5 promoter REII-region-binding 19913348NP 579877.1 - 1365 proteintrithorax/ashl-related protein 5 multiple myeloma SET domain protein Homo Sapiens) - - Wolf-Hirschhorn syndrome candidate1 protein, isoform 1 IL-5 1991.3350 (NP 579878-1- "X-X-..... proteintrithorax/ashi-relatedpromoter REII-region-binding protein 5 multiple myeloma SET domain protein Homo Sapiens) - Wolf-Hirschhorn syndrome candidate 1 protein, isoform 1 IL-5

promoter REII-region-binding 1991.3358 NP 5798.90.1 365 proteintrithorax/ashi-related S. protein 5 multiple myeloma SET | domain protein Homo Sapiens US 2003/0194725 A1 Oct. 16, 2003 30

| | | | Wolf Hirschhorn syndrome candidate 1 protein, isoform 5 IL-5

584 promoter REII-region-binding E. 1991.3361 NP 579891.1 - proteintrithorax/ashl-related protein 5 multiple myeloma SET domain protein Homo sapiens

------r- -- E.E----- sessi BAA83042.1 Aboor is KIAA1090sapiens protein (Homo E.; 3249713 AAC24150. AFO71593 1365 Miser type II Homo sapiens

atI aroo Aadios aross is- putativeSapiens WHSC1 protein (Homo

at!. asids addition AFO83387 1365 Sapienst WHSC1 protein Homo

assiss hapin AF083388 1365 sts WHSC1 protein Homo - AF330040 IL-5 promoterREII 12642795|AAK00344.1 AF330040 584 region-binding protein (Homo sapiens r siasts cabass, AJ007042 1365 rx. protein Homo sapiens Wolf Hirschhorn syndrome candidate 1 protein, isoform 4 IL-5 r 27 romoter REII-region-binding 65.946.83 NP 015627. I uron 629 roteintrithorax/ashl-related rotein 5 multiple myeloma SET domain protein/Homo sapiens/ E. 32497.15 AAC2415 1. I AF071594 647 user type IIHomo sapiens/

- - utative WHSC1 protein (Homo 4378022 AAD 19346. 1 | AF083391 629 sapiens/ -:3 2 ------, r s - y ------

23: so Adivas arosso putativesapiens/ WHSCI protein (Homo good aross or putative HSC protein (Homo - sapiens)

Hs. 11050

Gene: FBXO9 Sequence count: 328 Select GI W Protein Acc. DNA Acc. Length Description US 2003/0194725 A1 Oct. 16, 2003 31

------as88: 5360123 |AAD soulafisia 434 AF1551141Homosapiens NY-REN-57 antigen &S -- - - 6164737 AAF045181 AF174597 327 A77Homosapiens box protein Tbx' 6103647 |AAF0704. AF176704 447 rootein FBX9 Homo e --- |6808.184 (CAB70786.1 A 137320 sa, hypotheticalSapiens protein (Homo

a . 12653729- AAHO06501 BC000650 327 IAA00650Similaronlyprotein 9 Homo to sapiens F-box

as: 69.12546 NP_036479.1NMam 012347. 447 box proteinES" Fbx9 Homo sapiens riX issio passal NM also- 437 F-boxit. onlyHomo protein sapiens 9 F-boxprotein F-box only protein 9 F-boxprotein 15812203 NP 25842. NM 033481 327 FbX9 Homo sapiens -

HS-1 10953 Gene: - Sequence count:0 Select G Protein Acc. DNA Acc. Length Description solo caboos. m 392 hypothetical protein Homo sapiens

HS.11 123 Gene: DKFZP564G092 Sequence count: 107 Select GI - Protein Acc. DNA Acc. Length Description --- bA57G10.5 (A novel protein.T 103 similarto KIAA0032) Homo Eg|988461i. ICAC04175.1 - Sapiens

| seaE: oposablaioli Abosis oss KIAA1593Sapiens protein (Homo --

E|.a 5419849 CAB46371. AL096715 110 sapiensyPhilPoint"

|Eg|7661612E. |Np -- oscis NMosco.- 110 DKEZF564G092sapiens protein (Homo US 2003/O194725 A1 Oct. 16, 2003 32

Hs. 11 156 Gene: LOC51255 Sequence count: 257 select G Protein Acc. DNA Acc. Length Description

lei. 7106866- Aarossi O is AF151072Sapiens HSPC238 (Homo

sonE h2803913- IAAH02803.1 BC002803 153 Homosapiens'' hypothetical protein hypothetical protein LOC51255 7706039 Neosists. Mois. 153 Homosapiens

HS.112227 Gene: - Sequence count: 0 pNA 8 Select G Protein Ace. Acc Length Description M AF2553031 membrane-associated Fiji 983725 AAGOO432.1 - 1048 nucleicacid binding protein Homo E3 sapiens

HS. 17414 Gene: KIAA 1320 Sequence count: 56 Select G Protein Acc. DNA Acc. Length - Description 17944 be assos s 562 KIAA 1320 protein Homo sapiens) A. 724302 |Baaass labora so kaalso protein Homo sapiens

HS. 18174 Gene: TTC3 Sequence count: 575 Select G Protein Acc. DNA Acc. Length W Description -- - I. 1304,32|BAA 1769.1 D83077 - 2025-TPRE) (Homo sapiens------|------E. 2662364 |BAA2s66. D83327 1941 DCRRl Homo sapiens

1632762 Baalso D84294 2025 TPRD (Homo sapiens) -- possible protein TPRDII 1632764 BAA 12302.1 D84295 1792 IIomosapiens US 2003/0194725 A1 Oct. 16, 2003 33

-w- arr - - - -r------1632766 BAA20). D84296 1715 TPRDIII Homo sapiens - tetratricopeptide repeat domain 1083.5037 son NM 00:316, 1792 (TPRStricopeptide repeat protein repeat D) protein 3 Homosapiens

Hs. I 19120 Gene: SMURF1 Sequence count: 153 select G Protein Acc. DNA Acc. Length Description : m similar to NEDD-4 (KIA0093) Fr. 712 similarto P46934 (PID:gl 171682) i. Homo sapiens ote locos - xx 0047327 BAB13451. Aboosas 859 KIAA1625 protein Homo sapiens) E3 ubiquitin ligase SMURF1 3- acosaaroos. AF96. 722 Homosapiens

HS. 119960 Gene: DKFZP727G051 Sequence count: 190 Select G Protein Acc. DNA Acc. Length Description

. w - Homosapiens -

3. solios, cassosol All in 473 hypothere protein (Homo sapiens)

HS.1 2017 Gene: NEDD4L Sequence count: 228 select GI Protein Acc. DNA Acc. Length Description ------aboo BAA23711.1 AB007899 995 KIAA0439 (Homo sapiens) rwrip wirman-m------

assoccesloosasagasa Aporo is AF210730Sapiens 1 NEDD4La (Homo f soon cabiosi |aliano 820 real protein (Homo AAHO0621. Unknown (protein E 12653675 AAHO0621.1 BC000621 858 for IMAGE:3346045) Homo sapiens e US 2003/0194725 A1 Oct. 16, 2003 34

ubiquitin-protein ligaseNEDD4 like potential epithelial sodium | | | | | channelEastE regulator neuralprecursor 14719.404 NP 56092 NM 015277 854 own-regulatedt expressed, developmentally gene 4 likchomolog of yeast ubiquitin rotein ligase Rsp5 Homo sapiens

HS. 2429 Gene: TRIM36 Sequence count: 46 Select GI Protein Acc. DNA Acc. Length Description

8 is . . 1864.8883 CAB94831.1, A272269 728 sapiensE"g Pl" - tripartite motif protein 36zinc gig, 18924238NP 061170.1 NM_018700 728 binding protein Rbcc728 Homo fi: sapipiens

HS 121748 Gene: TRIM17 Sequence count: 21 Select GI Protein Acc. DNA Acc. Length Description -- ata: ------

E. 5114351AAD40286.1- AF156271 477 IEEE""Homosapiens - tripartite motif-containing 17testis a 7705825NP 057186.1 Mosto proteinterfringRING finger protein fingerprotein RING finger 16 Homo sapiens

HS.12256 ... Gene: MID2 Sequcnce count:40. Select GI Protein Acc. DNA Acc. Length Description - assis AAF07:41.. . . AF19648. 685 AF196481.E. 1 RINGsapiens finger protein ym-wn mnunu midline 2, isoform 1 TTT 6912.504 Neogas. NM 012216 715 tripartitemotif protein 1 midin 2 Homo sapiens

16445409 NP- 438112. NM_052817|- 685 E.tripartitemotif protein midin2 US 2003/O194725 A1 Oct. 16, 2003 35

| | --- Homo sapiens) 5912440 CAB56154.1 Y18880 - 715 midline poten Homo sapiens

HS 12271 Gene: - Sequence count:0 Select: G ProteinAcc. TDNAAcc. length - Description ------5, 6164727AAF04513.1 ------283 AF174592.11 F-boxr. protein---- Fb6 Homosapiens

HS. 122764 Gene: BRAP Sequence count: 108 Select G Protein Acc. DNA Acc. Length Description BRCA1-associated protein 2 : 3252872 AAC24200.1 AF035620 600 Homosapiens

ar rososi in 0067592 NM m-006768 592 BRCA1Homosapiens associated protein

& 2665906 IAAB885.38.1 AF035950 237 PTEPP'i'rotein.IHomo sapiens!

HS.12372 Gene: TRIM2 Sequence count:220 Select GI Protein Acc. DNA Acc. Length Description E.2. 47231 64 xpo 8435.2 - a tripartiteTRIM2Homo motif sapiensprotein - E.E. 3043558 |BAAs43. Aborios o sapiensthat 7 protein Homo

...Fig 12407367AAG53472.1|AF220018if -- . . . 744 EEEproteinTRIM2 Homo sapiens) 15029681 AAH11052.1 BC011052 744 2HomoAH102 sapiens partite motif Protein tripartite motif protein -

TRIM2KIAA0517 protein 13446227 NP_056086.1 NM_015271 744 tripartite motif protein 2 (Homo Sapiens s AH05016 Unknown (protein litziosautosolo BC005016 324 or IMAGE:3636175) (Homo US 2003/O194725 A1 Oct. 16, 2003 36

sapiens) ...we & hypothetical protein (Homo 230 sapiens/ Ex

HS. 12376 Gene: - Sequence count: 0 Select G Protein Acc. DNA Acc. Length |- Description says cabon is en Homo sapiens)

HS. 124024 Gene: DTX1 Sequence count: 72 Select G Protein Acc. DNA Acc. Length Description

6163343 xposing r 620 deltex homolog 1 Homo sapiens deltex (Homo sapiens)

2981 175 AAC06246. arosso AAHO5816 similar to deltex(Drosophila) homolog 13543301 also BCOO586 Homo sapiens deltex homolog 1 - 4758202 NP_004407.1 NM_004416 620 deltex(Drosophila) homolog 1 hDX-1 Homo sapiens)

HS 1241 86 Gene: RNF2 Sequence count: 63 Select GI Protein Acc. DNA Acc. Length Description

-- ring finger protein 2 11423783 ke oral r 336 Homosapiens AF141327. 1 ring finger protein 4769008 apolitiafrast as BAP-1Horno-sapiens--- - &X & AAH12583 Similar to ring E.8 15214887AAH12583.1W - ecoloss - 3 fingerprotein 2 Homo sapiens

a 6005747 |NP -0091431 NM re 007212. 336 Egge'PHomosapiens

-- - V E. 1785643 |CAA71596.1 Y10571 336 dinG Homo sapiens) US 2003/O194725 A1 Oct. 16, 2003 37

HS 12439 Gene: FLJ20188 Sequence count: 126 select G Protein Acc. DNA Acc. Length Description ------o-o-o-o-, - - - - - www.rama AAHO1586 hypothetical 16306784 AAHO1586.1 326 proteinFLJ20188 Homo sapiens

70201- 21 |BAA9002. Akoooo 326 unnamedHomosapiens protein product

- as s 8923179 NP 060173. NM 017703 326 HomosapiensYPothical Potein F208 - unnamed protein product nuous asso, AKO27004 273 (Homosapiens)

HS. 124835 Gene: FLJ20225 Sequence count: 39 select G Protein Ace. DNA Acc. Length Description

W w a 7020180 BAA91024.1 AK.000232 227 E"E"P"Homosapiens

: 1 s - E. so Neogloss. Molso 227 HomosapiensIt is protein FLJ20225

HS.12504 Gene: DKFZp76D081 Sequence count: 175 Select G Protein Acc. DNA Acc. Length. Description 2014): cabisco AL157474 hypotheticalSapiens protein (Homo

a 4714485m IAAH10369.1 BC010369- 137 mouseArkadiaAETEC Homo sapiens

hypothetical protein 89221 65 Neocoso. NM orgio 137 DKFZp761D081Homo sapiens)

HS.1253.00 Gene: TRIM34 Sequence count: 39 Select G Protein Acc. DNA Acc. Length Description - tripartite motif protein 34 isoform 1 E.in 18087807 NP 067629.2 a- 488 |responsiveinterferon-responsivering finger protein 21, interferon lfinger protein 1 Homo Sapiens

S3 US 2003/0194725 A1 Oct. 16, 2003 38

TT T. Trinartitetripartite motifnroteinmotif protein 34,34 isisoform 3 as 18641345 NP 569074. re- 270 ring finger protein 21, interferon a. -ram U. responsiveinterferon-responsivefinger protein l (Homo sapiens 11022688 B AB17049.1 ABO39902 488 interferon-responsive fingerprotein asses --- 1 middle form (Homo sapiens) 842 interferon-responsive fingerprotein s: 1 1022690 |BAB170501 aboo 1 long form Homo sapiens too. Babi 7051.1 aboo so I stillshort Iorm Homo sapiensfingerprotein -- AF2201431 tripartite motif E. 12407455 AAG53516.1 AF220 143 488 proteinTRIM34 alpha Homo it is sapiens) unnamed protein product re 8 14042869. BABS424. Akoso 48 (Homosapiens AF220144. I tripartite motif 1240745 ZAAG535 17.1 AF220 144 243 protein TRIM34 epsilon/Homo Sapiens/ fift|13093781i. cacos, also no sapiens|hypothetical protein (Homo

HS.127392 Gene: - Sequence count:0 select GI protein Ace. RNA length Description -- AF1892.86.1 p28 ING5H ---saissaalso 240 sapiens) 1 p. 35 Homo

& saro isos: - 240 ps ING5 Homo sapiens

HS. 127799. Gene: BIRC3 Sequence count: 168 Select GI Protein Acc. DNA Acc. Length Description --- inhibitor of apoptosis protein ! f ; bosaacsa). AF070674- 604 1Homo sapiens)

E3 160975 IAAc41943.1 :: L49432 604 proteinTNFR-TRAF - signalling complex 4502139NP 001156.1NM 001165 604 baculoviral IAP repeat US 2003/0194725 A1 Oct. 16, 2003 39

containingprotein 3 cIAP2 hiap-l apoptosis inhibitor 2 TNFR2-TRAF signallingcomplex protein Homo Sapiens) Mamm isolaacsson U375.46 604 MHC inhibitor of apoptosis protein 1 F. usic Acost. U45878 604 HS 127808 Gene: BIRC3 Sequence count: 42 Select G I Protein Acc. DN A Acc. Length - Description span cabalso Alias 180 hypothetical protein Homo sapiens) HS-127950 Gene: BRD1 Sequence count: 88 SelectSelect GI Protein Acc. DNA Acc. Length Description -- dJ522J7.2 (bromodomain 4200325 CAB 1574.1 1058 containing 1 (similar to peregrin, BR140)) Homo sapiens - AF0050671 BRL Homo 6979019 |AAF4320. AF005067 1058 sapiens) - — bromodomain containing protein 3 1132 1642 1058 1BR140-like gene Homo sapiens

poss39. Most hypothetical protein (Homo ; 526.2603 cabisz? I AL080 149 715 Sapiens/

Hs 1287

Gene: TRIM26 Sequence count: 204 Select ...G. Protein Acc. DNA Acc. Length Description idely expressed acid zinc assic,isors, papaso - 539 fingerW Homo sapiens tripartite motif-containing 26acid 4508005 NP_003440.1NM_003449 539 finger protein zinc finger protein I 173 Homo sapiens) 3. sain Aaaais. U09825 539 acid finger protein - US 2003/O194725 A1 Oct. 16, 2003 40

Hs 129829 Gene: AIRE Sequence count: 8 select GI Protein Acc. DNA Acc. Length Description 2696619-- re- as BAA-390.1BAA23990.1 - ength.s AIRE-1 HomoDescription sapiens boobaas. - 348 are: Homo sapiens) is 2696621 BAA23992.1 - 254 AIRE-3 (Homo sapiens)

apocanosis. 515 AIRE Homo sapiens) fi. possipaasi AB006682 545 AIRE-1 Homo sapiens r g; 2696617 BAA23989.1 AB006683 348 AIRE-2 Homo sapiens - --m-m-m-m------u- is poss |Baaass. AB006685 254 AIRE-3 Homo sapiens

autoimmune regulator AIRE -- isoform 1 AIRE protein autoimmune regulator f 4557291 NP 000374.1 NM_000383 545 polyendocrinopathycandidiasis(autoimmune -- ectodermal dystrophy) autoimmune regulator (APECEDprotein) Ilomo Sapiens autoimmune regulator AIRE isoform2 AIRE protein autoimmune regulator (autoimmune NP 000649.1 NM 000658 348 polyendocrinopathycandidiasis - | ectodermal-dystrophy) autoimmune - - - regulator (APECEDprotein) Homo Sapiens - autoimmune regulator AIRE isoform3 AIRE protein

autoimmune regulator 4557295NP 000650.1 NM 000659 254 (autoimmunepolyendocrinopathycandidiasis lectodermal dystrophy) autoimmune regulator (APECEDprotein) Homo US 2003/O194725 A1 Oct. 16, 2003 41

sapiens, assacaboo. Z97990 545 AIRE protein (Homo sapiens) 8

HS130541 Gene: KIAA1542 Sequence count: 113 Select GI Protein Acc. DNA Acc. Length Description assas, eason AB040975 iss kinas protein Homo sapiens) AH04950 Unknown (protein E. an AAHO)4950. I BC004950 322 for IMAGE:361.9689) (Homo : sapiens/ HS. 131 731 Gene: FLJ1 1099 Sequence count: 76 Select G Protein Acc. DNA Acc. Length - 7023550 BAA920). Akoo 132 unnamedHomosapiens) Protein product

- 10434923BAB14423.1 Ako23139 327 I Homosapiensnamed Protein Product

wo-w-r-rm802286, NP 0607001 NMoisso- 152 hypotheticalHomosapiens) Protein F11099

HS13 1859 Genc: FBXO11 Sequence count: 12 select G rest DNA Ace. length Description

6164741 AAF04520.1 - W- - - M. r - - 1 97 sisterosaruHomosapiens Y S 2xxxx sness Arizoni ar 7670 192 Fox protein FBX11 Homo sapiens

HS.32753 Gene: FBXO2 Sequence count: 122 Select? G Protein Acc. DNA Acc, Length au Description - - - - US 2003/O194725 A1 Oct. 16, 2003 42

------mummy M-m-m-m- Wu------hox onvnroteino Ho ------19263634 AAH25233.1- - F-boxsapiens) only protein 2 (Homo - - - - - 6164731 AAF04515.1|AF174594 257 EHomosapiens) Protein -- - 6018317 AAF01822.1 AF187318 295 f S. i. F-boxa protein Fbx2- - i. onosapiens ------a-

15821.98 NP 06:02 NM 012168, 296 -boxHomo only sapiens protein 2 F-boxprotein

HS 13495 Gene: REQ Sequence count: 273 Select G Protein. Acc. DNA Acc. length Description Warrear AAH14889 requiem, apoptosis 15928853 AAH 48.89. 391 responsezinc finger gene Homo Sapiens T-- - requiem neuroD4 ubi-d4 252970s AAB81203.1 AF001433 391 IES",Homosapiens "P"

- * requiem apoptosis response, 5454004 NP 0.06259.1 NM 006268 391 zincfinger protein ubi-d4 Homo Sapiens) -

AAB58307. U94585 requiem homolog Homo sapiens)

HS. 35890 Gene: PF 1 Sequence count: 84 Select G Protein Ace. DNA Acc. Length Description

4278861 AAK38349. Ayoos to: E.factor Zinc Homo finger sapiens transcription & AAHO1657 Unknown (protein 12804495 AAHO1657.1 BC001657 487 forIMAGE:3356959) Homo ------, -sapiens------

79.593 13 BAA96.047. I apolo 6 as 7 ku 1523 protein Homo sapiens/

unnamed protein product 104.366.36 BABI 4875. I AKO24290 590 /Homosapients/

HS.3755 Gene: FBXW2 Sequence count: 102 US 2003/O194725 A1 Oct. 16, 2003 43

Select GI ro-Protein Acc. DNA Acc. Length - Description

gigs sia IAAF046s. - - AF129531.1Homosapiens F-box protein Fbw2

: riai:x$ 479. |Aaraps AF176698 F-boxsapiens protein FBW2 (Homo

F.3. 104.3896 BAB14051. Akoz2484 454 unnamedHomosapiens) protein product - F-box and WD-40 domain protein 912360 INP 036296.1 NM 012164 422 2F-box protein Fbw2 Homo &: Sapiens -

Hs. 137732 Gene: TRIM35 Sequence count: 65 Select GI -o-Protein Acc. DNA Acc. Length Description gigs 147437924,479. XP- 027437.1 m 306 tripartitest motif-containing 35Homo grgy: 5689533 |BAAsosol anoi 504 kaios protein Homo sapiens

HS.1386.17 Gene: TRIP12 Sequence count: 320 Select GI Protein Ace. DNA Acc. Length Description - i. 460711 eaaos, D28476 1992 kaaos (Homo sapiens 2.3LWe - a 703100 AAC41731. L40383 174 thyroid receptor interactor thyroid hormone

receptorinteractor 12 thyroid 10863903 NP 004229.1 M_004238 1992 receptor interacting protein 12 Homo Sapiens)

HS.4084 Gene: RNF7 Sequence count: 191 Select G Protein Acc. RNA length Description --

- Tasmaassos AAD596).ansoo as - inis proteinSAGE. (Homozinc RING sapiens finger 4809218 IAAD301.47.1 - I 113 AF142060 RING finger protein US 2003/0194725 A1 Oct. 16, 2003 44

:38: ------un ------(Homosapiens) -- soiza AADs , CKBBP1AF164679_1 (Homo ring sapiens) finger protein

assic also - w 113 AAHO5966Homosapiens) ring finger protein 7 14250389 AAHO8627. 113 AH08627Homosapiens ring finger protein 7

HS142653 Gene: RFP Sequence count: 334 Select G Protein Acc. DNA Acc. Length Description E. -585.1985 CAB55434.1 to d25J64Homosapiens) (ret finger protein) AF2303931 tripartite motif 12275874 AAG50 172.1 AF230393 513 proteinTRIM27 alpha Homo sapiens - w AF230394 1 tripartite motif

12275876. AAG5O173. AF230394 358 proteinTRIM27 beta Homo sapiens - BCO 3580 513 AAH13580 ret finger protein 15488901, Aahiasso Homosapiens - E. 337372 AAA36564.1 J03407 513 rfp transforming proteinw ret finger protein isoform E 5730009 INP 006501.1 NM_006510 513 alphatripartite motif protein : TRIM27 Homo sapiens ------ret finger protein, isoform ----- E. 15O1 as in 2.1 NM 030950 358"betatripartite motif protein - - TRIM27 (Homo sapiens

Hs. 142684 Gene: DKFZP667Ol 16 Sequence count: 78 Select G Protein Acc. DNA Acc. Length Description DNA telength,AAH05847 Unknowntriple (protein E. 13543372 AAHO5847.1 BC005847 857 sapiensforIMAGE:2907142) Homo US 2003/O194725 A1 Oct. 16, 2003 45

------R |221978 Cacoloral asps, 184 hypothetical protein (Homo sapiens/

IIS. 14398 Gene: ING3 Sequence count: 139

Select TGI Protein Ace. DNA Acc. Length Description similar to tumor Suppressor 3. 304 1855 AAC12956.1 408 p33INGsimilar to AF044076 - (PID:g2829208) Homo sapiens ------1-100395.41 Mald Zot. AAG12172.1 AF074968 I 418 -97.PNGHomosapiens Protein

isa 7019962 BAA90942.1 AK000096 418 Ea"P"P"Homosapiens E. lossalagases avorio 418 4. Homo sapiens 4. inhibitor of growth family, X member3 hypothetical protein 9506659 NP_061944.1 NM_019071 418 similar to tumor Suppressor

p33ING1 Homosapiens hypothetical protein (Homo t; ; 1513.1675 CAC48260 AL603623 378 - sapiens)

:::::::: -7 AH09777 Unknown (protein E. anslation 7. I BC009777 92 or MGC: 13446) (Homo sapiens) : as: alora, BC009776 92 gMGC.AH09776 13445) Unknown (Homo (protein sapiens)

HS.143323

Gene: PLU Sequence count: 309 Select G Protein Acc. DNA Acc. Length Description

- 38%; c ar putative DNA/chromatin ------3. 1923370NP 06092 ------los RSt. ii. 3970878 |BAA34803.1 AB015.348 431 rurbo Homo sapiens g 2. - - F------w ------a

---4322488 AAD 160611 AF087481 1580 2homologtinoblastorabinding 1 Homo sapiens protein 4902724 |CAB43532. A. 132440 1544 PLU-1 protein Homo sapiens US 2003/O194725 A1 Oct. 16, 2003 46

------, -r - - 6572291 ICAB63108.1 Anato 1681 RB-binding protein Homo sapiens) a 6453448 CAB61368.1 also 1350 hypothetical protein Homo sapiens |- - 6453463 capsians AL133048 1028 brothetical protein Homo r Eli ...... 6453514 ICAB61395.1 AL133072 916 hypothetical protein Homo sapiens

mer 6808379 (CAB70847. AL137622 626 hypothetical protein Homo sapiens

HS. 144266 Gene: FLJ22612 Sequence count: 14 Select G Protein Acc. DNAx Acc. Length Descriptionh - unnamed protein product

F: logooseabso Akosos- 516 Homosapiens Ed assia sporos. Morris sic hypotheticalFLJ22612(Homo protein sapiens HS.144658 Gene: - Sequence count:0 select'-1.-1.- c ProteinAcc. Acc.DNA Length- Descriptionrintin. |- hPOSH2 based on gi 18676780 with a s protlog4.0 - 729 changeof A->G (Homo sapiens)

HS.46037 Gene: RNF32 Sequence count: 43 select G Protein Acc. DNA Acc. Length Description

-- 12-2761.78 - |Aassos. ra so AF325690sapiensqani FKsG33 (Homo AF4412221 ring finger protein ; 20278963 AAM1861. - 362 Affin sapiens ------. -- 12053253 CAB66808.1 AL136874 362 sapiens)Elia protein Homo M-ama - r - | -assos NP -- 112198.1 NM- 030936, 362 (Homosapicns)ling finger Protein 32 US 2003/0194725 A1 Oct. 16, 2003 47

Hs. 149918 Gene: GASC1 Sequence count: 330 Select G Protein Ace. DNA Acc. Length Description

as: 18573213 XP 03462441 - is cellcarcinomalgene amplified in Homo squamous sapiens) 8:S E: 3882281 |BAA4500. AB018323 1100 KIAA0780 protein Homo sapiens)

33 ---1056.7164 BAB16102.1 AB037901 1056 cellcarcinoma-1gra"P"Sa"'s Homo sapiens

HS15237

Gene: FLJ12526 Sequence count: 18 Select G Protein Acc. DNA Acc. Length Description 2: - h ypothetical proteinA 13644-171 XP 018257.1 - 155 FLJ12526(Homo sapiens

Pa 10434064- baba 15. Ako.2588 155 unnamedHomosapiens) Protein product hypothetical protein 13376152 NP- 0706 1 NM 024787. 155 LJ12526|Homo sapiens)

HS. 151411

Gene: KIAA0916 Sequence count: 229 select G Protein Acc. DNA Acc. Length. Description six: gal BAA7939.1 AB0073 | 120 kiwoc protein (Homo sapiens)

------33 19326 acaws AFO75587 as proteinHomosapiens associated with Myc - -- ig 7662380 Posnanmosos, 4641 KIAA0916 protein Homo sapiens)

HS. 151428 Gene: RFP2 Sequence count: 133 Select GI Protein Acc. DNA Acc. Length Description i. sizasaakios - 407 car Homo sapiens) AAK51624 putative tumor 1409479'AAksio24. - 407 suppressorRFP2 Homo sapiens US 2003/0194725 A1 Oct. 16, 2003 48

&S3 ------lamm -wn. - sg: 14594775 CAC43391.1 2)bA34F20.1 Homo sapiens (ret finger protein AF220127.1 tripartite motif proteinTRIM13 alpha Homo ria, 12407423 asso AF2201 27 407 sapiens AF220128 1 tripartite motif E 12407425AAG53501.1 AF220128 175 proteinTRIM13 beta Homo sapiens) -

aE. 965.1927 AAF91315.1 | AF241850 407 AtHomosapiens finger Protein? also caloisi AJ224819 407 uno suppressor Homo sapiens

r&: songs AAHO3579.1 BCOO3579 407 AirHomosapiens ret finger protein 2 ret finger protein 2

candidatetumor suppressor 5031861 NP_005789.1 NM_005798 407 involved in B-CLL tripartite motif

protein 13CLL-associated RING

finger Homo sapiens

ret finger protein 2 candidatetumor suppressor involved in B-CLL tripartite motif

protein 13CLL-associated RING finger Homo sapiens

HS. 153638

Gene: MLL2 Sequence count: 171 Select G Protein Acc. DNA Acc. Length Description g passes acsins. AF010403 5262 Air Homo sapiens

------w-H-r- Yip ------.

passaacsins. AF0104.04 4957 ALR Homo sapiens myeloid/lymphoid or mixed 4505.197 NP_003473.1 NM_0034825262 lineageleukemia 2 ALL1-related : c gene Homo sapiens -

HS. 153639 Gene: SBB103 Sequence count:230 US 2003/0194725 A1 Oct. 16, 2003 49

Selects: Gl Protein Acc. DNA Acc. Length - Description- - - i. --- F. 3342562 AAC27647.1 : AF077599 317 HomosapiensPhetical S0 Protein -- hypothetical SBB103 protei 50207 NP_005776.1 NM_005785 317 HomosapiensE. protein w- - similar to Homo sapiens

: hypotheticalSBB103 protein mRNA 9956003 AAG0I 988. I AY007109 14 I with GenBank Accession Number AF077599 Hs. 153685

Gene: KIAA0322 Sequence count: 18 select G Protein Acc. DNA Acc. Length Description E. 2224,585 Baazoso (Boszo 1so kaans: Homo sapiens

is 10039443 BAB13352.1 AB04.8365 1585 NEPPiquitingsHomosapiens

HS.15423 Gene: HDCMC04P Sequence count: 157 Select GI Protein Acc. DNA Acc. Length - Description - determined by GENSCAN y 4153862 AAD04721. 592 prediction andspliccd EST match to EST R84329 (NID:942735) Homo sapiens -- is 7021918 BAA91435.1 - 594 unnamed protein product X Homosapiens)

3.is: assos Arisai AFO67804 as sapiens)tips HDCMC04P (Homo

Ed 18923726NP 061152.1 NM_01 8682 as hypotheticalHomesapiens protein ------HDCMC04P . --- - -

HS. 15467 Gene: FLJ20725 Sequence count: 105 Select GI Protein Acc. DNA Acc. Length Description - hypothetical protein mrm-m-m-m-m-m- is 11434324 spons - 305 Epic sapiens 7021002 BAA91346.1 AKOOO732 305 unnamed protein product US 2003/0194725 A1 Oct. 16, 2003 50

------Homosapiens

NP 0604 13.1 MO 17943 305 EllisHomosapiens protein FLJ20725

HS.154680 Gene: DKFZP434M154 Sequence count: 48 Select GI Protein Acc. DNA Acc. Length Description ... it1476105sixpos13301 i u i vour - || 427 PFAPMHomosapiens Protein so cabasso Alsos 361 hypothetical protein Homo sapiens t 3543342 was boss 515

HS.15470 Gene: LOC51132 Sequence count: 66 Select --- G Protein Acc. DNA Acc. Length Description RING zinc finger LIM domain E. 10944.884 cacias 624 bindingprotein Homo sapiens

AF155109 1 putative ring zinc son AAD42875.1 | AF155109 483 fingerprotein NY-REN-43 antigen Homo sapiens unnamed protein product . 7022528 basis AKO01334 624 Homosapiens) T-H -- AAH13357 Unknown (protein forMGC:1561) Homo sapiens AAH13357. ecoss 624 - I so - putative ring zinc finger protein.NY-REN-43 antigen

putative ring zinc finger protein 77.05835 NP_057204.1NM 016120 483 NY-REN-43antigenRING zinc finger LIM domain binding protein Homo sapiens US 2003/0194725 A1 Oct. 16, 2003 51

Select GI Protein Acc. DNA Acc. Length Description .. E. riana kpool - loss ubiquitin-protein(E3) Homo sapiens) isopeptideligase re 285983 easovo D13635 loskaaolo Homo sapiens tools possi Molast easing ------assos, autocol to as E. HS. 155313 Gene: DATF1 Sequence count: 252 Select GI Protein Acc. DNA Acc. Length Description - d 885L7.9.3 (Death re M - associated transcription factor 1 F: o2733867|CAC28883.1 - ' containsHomo sapiens) KIAA0333), isoform 3)

AAH14489 death 1568O267 has 562 associatedtranscription factor 1 Homo sapiens death associated transcriptionfactor 3, 18375617 NP_071388.2 562 1, isoform a death inducer & - obliterator 1 Homo sapiens. - -- death associated transcriptionfactor is 18375619 NP_542986.1 - 562 1, isoform a death inducer obliterator 1 (Homo sapiens) death associated transcriptionfactor -- 18375621 NP 542987.1 544 1,st isoform No. b1 deathHomo inducer sapiens is 2224607 BAA20791.1 asons 99 kaans Homo sapiens) : % worw-o-o-o- - - - - 7023815 BAA920941 AK002127 so (Homosapiens)named Protein product - AAHO0770 hypothetical 12a —12653953. AAHO0770.1 BC000770. 544 inducer-obliterator-1EESTEth (Homo sapiens - x4 mm momma-a-Mimama ------g aissaloist- peogs,1. proteinFLJ11265AAHO4237 hypothetical similar to death US 2003/0194725 A1 Oct. 16, 2003 52

TT inducer-obliterator-1 (Homo : sapiens) .

HS.155968 Gene: ZFP103 Sequence count: 67 Select GI Protein Acc. DNA Acc. Length Description -

- zinc finger protein 103 12728697XP 002551.2 - 685 homolog(mouse) (Homo Sapiens ------

E 1945615 |BAA979. D76444 685 hkf-1 Homo sapiens zinc finger protein 103 homolog(mouse) zinc finger proteinhomologous to Zfp103 in 5031825 NP_005658.1 NM_005667| 685 mouse Zinc?inger protein expressed in cerebellum Homo Sapiens

HS. 155983 Gene: KIAA0677 Sequence count: 163 Select G Protein Acc. DNA Acc. Length Description 33271.68 BAA362.basis AB014577- | 1064 Sapiens proteinotein HHomo agig 12803467 AAHO2558.1 BC002558 1064 E.product HomosapiensKIAA0677 gene

fa 7662246 NP- 055478. NM- 014663. 1064 Homosapiens)'g''P'

HS. 156276 Gene: KIAA0783 Sequence count: 218

Select GI Protein Acc. DNA Acc. Length Description

------aspszaalso |AB01826 | 888 KIAA0783 protein-Homo sapiens. ------— KIAA0783 gene product ------

7662304 possisi NM ---oasso 888 Homosapiens

HS. 565 Gene: NEDD4. Sequence count: 104 Select GI Protein Ace. DNA length Description US 2003/O194725 A1 Oct. 16, 2003 53

i. ins: P46934 - 927 NED1.HUMANNEDD-4 protein KIAA0093 gene product is related E: 577313 BAA07655.1 D420s 927 toNEDD-4 protein. Homo sapiens

HS. 56637 Gene: CBLC Sequence count: 46 Select G Protein Acc. DNA Acc. Length - Description m Cas-Br-M (murine) T 2014.9596 NP 036248.2 - 474 lectropicretroviral transforming sequence c. CBL-3 Homo sapiens) r: loss Bassos. Aboss 474 Cbl-c Homo sapiens.

i.at 4959421- Aadual AF117646, 474 AF1176461Homosapiens long CBL-3 protein

4959423 AAD34342.1|AF117647. 428 ETShortHomosapiens C. Protein

HS. 157427 Gene: RFPL2 Sequence count: 7 select GI Protein Acc. DNA Acc. Length Description Eg|41.118593967 XP - 009938.3 - retfingerHomosapiens protein-like . 2

E.is 34-a- 17317 CAA09045.1 AJ010231 - 288 RETHomosapiens finger protein-like 2

ad-a 5730011 NP 006596. NM_006605- 288 E.P."Homosapiens -

HS. 1579 Gene: ZNF147 Sequence count: 41 Select GI Protein Acc. DNA Acc. Length Description

------AAH16924 zinc finger protein 630, 147(estrogen-responsive finger : 1873. Aaho. - protein) Homo sapiens 458726 BAA04747.1, D21205 630 estrogen responsive finger US 2003/0194725 A1 Oct. 16, 2003 54

Fig. protein(e?p) (Homo sapiens)

Zinc finger protein 147 Zincfinger

protein-147 estrogen-responsive 4827065 NP_005073.1 NM 005082 630 finger protein tripartitemotif protein 25 Homo sapiens

HS. 158761 Gene: LOC93349 Sequence count:37 select G Protein Acc. DNA Acc. Length Description E. --anals ke oso. - 245 roducts to Homo unnamed Sapiens) protein hypothetical protein Fa poisons span. 245 BC004921 Homo sapiens

it. 0434890 BAB14413. AK023116- 245 Homosapiensnamed Protein Product & 31. x 3. Isaac AAHO4921. 1 cool 12 AAHO4921forMGC:4821) Unknown (Homo (protein sapiens)

Hs. 15921 Gene: FLJ10759 Sequence count: 80 Select G Protein Acc. DNA Acc. Length Description li:as 7022987 BAA91792.1 AK001621 475 Ea"ProteinHomosapiens Product

g 12654759 AAHO1222.1 BC012 - 475 proteinAli FLJ10759 hypothetical Homo sapiens --- $2:...'.E. 14124950 AAHO7999.1 BC007999 475 proteinFLJ10759Air hypothetical Homo sapiens -E. sons AAhloss.,,,,,,,,, BCO1 1689 T. is AAH1689forMGC: 19672) Unknown (Homo (protein sapiens) |alignati,E|15082476 AAH12152.1is BC012152als, isas EggsAAH12152 Unknown Homo (protein" sapiens

3.in 18922648 NP 060677. NM 018207 475- EPhIP'n''Homosapiens

HS. 159589 Gene: NEUD4 Sequence count: 48 Select G Protein Acc. DNA Acc. Length Description US 2003/0194725 A1 Oct. 16, 2003 55

---- -r-1 wr-r------unwrax m wavass-wn was ... ------a------4 trary ho - s 17879 possi NMost 353 Homosapiensit. d4 (rat) homolog. rt so AAC50685.1 U43843 353 heroid protcin Homo sapiens) HS.16036 Gene: FLJ12565 Sequence count: 1 12 Select GI Protein Acc. DNA Acc. Length - Description - unnamed protein product | Falli. 0434127- BAB14139,1|AK022627 622 Homosapiens

pairE. logoszeabisco. Ak026068 772 unnamedHomosapiens protein product

g posoncapsai AL136729- 442 ElsieSapiens protein Homo

anf his457 - NP. 07.1347. NM o22064was 772 FLJ12565hypothetical Homo Protein sapiens) Hs. 16537 Gene: ZNF364 Sequence count: 112 Select GI Protein Acc. DNA Acc. Length Description hypothetical protein, similar 17488930XP 039714.2 - 304 it (U06944) PRAJA1 (Homo sapiens) -- hypothetical protein, similar E. 5102894 CAB45280.1 AL079314, 232 to(U06944) PRAJA1 Mus : - musculus Homo sapiens Hs. 165662 Gene: KIAA0675 Sequence count: 131 Select GI Protein- - - Acc. - DNA Acc. Length Description4 ------wr-e-a-wo- ro--oncour r 3327164 baarsson AB014.575 1208 Af protein Homo - - - - E. - - - Sapiens

f is 7662244 NP--- Oss463.1 NM_01.4648 1208 ISPHomosapiens

AF2 79370 508 - - 14582392 Akaoist I a------Sapiens)ity Inari,(Homo US 2003/0194725 A1 Oct. 16, 2003 56

HS. 16577 - Gene: FBXO3 Sequence count: 170 Select G Protein Acc. DNA Acc. Length - Description ------a loosomalcabasco - a hypotheticalSapiens protein (Homo |Est. AF174595 1 F-b in FbX3 is:3 6164733 |AAF04516. AF174595 173 HomosapiensF-box protein Fbx3 .

i.a 6103643 AAF03702.1 AF176702- 410 SapiensBP''''" unnamed protein roduct is 7023521 BAA9 1991.1 AK001943 471 Homosapiensned protein p Free t t only protein 3, isoform 1 F E. 15812186NP 0363.07.2NM 012175 471 box protein FBX3 Homo sapiens e-S F-boxs onlyy protein 3, isoform 2Fre E. 15812188|NP 208385.1 INM 033406 415 box protein FBX3 Homo sapiens

HS. 66204

Gene: PHF1 Sequence count: 195 |Select GI Protein Acc. DNA Acc. Length Description cICK0721 Q.4.1 (PHD finger is 3169118 |CAA16158.1. 457 protein 1)(isoform 1) Homo Sapiens -r to- cICK0721 O.4.1 (PHD ?inger 3x3 3169119 CAA16159.1, - 567 protein 2)(isoform 2) Homo 3. sapiens El 2660720 AACs206. AF029678 457 PHF1 Homo sapiens i.

HD fi in2 Homo acao aacson aposons so, Pie protein2 (Homo E. 14250730 AAHO8834.1 BC008.834 567 AHomosapiens PHP finger Protein

r: 4505777 : NP 002627. NM002636 457 PHPfingera Homo sapiens protein, isoform

PHD finger protein 1, isoform it: asso, |NP 077084. Mozas bHomo sapiens US 2003/O194725 A1 Oct. 16, 2003 57

Hs. 167750 Gene: RFPL1 Sequence count: 4 Select G Protein Acc. DNA Acc. Length Description

gainia- |CAA09043. aloons 288 relil Homo sapiens rata 3417314 cao. AJO 10229 as RFPL1S (Homo sapiens)

a 0440558 NP 066306.1 NM021026 288 E"F"""Homosapiens

HS. 167751 Gene: RFPL3 Sequence count: 6 Select GI Protein Acc. DNA Acc. Length Description F. 4468862CAB38256.1 288 d 149A16.2 (Ret finger protein g - like3) (Homo sapiens) is TRET fi in-like 3 Fa 341.7319|CAA09046.1 AJ010232- 288 (Homosapiens)inger protein-like -

5730013 NP osos in M006604- 288 Homosapiensfinger protein-like 3

HS.168.095

Gene: RNF20 Sequence count: 101 select G Protein Acc. DNA Ace. Length Description

is 18572254 XP 084272.1 - ring fingerprotein 20 E. -- Homosapiens - - - - - FEE. (1427.9233 Aaksson AF26520 975 20AF265230 Homo sapiens) IRING finger protein

zoso- Baazos,- - - - - Akoos, as Homosapiens)named protein product 10433666 BAB1400s. Ako2.300 746 unnamedIIomosapicns Protein Product - 2x25. - - -

.. a 10433974 BAB14081. - AK022532 975 HomosapiensEa"Pon Product 8. . ino finoer nrotei 16554453. NP_062538.4- NM_019592H. : 975 E.Homosapiens "SP" '

US 2003/0194725 A1 Oct. 16, 2003 58

HS. 68159 Gene: LOC51283 Sequence count: 202 Select GI Protein Acc. DNA Acc. Length Description i------18605049 XP 02731 1.3 soO apoptosis regulator (Homo & 3. sapiens | in - : a 4 sy:a 7320979 IAAF59975.1|AF173003 450 ET-PP'sHomosapiens) "gat"

a 12804383 IAAHO30541 BC003054 450 A305Homosapiens PP'ss'gulato' - -E no NP 07:45. NMoosa 450 g regulator Homo

HS. 170610 Gene: MAP3K1 Sequence count: 86 Select G Protein Acc. DNA Acc. Length Description s psis actors Afoss 1495 Mekkine Homo sapiens

Hs. 170822 Gene: DKFZP564A022 Sequence count: 32 Select G Protein Acc. DNA Acc. Length Description : a 4042657 BAB55340.1 AK027748 - 258 Ea"P"P"Homosapiens -

i.i. rosaics caboss AL 36620 hypotheticalSapiens protein (Homo : assospizio Moosa ' hypotheticalDKFZ564A022(Homo protein sapiens)

HS. 172084

Gene: PYGO2 Sequence count: 124 - - Select GI Protein Acc. DNA Acc. Length Description similar to Unknown (protein 2: 16160297XP 034083.2 406 forIMAGE:3627860) Homo & - sapiens) -

a. 5, 19550451 aloist. - 406 at:Sapiens pygopus 2 Homo- 13543991AAH06132.1 BC006132 463 AAH06132 Unknown (protein US 2003/O194725 A1 Oct. 16, 2003 59

forIMAGE:3627860) (Homo | | sapiens)

HS. 172700 Gene: NEURL Sequence count: 99 Select G Protein Acc. DNA Acc. Length Description ------5, 20070955 H26336.1 - 574 euralized-like (Drosophila) a - - Homosapiens 4103928 AADossil aroon 574 neuralized Homo sapiens

- - neuralized-like 4758800 NP_004201.1 NM_004210 574 (Drosophila)neuralized . - (Drosophila)-like Homo sapiens neuralized homolog Homo

3157991 Aaciliti U87864 574 sapiens

Hs 172777 Gene: BIRC4 Sequence count: 17 Select GI Protein Acc. DNA Acc. Length -- Description -- similar to baculoviral IAPrepeat- , --- 497 containing 4 (H. sapiens) (Homo 13649024 XP 013050.3 Sapiens . baculoviral IAP repeat containingprotein 4 apoptosis 4502143 NP_001158.1 NM_001167 497 inhibitor 3 X-linked inhibitor of apoptosis Homo sapiens - E. Ioasaacsosis | U32974 - 497 IAP-like protein ILP

184320- AACSO373.1 U45880 497 E"apotosisprotein

HS. 173980 Gene: NMP200 Sequence count: 217 Select G Protein Acc. DNA Acc. Length Description - - - AAH18665 nuclear matrix E. 17391461 |AAH 8665.1 - 504 proteinNMP200 related to splicing factor PRP19 Homo sapiens) US 2003/0194725 A1 Oct. 16, 2003 60

------AAH18698 nuclear matrix AAH18698.1 504 proteinNMP200 related to splicing F17391520 factor PRP19 Homo sapiens

asE. 5689738 cassissi AJ131186- 504 Homosapiensi. matrix protein NMP200 AAHO8719 nuclear matrix Fis 14250536 AAHO8719.1 BC008719 504 protein.NMP200 related to splicing as factor PRP19 (Homo sapiens nuclear matrix protein 504 MP200related to splicing factor ra ions earn M 014502 PRP19 Homo sapiens

Hs.17639 Gene: UBE3B Sequence count: 318 G Protein Acc. DNA Acc. Length - Description --- AF251046 ubiquitin protein is 13507059 AAK28419.1 AF251046 185 ligaseHomo sapiens)

HS.17763.5 Gene: KIAA1095 Sequence count: 117 Select GI Protein Acc. DNA Acc. Length Description

E. sosz, eason-- AB02901 8 1098 KIAA109s protein Homo sapiens) E. 70 18547 can 75679. I All 57498 480 hypothetical protein (Homo sapiens,

HS, 17767

Gene: KIAA1554 Sequence count: 696 Select G toProtein soon Ace. long,DNA Acc. Lengthw-r-m------|- Description 10047173 |BAB1380. AB046774 1320 KIAAisi protein Homo sapiens 10438750 BAB1530. aroos (Homosapiens)unnamed protein product

3. 10438576 | BAB15280.1 AK025914 919 (Homosapiens)Elm P'7"P"

--- -

unnamed protein product 10438,270 | BAB15212 I |AK025676 1036 it. 7 104.35940. BABI 1708, AK02.3871509 unnamed protein product Oe US 2003/0194725 A1 Oct. 16, 2003 61 (Homosapiens,

Hs. 179260 Gene: C14orf4 Sequence count: 72 Select G Protein Acc. DNA Acc. Length Description a loss6484 cacios39.1 - || 796 polyglutamine-containing E is esses is . r protcinrotcin HHomo sapiensiens) m 14784721 XP 041 104.1 - chromosome 14 open reading frame a total - o 4 Homo sapiens f 14017947 BAB47494.1 Aposses 723 kinases protein Homo sapiens)

a 12002026 IAAG43156, AF06,3597 103 -(Homosapiens) 7 brin")''P'ei

HS. 179669 Gene: FLJ20637 Sequence count: 50 select GI Protein Acc. DNA Acc. Length Description fire ; : 7020871 BAA91303.1- : AK000644- 379- IrnamedHomosapiens) Protein Product* , - E. NP 060382.1 NM 017912 379 hypotheticalyPothetical proteinin FLJ20637 E. paso - - ww. Homosapiens

HS 17994.6 Genc: KIAA1100 Sequence count: 174 Select G Protein Acc. DNA Acc. Length Description 5689537BAA83052.1 AB029023 432 kiano protein Homo sapiens ara 7662486 postonMolso 432 kiano protein Homo spin

Hs.179982 Gene: TP53BPL Sequence count: 134 Select GI Protein Acc. DNA Acc. Length Description

9664.----- 146 BAB03714.1 AB045732 1045 SapiensRING-finger protein (Homo i --

it. A 9664. 148 pabolis Aboss Toso IRING-finger protein (Homo 8%& Sapiens

------sessm. ------US 2003/O194725 A1 Oct. 16, 2003 62

- - - AF0983.00 l topoisomerase I E. ass AAD23379.1 Arosoloss binding RSprotein Homo sapiens ------AH 13655 Unknown (protein

or IMAGE:4152599) (Homo 15489083 AAH13655. I BCO 13655 30 Sapiens/ i tumor protein p53-binding 5032191 NP_005793. 1 (NM 005:802 815 proteintopoisomerase I binding | rotein (Homo sapiens/

265,123 AAC985301 U82939 815 Phinting Protein (Homo apiens)

HS. 180403 Gene: STRIN Sequence count: 143 select Gi Protein Acc. DNA Acc. Length - Description : assm AADiosa AF62680 245 Eco-Homosapiens HSD4 protein E. tion Nposts moon 245 sirn protein (Homo sapiens) : -

i.a 10435ssal BAB14614 || AK023579 15, 19"(Homosapiens) Prote:"'Pride hypothetical protein (Homo 6599.126 CAB63712. I AL1335.57 I83 sapiens)

HS. 80612 Gene: PXMP3 Sequence count: 198 Select GI Protein Acc. DNA Acc. Length Description . AF133826 135 kDa peroxisomal 97 19228 AAF97687.1 a- 305 membraneprotein Homo sapiens - AAHO0661 peroxisomal TT 12653751AAHO0661.1 BC000661 305-Zellweger ESPC" syndrome) (EP. Homo- - - sapiens m AAHO5375 peroxisomal ------13529227AAH053751- - BC005.375 305 ElbaneZellweger syndrome)Pin (P. (Homo sapiens - -- M-r-rm------190190 Aaaaiii. M85038 305 peroxisomal membrane protein 189849 AAC12785.1 M86852 305 peroxisome assembly factor-1

I n 2 US 2003/O194725 A1 Oct. 16, 2003 63

(Homosapiens peroxisomal membrane protein 4506343 pool M_000318305 3Peroxisomal membrane protein-3 (35kD) (Homo sapiens)

HS. 80686 Gene: UBE3A Sequence count: 522 Select G Protein Acc. DNA Acc. Length Description 14954.36 cassai 852 esar Homo sapiens

2361031 AAB69154.1 . ss E6-APHomosapiens) ubiquitin-protein ligase

E6-AP ubiquitin-protein ligase is 2853320 AAC39580.1 -- 39 Homosapiens ------E6-AP ubiquitin-protein ligase girls chaosi | - 48 Homosapiens) ::: aglio-- caust. - - 68 E6-APHomosapiens ubiquitin-protein ligase f A 3421 136 CAA04538.1 - 55 E6-AP ubiquitin-protein ligase iii. — -- (Homosapiens) alignalcaos39.1. - E6-AP(Homosapiens) ubiquitin-pro E6-AP ubiquitin-protein ligase 3421159 canoso - 39 Homosapiens) ubiquitin protein ligase E3A,isoform 1 human papilloma virus E6-associated protein 852 oncogenicprotein-associated protein E6-APCTCL tumor antigen se37-2 Homosapiens) . ubiquitin protein ligase E3A,isoform 3 human papilloma virus E6-associated protein 19718764 NP 570854.1 872 oncogenicprotein-associated protein. E6-APCTCL tumor antigen se37-2 Homosapiens - - ubiquitin protein ligase 19718766NP 000453.2 - 875 E3A,isoform 2 human papilloma : virus E6-associated protein US 2003/0194725 A1 Oct. 16, 2003 64

--- Toncogenicprotein-associated protein E6-APCTCL tumor antigen se37-2 - Homosapiens -

a 1385,658- AAG34910.1 AF2730so 852 se37-2|HomoAF2700 CTCsapiens tumor antigen AAHO2582 ubiquitin protein ligase r E3A(human papilloma virus E6 12803511 AAHO2582.1 BC002582 852 associated protein, Angelman |syndrome) Homosapiens AAHO9271 Unknown (protein : 14424,503 AAHO9271.1 BC009271 585 forMGC:15720) Homo sapiens ingry 178745 35542.1 L07557. 874 oncogenic protein-associated ge protein - E6-associated proteinE6 6, 1872514 AAB49301.1 U84404 852 AP/ubiquitin-protein ligase (Homo sapiens 14954.32 cass. X9803 852 isoform II (Homo sapiens) 1495430 CAA66655.1 X98032 s isoform I (Homo sapiens) FE -1495.434 cases. xosos 852 room III Homo sapiens

HS 180933 Gene: CGBP Sequence count: 226 Select GT Protein Acc. DNA Acc. Length Description protein containing CXXC domain le 8100075 Baalso 1. abos 1069 loss 1Homo sapiens s issaaranoi AF49758 656 AFHomosapiens 149758-1 CpG - binding protein

------thrilatio 1 air rtf, a 12053229 CAB66796. AL136862 656 SapiensyPhil Pino" - - CpG binding protein DNA

bindingprotein with PHD finger 7656975 NP_055408.1 NM_014593 656 and CXXC domain (Homo sapiens

HS.180941 Gene: VPS41 Sequence count: 159 US 2003/0194725 A1 Oct. 16, 2003 65

select GI Protein Acc. DNA Acc. Length Description vacuolar protein sorting 4

7657677 s -0.55211. NM-- 04306, 854 (yeasthomolog),vacuolar assembly isoform protein 1 41 : - Homo sapiens

1842093 AAB1756. U87309 854 vpal Homo sapiens

gard posalagiaro- arise 779 sapiens)g135593 hrps-lp (Homo 887370 acro L40398 130 ornia. ------wawnmowwaauum -- -v.w-uumwaw-mow E. 1843,570 AAB47758. 1 U87281 149 hWps4 Ip (Homo sapiens)

HS.181077 Gene: DKFZp5861021 Sequence count: 253 select G Protein Acc. DNA Acc. Length - Description - - - hypothetical protein 18999397 AAID4267. m 594 DKFZp586021 Homo sapiens

frt3. possicaboss AL136921 594 sapiensre protein- Homo -- ypothetical protein X also splisti Morn 594 Resis. sapiens4

HS. 81 161 Gene: KIAA 1972 Sequence count: 114 select G Protein Acc. DNA Acc. Length Description it. isospassssss- - sts kaion protein Homo sapiens AAH13173. Unknown (protein is gi" issaloo sAH13173.1 scosiasts forMGC:17340) Homo sapiens HS. 18380 Gene: ANAPC11 Sequence count: 282 select G Protein Ace. NA Length Description 7106818 Aarons is Arisis. HSPC214 (Homo US 2003/O194725 A1 Oct. 16, 2003 66

HS.348263 Gene: - Sequence count: 0 lso - DNA weal-ester-wrawn-o-war Select G |Protein Acc. Acc Length Description ,

15042064 AAK81892.1 ad 236 AFHomosapiens 646821 IAP-like protein 2 AF420440 1 testi ific inhibitorof & 16902898AAL30369.1 - || 236 apoptosisA Homotests sapiens specinhibitor

&gs 3. baculoviralaClOW IAP repeat-containingining.8 E. 16974128 NP 203127.2 236 IAP-like protein 2 Homo sapiens issos A XP 084020.1 r 236 baculoviral IAP repeat-containing8 ge - - - Homo Sapiens

HS.348716 Gene: - Sequence count: 0 select GI ProteinAcc. Acc.DNA Length Description 3, 6164616 IAAFO 4467.1 434 AF129533 1 F-box protein Fbl3b E. - Homosapiens

HS.35032 Gene: RET Sequence count: 68 Select GI Protein Ace. DNAAcc. Length Description 340026 AAAssiss M16029 805 tyrosine kinase m AAHO4257 ret proto-oncogene (multipleendocrine neoplasia in 1327,904 IAAH04257. BC0042.57 1072 MEN2A, MEN2B and medullary E. thyroid carcinoma I, Hirschsprung disease) (Homo sapiens/ - AAHO3072 Similar to retproto Toncogene (multiple endocrine

neoplasia MEN2A, MEN2B AAHO3072. I BC003072 458 andmedullary thyroid curcinoma 1. Hirschsprung disease) (Homo sapiens) -o-r rm ------ret proto-Oncogene,n-mm--- WM isoform

aprecursor, hydroxyaryl-protein NP 000314. INM 000323; 11 14 kinase RET51 oncogene RET : RETransforming sequence US 2003/O194725 A1 Oct. 16, 2003 67

--- --r -r-, m------cadherin family member 12 if(Homo sapiens/ ret proto-oncogene, isoform aprecursor hydroxyaryl-protein kinase RET51 oncogene RET RETtransforming sequence 10862703 - 124. I - 75 1114 cadherin family member 12 (Homo sapiens) ret proto-oncogene, isoform cprecursor hydroxyaryl-protein kinase RET51 oncogene RET it O862701 NP 0.568. NM 020630. 1072 RETtransforming sequence cadherin family member 12 (Homo sapiens) ret proto-oncogene, isoform bprecursor hydroxyaryl-protein kinase RET51 oncogene RET a 10862699 NP 065680. INM_020629 I 106 RETtransforming sequence cadherin family member 12 - (Homo sapiens)

38275 casians, X12949 860 E. kinase (AA 1 - 860) IHomosapiens) i

-m-m-190700 AAA36524. 1 M31213 - 503 encodedproteinpapillary thyroid carcinoma

HS.350518 Gene: TRIM6 Sequence count: 60 Select GI Protein Acc. DNA Acc. Length Description rig: isotops.m Narsial as tripartite(Homosapiens) motif protein 6 AF220030 tripartite motif t 12407391 AAGs. AF22003O 488 proteinTRIM6 (Homo sapiens) Fr. assos passistakovices unnamed protein product

HS.35384 Gene: RING1 Sequence count: 119 select G Protein Acc. DNA Acc. Length Description

dJ1033B10.8.1 (Ring finger 3820982 calos 377 protein l (RNFI), isoform 1) US 2003/O194725 A1 Oct. 16, 2003 68

- -- | Homo sapiens)

i 12804137 AAHO2922.1 BC002922 377 HomosapiensAH'ginger Protein - ring finger proteinl se:is 4506535 NP- 002922.1 NM002931r 377 (Homosapiens)s

- - assos caatssor Z14000 377 RING Homo sapiens

IIs.355726 Gene: HT011 Sequence count: 13 Select G Protein Acc. DNA Acc. Length - Description : uncharacterized s: 13650239XP 017935.1 w 357 hypothalamusprotein HT011 as Homo sapiens -- AF2201851 7689021 AAF67650.1 AF220185 360 uncharacterizedhypothalamus : protein HT011 Homo sapiens

-- m uncharacterized --: 8923810 NP 060942.1 NM_018472 360 hypothalamusprotein HT011 s: Homo sapiens

HS.355977

Gene: - Sequence count: 0 Select GI Protein Acc. RNA length description far ponso aacsis: as 684. WWP1 Homo sapiens)

15929915 15380.1 - 304 AAH15380containing protein Similar 1 to(Homo WWaomain sapiens

f? 1855 4931 XP 8757. 922 WW1Homo domain-containing sapiens protein E. -- - -

HS.356868 Gene: - Sequence count:0 select G. Protein Acc. NA Length Description 200670266 IAAM09503.1 - 229 AF489517 RNF35 Homo US 2003/0194725 A1 Oct. 16, 2003 69

------sapiens) 3 poosa NP loss. - 229 in finger RNF35 Homo sapiens

HS.35804

Gene: HERC3 Sequence count: 105 -- --- Select GI Protein Acc. DNA Acc. Length Description EA 11436418 XP 003490.1 - so hectHomosapiens domain and RLD 3 sians Baosi D25215 1050 kaans: (Homo sapiens)

7657152 NP -055421. NM 014606 1050 hectHomosapiens domain and RLP

HS.38.25 Gene: SP110 Sequence count: 155 Select G Protein Acc. DNA Acc. Length Description r asso B49515 371 B49515 phosphoprotein 75-human SP110 nuclear body protein,isoform

c interferon-induced protein 75, 52kD interferon-inducedprotein 41, - 713 30kD transcriptional coactivator Sp110 phosphoprotein41 phosphoprotein 75 Homo sapiens SP110 nuclear body protein,isoform a interferon-induced protein 75, 52kD interferon-inducedprotein 41, 17986256 NP_004500.2 - 30kD transcriptional coactivator Sp110 phosphoprotein41 phosphoprotein 75 Homo sapiens

as9964115 . . AAGo9826.1 AE28000s 680 AF280095coactivatorSp110 transcriptional Homo sapiens 423.0654 AAD13402.1 L22343 | 408 nuclear phosphoprotein Homo |-tecs | * * Sapiens - y ar F280094. 1 transcriptional i 98.00494 AAF9318 AF2009, 539 coactivator.Spl 10b (Homo sapiens/ 402205 IAAA18806.1 L22342 248 phosphoprotein US 2003/O194725 A1 Oct. 16, 2003 70

HS.431 Gene: BMI1 Sequence count: 193 Select G Protein Acc. DNA Acc. Length Description

s w AAH1 1652 Unknown (protein 15341688 AAH1652. BCO 1652 326 forMGC:12685) Homo sapiens 291.873 . AAA 19873.1 L13689 326 unive murine leukemia viral (bmi i)oncogene homolog Oncogene so NP 005171.1 NM_005180 326 BMI-1 Homo sapiens

HS.43149 Gene: KIAA 1214 Sequence count: 40 Select Gi Protein Acc. DNA Acc. Length Description asos pass AB030so 462 kaara protein Homo sapiens

HS.44685 Gene: ZFP26 Sequence count: 195 Select G Protein Acc. DNA Acc. Length Description - 4. a 6856967 IAAF30180.1 AF214680 230 is finger Protein E. - -- Homosapiens) W

a 7706777 NP 057506, r-NM 016422 is . 230 protein Homo"g sapiens

HS.46700 Gene: ING1 Sequence count: 112 Select GI Protein Acc. DNA Acc. Length -- Description E. 61884sixpos7109 - 422 inhibowmemberl Horno Sapiensany

? & inhibitor of growth family, memberl inhibitor of growth 1 1992.3771 NP_0055282 422 inhibitor of growth 1 family, r member 1 Homosapiens sps, BAA82886. AB024401 279 s Homo sapiens US 2003/0194725 A1 Oct. 16, 2003 71

r ------urn armwr------wa. ww.m-r. m. ------56892.59 |Baaass. Aboao 422 p47 Homo sapiens) - ris snia passassi Aboro 210 p Homo sapiens ris also aboo Aroos 294 ana Homo sapiens . E. assos AAC00501. Aroans 279 candidatetitle. tumor suppressorsapiens loosaaaara aross AF078835.1sapiens) p33ING1 (Homo AF149721 1 ING1 tumor is 7158365 AAF37421.1 AF149721 279 suppressor, variant A (Homo tale sapiens AF1497.22 ING1 tumor 7158367 IAAF37422.1 AF 149722 210 suppressor, variant B Homo sapiens t m AF149723 1 ING1 tumor 7158369 AAF37423.1 AF149723 235 suppressor, variant C Homo lead - Sapiens)

f t 4097m IAAF07920.-- Aris849 422 AF181849_1Sapiens) p47INGla (Homo

s: gaps . Aarone. arisso 279 sapiensEl p33ING1b Homor ES 13992539 CAC38067.1 AJ310392 279 p33ING 1b (Homo sapiens)

HS.4745 Gene: PSMC1 Sequence count: 530 Select G. Protein Acc. DNA Acc. Length Description ------J. - - isopagahasaoao A A L11 a one : BCOan 13908on 8 AAH13908forMGC:16498) Unknown (Homo (protein sapiens ------AA II005 12 Similar to rotease(prosome, macropain) 1265.348 AAHO05 12. I BC0005122 44440 26S subunit, ATPase 1 (Homo sapiens)

proteasome (prosome, macropain) y 26Ssubunit, ATPase, 1 4506207 NP 002793. 1 NM 00:02 440 Proteasome 26S subunit, ATPase, | | Homo sapiens) US 2003/0194725 A1 Oct. 16, 2003 72

s:x: loss |AAA35484w || Lo2426 440 Subunitision (S4) regulatory

HS.48320

Gene: DORFIN Sequence count: 253 Select GI----- Protein Ace. DNA Acc. Length Description

1923422—— Neososo. - 838 an Homom sapiens) ------ring-IBR-ring domain 13366024 BAB39353.1 AB029316 838 containingprotein Dorfin Homo sapiens)

r E. roupoonants, aroo 397 (Homosapiens/it.otein product

ait. 10435397 BAB14581. JAKO23455 155 EP"Pi"P"(Homosapiens) 25,a 7023254 BA491900. 1 AK001774 457 "Pole"(Homosapiens) Prode 6102910 CAB39264. I upo 315 sapiensrol ical protein (Homo hypothetical protein (Homo , 58 17213 cabso autops 101 Supiens/

HS.49210 Gene: FBXO4 Sequence count: 65

Select G Protein Acc. DNA Acc. Length Description ---at 6164618 AAF04468. AF1295.34 387 (2-box Protein- F* -- - Homosapiens

age -6103645 AAFoos. AF1760,- os F-boxsapiens protein FBX4 (Homo - . mrm-w-3sissagigin ------o6308.1NM--- - on 2176,387------r - Fiboxw r .only protein- 4 F-boxproteina - i. 15834619 -- - Fbx4 Homo sapiens -- s F-box only protein 4, isoform 2F 15834621 NP 277019.1 NM 033484 307 box protein Fbx4 hypothetical

%. -- protein FLJ10141-- (Homo sapiens) E. associous asso a hypothetical protein (Homo - sapiens7 702.2012 | BAA 91463. 1 AK001003 | 161 unnamed protein product US 2003/0194725 A1 Oct. 16, 2003 73

E. --- (Homosapiens, ww-mwamwrew

HS.49526 Gene: FBXL4 Sequence count: 66 Select GI Protein Acc. DNA Ace. Length Description t dJ273N12.1 (PUTATIVE protein is 4468288 CAB37981. - 621 basedonSapiens EST matches) Homo

fii. 6164723 |Aaros 1. Asoo AF174590.1Homosapiens) F-box protein Fbl4 F-box protein FBL4 Homo s lsions, IAAF03699.1 AF176699 621 Sapiens

r;s 6456735 Aaroyal Arooss 62 AF199355.1Homosapiens F-box protein FBLS F-box and leucine-rich repeat protein 4 F-box protein a 16306588 Noon. NM 012160 621 FBL4 Homo sapiens

HS.5094

Gene: RNF10 Sequence count: 616 Select GI Protein Acc. DNA Acc. Length Description

- a 5931614BAA84708.1 AB027.196 811 RIE2 sid2705 Homo sapiens -- - TRNF 10, ring finger 10 Fig 19367867|CAB97533.1 AL389976 | 729 RIAA6262RIEalternatively spliced product Homo sapiens

166.5791 BAA13392.1 D87451 761 fingersignatureContains C3HCype Homo Zinc sapiens ------ring finger protein 10" " 7662653INP 055683.1 NM_014868 761 KIAAO262gene product Homo a * Sapiens)p

HS.53940 Gene: - Sequence count: 0 select GI Protein Acc. N length Description 1598.2946 IAAL11501.1 a rVVs- 485 AF360739 1 SSA proteinss-56 US 2003/0194725 A1 Oct. 16, 2003 74

- | | | | | Homosapiens) ------m-1------AF439153. 1 Ross Al related gags Aalaisal - 485 proteinFLJ10369 Homo sapiens 1751121 NP 00:43.4 - 485 Ro/SSA1is: 69(Homo related proteinsapiens - — : 1803787 |XP 0060634 - is Ro/SSA related protein w8. FLJ10369 Homo sapiens

HS.54089 Gene: BARD1 Sequence count: 79 Select GI Protein Acc. DNA Acc. Length Description - -- - - is2: Y 2828068 AAB99978.1 - proteinBRCA1 Homo associated sapiens RING domain BRCA1 associated RING domain lBRCA1-associated RING domain gig 4557349 NPoss. Moss 777 gene 1 BRCA1-associated RING if: domain lHomo sapiens - -assin?i 171015a-r- AAB383.16.1 U76638 777 proteinBRCAssociated Homo sapiens) RING domain

HS.54580 Gene: RNF27 Scquence count: 248 select GI Protein Acc. DNA Acc. Length Description - . AAH21925 ring fi in 27 18314488--- AAH21925.1 55 Homosapiens1925 ring finger protein

12407399 AAG53488.1 AF220034 m www.uu551 E.proteinTRIM8 tripartite Homo sapiens)motif AF281046. 1 glioblastoma expressed RING finger protein 1238.2258 ason AF281046 551 GERP Homo sapiens - ring finger protein 27 tripartitemotif protein TRIM8 13569866 NP 1121741 INM 030912 551 glioblastoma expressed ring finger protein Homosapiens

HS.5548 Gene: FBXL5 Sequence count: 280 Select GI Protein Acc. DNA Acc. Length Description ------US 2003/0194725 A1 Oct. 16, 2003 75

AF142481.1 F-box protein FLR1 7672734 AAF616. AF142481 691 Homosapiens -- 7688697 AAF67489.1 AF157323 674 proteinAF1572 (Homosapiens) IPSSKP2-like AF174591.1 F-box protein Fbl5 61.64725 |AAF01s12. AF174591 535 Homosapiens F-box protein FBL5 (Homo F. 6103639 AAFO3700.1 AF176700 694 sapiens. AF199420 F-box protein FBL4 FE 6456739 Aaroo AF199420 690 Homosapiens) 7020055 BAA90978. Ak000153 636 named Protein Product - - Homosapiens F-box and leucine-rich repeatprotein 5, isoform 1 F-box fi. on spoon onio 691 protein FBL5 Homo sapiens F-box and leucine-rich

3 repeatprotein 5, isoform 2 F-box 3: 3. . o 277077.1 sons 565 protein FBL5 Homo sapiens

HS.5912 Gene: FBXO7 Sequence count: 491 Description

select G Protein Ace. DNA Acc. Length AF129537.1 F-box protein Fbx7 E. saga AAF04471.1 AF2s, 482 Homosapiens AF233225 F-box protein FBX islao Aarons AF233225 522 IIomosapicns hypothetical protein Homo assa cabassa AL0502.54 s: sapiens - -- X - AAHO8361 F-box only protein 7 i. 14249ss AAHO8361.1 BC008361 522 (Homosapiens) ...... F-box only protein 7 F-boxprotein -

7F-box protein FBXF-box 15812 193INP 036311.2NM 012179 522 protein Fbx7d.) 149A16.8 Homosapiens

Hs.59545 Gene: RNF15 Sequence count: 142 Select G Protein Acc. DNA Acc. Length Description US 2003/O194725 A1 Oct. 16, 2003 76

3.x: * poss: absos. 465 unknown Homo sapiens a 20070649 IAAH26930.1 - 465 Egg"P"Homosapiens --- ring finger protein 15 Ro/SSAribonucleoprotein so so M 0.06355 homolog Homo sapiens

2062696 Aansas. U90547

HS-6092

Gene: FBXL2 Sequence count: 83 select G Protein Acc. DNA Acc, Length Description AF174589 1 F-box protein Fbl2 Homosapiens y also Aarosion AF174589 - - AF176518 1 F-box protein FBL2 also Aaroaasi AF176518 Homosapiens

86273 1 leucine-rich repeatscontaining F-box protein & o also AF186273 FBL3 Homo sapiens www. unnamed protein product Homosapiens g zoos |Baaoo-Akolas F-box and leucine-rich repeatprotein 2 F-box protein containing leucine-rich repeats

Homosapiens hypothetical protein (Homo Sapiens/

HS.61515 Gene: RNF15 Sequence count: 65 Select G Protein Acc. DNA Acc. length Description AAHO7661 Similar to ring 5: 14043332 AAHO7661.1 BC007661 488 A.fingerprotein 23 Homo sapiens

HS-62264

Gene: KIAA0937 Sequence count: 142 Select G Protein Acc. DNA Acc. Length Description US 2003/O194725 A1 Oct. 16, 2003 77

-...------sur -- - - -num-cum------www v.

i s 89518 Baators. |IAB023154 653 KIAA0937 protein Homo- sapiens

HS.62767 Gene: - Sequence count:0 Select GI Protein Acc. DNA Acc. Length Description E. 17450863 xpossm. 717 KIAA 1332 protein Homo sapiens)

HS.64691 Gene: KIAA0483 Sequence count: 195 select G Protein Acc. DNA Acc. Length Description EE.E 7022998 Baaoosakoos 36s unnamedHomosapiens protein product F. iss 158 possi Moising 368 kaans protein Homo sapiens E. . --war w ------all------Buson BAA32327 I AB007952 299 KIAA0483 protein (Homo sapiens) -

HS.64794 Gene: ZNF183 Sequence count: 100 select G Protein Acc. DNA Acc. Length Description --- 343 zinc-finger protein Homo ...in 1234.1022 AaB670s.- - sapiens - zinc finger protein 183 st 11422613XP 010437.1 - 343 (RINGfinger, C3HC4 type) sp. Homo sapiens

--mar AAH20556 zinc finger protein -- 343 183(RENG finger, C3HC4-type)- is 18089018 AAH20556.1- Homo Sapiens TTT AAHO0832 zinc finger protein 12654053 AAHO0832.1 BC000832 343 183(RING finger, C3HC4 type) ; Homo sapiens zinc finger protein 183 it 5902158 NP_008909.1 NM_006978 343 (RING?inger, C3HC4 type) 43 Homo sapiens 2274982 CAA66907.1 x98253 343 ZNF183 Homo sapiens US 2003/0194725 A1 Oct. 16, 2003 78

|E| | | | | Hs.65238 Gene: RNF40 Sequence count: 320 Select GI. Protein Acc. DNA Acc. Length Description : - 95 kDa retinoblastoma is 14779695XP 034375.1 1001 proteinbinding protein Homo E4: sapiens -- T AAH18647 95 kDa Fig 17391423 AAH18647.1 - 1001 retinoblastomaprotein binding protein Homo sapiens)

i. 3327136-- BAA31636.1 AB014561 1001 Sapiensal protein Homo

a...;; 14042062 BAB55092.1 AK027406 901 Homosapiens)ina"Protein Product --- TIAAH04527 Similar to 95 kDaretinoblastoma protein r sloe BC004527 271 binding protein KlAA0661 gene product Homosapiens AAHO613395 kDa retinoblastomaprotein binding 13543994 AAH06133.1 BC006133 1001 protein KIAA.0661 gene product Homo sapiens), AAH11769 Similar to 95 15079968 AAH 11769.1 BC01 1769 1001 kDaretinoblastoma protein binding protein Homo sapiens) S- - 95 kDa retinoblastoma 7662230 NP 055586.1 NM 014771 1001 proteinbinding protein Homo 8. Sapicns ------2 -v c cariri fa orops AAG 13723. I AFI 22819 838 :122819rotein? Homo 1 Rb-associated sapiens)

Gene: TRIM4 Sequence count: 98 Select G. Protein Acc. DNA Acc. - Length Description AF2200231 tripartite motif 500 proteinTRIM4 isoform alpha 12407377AAG53477.1 AF220023 Homo sapiens rowmom- - - - - gin 12407379 AAGS3478.1 AF)20024 474 proteinTRIM4A.2992, "Petite isoform betaof m US 2003/0194725 A1 Oct. 16, 2003 79

--- unders- wwn - - she summ wre-en- ...... ? - -- (Homo Sapiens) -- tripartite motif protein

TRIM4isoform alpha tripartite motif protein TRIM4 tripartite

14670266 148977. or 500 motifprotein 4 Homo sapiens tripartite motif protein TRIM4isoform beta tripartite 1501 1941 NP 11oz. M 033091 474 motif protein TRIM4 tripartite motif protein4 Homo sapiens

HS.66295 Gene: - Sequence count: 0 Select GI Protein Acc. RNA Length description t — multi-PDZ-domain-containin --- firgos stops XP 0303602 728 Homo sapiens - gprotein

HS.66394 y Gene: RNF4 Sequence count: 327 select G Protein Acc. DNA Acc. Length Description : -- 5, 1843401 Basion AB000468 190 zinc finger protein Homo sapiens . -- .

-----,e ------X w ing fi O tein 4 y isE. 4506561 NP- 002929.1NM_002938 | - 190 ETEP"Homosapiens) -- assissaacson U.95140 190 RNF4 Homo sapiens

HS,6900 Gene: RNF13 Sequence count: 312 Select GI Protein Acc. DNA Acc. Length Description RING zinc finger protein rt spasiaacovo. Arono 381 (Homosapiens------

33.87925 AAC28641.1|AF070558 381 ENGHomosapiens zinc finger protein RAF

g . agosa AAHO97811 BCOO978 381 (Homosapiens)GE ring finger protein 13

2, 14602579 AAHO9803.1 BC009803 AAHO9803Homosapiens ring finger protein 13 . US 2003/0194725 A1 Oct. 16, 2003 80

: 6005864 NP 0092.13.1 NM- oons. is Zincfingerring finger protein protein 13Homo RING sapiens

HS-69554 Gene: FLJ20552 Sequence count: 232 Select G Protein Ace. DNA Acc. Length Description 297953 Aacolas 103 Rssssss Homo sapiens 1. hypothetical protein FL20552 s h92aso AAH25374.1 - 311 artia , - . . E:to 7020737 BAA91254.1AK.000559 311 Ea"P"P"Homosapiens issinational BCOO 1442 ' AAHO1442proteinFL20552 hypothetical (Homo sapiens) AAH13977 Similar to E 15559245 AAH13977.1 BC013977 | 311 hypotheticalprotein FLJ20552 is - Homo sapiens) is 8923522 NP 060346.1 moirs,6| 311 hypotheticalHomosapiens protein FLJ20552

Hs.7158 Gene: DKFZP566H073 Sequence count: 485 Select GI Protein Acc. DNA Acc. Length Description in3. 10437824 BAB15113. AK025329 350 Homosapiensfami Po Pot - - at 4884104 cassassi AL050.060 324 letical protein Homo . ------14603365 AAH101391 BC00139- 350 proteinAAH039DKFAP560H073 Homosapiens

is: 14149702NP -056343.1NM - 015528 350 PSFASH073(Homosapiens) protein hypothetical protein, similar to(AF037205) RING zinc finger His 5 102896 cent AL079.315 158 rotein (Mus musculus) (Homo - Sapiens/

HS.7236 Gene: NOSIP Sequence count: 263 US 2003/0194725 A1 Oct. 16, 2003 81

Select GI Protein Acc. DNA Acc. Length Description : -- - -- red 4680689 IAAD27734.1|AF132959 301 AF132959-1Homosapiens co-25 protein - - ino erall3. 44245.50AAHO9299.1 BC009299 301 E."protein Homo sapiens a lagoo Aamorpooloo, so AAHO977 eNos interacting - protein Homo sapiens) AAH11249 similar to eNos is0307|AAH1249. BCO11249 so intcractingprotein9 Similar Homo to e sapiens as 77.05716 NP 057057. Moss to eNOS25protein interacting Homo sapiens)protein CGI

HS. 7252 - Gene: RAI17 Sequence count: 279 Select G Protein Acc. DNA Acc. Length Description f asoa BAA86538.1 abososol as KIAA224 protein (Homo sapiens)

HS.72964 Gene: MKRN3 Sequence count: 12 Select G Protein Acc. DNA Acc. Length - Description f loosaacross 507 ZNF127 Homo sapiens -

era: 5032243 NP_005655.1 NM_005664 507, fingerESEP""" protein 127 Homo sapiens)

HS.7299 Gene: RAI17 Sequence count: 185

Select G Protein AccDNA Acc. Length Description w - loya cassassistians 253 hypothetical protein (Homo sapiens)

HS.7314 Gene: KIAA0614 Sequence count: 312 Select G Protein Acc. DNA Acc. Length - Description - if, apos- |BAA31589. Aboasis 1630- kaanga protein (Homo- sapiens) US 2003/0194725 A1 Oct. 16, 2003 82

- - - w - - - - ". GRAF-1 specific protein : io91 AAF36539.1 AF17498 1381 phosphatase Homo sapiens) — ------591 1937 CAB55944. AL1 17469 116 hypothetical protein Homo sapiens)

s:- goalkaloio. BC0270. 116 forMGC:AAHO6270 1291) Unknown Homo (protein sapiens

HS. 7316 Gene: KIAA0804 Sequence count: 125 Select G Protein Acc. DNA Acc. Length Description : Fox:r: 388232923 bassai Boisri 1211 kaanso protein Homo sapiens d protein product as 10434628- BAB14322.1 AK022945 793 IIomosapiens'P' P' AH01001 Similar to hBKLF for gig 126.5435 ZAAHO) 001. 1 BC001001 92 basickruppel like factor (Homo iii. sapiens)

HS.73958 Gene: RAG1 Sequence count: 13 Select GI Protein Acc. DNA Acc. Length Description 33. 14763 521 x 030299.1 1043 recombination activating gene 3. it rulozi. --- 1Homo sapiens) FE 190843 AAAspas. spot 1043 embination activating protein

ai. 4557841 NP -000439, NM -00044s. 1043 recombination1Homo sapiens activating gene

HS.74441 Gene: CHD4 Sequence count: 644 Select GI Protein Acc. DNA Acc. Length Description - chromodomain helicase DNA fi 4557453. NP_001264.1 NM_001273, 1912 bindingprotein 4 Mi-2b (Homo six, ; ason 1107696 (CAA60384.1 x8669 1912 Mi-2 protein (Homo sapiens US 2003/O194725 A1 Oct. 16, 2003 83

------waW ------i |

HS.75090 Gene: TRIM9 Sequence count: 121 Select GI Protein Acc. DNA Acc. Length Description - AF220036 tripartite motif in 12407403 AAG53490.1 AF220036 664 protein.TRIM9 isoform alpha Homo sapiens AF2200371 tripartite motif 12407405AAG53491.1 AF220037 710 proteinTRIM9 isoform beta Homo sapiens - AF220038 tripartite motif 12407407 AAG53492.1 AF220038 664 proteinTRIM9 isoform gamma Homo sapiens) - AAH13414 Unknown (protein 1542.6583 AAH 13414.1 BC013414 550 forMGC:4626) Homo sapiens 16519557 NP_055978.2 NM_015163 710 homolog of rat RING finger Spring Homo sapiens

tripartite motif protein 9,isoform 2 16519559NP 443210.1 NM 052978 550 homolog of rat RING finger Spring Homo sapiens - Similar to Human estrogen- m 1665803 BAA 13398. I D87458 550 responsivefinger protein, efid (A49656) (Homo sapiensi

HS.75275 Gene:UBE4A Sequence count: 257 Select G Protein Acc. DNA Acc, Length Description - - The KIAA0126 gene is is 1469175 BAA09475.1 D50916 1073 partially related to a yeast gene. ::::: m Homo sapiens - ubiquitination factor E4A (UFD2homolog, yeast) homolog of yeast (S. cerevisiae) ufd2 5. 4759288 NP 004779.1 NM004788 1073 ubiquitinationfactor E4A (homologous to yeast UFD2) Homo sapiens

HS,7540 Gene: FBXL3A Sequence count: 165 US 2003/O194725 A1 Oct. 16, 2003 84

Select GI Protein Acc. DNA Acc. Length Description 6164614 AAF04466.1 - - 428 AF1295321 F-box-hry proteinw Fbl3a ex: - Homosapiens = - F-box and leucine-rich El 17475754 XP 041209.2-- 428 repeatproteins 3A Homo sapiens AF126028 unknown (Homo 7158286 AAF37383.1 AF126028 419 sapiens F-box and leucine-rich s 16306584 roo . NM 012158; 428 repeatprotcin 3A F-box protein Fbl3a Homo sapiens

HS. 75450 Gene: DSIPI Sequence count: 545 Select GI Protein Acc. DNA Acc. Length Description - . - usisobabso Aposs 134 GILZ. Homo sapiens - AF153603 1 TSC-22 related soaria |AAD41085. Afisco as protein Homo sapiens ""------is 591916.1 AADsos, Arisan is AEllisco-like to 1 Tsroo-lik Polin-- E 10086253 AAG12456.1 AF228339 glucocorticoid-induced GILZ 38& - Homosapiens

-- -- rrx t 58 7106 CAB53669. ALl 10191 130 betical protein (Homo delta sleep inducing

4758198 prote, Moose 77 peptide,Sapiens/ immunoreactor (Homo -

leucine zipper protein 1834,507 |CAA9064.) Z5078. I 77 1Homosapiens,

HS. 75871 Gene: PRKCBP1 Sequence count: 337 Select GI Protein Acc. DNA Acc. Length Description

r --

KIA Al 125 protein Homo spotas BAA86439.1 AB032951 1205 sapiens - 1 || 1037059 NP 036540. 1 INM o12408 614 protein kinase C binding protein I uu / vu zi " . - IIHomo sapiens) US 2003/O194725 A1 Oct. 16, 2003 85

------or -- a--- xxa 1265-4363 AAHOI 004. BCOO 1004 133 AAHO10043: Unknown (Homo (protein sapiens! - f usatsaggios aros 764 g2730452 CTCLy tumor antigen E. e 14-3/ Homo sapiens) o w RA CK-like protein PRKCBP1 rEišš.: 7960216 AAF71262. 1 AF233-153 614 ;:

s N w r s 2 fS2 Mr rotein kinase C-binding E. 3142288 AAC72244. I U48251 553 rotein RACK7 (Homo sapiens)

HS.7627 Gene: HERC1 Sequence count: 133 Select GI Protein Acc. DNA Acc. Length Description

inR 4557026- NP- 003.913.1NM - 003922 4861 factorp532""). Homo sapiens - g 1477565AAD12586.1-- -- U50078 4861 p532 (Homo sapiens

HS,76272 Gene: RBBP2 Sequence count: 207 Select GI Protein Acc. DNA Acc. Length Description

a 4826968 NP 005047.1NM_005056s 1722 2HomoE."g sapiens P'" :g3. assrs AAB2854.1 s631 1722 retinoblastomagif sapiens binding protein 2

HS,76798 Gene: FBXL7 Sequence count: 87 Select G Protein Acc. DNA Acc. Length Description

& 4240169|BAA74863.1 : AB020647 483 KIAA0840 protein (Homo sapiens) : i

s 6164729AAF04514.1 :; AF174593 483 EP''Homosapiens AF1993.56 1 F-box protein FBL6 ason. Aropas, AF 1993.56 491 Homosapiens) for- mr. F-box and leucine-rich Eg 6912466 INPP 036436.1 NM_012304. 491 repeatprotein 7 F-box protein Fbl7 : KIAA0840 protein Homo sapiensJ US 2003/0194725 A1 Oct. 16, 2003 86

Hs.76917 Gene: FBXO8 Sequence count: 170 Select G DNA Acc. Length Description Protein Acc. ------AF174596 F-box protein Fbx8 61.64735 IAAF04517.1 AF174596 a Homosapiens AF201932-1 DC10 (Homo 9295168 |AAF86868. AF201932 319 sapiens - 7677406 |AAF671541 AF233224 l 9 AF233224.1Homosapiens F-box protein FBs F-box only protein 8 F-boxprotein 15812208 possini no 2180 31 9 Fbx8 Homo sapiens

HS. 7759 Gene: XAP135 Sequence count: 387 Select G Protein Acc. DNA Acc. Length Description AF3387351 hypothetical PHD 13487236 AAK27451.1 410 zincfinger protein XAP135 Homo sapiens AAH20954 hypothetical 18088065 apos 408 proteinFLJ10975 Homo sapiens -- - PHD zinc finger protein 19747276NP 579866. I 408 XAP135,isoform b Homo sapiens f unnamed protein product Homosapiens s

7023354 |BAA994. Akos, 410 H-H - PHD zinc finger protein Egg 8922800 NP_060758.1 NM 018288 410 XAP135,isoform a Homo sapiens

HS. 77617 Gene: SP100 Sequence count: 232 Select G Protein Acc. DNA Acc. Length Description - . approalasianaps 88s proteinSPAF255565 100C nuclear (Homo body sapiens) AAHI 1562 Similar to nuclear ge 15079448 'AAH 1562. 1 conso 480 antigenSp100 (Homo sapiens) II 73.656 (AAC50743.1 U3650i 688 ISP100-B US 2003/O194725 A1 Oct. 16, 2003 87 IEET | | | 32.52011 44C39790.mamwmw-w- Aros,www. 322 879 ISCI00-HMGnclearISP100 HMG nuclear autoanticautoantigen f i (Homosapiens/ ---v------178689 4445 7.1 M606 18 480 nuclear autoantigen s

HS.77823 Gene: FLJ21343 Sequence count: 211 Select GI Protein Acc. DNA Acc. Length Description reg laskossos. 332 hypotheticalaris. protein sapiens rE. hostas BAB15050 AK024996 332 Homosapiensnamed Protein product r - aga- Poros.M - NM opts 332 2.hypothetical 1343 (Homo protein sapiens)

8: 1205314.5 CAB66751.1 Al13617 394 Sapiens/hypothetical Protein (Homo

HS.7838 Gene: MKRN1 Sequence count: 473 Select G Protein Acc. DNA Acc. Length - Description --- Fig 6601434 aafists - 482 non (Homo sapiens) x: - y guez-r-looski so ... is 1Homonakorn, sapiens) ring finger protein, 19684.160 AAH25955.1 482 makorin. ring finger protein, . 1Homo sapiens losopao Aafia Arts as AF17233 afxpprotein ------(Homosapiens) it: 6572964 aritas. AF1927.84 482 1. makorin1 Homo ------

-a-m-posalascabstas also- is hypotheticalsapiens) protein (Homo

at 7305273 positi NMonas as makorin,1 (Homo sapiens) ring finger protein,

1 22 US 2003/O194725 A1 Oct. 16, 2003 88

HS. 7885 Gene: PICALM Sequence count:370 Select - G Protein Acc. DNA Acc. Length Description 313901 1AAC16702.1 AF060930 104 proteinSPY-FM's Homo sapiens

E. 3139013 AAC16703.1 AF060931 84 Homosapiens)EMP"

. postacion, |AFooto 58 sapiens/E. CALM protein (Homo

: 3139029AAC1671 11 AF060939 78 sapiens)EP 'Pel" . signacioto, |AF007 s. gA.f10 protein (Homo isolactato, aroops 5 I gprotein/Homo III AF10/CALM sapiens fusion Clathrin assembly lymphoid a 6005733NP 009097. INM 007166 652 myeloidleukemia gene (Homo . sapiens) ii. n als AAB07762. 1 U-15976 652 cal

HS78893 Gene: PHF3 Sequence count: 221 Select GI Protein Ace. DNA Acc. Length Description guruzat6648.928AAF21292.1 AF091622 2039 Homosapiens0622 HD finger Protein 3 similar to human transcription factor it. atopsa 13438.1 D87685 1723 TFIIS (S34159). Homo sapiens --- www-rm PHD finger protein 3 Homo

7662018 NP O55968.1 NMoss 2039 Sapiens -

HS.792

Gene: ARFDl Sequence count: 90 Select GIT Protein Acc. DNA Acc. Length Description US 2003/O194725 A1 Oct. 16, 2003 89

------. . ADP-ribosylation factor 18490296 574 domainprotein 1,64kD (Homo aims 10. sapiens 574 proteinTRIM23 alpha Homo 12275883 acoral AF230397 sapiens -w- AF2303981 tripartite motif 569 proteinTRIM23 beta Homo ins ason AF230398 Sapiens * AF230399.1 tripartite motif AAGSO178.1 AF230399 546 proteinTRIM23 gamma Homo loss Sapiens ------em-n--- 292070 AAA35940.1 LO4510 574 nucleotide binding protein ADP-ribosylation factor domainprotein 1 isoform alpha ARF domain protein 1 GTP 4502197 NP 001647.1 NM_001656 574 binding proteinARD-1 tripartite motif protein TRIM23 Homo Sapiens ADP-ribosylation factor domainprotein l isoform beta

ARF domain protein 1 GTP 15208641NP 150230.1NM 033227 569 binding proteinARD-1 tripartite motif protein TRIM23 Homo sapiens ADP-ribosylation factor H domainprotein l isoform gamma

ARF domain protein GTP 15208643 NP 50231.1|NM 033228 546 binding proteinARD-1 tripartite motif protein TRIM23 Homo sapiens) -

HS.79828 - - - - - Gene: FLJ20333 Sequence count: 162 Select GI Protein Acc. DNA Acc. Length ? ------Description

s nasoa BAA92571.1 AB037754 741 Sapienso nnamed protein product g zoo 59 BAA91095.1 Akoso 706 Homosapiens) 10434508 BAB14280.1 AK022867 290 unnamed protein product US 2003/0194725 A1 Oct. 16, 2003 90

| | | | Homosapiens - --war--- wnm------au------AAHO0973 hypothetical F1 ansas Aaloon? BC000973 706 proteinFLJ20333 Homo sapiens

rais5233093. N -06029. NM -017760 706 hypotheticalHomosapiens protein FLJ20333

HS.80358 Gene: SMCY Sequence count: 64 Select GI Protein Acc. DNA Ace. Length - Description are 11418363XP 010478.1 so Smcy homolog, Y chromosome it. - - (mouse)Homo sapiens i. similar to Human 1510145 BAA1 3241.1 D87072 1482 XE169protein(P4 1229) Homo sapiens Smcy homolog, Y chromosome (mouse)histocompatibility Y Y

475.950 NP--- 004644. NM-- 004653 1539 chromosomeSelectedantigen SMC (rouse) mousehomolog. Y cDNA on Y, human homolog of Homo sapiens) i solois AACsoso. U52191 1539 sucy

HS.80731 Gene: AMFR Sequence count: 298 Select GIT Protein Ace. DNA Acc. Length Description ; : 1992.3132 NP 001135.2 a 6 4. autocrine motility factor --- | 2-slice - receptor Homo sapicns

F5931955 IAAD56722.| AF124145 643 A.E.E.actorreceptor (Homo sapiens) 73 52 1221 AAA79362. I L35233 323 non motility factor receptor : ------...-a-a-now-wow ------f 338731 |A443 6671. 1 M631 75 323 autocrine motility factor receptor

HS.81 001 Gene: - Sequence count: 0 Select G Protein Acc. DNA Length Description - US 2003/0194725 A1 Oct. 16, 2003 91

F-box domain Fbx25-containingprotein E. 187019. BAB812. X 358 Homo sapiens HS.81 64 Gene: TRIM37 Sequence count: 115 Select GI Protein Acc. DNA Acc. Length Description ------3. gaps |BAA74921.1 AB02070s 970 KIAA0898Sapiens protein (Homo -- tripartite motif-containing 37RING-B-box-coiled-coil 15147333 t P 056109.1 NM_015294 964 protein MUL protein Mulibrey nanism Homosapiens

HS.82023 Gene: DKFZP434B205 Sequence count: 375 Select G Protein Acc. DNA Acc. Length Description unnamed protein product risa ho438817 bansas AK026081 566 IP(Homosapiens Ag: 30 Unknown (protein

al issos, AAH14130.1 Bcolaiso 566 forMGC:20962) Homo sapiens) H09429 Unknown (protein

1449,5644 AAHO9429 | BC009429 345 for IMAGE:2988331)/Homo Sapiens/ 13528687 IAAHO454.1.1 BC004541 433 forE. IMAGE:3951723) Unknown (Homo(protein E. Sapiens/ AAHO0850 Similar to f-box and 1265,408 IAAH00850. 1 BC000850 159 WD-40domain protein 5 (Homo

sapiens) : AF2 17998 159 AF2.17998 | unknown (Homo : sapiens) 3. CAB70851. 1 AL I37631 Sapiens/hypothetical protein (Homo

HS.8220 Gene: ZNF220 Sequence count: 167 Select GI a. Protein Acc. DNA Acc. Lengthgt Description 5803098 NP 006757.1NM 006766 2004 zinc finger protein 220 US 2003/O194725 A1 Oct. 16, 2003 92

i------M'm-m-ma\r Y^r-r- m-www.www. ... ------Monocyticleukemia Zinc finger --- -33 protein Homo Sapiens

| E. 1517914 AACS0662.1 U47742-- 2004 Eylemiafingerprotein in

Fias L404.1987 BAB55062 AK02736, 205 (Homosapiens)EP" Prote” Prode

HS.82292 Gene: KIAA0215 Sequence count: 73 Select GI Protein Acc. DNA Acc. Length Description - o is 14758564 XP 01 0180.4 - s RE product

580.5248 AAD51905.1 AF27774 823 unknown Homo sapiens) ------similar to Human zinc BAA3205. D86969 823 fingerprotein, BR 140(Pl:JC2069) Homo Sapiens

NM 014735 823 KIAA0215 gene product 7662006 NP 055550.1 W- (Homosapiens)

HS.82380 Gene: MNAT1 Sequence count: 175 Select GI Protein Ace. DNA Acc. Length - Description

AAHO0820 menage atrois 1 12654033 IAAHO0820.1 BC000820 309 (CAKassembly factor) Homo & sapiens s - -- - menage a trois 1 (CAK --

assembly factor) cyclin G1 4505225 INP 002422.1 NM_002431M 309 interactingw protein cyclins H assembly factor Homosapiens - 470082, AAB05248. U18s - 267-1500GX1 gene product------

cdk7/cyclin Hassembly factor CAA61 2. xers 309 Homosapiens CAA63356.1 X92669 309 p35 Homo sapiens)

HS.82568 Gene: - Sequence count:0 US 2003/0194725 A1 Oct. 16, 2003 93

select GI Protein Acc. DNA Acc. Length. Description f so A40044 - 560 Adola PML-1 protein - human

HS.83293 Gene: DKFZP434A0225 Sequence count: 287 Select GI Protein Acc. DNA Acc. Length Description - 724353 BAA264 AB01707 1214 KIAA386 protein Homo sapiens s ls , , , , unnamed proteinproduct ig 7022270 BAA91537. I akoor so unnamed1.Homosapiens) protein produc at 6807862 canon, AL137349 54 I ponical protein (Homo sapiens/ s

HS.8375 Gene: TRAF4 Sequence count: 220 Select GI Protein Acc. DNA Acc. Length Description tumor necrosis factorreceptor r: 3435256 AAC32376.1 AF082185 198 associated factor 4A (Homo it: - - - sapiens -

as. 12804687-- IAAHO1769.1 BC001769 470 EEC.associatedfactor 4 Homo sapiens

i. 4759252NP- 004286.1 NM004295- 470 4NE Homo sapiensPassociatefactor cystein rich domain associated toRING and TRAF protein Homo sin cus X80200 r sapiens

HS-8383 Gene: BAZ2B Sequence count: 189 Select -- GI - Protein. Acc. DNA Acc. Length...... Description------s

ats: 668,3500BAA892.12.1- AB032255 1972 fingerdomain"" " 2B (Homo" sapiens)". —— 99.1 |BAA960. AB010509120 Ali poein Hono

E|3262644 cabissol AL080173 hypotheticalsapiens protein (Homo 7304923 NP 03 8478.1 NM 013450 1972 bromodomain adjacent to Zinc US 2003/O194725 A1 Oct. 16, 2003 94

FE ------?ingerdomain. 2B Homo sapiens AAHI 2576 Unknown (protein

Isaasaanasia, BCO 12576 643 orMGC. 13472)/Homo sapiens)

HS.85273 Gene: RBBP6 Sequence count: 118 select G Protein Acc. DNA Acc. Length Description retinoblastoma-binding protein : &g sackpososos-- - 1560 6Homo sapiens rctinoblastoma-binding protein spot possia Moo 948 6Homo sapiens RB protein binding protein ri issils casuasi X85133 948 IIomosapiens 628 unnamed protein product 104.39936 BABI 5600. I AK0269.54 (Homosapiens/

HS.85524 Gene: RNF29 Sequence count: 51 Select G Protein Ace. D NA Acc. Length - Description ring finger protein 29 ; gascacas. 436 Homosapiens) titlin zinc-finger anchoring . . assascacao.9. apasas as: protein Homo sapiens titin zinc-finger anchoring E. assas cacao AJ243489 s protein Homo sapiens) ring finger protein 29 -w E. iaisoscacasao AJ291712 s Homosapiens AAHO7750 Unknown (protein forMGC: 12836) (Ilomo sapiens)

t last Aaronsor acorsos for muscle specific ring finger 49 as NP 149047.1 NM_033058 s 2Homo sapiens

HS.85844 Gene: NTRK1 Sequence count: 1572 Select G Protein Acc. DNA Acc. Length Description E. 3399.18 AAA36770.1 M23 102 790 trk tyrosine-specific protein kinase