WO 2014/134728 Al 12 September 2014 (12.09.2014) P O P C T
Total Page:16
File Type:pdf, Size:1020Kb
(12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2014/134728 Al 12 September 2014 (12.09.2014) P O P C T (51) International Patent Classification: (81) Designated States (unless otherwise indicated, for every C12Q 1/68 (2006.01) G06F 19/20 (201 1.01) kind of national protection available): AE, AG, AL, AM, C40B 30/00 (2006.01) AO, AT, AU, AZ, BA, BB, BG, BH, BN, BR, BW, BY, BZ, CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, (21) International Application Number: DO, DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, PCT/CA20 14/050 174 HN, HR, HU, ID, IL, IN, IR, IS, JP, KE, KG, KN, KP, KR, (22) International Filing Date: KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, 6 March 2014 (06.03.2014) MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (25) Filing Language: English SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, (26) Publication Language: English TN, TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (30) Priority Data: 61/774,271 7 March 2013 (07.03.2013) US (84) Designated States (unless otherwise indicated, for every kind of regional protection available): ARIPO (BW, GH, (71) Applicants: UNIVERSITE DE MONTREAL [CA/CA]; GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, SZ, TZ, 2900 Edouard-Montpetit, Montreal, Quebec H3T 1J4 UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, (CA). THE WALTER AND ELIZA HALL INSTITUTE TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, OF MEDICAL RESEARCH [AU/AU]; 1G Royal EE, ES, FI, FR, GB, GR, HR, HU, IE, IS, IT, LT, LU, LV, Parade, Parkville, Victoria 3052 (AU). MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, (72) Inventors: SAUVAGEAU, Guy; 7390, de Tilly, Montreal, KM, ML, MR, NE, SN, TD, TG). Quebec H3R 3E3 (CA). MACRAE, Tara; 4584 Ave. Hingston, Montreal, Quebec H4A 2K1 (CA). SAR- Published: GEANT, Tobias; 23 Olinda Crescent, Olinda, Melbourne, — with international search report (Art. 21(3)) Victoria 3788 (AU). — with sequence listing part of description (Rule 5.2(a)) (74) Agent: GOUDREAU GAGE DUBUC; 2000, McGill Col lege, #2200, Montreal, Quebec H3A 3H3 (CA). (54) Title: METHODS AND GENES FOR NORMALIZATION OF GENE EXPRESSION 00 Least ai i t genes ::::> V FIG. 2 (57) Abstract: Novel genes exhibit minimal variation in expression level across different samples and which may be used as house - © keeping genes for normalization of gene expression in quantitative gene expression measurements are disclosed. A novel method for the identification of housekeeping genes using whole Transcriptome Shotgun Sequencing (RNA-seq) is also disclosed. METHODS AND GENES FOR NORMALIZATION OF GENE EXPRESSION CROSS REFERENCE TO RELATED APPLICATIONS The present application claims the benefit of U.S. Provisional Application Serial No. 61/774,271 filed on March 7 , 2013, which is incorporated herein by reference in its entirety. TECHNICAL FIELD The present invention generally relates to the normalization of measured levels of a gene of interest in applications involving gene expression analysis. BACKGROUND Normalization of measured levels of a gene of interest against a stably expressed control gene is the most important action leading to accuracy in quantitative reverse- transcriptase PCR (qRT-PCR) experiments. However, while control gene levels can vary greatly depending on samples used, they are usually selected based solely on convention [1-6]. The control genes most commonly used were originally selected due to their high expression levels in all tissues rather than their low variability among tissues. Numerous studies have shown that these genes can vary considerably [1-5], thus casting doubt on the accuracy of relative quantification values. A couple of studies which have been done with this shared goal relied on microarray data meta-analysis [7, 8]. However, microarray data is susceptible to errors resulting from hybridization artifacts, saturation of fluorescent signal, and requires complicated normalization [10-12]. Leukemia and other cancer samples are prone to higher variability of gene expression compared to normal tissues due to clonal selection and genetic instability. Given the increased interest in expression profiling and identification of marker genes in cancer for personalized medicine, there is a clear need for optimal normalization of gene expression data by identifying control genes with the least possible variation. There is thus a need for the identification of genes suitable for normalization of gene expression. The present description refers to a number of documents, the content of which is herein incorporated by reference in their entirety. SUMMARY OF THE INVENTION In a first aspect, the present invention provides a method for comparing expression levels of a test gene in a plurality of samples, comprising: a) measuring the expression of one or more of control genes depicted in Table 1 below in said plurality of samples; Table 1 b) measuring the expression of the test gene in said plurality of samples; c) normalizing the expression of the test gene in each sample by comparing expression of the one or more control genes across the samples, and applying normalization to the test gene to obtain normalized expression levels of the test gene; and d) comparing the normalized expression levels of the test gene across said plurality of samples. In another aspect, the present invention provides a method for normalizing the levels of a test gene present in a plurality of samples comprising a) measuring the expression of one or more of control genes depicted in Table 1 across said plurality of samples; b) comparing the expression levels of the one or more control genes across said plurality of samples; c) deriving a value for normalizing expression of the one or more control genes across said plurality of samples; and d) normalizing the expression of the test gene in said plurality of samples based on the value obtained in step c). In an embodiment, the one or more control genes encodes/encode protein involved in RNA splicing/processing, and is/are KHDRBS1 , RBM22, SNW1, CASC3, SF3A1 , POLR2C, PAPOLA, HNRNPH3, HNRNPUL1 , RBM8A, GTF2F1 , USP39, U2AF1 , XRN2 and/or ADAR (i.e. one or any combination of the just-noted RNA splicing/processing gene). In another embodiment, the above-mentioned one or more control genes encode protein involved in proteasome/ubiquitination and is/are USP4, UBE2I, PSMF1 , PSMA1 , VCP, PSMD6, PSMD7, KHDRBS1 , and/or VPS4A, in a further embodiment UBE2I, PSMF1 , PSMA1 , PSMD6 and/or VPS4A (i.e. one or any combination of the just-noted proteasome/ubiquitination genes). In another embodiment, the above-mentioned one or more control genes is/are HNRNPL, PCBP2, GNB1 , SLC25A3, ZNF207, UBE2I, VPS4A, PSMF1 , PSMA1 , SRSF9 and/or PSMD6 (i.e. any combination thereof). In an embodiment, the control gene is HNRNPL. In an embodiment, the control gene is PCBP2. In an embodiment, the control gene is GNB1 . In an embodiment, the control gene is SLC25A3. In an embodiment, the control gene is ZNF207. In an embodiment, the control gene is UBE2I. In an embodiment, the control gene is VPS4A. In an embodiment, the control gene is PSMF1 . In an embodiment, the control gene is PSMA1 . In an embodiment, the control gene is SRSF9. In an embodiment, the control gene is PSMD6. In an embodiment, the expression is measured at the mRNA level. In a further embodiment, the mRNA is reverse transcribed to cDNA prior to the measuring. In a further embodiment, the mRNA or cDNA is amplified prior to said measuring. In a further embodiment, the amplification is by PCR, more particularly real time PCR (RT-PCR) (e.g., quantitative RT- PCR, qRT-PCR). In an embodiment, the above-mentioned the plurality of samples comprises a normal cell sample. In another embodiment, the above-mentioned the plurality of samples comprises a tumor cell sample). In another embodiment, the plurality of samples comprises both a normal cell sample and a tumor cell sample. In a further embodiment, the tumor cell sample is a leukemia cell sample, a breast cancer cell sample, a colon cancer cell sample, a kidney cancer cell sample and/or a lung cancer cell sample, more particularly a leukemia cell sample. In another aspect, the present invention provides a method for identifying a gene useful for normalizing the expression of a test gene across a plurality of samples, comprising a) performing whole Transcriptome Shotgun Sequencing (RNA-seq) on said plurality of samples; b) comparing the level of expression of the genes of the transcriptome across the plurality of samples; and c) identifying the gene(s) exhibiting a coefficient of variation (CV) of about 25% or less and a maximum fold-change (MFC) of about 10 or less across the plurality of samples. In an embodiment, the MFC is about 5 or less, more particularly about 2 or less. In an embodiment, the CV is about 20% or less, more particularly about 15% or less. Other objects, advantages and features of the present invention will become more apparent upon reading of the following non-restrictive description of specific embodiments thereof, given by way of example only with reference to the accompanying drawings. BRIEF DESCRIPTION OF DRAWINGS In the appended drawings: FIG. 1 shows the distribution of coefficient of variation of control genes in relation to all genes in combined TCGA RNA-seq data. Mean expression represents the average of all RPKM values for a given gene across the combined TCGA data set (1933 samples).