Method for a Genome Wide Identification of Expression
Total Page:16
File Type:pdf, Size:1020Kb
(19) TZZ ___T (11) EP 2 177 615 A1 (12) EUROPEAN PATENT APPLICATION (43) Date of publication: (51) Int Cl.: 21.04.2010 Bulletin 2010/16 C12N 15/11 (2006.01) (21) Application number: 08075816.2 (22) Date of filing: 10.10.2008 (84) Designated Contracting States: • Weltmeier, Fridtjof, Dr. AT BE BG CH CY CZ DE DK EE ES FI FR GB GR 30627 Hannover (DE) HR HU IE IS IT LI LT LU LV MC MT NL NO PL PT RO SE SI SK TR (74) Representative: Baumbach, Friedrich Designated Extension States: Patentanwalt AL BA MK RS Robert-Rössle-Strasse 10 13125 Berlin (DE) (71) Applicant: Fraunhofer-Gesellschaft zur Förderung der Remarks: angewandten Forschung e.V. Claims 16-27,29,31,33-44 and 46-47 are deemed to 80686 München (DE) be abandoned due to non-payment of the claims fees (Rule 45(3) EPC). (72) Inventors: • Borlak, Jürgen, Prof. Dr. 31275 Lehrte OT Immensen (DE) (54) Method for a genome wide identification of expression regulatory sequences and use of genes and molecules derived thereof for the diagnosis and therapy of metabolic and/or tumorous diseases (57) The invention is directed to the use of particular located within the chromosomal position specified by par- human genes, nucleic acids hybridizing to said genes, ticular start and end sites. and gene products encoded thereby in the context of the The invention further relates to a method for a diagnosis and/or therapy of metabolic and/or cancerous genomewide identification of functional binding sites at diseases, preferably of diabetes mellitus and/or colorec- specifically targeted DNA sequences with high resolu- tal cancer, wherein the gene is selected from the group tion, wherein the method comprises, or preferably con- of the human chromosomal genes having at least one sists of, the steps of: expression regulatory sequence according to matrix 1 a. chromatin immunoprecipitation and ("de novo" HNF4α matrix) in the range of 100000 nucle- b. DNA-DNA hybridisation for the otides upstream or downstream of their transcription start c. de novo identification of gene targets. site in the human genome, and wherein the at least one expression regulatory sequence according to matrix 1 is EP 2 177 615 A1 Printed by Jouve, 75001 PARIS (FR) EP 2 177 615 A1 Description [0001] The invention is directed to the use of particular human genes, nucleic acids hybridizing to said genes, and gene products encoded thereby in the context of the diagnosis and/or therapy of metabolic and/or cancerous diseases, 5 preferably of diabetes mellitus and/or colorectal cancer. The invention further relates to a method for a genomewide identification of functional binding sites at specifically targeted DNA sequences with high resolution. Areas of application are the life sciences: biology, biochemistry, biotechnology, medicine and medical technology. [0002] Hepatic nuclear factor (HNF)-4α is a member of the nuclear receptor superfamily and known to be expressed in the liver, intestine, and pancreas (for review see Sladek et al., 2001; Schrem, 2002). Many reports have highlighted 10 the importance of HNF4α in the regulation of developmental processes in determining the hepatic phenotype, as well as the regulation of diverse metabolic pathways (e.g., glucose, cholesterol, and fatty acid metabolism) (Sladek et al., 1990; Jiang et al., 1995; Yamagata et al., 1996; Hadzopoulou-Cladaras et al., 1997). Therefore, HNF4α is considered as a hepatic master regulatory protein. In contrast to other members of the nuclear receptor superfamily, HNF4α binds to its cognate DNA binding site as a homodimer (Jiang 1997; Sladek 1990). HNF4α is one of the best characterized 15 transcription factors, and in the past some dozent direct binding sites were reported. The employment of ChlP-chip technologies demonstrated however that these are only the smallest fraction of the actual HNF4α binding sites. Notably, Rada-lglesias et al. (2005) used custom made arrays with a low resolution encompasing the ENCODE regions, i. e. 1% of the genome, for ChlP-chip, and thus mapped 194 HNF4α binding sites in the human hepatoma cell line HepG2. In another study binding sites in hepatocytes and pancreatic islets were mapped, but the approach focused on promoter 20 regions only (Odom et al, 2004; Odom et al, 2006). Based on the findings published by Rada-lglesias et al. (2005) only a small fraction of HNF4α binding sites are located in proximal promoter regions. [0003] As of today, a genome-wide footprinting of binding sites targeted by HNF4α has not been reported. However, such a system would allow to identify the common rationale underlying the genome wide regulation processes induced by protein-DNA binding within the context of diseases caused by HNF4α upregulation, such as adenocarcinomas 25 of the colon may be, namely to identify specific regulatory sequences in the chromosomal genome which (a) bind to HNF4α and (b) bind to proteins other than HNF4α, wherein said proteins bind to HNF4α. [0004] The aim of the invention is thus to provide a method allowing a genome-wide high resolution map of binding sites relevant for the transcription regulation induced by HNF4α, and the identification of human chromosomal genes having at least one specific regulatory sequence in their natural environment, and the use of said genes or gene products 30 encoded thereby or of RNA hybridizing to said genes for the diagnosis and therapy of diseases, preferably of adeno- carcinomas of the colon, caused by an deregulation of HNF4α [0005] To this end, the implementation of the actions and embodiments as described in the claims provides appropriate means to fulfill these demands in a satisfying manner. [0006] Thus, the invention in its different aspects and embodiments is implemented according to the claims. 35 [0007] In the first aspect, the invention is directed to the use of a human gene, in particular the coding region thereof, or a gene product encoded thereby or of DNA or RNA sequences hybridizing to said gene and coding for a polypeptide having the function of said gene product, for the therapy and/or diagnosis of metabolic and/or cancerous diseases and/or to screen for and to identify drugs against metabolic and/or cancerous diseases, such as diabetes mellitus and/or colorectal cancer may be, wherein 40 the gene is selected from the group of the human chromosomal genes having at least one expression regulatory sequence according to matrix 1 ("de novo" HNF4α matrix) in the range of 100000 nucleotides upstream or downstream of their transcription start site in the human genome, and wherein the at least one expression regulatory sequence according to matrix 1 is located within the chromosomal position specified by the start and end sites according to Tables 9-32 (chromosomes 1-22, X-, Y-chromosome, wherein Table 9 refers to the human chromosome 1, Table 10 refers to the 45 human chromosome 2, etc., Table 30 refers to the human chromosome 22, Table 31 refers to the human X-chromosome, and Table 32 refers to the human Y-chromosome). [0008] The term "gene" according to the invention is directed to both the template strand, which refers to the sequence of the DNA that is copied during the synthesis of mRNA, and to the coding strand corresponding to the codons that are translated into a protein. The genes according to the invention and gene products encoded thereby can be easily derived 50 from the common databases, as such are known to the person skilled in the art, wherein the UCSC Genome Database is particularly preferred: Karolchik D, Kuhn, RM, Baertsch R, Barber GP, Clawson H, Diekhans M, Giardine B, Harte RA, Hinrichs AS, Hsu F, Miller W, Pedersen JS, Pohl A, Raney BJ, Rhead B, Rosenbloom KR, Smith KE, Stanke M, Thakkapallayil A, Trumbower H, Wang T, Zweig AS, Haussler D, Kent WJ. The UCSC Genome Browser Database: 2008 update. Nucleic Acids Res. 2008 Jan;36:D773, which is incorporated herein by reference. 55 [0009] The term "coding region" according to the invention is directed to the portion of DNA or RNA that is transcribed into the mRNA, which then is translated into a protein. This does not include gene regions such as a recognition site, initiator sequence, or termination sequence. [0010] The term "RNA sequences hybridizing to said gene" or "RNA sequences, which hybridize to said gene", re- 2 EP 2 177 615 A1 spectively, according to the invention relates to RNA molecules hybridizing with the template strand of said gene, in particular with the coding region. The term "DNA sequences hybridizing to said gene" or "DNA sequences, which hybridize to said gene", respectively, according to the invention preferably relates to DNA molecules hybridizing with the coding strand of said gene, in 5 particular with the coding region thereof. The term "hybridizing" as used herein refers to conventional hybridization conditions, preferably to hybridization conditions under which the Tm value is between 37˚C to 70˚C, preferably between 41˚C to 66˚C.[fl] An example for low stringency conditions is e.g. hybridisation under the conditions 42˚C, 2xSSC, 0.1 % SDS and an example for high stringency conditions is e.g. hybridization under the conditions: 65˚C, 2xSSC, 0.1 % SDS, wherein in the case that washing is 10 necessary for equilibrium, the hybridization solution is used for the washings. Most preferably, the term hybridization refers to stringent hybridization conditions. The term "function of said gene product" is directed to biological activities of the polypeptide encoded by the selected gene, wherein the biological activities may easily be derived from common gene or protein databases, such as the ncbi or the SWISS-Prot databases may be and the biological activities can be determined according to the literature provided 15 therein.