<<

USOO8795959B2

(12) United States Patent (10) Patent No.: US 8,795,959 B2 Ryan (45) Date of Patent: * Aug. 5, 2014

(54) ISOLATED GLUCOKINASE GENOMIC FOREIGN PATENT DOCUMENTS POLYNUCLEOTDE FRAGMENTS FROM WO 952O678 8, 1995 7 WO OO58467 10, 2000 (71) Applicant: Ryogen LLC, Suffern, NY (US) OTHER PUBLICATIONS (72) Inventor: James Ryan, Augusta, GA (US) Sulston et al., Genome Research, vol. 8, pp. 1097-1108 (1998).* Stoffel et alProc. Natl. Acad. Sci., vol. 89, pp. 2698-2702 (1992).* (73) Assignee: Ryogen LLC, Suffern, NY (US) logy et al., Molecular Endocrinol. vol. 6, pp. 1070-1081 1992). (*) Notice: Subject to any disclaimer, the term of this Table 24 of U.S. Appl. No. 60/231.498.

patent is extended or adjusted under 35 ablemed, ES Froc. Nall. r Acad. N.SEs Sc1. : 14800. 1999 U.S.C. 154(b) by 0 days. Altschul, Nucleic Acids Res. 25: 3389-3402. 1997. This patent is Subject to a terminal dis- Burge, J. Mol. Biol. 268: 78-94. 1997. claimer. EMBL database Accession No. Q9UES0 May 1, 2000 SNARE pro tein Ykt6 (Fragment) Homo sapiens. (21) Appl. No.: 13/680,178 International Search Report for PCT/2001/29454, filed Mar. 27. 2003. (22) Filed: Nov. 19, 2012 Layne, J. Biol. Chem. 273:15654-15660. 1998. 9 Maestrini, Hum. Mol. Gen. 2:761-766. 1993. O O McNew, J. Biol. Chem. 272: 17776-17783. 1997. (65) Prior Publication Data Muise, Biochem J. 343: 341-345. 1999. US 2013/O13O250 A1 May 23, 2013 Nuttal, Bone 27: 177-184. 2000. Ohno, Biochem Biophys Res Comm 228: 411-414, 1996. Perez. “Characterization of the 5'-flanking region of the encod O O ing the 50 kDa subunit of human DNA O' Biochem Related U.S. Application Data Biophys Acta 1493: 231-236. 2000. (60) 31,Division 2009, ofnow application Pat. No. 8,313,899,No. 12/533,105, which filedis a division on Jul. SEE.E." solos 7702 of application No. 10642,946, filed on Aug. 18, 2003, Tanizawa, Mo1. Endocrinol 6: 1070-1081. 1992. now Pat. No. 7.588.915, which is a continuation of Waterston, R.H. GenBank Accession No. AC0006454.4 (gi: application No. 09/957.956, filed on Sep. 21, 2001, 28261662) Submitted Jan. 28, 1999, bases 1-153203. now abandoned. Waterston, R.H. GenBank Accession No. AC0006456.3 (gi: 21322189) Submitted Jan. 28, 1999, bases 1-75609. (60) Provisional application No. 60/234,422, filed on Sep. Waterston, R.H. GenBank Accession No. AC0006454.2 21, 2000. (gi:4337283) Submitted Jan. 28, 1999, bases 1-151965. Zhang, Genomics 29: 179-186. 1995. (51) Int. Cl. U.S. Appl. No. 13/680.203, Non-Final Office Action May 30, 2013. CI2O I/68 (2006.01) U.S. Appl.pp No. 13,680,223 Non-Final Office Action Mayy 30,5U, 2013. U.S. Appl. No. 09/957,956 Non-Final Office Action Dec. 4, 2002. Ea 3.08: U.S. Appl. No. 09/957,956 Non-Final Office Action May 21, 2003. U.S. Appl. No. 10/642,946 Non-Final Office Action Oct. 17, 2006. C7H 2L/02 (2006.01) U.S. Appl. No. 10/642,946 Non-Final Office Action Oct. 26, 2007. C7H 2L/04 (2006.01) U.S. Appl. No. 10/642,946 Non-Final Office Action Mar. 25, 2008. (52) U.S. Cl. U.S. Appl. No. 10/642,946 Non-Final Office Action Oct. 16, 2008. USPC ...... 435/6; 435/69.1:435/91. 1; 435/91.31; 435/455; 536/23.1536/23.2:536/23.5:536/24.31 (Continued) (58) Field of Classification Search Primary Examiner — Jane Zara USPC ...... 435/6, 91.1, 91.31, 69.1, 455; 536/23.1, (74) Attorney, Agent, or Firm — Cheryl H. Agris; Agris & 536/24.3, 24.31, 23.2, 23.5 von Natzmer, LLP See application file for complete search history. s

(56) Refeerees Citede (57) ABSTRACT Provided are isolated genomic polynucleotide fragments that U.S. PATENT DOCUMENTS encode human SNARE YKT6, human glucokinase, human adipocyte enhancer binding protein (AEBP1) and DNA 5,541,060 A 7, 1996 Bell 5,624,803 A 4/1997 Noonberg directed 50 kD regulatory subunit (POLD2), vectors and 5,972,334 A 10/1999 Denney hosts containing these fragments and fragments hybridizing 6,783,961 B1 8, 2004 Edwards to noncoding regions as well as antisense oligonucleotides to 6,812,339 B1 1 1/2004 Venter these fragments. The invention is further directed to methods 8,313,899 B2 * 1 1/2012 Ryan ...... 435/455 2002/0048763 A1 4/2002 Penn of using these fragments to obtain SNARE YKT6, human 2003/OO778O8 A1 4/2003 Rosen glucokinase, AEBP1 protein and POLD2 and to diagnose, 2003/0204075 A9 10/2003 Wang treat, prevent and/or ameliorate a pathological disorder. 2007, OO15162 A1 1/2007 Rosen 2007/0031842 A1 2/2007 Rosen 2 Claims, No Drawings US 8,795,959 B2 Page 2

(56) References Cited U.S. Appl. No. 60/231.498, filed Sep. 8, 2000, Venter. (Priority for US Patent 68.12339). OTHER PUBLICATIONS Table 1 of U.S. Appl. No. 60/231.498, Sep. 8, 2000 Table 2 of U.S. Appl. No. 60/231.498, Sep. 8, 2000. . Appl. ... 10/642,946 Notice of Allowance Apr. 28, 2009. Table 3 of U.S. Appl. No. 60/231.498, Sep. 8, 2000. . Appl. 12/533,105 Non-Final Office Action May 21, 2010. Table 4 of U.S. Appl. No. 60/231.498, Sep. 8, 2000. . Appl. 12/533,105 Final Office Action Nov. 29, 2010. No. . Appl. 12/533,105 Non-Final Office Action Mar. 5, 2012. Table 5 of U.S. Appl. 60/231.498, Sep. 8, 2000. . Appl. 12/533,105 Notice of Allowance Jul. 16, 2012. Table 6 of U.S. Appl. No. 60/231.498, Sep. 8, 2000. . Appl. 12/533,130 Non-Final Office Action May 20, 2010. Table 7 of U.S. Appl. No. 60/231.498, Sep. 8, 2000. . Appl. 12/533,130 Final Office Action Nov. 29, 2010. Table 8 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,130 Non-Final Office Action Mar. 29, 2012. Table 9 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,130 Notice of Allowance Jul. 24, 2012. Table 10 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,164 Non-Final Office Action Jun. 15, 2010. Table 11 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,164 Final Office Action Dec. 29, 2010. Table 12 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,164 Non-Final Office Action Mar. 2, 2012. Table 13 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. 12/533,164 Notice of Allowance Jul. 11, 2012. . Appl. 12/533,087 Non-Final Office Action Apr. 6, 2011. Table 14 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. No. 12/533,087 Final Office Action Oct. 4, 2011. Table 15 of U.S. Appl. . 60/231.498, Sep. 8, 2000. . Appl. No. 12/533,087 Notice of Allowance Jan. 12, 2012. Table 16 of U.S. Appl. . 60/231.498, Sep. 8, 2000. Sequence Alignment U.S. Appl. No. 12/533,105 Non-final Office Table 17 of U.S. Appl. . 60/231.498, Sep. 8, 2000. Action May 21, 2010 SEQID No. 6 NT 20.485-33,460 Result 3 (vs Table 18 of U.S. Appl. . 60/231.498, Sep. 8, 2000. AC006454.4 PRI May 30, 2003. Table 19 of U.S. Appl. . 60/231.498, Sep. 8, 2000. Sequence Alignment U.S. Appl. No. 12/533,130 Non-final Office Action May 20, 2010 SEQ ID No. 5 Result 3 (vs AC006454.4 PRI Table 20 of U.S. Appl. . 60/231.498, Sep. 8, 2000. May 30, 2003 GI:28261662) No SEQID. Table 21 of U.S. Appl. . 60/231.498, Sep. 8, 2000. Sequence Alignment U.S. Appl. No. 12/533,164 Non-final Office Table 22 of U.S. Appl. . 60/231.498, Sep. 8, 2000. Action Jun. 15, 2010 No SEQID, Result 3 (vs AC006454.4 PRI May Table 23 of U.S. Appl. . 60/231.498, Sep. 8, 2000. 20, 2003 GI:2921662) Result 7 (vs AF2397 10.1 PRI Sep. 15, 2000 GI:9963917. * cited by examiner US 8,795,959 B2 1. 2 SOLATED GLUCOKNASE GENOMIC 341-345). B-like carboxypeptidases remove C-terminal argi POLYNUCLEOTIDE FRAGMENTS FROM nine and lysine residues and participate in the release of active CHROMOSOME 7 peptides, such as insulin, alter receptor specificity for polypeptides and terminate polypeptide activity (Skidgel. PRIORITY CLAIM 1988, Trends Pharmacol. Sci. 9:299-304). For example, they are thought to be involved in the onset of obesity (Naggert et This application is a divisional of application Ser. No. al., 1995, Nat. Genet. 10:1335-1342). It has been reported 12/533,105, filed Jul. 31, 2009, which is a divisional of Ser. that obese and hyperglycemic mice homozygous for the fat No. 10/642,946, filed Aug. 18, 2003, now U.S. Pat. No. 7,588, mutation contain a mutation in the CP-E gene. 915, issued Sep. 15, 2009, which is a continuation of appli 10 Full length cDNA clones encoding AEBP1 have been iso cation Ser. No. 09/957.956, filed Sep. 21, 2001, now aban lated from human osteoblast and adipose tissue (Ohno et al., doned, the contents of which all are incorporated herein by 1996, Biochem. Biophys Res. Commun. 228:411–414). Two reference. Application Ser. No. 09/957,956 is a non-provi forms have been found to exist due to alternative splicing. sional application of and claims priority under 35 U.S.C. This gene appears to play a significant role in regulating S119(e) to provisional application Ser. No. 60/234,422, filed 15 adipogenesis. In addition to playing a role in obesity, adipo Sep. 21, 2000, also incorporated herein by reference. genesis may play a role in ostopenic disorders. It has been postulated that adipogenesis inhibitors may be used to treat FIELD OF THE INVENTION osteopenic disorders (Nuttal et al., 2000, Bone 27:177-184). The invention is directed to isolated genomic polynucle DNA Polymerase Delta Small Subunit (POLD2) otide fragments that encode human SNARE YKT6, human DNA polymerase delta core is a heterodimeric enzyme glucokinase, human adipocyte enhancer binding protein 1 with a catalytic subunit of 125 kD and a second subunit of 50 (AEBP1) and DNA directed 50 kD regulatory subunit kD and is an essential enzyme for DNA replication and DNA (POLD2), vectors and hosts containing these fragments and repair (Zhang et al., 1995, Genomics 29:179-186). cDNAs fragments hybridizing to noncoding regions as well as anti 25 encoding the Small Subunit have been cloned and sequenced. sense oligonucleotides to these fragments. The invention is The gene for the small subunit has been localized to human further directed to methods of using these fragments to obtain chromosome 7 via PCR analysis of a panel of human-hamster SNARE YKT6, human glucokinase, AEBP1 protein and hybrid cell lines. However, the genomic DNA has not been POLD2 and to diagnose, treat, prevent and/or ameliorate a isolated and the exact location on chromosome 7 has not been pathological disorder. 30 determined.

BACKGROUND OF THE INVENTION OBJECTS OF THE INVENTION Chromosome 7 contains encoding, for example, epi Although cDNAs encoding the above-disclosed proteins dermal growth factor receptor, collagen-1-Alpha-1-chain, 35 have been isolated, their location on chromosome 7 has not SNARE YKT6, human glucokinase, human adipocyte been determined. Furthermore, genomic DNA encoding enhancer binding protein 1 and DNA polymerase delta small these polypeptides have not been isolated. Noncoding subunit (POLD2). SNARE YKT6, human glucokinase, sequences can play a significant role in regulating the expres human adipocyte enhancer binding protein 1 and DNA poly sion of polypeptides as well as the processing of RNA encod merase delta small subunit (POLD2) are discussed in further 40 ing these polypeptides. detail below. SNARE YKT6 There is clearly a need for obtaining genomic polynucle SNARE YKT6, a substrate for prenylation, is essential for otide sequences encoding these polypeptides. Therefore, it is vesicle-associated endoplasmic reticulum-Golgi transport an object of the invention to isolate Such genomic polynucle (McNew.J. A. etal. J. Biol. Chem. 272, 17776-17783, 1997). 45 otide sequences. It has been found that depletion of this function stops cell growth and manifests a transport block at the endoplasmic SUMMARY OF THE INVENTION reticulum level. Human Glucokinase The invention is directed to an isolated genomic polynucle Human glucokinase (ATP:D-hexose 6-phosphotrans 50 otide, said polynucleotide obtainable from human chromo ferase) is thought to play a major role in glucose sensing in some 7 having a nucleotide sequence at least 95% identical to pancreatic islet beta cells (Tanizawa et al., 1992, Mol. Endo a sequence selected from the group consisting of crinol. 6:1070-1081) and in the liver. Glucokinase defects have been observed in patients with noninsulin-dependent (a) a polynucleotide encoding a polypeptide selected from diabetes mellitus (NIDDM) patients. Mutations in the human 55 the group consisting of human SNARE YKT6 depicted in glucokinase gene are thought to play a role in the early onset SEQID NO:1, human glucokinase depicted in SEQID NO:2. ofNIDDM. The gene has been shown by Southern Blotting to human adipocyte enhancer binding protein 1 (AEBP1) exist as a single copy on chromosome 7. It was further found depicted in SEQID NO:3 and DNA directed 50kD regulatory to contain 10 exons including one exon expressed in islet beta subunit (POLD2) depicted in SEQID NO:4: cells and the other expressed in liver. 60 (b) a polynucleotide selected from the group consisting of Human Adipocyte Enhancer Binding Protein 1 SEQID NO:5 which encodes human SNAREYKT6 depicted The adipocyte-enhancer binding protein 1 (AEBP1) is a in SEQID NO:1, SEQID NO:6 which encodes human glu transcriptional repressor having carboxypeptidase B-like cokinase depicted in SEQ ID NO:2, SEQ ID NO:8 which activity which binds to a regulatory sequence (adipocyte encodes human adipocyte enhancer binding protein 1 enhancer 1, AE-1) located in the proximal promoter region of 65 depicted in SEQID NO:3 and SEQID NO:7 which encodes the adipose P2 (aP2) gene, which encodes the adipocyte fatty DNA directed 50 kD regulatory subunit (POLD2) depicted in acid binding protein (Muise et al., 1999, Biochem. J. 343: SEQID NO:4; US 8,795,959 B2 3 4 (c) a polynucleotide which is a variant of SEQID NOS:5, DETAILED DESCRIPTION OF THE INVENTION 6, 7, or 8: (d) a polynucleotide which is an allelic variant of SEQID The invention is directed to isolated genomic polynucle NOS:5, 6, 7, or 8: otide fragments that encode human SNARE YKT6, human (e) a polynucleotide which encodes a variant of SEQ ID glucokinase, human adipocyte enhancer binding protein 1 NOS:1, 2, 3, or 4: and DNA directed 50 kD regulatory subunit (POLD2), which (f) a polynucleotide which hybridizes to any one of the in a specific embodiment are the SNARE YKT6, human glu polynucleotides specified in (a)-(e); cokinase, human adipocyte enhancer binding protein 1 and (g) a polynucleotide that is a reverse complement to the DNA directed 50 kD regulatory subunit (POLD2) genes, as 10 well as vectors and hosts containing these fragments and polynucleotides specified in (a)-(f) and polynucleotide fragments hybridizing to noncoding regions, (h) containing at least 10 transcription factor binding sites as well as antisense oligonucleotides to these fragments. selected from the group consisting of AP1 FJ-Q2, AP1-C, As defined herein, a “gene' is the segment of DNA AP1-Q2, AP1-Q4, AP4-Q5, AP4-Q6, ARNT-01, CEB P-01, involved in producing a polypeptide chain; it includes regions CETS1P54-01, CREL-01, DELTAEF1-01, FREAC7-01, 15 preceding and following the coding region, as well as inter GATA1-02, GATA1-03, GATA1-04, GATA1-06, GATA2-02, vening sequences (introns) between individual coding seg GATA3-02, GATA-C, GC-01, GFII-01. HFH2-01. HFH3-01, ments (exons). HFH8-01, IK2-01, LMO2COM-01, LMO2COM-02, LYF 1 - As defined herein "isolated’ refers to material removed 01, MAX-01, NKX25-01, NMYC-01, S8-01, SOX5-01, from its original environment and is thus altered “by the hand SP1-Q6, SAEBP1-01, SRV-02, STAT-01, TATA-01, TCF ofman' from its natural State. An isolated polynucleotide can 01, USF-01, USF-C and USF-Q6 be part of a vector, a composition of matter or could be as well as nucleic acid constructs, expression vectors and host contained within a cell as long as the cell is not the original cells containing these polynucleotide sequences. environment of the polynucleotide. The polynucleotides of the present invention may be used The polynucleotides of the present invention may be in the for the manufacture of a gene therapy for the prevention, 25 form of RNA or in the form of DNA, which DNA includes treatment oramelioration of a medical condition by adding an genomic DNA and synthetic DNA. The DNA may be double amount of a composition comprising said polynucleotide Stranded or single-stranded and if single Stranded may be the effective to prevent, treat or ameliorate said medical condi coding strand or non-coding strand. tion. The human SNAREYKT6 polypeptide has the amino acid The invention is further directed to obtaining these 30 sequence depicted in SEQ ID NO:1 and is encoded by the polypeptides by genomic DNA sequence shown in SEQ ID NO:5. The (a) culturing host cells comprising these sequences under genomic DNA for SNAREYKT6 gene is 39,000 base pairs in conditions that provide for the expression of said polypeptide length and contains seven exons (see Table 4 below for loca and tion of exons). As will be discussed in further detail below, the (b) recovering said expressed polypeptide. 35 SNARE YKT6 gene is situated in genomic clone AC006.454 The polypeptides obtained may be used to produce anti at nucleotides 36,001-75,000. bodies by The human glucokinase is depicted in SEQID NO:2 and is (a) optionally conjugating said polypeptide to a carrier encoded by the genomic DNA sequence shown in SEQ ID protein; NO:6. The human glucokinase genomic DNA is 46,000 base (b) immunizing a host animal with said polypeptide or 40 pairs in length and contains ten exons (see Table 3 below for peptide-carrier protein conjugate of step (b) with an adjuvant location of exons). and The human adipocyte enhancer binding protein 1 has the (c) obtaining antibody from said immunized host animal. amino acid sequence depicted in SEQ ID NO:3 and is The invention is further directed to polynucleotides that encoded by the genomic DNA sequence shown in SEQ ID hybridize to noncoding regions of said polynucleotide 45 NO:8. The adipocyte enhancer binding protein 1 is 16,000 sequences as well as antisense oligonucleotides to these poly base pairs in length and contains 21 exons (see Table 2 below nucleotides as well as antisense mimetics. The antisense oli for location of exons). As will be discussed in further detail gonucleotides or mimetics may be used for the manufacture below, the human AEBP1 gene is situated in genomic clone of a medicament for prevention, treatment or amelioration of AC006454 at nucleotides 137,041-end. a medical condition. 50 POLD2 has an amino acid sequence depicted in SEQ ID The invention is further directed to kits comprising these NO:4 and a genomic DNA sequence depicted in SEQ ID polynucleotides and kits comprising these antisense oligo NO:7. The POLD2 gene is 19,000 base pairs in length and nucleotides or mimetics. contains ten exons (see Table 1 below for location of exons). In a specific embodiment, the noncoding regions are tran As will be discussed in further detail below, the POLD2 gene Scription regulatory regions. The transcription regulatory 55 is situated in genomic clone AC006454 at nucleotides 119, regions may be used to produce a heterologous peptide by 001-138,000. expressing in a host cell, said transcription regulatory region The polynucleotides of the invention have at least a 95% operably linked to a polynucleotide encoding the heterolo identity and may have a 96%, 97%, 98% or 99% identity to gous polypeptide and recovering the expressed heterologous the polynucleotides depicted in SEQID NOS:5, 6, 7 or 8 as polypeptide. 60 well as the polynucleotides in reverse sense orientation, or the The polynucleotides of the present invention may be used polynucleotide sequences encoding the SNARE YKT6, to diagnose a pathological condition in a Subject comprising human glucokinase, AEBP1, or POLD2 polypeptides (a) determining the presence or absence of a mutation in the depicted in SEQID NOS:1, 2, 3, or 4 respectively. polynucleotides of the present invention and A polynucleotide having 95% “identity” to a reference (b) diagnosing a pathological condition or a Susceptibility 65 nucleotide sequence of the present invention, is identical to to a pathological condition based on the presence or absence the reference sequence except that the polynucleotide of said mutation. sequence may include on average up to five point mutations US 8,795,959 B2 5 6 per each 100 nucleotides of the reference nucleotide sequence Subject sequence which are not matched/aligned with the encoding the polypeptide. In other words, to obtain a poly query sequence are manually corrected for. No other manual nucleotide having a nucleotide sequence at least 95% identi corrections are made for purposes of the present invention. cal to a reference nucleotide sequence, up to 5% of the nucle A polypeptide that has an amino acid sequence at least, for otides in the reference sequence may be deleted or substituted example, 95% “identical to a query amino acid sequence is with another nucleotide, or a number of nucleotides up to 5% identical to the query sequence except that the Subject of the total nucleotides in the reference sequence may be polypeptide sequence may include on average, up to five inserted into the reference sequence. The query sequence may amino acid alterations per each 100 amino acids of the query be an entire sequence, the ORF (open reading frame), or any amino acid sequence. In other words, to obtain a polypeptide fragment specified as described herein. 10 having an amino acid sequence at least 95% identical to a As a practical matter, whether any particular nucleic acid query amino acid sequence, up to 5% of the amino acid molecule or polypeptide is at least 95%, 96%, 97%, 98% or residues in the Subject sequence may be inserted, deleted, 99% identical to a nucleotide sequence of the presence inven (indels) or substituted with another amino acid. These alter tion can be determined conventionally using known computer ations of the reference sequence may occur at the amino or programs. A preferred method for determining the best over 15 carboxy terminal positions of the reference amino acid all match between a query sequence (a sequence of the sequence or anywhere between those terminal positions, present invention) and a subject sequence, also referred to as interspersed either individually among residues in the refer a global sequence alignment, can be determined using the enced sequence or in one or more contiguous groups within FASTDB computer program based on the algorithm of Brut the reference sequence. lag et al. (Comp. App. Biosci. (1990) 6:237-245). In a A preferred method for determining the best overall match sequence alignment the query and Subject sequences are both between a query sequence (a sequence of the present inven DNA sequences. An RNA sequence can be compared by tion) and a Subject sequence, also referred to as a global converting Us to T’s. The result of said global sequence sequence alignment, can be determined using the FASTDB alignment is in percent identity. Preferred parameters used in computer program based on the algorithm of Brutlag et al. a FASTDB alignment of DNA sequences to calculate percent 25 (Com. App. Biosci. (1990) 6:237-245). In a sequence align identity are: Matrix=Unitary, k-tuple=4, Mismatch Pen ment, the query and Subject sequence are either both nucle alty=1, Joining Penalty=30, Randomization Group otide sequences or both amino acid sequences. The result of Length=0, Cutoff Score=1, Gap Penalty=5, Gap Size Pen said global sequence alignment is in percent identity. Pre alty=0.05, Window Size=500 or the length of the subject ferred parameters used in a FASTDB amino acid alignment nucleotide sequence, whichever is shorter. 30 are: Matrix=PAM 0, k-tuple=2, Mismatch Penalty=1, Joining If the Subject sequence is shorter than the query sequence Penalty=20, Randomization Group Length=0, Cutoff because of 5" or 3' deletions, not because of internal deletions, Score=1, Window Size=sequence length, Gap Penalty=5, a manual correction must be made to the results. This is Gap Size Penalty=0.05, Window Size=500 or the length of because the FASTDB program does not account for 5' and 3' the Subject amino acid sequence, whichever is shorter. truncations of the Subject sequence when calculating percent 35 If the Subject sequence is shorter than the query sequence identity. For subject sequences truncated at the 5' or 3' ends, due to N- or C-terminal deletions, not because of internal relative to the query sequence, the percent identity is cor deletions, a manual correction must be made to the results. rected by calculating the number of bases of the query This is because the FASTDB program does not account for N sequence that are 5' and 3' of the Subject sequence, which are and C-terminal truncations of the Subject sequence when not matched/aligned, as a percent of the total bases of the 40 calculating global percent identity. For Subject sequences query sequence. Whether a nucleotide is matched/aligned is truncated at the N- and C-termini, relative to the query determined by results of the FASTDB sequence alignment. sequence, the percent identity is corrected by calculating the This percentage is then subtracted from the percent identify, number of residues of the query sequence that are N- and calculated by the above FASTDB program using the specified C-terminal of the Subject sequence, which are not matched/ parameters, to arrive at a final percent identity Score. This 45 aligned with a corresponding Subject residue, as a percent of corrected score is what is used for the purposes of the present the total bases of the query sequence. Whether a residue is invention. Only bases outside the 5' and 3' bases of the subject matched/aligned is determined by results of the FASTDB sequence, as displayed by the FASTDB alignment, which are sequence alignment. This percentage is then subtracted from not matched/aligned with the query sequence are calculated the percent identity, calculated by the above FASTDB pro for the purposes of manually adjusting the percent identity 50 gram using the specified parameters, to arrive at a final per SCO. cent identity score. This final percent identity score is what is For example, a 95 base subject sequence is aligned to a 100 used for the purposes of the present invention. Only residues base query sequence to determine percent identity. The dele to the N- and C-termini of the subject sequence, which are not tions occur at the 5' end of the subject sequence and therefore, matched/aligned with the query sequence, are considered for the FASTDB alignment does not show a matched/alignment 55 the purposes of manually adjusting the percent identity score. of the first 10 bases at 5' end. The 10 unpaired bases represent That is, only query residue positions outside the farthest N 5% of the sequence (number of bases at the 5' and 3' ends not and C-terminal residues of the Subject sequence. matched/total numbers of bases in the query sequence) So 5% The invention also encompasses polynucleotides that is subtracted from the percent identity score calculated by the hybridize to the polynucleotides depicted in SEQID NOS: 5, FASTDB program. If the remaining 95 bases were perfectly 60 6, 7 or 8. A polynucleotide “hybridizes to another polynucle matched the final percent identity would be 95%. In another otide, when a single-stranded form of the polynucleotide can example, a 95 base subject sequence is compared with a 100 anneal to the other polynucleotide under the appropriate con base query sequence. This time the deletions are internal ditions of temperature and solution ionic strength (see Sam deletions so that there are no bases on the 5' or 3' of the subject brook et al., Supra). The conditions of temperature and ionic sequence which are not matched/aligned with the query. In 65 strength determine the “stringency” of the hybridization. For this case the percent identity calculated by FASTDB is not preliminary Screening for homologous nucleic acids, low manually corrected. Once again, only bases 5' and 3' of the stringency hybridization conditions, corresponding to a tem US 8,795,959 B2 7 8 perature of 42° C. can be used, e.g., 5xSSC, 0.1% SDS, exchanges are Ala/Ser, Val/Ile, Asp/Glu, Thr/Ser, Ala/Gly, 0.25% milk, and no formamide; or 40% formamide, 5xSSC, Ala/Thr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/ 0.5% SDS). Moderate stringency hybridization conditions Arg, Asp/ASn, Leu/Ile, Leu/Val, as well as these in reverse. correspond to a higher temperature of 55° C., e.g., 40% for Noncoding Regions mamide, with 5x or 6xSCC. High stringency hybridization 5 The invention is further directed to polynucleotide frag conditions correspond to the highest temperature of 65°C., ments containing or hybridizing to noncoding regions of the e.g., 50% formamide, 5x or 6xSCC. Hybridization requires SNARE YKT6, AEBP1, human glucokinase and POLD2 that the two nucleic acids contain complementary sequences, genes. These include but are not limited to an intron, a 5' although depending on the stringency of the hybridization, non-coding region, a 3' non-coding region and splice junc mismatches between bases are possible. The appropriate 10 tions (see Tables 1-4), as well as transcription factor binding stringency for hybridizing nucleic acids depends on the sites (see Table 5). The polynucleotide fragments may be a length of the nucleic acids and the degree of complementa short polynucleotide fragment which is between about 8 tion, variables well known in the art. The greater the degree of nucleotides to about 40 nucleotides in length. Such shorter similarity or between two nucleotide sequences, fragments may be useful for diagnostic purposes. Such short 15 polynucleotide fragments are also preferred with respect to the greater the value of Tm for hybrids of nucleic acids having polynucleotides containing or hybridizing to polynucleotides those sequences. The relative stability (corresponding to containing splice junctions. Alternatively larger fragments, higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. e.g., of about 50, 150, 500, 600 or about 2000 nucleotides in Polynucleotide and Polypeptide Variants length may be used. The invention is directed to both polynucleotide and polypeptide variants. A “variant” refers to a polynucleotide or TABLE 1 polypeptide differing from the polynucleotide or polypeptide Exon/Intron Regions of Polymerase, DNA directed, 50 kD of the present invention, but retaining essential properties regulatory subunit (POLD2) Genomic DNA thereof. Generally, variants are overall closely similar and in many regions, identical to the polynucleotide or polypeptide 25 LOCATION of the present invention. (nucleotide no.) The variants may contain alterations in the coding regions, EXONS (Amino acid no.) non-coding regions, or both. Especially preferred are poly 1. 1546 ... 11764 nucleotide variants containing alterations which produce 1 73 silent substitutions, additions, or deletions, but do not alter the 30 2. SS34 . . . 15656 74 14 properties or activities of the encoded polypeptide. Nucle 3. 5857 ... 15979 otide variants produced by silent substitutions due to the 115 55 degeneracy of the genetic code are preferred. Moreover, Vari 4. 6351 ... 16464 ants in which 5-10, 1-5, or 1-2 amino acids are substituted, 156 93 5. 6582 ... 16782 deleted, or added in any combination are also preferred. 35 194 260 The invention also encompasses allelic variants of said 6. 7089 ... 17169 polynucleotides. An allelic variant denotes any of two or more 261 287 alternative forms of a gene occupying the same chromosomal 7. 7327 . . . 17484 locus. Allelic variation arises naturally through mutation, and 288 339 may result in polymorphism within populations. Gene muta 8. 7704 . . . 17829 tions can be silent (no change in the encoded polypeptide) or 40 340 381 9. 81.99 . . . 18303 may encode polypeptides having altered amino acid 382 416 sequences. An allelic variant of a polypeptide is a polypeptide 10. 8653 ... 18811 encoded by an allelic variant of a gene. 417 469 The amino acid sequences of the variant polypeptides may differ from the amino acid sequences depicted in SEQ ID 45 tgaat 18812-14 NOS:1, 2, 3 or 4 by an insertion or deletion of one or more Poly A at 18885-90 amino acid residues and/or the Substitution of one or more amino acid residues by different amino acid residues. Prefer ably, amino acid changes are of a minor nature, that is con TABLE 2 servative amino acid substitutions that do not significantly 50 AEBP1 (adipocyte enhancer binding protein 1), vascular Smooth affect the folding and/or activity of the protein; small dele muscle-type. Reverse strand coding. tions, typically of one to about 30 amino acids; Small amino or carboxyl-terminal extensions, such as an amino-terminal LOCATION (nucleotide no.) methionine residue; a small linker peptide of up to about EXONS (Amino acid no.) 20-25 residues; or a small extension that facilitates purifica 55 tion by changing net charge or another function, Such as a 21. 1301 ... 1966 poly-histidine tract, an antigenic epitope or a binding domain. 1158 937 2O. 2209 . . . 2304 Examples of conservative Substitutions are within the group 936 905 of basic amino acids (arginine, lysine and histidine), acidic 19. 2426 . . . 2569 amino acids (glutamic acid and aspartic acid), polar amino 904 857 acids (glutamine and asparagine), hydrophobic amino acids 60 18. 2651 ... 3001 (leucine, isoleucine and valine), aromatic amino acids (phe 856 740 nylalanine, tryptophan and tyrosine), and Small amino acids 17. 3238 . . . 34.17 739 68O (glycine, alanine, serine, threonine and methionine). Amino 16. 3509 . . . 37O6 acid Substitutions which do not generally alter the specific 679 614 activity are known in the art and are described, for example, 65 15. 393O . . . 4052 by H. Neurath and R. L. Hill, 1979, In, The Proteins, Aca 613 573 demic Press, New York. The most commonly occurring US 8,795,959 B2 9 10 TABLE 2-continued TABLE 4

AEBP1 (adipocyte enhancer binding protein 1), vascular Smooth SNAREYKT6. Reverse Strand codin muscle-type. Reverse strand coding. LOCATION 5 (nucleotide no.) LOCATION EXONS (Amino acid no.) (nucleotide no.) 7. 432O . . . 4352 EXONS (Amino acid no.) 198 188 6. 5475 ... 5576 14. 432O . . . 4406 10 5 s: i. 572 544 153 132 13. 4503 . . . 4646 4. 9107 . . . 921.1 543 496 131 97 12. 47SO . . . 4833 3. 1O114 ... 10215 96 63 495 468 15 2. 11950 . . . 12033 11. S212 . . . S352 62 35 467 421 1. 15362 . . . 15463 10. S435 . . . SS45 34 1 420 384 Stop codon at 4817-19 9. 6219 . . . 6272 20 Poly A-site: 4245-4250 383 366 8. 6376 ... 6453 TABLE 5 365 340 7. 6584 ... 6661 TRANSCRIPTION FACTORBINDING SITES 339 314 25 SNARE 6. 7476 ... 7553 BINDING SITES YKT6 GLUCOKINASE POLD2 AEBP1 313 288 AP1 FJ-Q2 11 11 5. 7629 ... 7753 AP1-C 15 15 7 6 287 247 AP1-Q2 9 5 4. 786O ... 7931 30 AP1-Q4 7 4 246 223 AP4-Q5 36 5 43 AP4-Q6 17 23 3. 8OSO ... 8121 ARNTO1 7 5 222 199 CEBP-O1 7 2. 8673 . . . 9014 CETS1PS4-01 6 198 85 35 CREL-01 7 DELTAEF1-01 64 12 5 50 1. 10642 ... 10893 FREACF-01 4 84 1 GATA1-02 19 GATA1-03 12 6 Stop codon 1298-1300 GATA1-04 25 6 40 GATA1-06 8 5 Poly A-site 1013-18 GATA2-O2 10 GATA3-O2 5 GATA-C 11 6 TABLE 3 GC-O1 4 GFII-01 6 Glucokinase HFEH2-01 5 45 HFH3-01 10 LOCATION HFEH8-01 4 (nucleotide no.) IK2-01 49 29 EXONS (Amino acid no.) LMO2COM-01 41 6 27 LMO2COM-02 31 5 7 1. 20485 . . . 20523 LYF1-01 10 13 6 1 13 SO MAX-01 4 2. 25133 . . . 25297 MYOD-O1 7 14 68 MYOD-Q6 32 19 7 12 3. 261.73 . . . 26.328 MZF1-01 99 40 15 94 69 120 NF1-Q6 5 7 4. 27524 . . . 27643 NFAT-Q6 43 8 7 8 121 160 ss NFKAPPAB50-01 4 5. 28S35 . . . 28630 NKX25-O1 13 14 5 161 192 NMYC-01 12 8 6. 2874O . . . 28838 S8-O1 30 4 193 225 SOXS-O1 21 2O 4 4 7. 3O765 ... 30950 SP1-Q6 8 226 287 60 SAEBP1-01 4 8. 3.1982 ... 32134 SRVO2 5 288 338 STAT-01 6 9. 32867 . . . 33097 TATA-O1 8 339 415 TCF11-01 47 28 5 19 10. 33314 ... 33460 USF-01 12 8 6 8 416 464 USF-C 16 12 12 8 65 USF-Q6 6 Stop codon 33461-3 US 8,795,959 B2 11 12 In a specific embodiment, Such noncoding sequences are or POLD2 gene may be accomplished in a number of ways. expression control sequences. These include but are not lim For example, if an amount of a portion of a SNARE YKT6 ited to DNA regulatory sequences, such as promoters, gene, the human glucokinasegene, the POLD2 gene or enhancers, repressors, terminators, and the like, that provide AEBP1 gene, or its specific RNA, or a fragment thereof, is for the regulation of expression of a coding sequence in a host 5 available and can be purified and labeled, the generated DNA cell. In eukaryotic cells, polyadenylation signals are also fragments may be screened by nucleic acid hybridization to control sequences. the labeled probe (Benton and Davis, 1977, Science 196:180; In a more specific embodiment of the invention, the expres Grunstein and Hogness, 1975, Proc. Natl. Acad. Sci. U.S.A. sion control sequences may be operatively linked to a poly 72:3961). The present invention provides such nucleic acid nucleotide encoding a heterologous polypeptide. Such 10 probes, which can be conveniently prepared from the specific expression control sequences may be about 50-200 nucle sequences disclosed herein, e.g., a hybridizable probehaving otides in length and specifically about 50, 100,200,500, 600, a nucleotide sequence corresponding to at least a 10, and 1000 or 2000 nucleotides in length. A transcriptional control preferably a 15, nucleotide fragment of the sequences sequence is "operatively linked to a polynucleotide encod depicted in SEQID NOS:5, 6, 7 or 8. Preferably, a fragment ing a heterologous polypeptide sequence when the expression 15 is selected that is highly unique to the encoded polypeptides. control sequence controls and regulates the transcription and Those DNA fragments with substantial homology to the translation of that polynucleotide sequence. The term “opera probe will hybridize. As noted above, the greater the degree of tively linked' includes having an appropriate start signal homology, the more stringent hybridization conditions can be (e.g., ATG) in front of the polynucleotide sequence to be used. In one embodiment, low stringency hybridization con expressed and maintaining the correct reading frame to per ditions are used to identify a homologous SNAREYKT6, the mit expression of the DNA sequence under the control of the human glucokinase, the AEBP1, or POLD2 polynucleotide. expression control sequence and production of the desired However, in a preferred aspect, and as demonstrated experi product encoded by the polynucleotide sequence. If a gene mentally herein, a nucleic acid encoding a polypeptide of the that one desires to insert into a recombinant DNA molecule invention will hybridize to a nucleic acid derived from the does not contain an appropriate start signal. Such a start signal 25 polynucleotide sequence depicted in SEQID NOS:5, 6, 7 or can be inserted upstream (5') of and in reading frame with the 8 or a hybridizable fragment thereof, under moderately strin gene. gent conditions; more preferably, it will hybridize under high Expression of Polypeptides stringency conditions. Isolated Polynucleotide Sequences Alternatively, the presence of the gene may be detected by The human chromosome 7 genomic clone of accession 30 assays based on the physical, chemical, or immunological number AC006454 has been discovered to contain the properties of its expressed product. For example, cDNA SNARE YKT6 gene, the human glucokinase gene, the clones, or DNA clones which hybrid-select the proper AEBP1 gene, and the POLD2 gene by Genscan analysis mRNAS, can be selected which produce a protein that, e.g., (Burge et al., 1997, J. Mol. Biol. 268:78-94), BLAST2 and has similar or identical electrophoretic migration, isoelectric TBLASTN analysis (Altschulet al., 1997, Nucl. Acids Res. 35 focusing behavior, proteolytic digestion maps, or antigenic 25:3389-3402), in which the sequence of AC006454 was properties as known for the SNARE YKT6, the human glu compared to the SNARE YKT6 cDNA sequence, accession cokinase, the AEBP1, or POLD2 polynucleotide. number NM 006555 (McNew et al., 1997, J. Biol. Chem. A gene encoding SNARE YKT6, the human glucokinase, 272:17776-177783), the human glucokinase cDNA sequence the AEBP1, or POLD2 polypeptide can also be identified by (Tanizawa et al., 1992, Mol. Endocrinol. 6:1070-1081), 40 mRNA selection, i.e., by nucleic acid hybridization followed accession number NM 000162 (major form) and M69051 by in vitro translation. In this procedure, fragments are used to (minor form), AEBP1 c)NA sequence, accession number isolate complementary mRNAs by hybridization. Immuno NM 001 129 (accession number D86479 for the osteoblast precipitation analysis or functional assays of the in vitro type) (Layne et al., 1998, J. Biol. Chem. 273:15654-15660) translation products of the products of the isolated mRNAs and the POLD2 cl DNA sequence, accession number 45 identifies the mRNA and, therefore, the complementary DNA NM 006230 (Zhanget al., 1995, Genomics 29:179-186). fragments, that contain the desired sequences. The cloning of the nucleic acid sequences of the present Nucleic Acid Constructs invention from Such genomic DNA can be effected, e.g., by The present invention also relates to nucleic acid constructs using the well known polymerase chain reaction (PCR) or comprising a polynucleotide sequence containing the exon/ antibody screening of expression libraries to detect cloned 50 intron segments of the SNARE YKT6 gene (nucleotides DNA fragments with shared structural features. See, e.g., 4320-15463 of SEQ ID NO:5), human glucokinase gene Innis et al., 1990, PCR: A Guide to Methods and Application, (nucleotides 20485-33460 of SEQ ID NO:6), AEBP1 gene Academic Press, New York. Other nucleic acid amplification (nucleotides 1301-13893 of SEQID NO:8) or POLD2 gene procedures such as ligase chain reaction (LCR), ligated acti (nucleotides 11546-18811 of SEQID NO:7) operably linked vated transcription (LAT) and nucleic acid sequence-based 55 to one or more control sequences which direct the expression amplification (NASBA) or long chain PCR may be used. In a of the coding sequence in a suitable host cell under conditions specific embodiment, 5' or 3' non-coding portions of each compatible with the control sequences. Expression will be gene may be identified by methods including but are not understood to include any step involved in the production of limited to, filter probing, clone enrichment using specific the polypeptide including, but not limited to, transcription, probes and protocols similar or identical to 5' and 3’ “RACE 60 post-transcriptional modification, translation, post-transla protocols which are well known in the art. For instance, a tional modification, and secretion. method similar to 5' RACE is available for generating the The invention is further directed to a nucleic acid construct missing 5' end of a desired full-length transcript. (Fromont comprising expression control sequences derived from SEQ Racine et al., 1993, Nucl. Acids Res. 21:1683-1684). ID NOS: 5, 6, 7 or 8 and a heterologous polynucleotide Once the DNA fragments are generated, identification of 65 Sequence. the specific DNA fragment containing the desired SNARE "Nucleic acid construct” is defined herein as a nucleic acid YKT6 gene, the human glucokinase gene, the AEBP1 gene, molecule, either single- or double-stranded, which is isolated US 8,795,959 B2 13 14 from a naturally occurring gene or which has been modified to gene. Other useful promoters for yeast host cells are contain segments of nucleic acid which are combined and described by Romanos et al., 1992, Yeast 8: 423-488. juxtaposed in a manner which would not otherwise exist in Eukaryotic promoters may be obtained from the genomes nature. The term nucleic acid construct is synonymous with of viruses such as polyoma virus, fowlpox virus, adenovirus, the term expression cassette when the nucleic acid construct 5 bovine papilloma virus, avian sarcoma virus, cytomegalovi contains all the control sequences required for expression of rus, a retrovirus, hepatitis-B virus and SV40. Alternatively, a coding sequence of the present invention. The term "coding heterologous mammalian promoters, such as the actin pro sequence' is defined herein as a portion of a nucleic acid moter or immunoglobulin promoter may be used. sequence which directly specifies the amino acid sequence of The constructs of the invention may also include enhanc its protein product. The boundaries of the coding sequence 10 ers. Enhancers are cis-acting elements of DNA, usually from about 10 to about 300 bp that act on a promoter to increase its are generally determined by a ribosome binding site transcription. Enhancers from globin, elastase, albumin, (prokaryotes) or by the ATG start codon (eukaryotes) located alpha-fetoprotein, and insulin enhancers may be used. How just upstream of the open reading frame at the 5' end of the ever, an enhancer from a virus may be used; examples include mRNA and a transcription terminator sequence located just 15 SV40 on the late side of the replication origin, the cytomega downstream of the open reading frame at the 3' end of the lovirus early promoter enhancer, the polyoma enhancer on the mRNA. A coding sequence can include, but is not limited to, late side of the replication origin and adenovirus enhancers. DNA, cDNA, and recombinant nucleic acid sequences. The control sequence may also be a suitable transcription The isolated polynucleotide of the present invention may terminator sequence, a sequence recognized by a host cell to be manipulated in a variety of ways to provide for expression terminate transcription. The terminator sequence is operably of the polypeptide. Manipulation of the nucleic acid sequence linked to the 3' terminus of the nucleic acid sequence encod prior to its insertion into a vector may be desirable or neces ing the polypeptide. Any terminator which is functional in the sary depending on the expression vector. The techniques for host cell of choice may be used in the present invention. modifying nucleic acid sequences utilizing recombinant The control sequence may also be a suitable leader DNA methods are well known in the art. 25 sequence, a nontranslated region of an mRNA which is The control sequence may be an appropriate promoter important for translation by the host cell. The leader sequence sequence, a nucleic acid sequence which is recognized by a is operably linked to the 5' terminus of the nucleic acid host cell for expression of the nucleic acid sequence. The sequence encoding the polypeptide. Any leader sequence that promoter sequence contains transcriptional control is functional in the host cell of choice may be used in the sequences which regulate the expression of the polynucle 30 present invention. otide. The promoter may be any nucleic acid sequence which The control sequence may also be a polyadenylation shows transcriptional activity in the host cell of choice includ sequence, a sequence which is operably linked to the 3' ter ing mutant, truncated, and hybrid promoters, and may be minus of the nucleic acid sequence and which, when tran obtained from genes encoding extracellular or intracellular scribed, is recognized by the host cell as a signal to add polypeptides either homologous or heterologous to the host 35 polyadenosine residues to transcribed mRNA. Any polyade cell. nylation sequence which is functional in the host cell of Examples of suitable promoters for directing the transcrip choice may be used in the present invention. tion of the nucleic acid constructs of the present invention, The control sequence may also be a signal peptide coding especially in a bacterial host cell, are the promoters obtained region, which codes for an amino acid sequence linked to the from the E. colilac operon, the prokaryotic beta-lactamase 40 amino terminus of the polypeptide which can direct the gene (VIIIa-Komaroffet al., 1978, Proc. Natl. Acad. Sci. USA encoded polypeptide into the cells secretory pathway. The 5' 75: 3727-3731), as well as the tac promoter (DeBoer et al., end of the coding sequence of the nucleic acid sequence may 1983, Proc. Natl. Acad. of Sciences USA 80: 21-25). Further inherently contain a signal peptide coding region naturally promoters are described in “Useful proteins from recombi linked in translation reading frame with the segment of the nant bacteria' in Scientific American, 1980, 242: 74-94; and 45 coding region which encodes the secreted polypeptide. Alter in Sambrook et al., 1989, supra. natively, the 5' end of the coding sequence may contain a Examples of suitable promoters for directing the transcrip signal peptide coding region which is foreign to the coding tion of the nucleic acid constructs of the present invention in sequence. The foreign signal peptide coding region may be a filamentous fungal host cell are promoters obtained from required where the coding sequence does not normally con the genes encoding Aspergillus Oryzae TAKA amylase, Rhi 50 tain a signal peptide coding region. Alternatively, the foreign zomucor mieheiaspartic proteinase, Aspergillus niger neutral signal peptide coding region may simply replace the natural alpha-amylase, Aspergillus niger acid stable alpha-amylase, signal peptide coding region in order to obtain enhanced Aspergillus niger or Aspergillus awamori glucoamylase secretion of the polypeptide. However, any signal peptide (glaA), Rhizomucor miehei lipase, Aspergillus Oryzae alka coding region which directs the expressed polypeptide into line protease, Aspergillus Oryzae triose phosphate isomerase, 55 the secretory pathway of a host cell of choice may be used in Aspergillus nidulans acetamidase, Fusarium oxysporum the present invention. trypsin-like protease (WO 96/00787), NA2-tpi (a hybrid of The control sequence may also be a propeptide coding the promoters from the genes encoding Aspergillus niger region, which codes for an amino acid sequence positioned at neutral alpha-amylase and Aspergillus Oryzae triose phos the amino terminus of a polypeptide. The resultant polypep phate isomerase), and mutant, truncated, and hybrid promot 60 tide is known as a proenzyme or propolypeptide (or a ers thereof. Zymogen in Some cases). A propolypeptide is generally inac In a yeast host, useful promoters are obtained from the tive and can be converted to a mature active polypeptide by Saccharomyces cerevisiae enolase (ENO-1) gene, the Sac catalytic or autocatalytic cleavage of the propeptide from the charomyces cerevisiae galactokinase gene (GAL1), the Sac propolypeptide. The propeptide coding region may be charomyces cerevisiae alcohol dehydrogenase/glyceralde 65 obtained from the Bacillus subtilis alkaline protease gene hyde-3-phosphate dehydrogenase genes (ADH2/GAP), and (aprE), the Bacillus subtilis neutral protease gene (nprT), the the Saccharomyces cerevisiae 3-phosphoglycerate kinase Saccharomyces cerevisiae alpha-factor gene, the Rhizomucor US 8,795,959 B2 15 16 mieheiaspartic proteinase gene, or the Myceliophthora ther Examples of bacterial selectable markers are the dal genes mophila laccase gene (WO95/33836). from Bacillus subtilis or Bacillus licheniformis, or markers Where both signal peptide and propeptide regions are which confer antibiotic resistance Such as amplicillin, kana present at the amino terminus of a polypeptide, the propeptide mycin, chloramphenicol or tetracycline resistance. Suitable region is positioned next to the amino terminus of a polypep markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, tide and the signal peptide region is positioned next to the MET3, TRP1, and URA3. An example of suitable selectable amino terminus of the propeptide region. markers for mammalian cells are those that enable the iden It may also be desirable to add regulatory sequences which tification of cells competent to take of the nucleic acids of the allow the regulation of the expression of the polypeptide present invention, such as DHFR or thymidine kinase. An relative to the growth of the host cell. Examples of regulatory 10 systems are those which cause the expression of the gene to be appropriate host cell when wild-type DHFR is employed is turned on or offin response to a chemical or physical stimu the CHO cell line deficient in DHFR activity, prepared and lus, including the presence of a regulatory compound. Regu propagated as described by Urlaub et al., Proc. Natl. Acad. latory systems in prokaryotic systems would include the lac, Sci. USA, 77:4216 (1980). tac, and trp operator systems. In yeast, the ADH2 system or 15 The vectors of the present invention preferably contain an GAL1 system may be used. In filamentous fungi, the TAKA element(s) that permits stable integration of the vector into alpha-amylase promoter, Aspergillus niger glucoamylase the host cell genome or autonomous replication of the vector promoter, and the Aspergillus Oryzae glucoamylase promoter in the cell independent of the genome of the cell. may be used as regulatory sequences. Other examples of For integration into the host cell genome, the vector may regulatory sequences are those which allow for gene ampli rely on the polynucleotide sequence encoding the polypep fication. In eukaryotic systems, these include the dihydro tide or any other element of the vector for stable integration of folate reductase gene which is amplified in the presence of the vector into the genome by homologous or nonhomolo methotrexate, and the metallothionein genes which are ampli gous recombination. Alternatively, the vector may contain fied with heavy metals. In these cases, the nucleic acid additional nucleic acid sequences for directing integration by sequence encoding the polypeptide would be operably linked 25 homologous recombination into the genome of the host cell. with the regulatory sequence. The additional polynucleotide sequences enable the vector to Expression Vectors be integrated into the host cell genome at a precise location(s) The present invention also relates to recombinant expres in the chromosome(s). To increase the likelihood of integra sion vectors comprising a nucleic acid sequence of the present tion at a precise location, the integrational elements should invention, a promoter, and transcriptional and translational 30 preferably contain a sufficient number of nucleic acids, Such stop signals. The various nucleic acid and control sequences as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, described above may be joined together to produce a recom and most preferably 800 to 1,500 base pairs, which are highly binant expression vector which may include one or more homologous with the corresponding target sequence to convenient restriction sites to allow for insertion or substitu enhance the probability of homologous recombination. The tion of the nucleic acid sequence encoding the polypeptide at 35 integrational elements may be any sequence that is homolo such sites. Alternatively, the polynucleotide of the present gous with the target sequence in the genome of the host cell. invention may be expressed by inserting the nucleic acid Furthermore, the integrational elements may be non-encod sequence or a nucleic acid construct comprising the sequence ing or encoding nucleic acid sequences. On the other hand, into an appropriate vector for expression. In creating the the vector may be integrated into the genome of the host cell expression vector, the coding sequence is located in the vector 40 by non-homologous recombination. so that the coding sequence is operably linked with the appro For autonomous replication, the vector may further com priate control sequences for expression. prise an enabling the vector to replicate The recombinant expression vector may be any vector autonomously in the host cell in question. Examples of bac (e.g., a plasmidor virus) which can be conveniently subjected terial origins of replication are the origins of replication of to recombinant DNA procedures and can bring about the 45 plasmids pBR322, p OC19, p.ACYC177, and pACYC184 per expression of the nucleic acid sequence. The choice of the mitting replication in E. coli, and puB110, pE 194, pTA1060, vector will typically depend on the compatibility of the vector and pAMS.1 permitting replication in Bacillus. Examples of with the host cell into which the vector is to be introduced. origins of replication for use in a yeast host cell are the 2 The vectors may be linear or closed circular plasmids. micron origin of replication, ARS1, ARS4, the combination The vector may be an autonomously replicating vector, i.e., 50 of ARS1 and CEN3, and the combination of ARS4 and CEN6. a vector which exists as an extrachromosomal entity, the The origin of replication may be one having a mutation which replication of which is independent of chromosomal replica makes its functioning temperature-sensitive in the host cell tion, e.g., a plasmid, an extrachromosomal element, a min (see, e.g., Ehrlich, 1978, Proceedings of the National Acad ichromosome, or an artificial chromosome. The vector may emy of Sciences USA 75: 1433). contain any means for assuring self-replication. Alternatively, 55 More than one copy of a polynucleotide sequence of the the vector may be one which, when introduced into the host present invention may be inserted into the host cell to increase cell, is integrated into the genome and replicated together production of the gene product. An increase in the copy with the chromosome(s) into which it has been integrated. number of the polynucleotide sequence can be obtained by Furthermore, a single vector or plasmid or two or more vec integrating at least one additional copy of the sequence into tors or plasmids which together contain the total DNA to be 60 the host cell genome or by including an amplifiable selectable introduced into the genome of the host cell, or a transposon marker gene with the nucleic acid sequence where cells con may be used. taining amplified copies of the selectable marker gene, and The vectors of the present invention preferably contain one thereby additional copies of the nucleic acid sequence, can be or more selectable markers which permit easy selection of selected for by cultivating the cells in the presence of the transformed cells. A selectable marker is a gene the product of 65 appropriate selectable agent. which provides for biocide or viral resistance, resistance to The procedures used to ligate the elements described above heavy metals, prototrophy to auxotrophs, and the like. to construct the recombinant expression vectors of the present US 8,795,959 B2 17 18 invention are well known to one skilled in the art (see, e.g., plex polysaccharides. Vegetative growth is by hyphal elonga Sambrook et al., 1989, supra). tion and carbon catabolism is obligately aerobic. In contrast, Host Cells Vegetative growth by yeasts such as Saccharomyces cerevi The present invention also relates to recombinant host siae is by budding of a unicellular thallus and carbon catabo cells, comprising a nucleic acid sequence of the invention, lism may be fermentative. which are advantageously used in the recombinant produc Fungal cells may be transformed by a process involving tion of the polypeptides. A vector comprising a nucleic acid protoplast formation, transformation of the protoplasts, and sequence of the present invention is introduced into a host cell regeneration of the cell wall in a manner known perse. Suit so that the vector is maintained as a chromosomal integrant or able procedures for transformation of Aspergillus host cells as a self-replicating extra-chromosomal vector as described 10 are described in EP238 023 and Yelton et al., 1984, Proceed earlier. The term "host cell encompasses any progeny of a ings of the National Academy of Sciences USA 81: 1470 parent cell that is not identical to the parent cell due to muta 1474. Suitable methods for transforming Fusarium species tions that occur during replication. The choice of a host cell are described by Malardier et al., 1989, Gene 78: 147-156 and will to a large extent depend upon the gene encoding the WO 96/00787. Yeast may be transformed using the proce polypeptide and its source. 15 dures described by Becker and Guarente. In Abelson, J. N. The host cell may be a unicellular microorganism, e.g., a and Simon, M. I., editors, Guide to Yeast Genetics and prokaryote, or a non-unicellular microorganism, e.g., a , Methods in Enzymology, Volume 194, pp eukaryote. Useful unicellular cells are bacterial cells such as 182-187, Academic Press, Inc., New York; I to et al., 1983, gram positive bacteria including, but not limited to, a Bacillus Journal of Bacteriology 153: 163; and Hinnen et al., 1978, cell, or a Streptomyces cell, e.g., Streptomyces lividans or Proc. e Natl Acadf Scis USA 75: 1920. Streptomyces murinus, or gram negative bacteria Such as E. Methods of Production coli and Pseudomonas sp. The present invention also relates to methods for producing The introduction of a vector into a bacterial host cell may, a polypeptide of the present invention comprising (a) culti for instance, be effected by protoplast transformation (see, Vating a host cell under conditions conducive for production e.g., Chang and Cohen, 1979, Molecular General Genetics 25 of the polypeptide; and (b) recovering the polypeptide. 168: 111-115), using competent cells (see, e.g., Young and In the production methods of the present invention, the Spizizin, 1961, Journal of Bacteriology 81:823-829, or Dub cells are cultivated in a nutrient medium suitable for produc nau and Davidoff-Abelson, 1971, Journal of Molecular Biol tion of the polypeptide using methods known in the art. For ogy 56: 209-221), electroporation (see, e.g., Shigekawa and example, the cell may be cultivated by shake flask cultivation, Dower, 1988, Biotechniques 6: 742-751), or conjugation (see, 30 Small-scale or large-scale fermentation (including continu e.g., Koehler and Thorne, 1987, Journal of Bacteriology 169: ous, batch, fed-batch, or solid state fermentations) in labora 5771-5278). tory or industrial fermentors performed in a suitable medium The host cell may be a eukaryote. Such as a mammaliancell and under conditions allowing the polypeptide to be (e.g., human cell), an insect cell, a plant cell or a fungal cell. expressed and/or isolated. The cultivation takes place in a Mammalian host cells that could be used include but are not 35 Suitable nutrient medium comprising carbon and nitrogen limited to human Hela, embryonic kidney cells (293), lung Sources and inorganic salts, using procedures known in the cells, H9 and Jurkat cells, mouse NIH3T3 and C127 cells, art. Suitable media are available from commercial suppliers Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and or may be prepared according to published compositions Chinese Hamster ovary (CHO) cells. These cells may be (e.g., in catalogues of the AmericanType Culture Collection). transfected with a vector containing a transcriptional regula 40 If the polypeptide is secreted into the nutrient medium, the tory sequence, a protein coding sequence and transcriptional polypeptide can be recovered directly from the medium. If the termination sequences. Alternatively, the polypeptide can be polypeptide is not secreted, it can be recovered from cell expressed in stable cell lines containing the polynucleotide lysates. integrated into a chromosome. The co-transfection with a The polypeptides may be detected using methods known in selectable marker Such as dhfr, gpt, neomycin, hygromycin 45 the art that are specific for the polypeptides. These detection allows the identification and isolation of the transfected cells. methods may include use of specific antibodies, formation of The host cell may be a fungal cell. "Fungi as used herein an enzyme product, or disappearance of an enzyme Substrate. includes the phyla Ascomycota, Basidiomycota, Chytridi In a specific embodiment, an enzyme assay may be used to omycota, and Zygomycota (as defined by Hawksworth et al., determine the activity of the polypeptide. For example, In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edi 50 AEBP1 activity can be determined by measuring carbox tion, 1995, CAB International, University Press, Cambridge, ypeptidase activity as described by Muise and Ro, 1999, UK) as well as the Oomycota (as cited in Hawksworth et al., Biochem. J. 343:341-345. Here, the conversion of hippuryl 1995, supra, page 171) and all mitosporic fungi (Hawksworth L-arginine, hippuryl-L-lysine or hippuryl-L-phenylalanine to et al., 1995, supra). The fungal host cell may also be a yeast hippuric acid may be monitored spectrophotometrically. cell. OeastO as used herein includes ascosporogenous yeast 55 POLD2 activity may be detected by assaying for DNA poly (Endomycetales), basidiosporogenous yeast, and yeast merase activity (see, for example, Ng et al., 1991, J. Biol. belonging to the Fungi Imperfecti (Blastomycetes). Since the Chem. 266:11699-11704). classification of yeast may change in the future, for the pur The resulting polypeptide may be recovered by methods poses of this invention, yeast shall be defined as described in known in the art. For example, the polypeptide may be recov Biology and Activities of Yeast (Skinner, F. A., Passmore, S. 60 ered from the nutrient medium by conventional procedures M., and Davenport, R. R., eds, Soc. App. Bacteriol. Sympo including, but not limited to, centrifugation, filtration, extrac sium Series No. 9, 1980). The fungal host cell may also be a tion, spray-drying, evaporation, or precipitation. filamentous fungal cell. "Filamentous fungi' include all fila The polypeptides of the present invention may be purified mentous forms of the subdivision Eumycota and Oomycota by a variety of procedures known in the art including, but not (as defined by Hawksworth et al., 1995, supra). The filamen 65 limited to, chromatography (e.g., ion exchange, affinity, tous fungi are characterized by a mycelial wall composed of hydrophobic, chromatofocusing, and size exclusion), electro chitin, cellulose, glucan, chitosan, mannan, and other com phoretic procedures (e.g., preparative isoelectric focusing, US 8,795,959 B2 19 20 differential Solubility (e.g., ammonium sulfate precipitation), Antibody fragments which contain the idiotype of the anti SDS-PAGE, or extraction (see, e.g., Protein Purification, body molecule can be generated by known techniques. For J.-C. Janson and Lars Ryden, editors, VCH Publishers, New example, Such fragments include but are not limited to: the York, 1989). F(ab')2 fragment which can be produced by pepsin digestion Antibodies 5 of the antibody molecule; the Fab' fragments which can be According to the invention, the SNARE YKT6, human generated by reducing the disulfide bridges of the F(ab')2. glucokinase, AEBP1 or POLD2 polypeptides produced fragment, and the Fab fragments which can be generated by according to the method of the present invention may be used treating the antibody molecule with papain and a reducing as an immunogen to generate any of these polypeptides. Such agent. antibodies include but are not limited to polyclonal, mono 10 In the production of antibodies, screening for the desired clonal, chimeric, single chain, Fab fragments, and an Fab antibody can be accomplished by techniques known in the art, expression library. e.g., radioimmunoassay, ELISA (enzyme-linked immunosor Various procedures known in the art may be used for the bent assay), “sandwich’immunoassays, immunoradiometric production of antibodies. For the production of antibody, 15 assays, gel diffusion precipitin reactions, immunodiffusion various host animals can be immunized by injection with the assays, in situ immunoassays (using colloidal gold, enzyme polypeptide thereof, including but not limited to rabbits, or radioisotope labels, for example), western blots, precipi mice, rats, sheep,goats, etc. In one embodiment, the polypep tation reactions, agglutination assays (e.g., gel agglutination tide or fragment thereof can optionally be conjugated to an assays, hemagglutination assays), complement fixation immunogenic carrier, e.g., bovine serum albumin (BSA) or 20 assays, immunofluorescence assays, protein A assays, and keyhole limpet hemocyanin (KLH). Various adjuvants may immunoelectrophoresis assays, etc. In one embodiment, anti be used to increase the immunological response, depending body binding is detected by detecting a label on the primary on the host species, including but not limited to Freund's antibody. In another embodiment, the primary antibody is (complete and incomplete), mineral gels such as aluminum detected by detecting binding of a secondary antibody or hydroxide, active Substances Such as lysolecithin, 25 reagent to the primary antibody. In a further embodiment, the pluronic polyols, polyanions, peptides, oil emulsions, key secondary antibody is labeled. Many means are known in the hole limpet hemocyanins, dinitrophenol, and potentially use art for detecting binding in an immunoassay and are within ful human adjuvants such as BCG (bacille Calmette-Guerin) the scope of the present invention. For example, to select and Corynebacterium parvum. antibodies which recognize a specific epitope of a particular For preparation of monoclonal antibodies directed toward 30 polypeptide, one may assay generated hybridomas for a prod the SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptide, anytechnique that provides for the production of uct which binds to a particular polypeptide fragment contain antibody molecules by continuous cell lines in culture may be ing Such epitope. For selection of an antibody specific to a used. These include but are not limited to the hybridoma particular polypeptide from a particular species of animal, technique originally developed by Kohler and Milstein (1975, 35 one can select on the basis of positive binding with the Nature 256:495-497), as well as the trioma technique, the polypeptide expressed by or isolated from cells of that species human B-cell hybridoma technique (Kozbor et al., 1983, of animal. Immunology Today 4:72), and the EBV-hybridoma technique Immortal, antibody-producing cell lines can also be cre to produce human monoclonal antibodies (Cole et al., 1985, ated by techniques other than fusion, such as direct transfor in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, 40 mation of B lymphocytes with oncogenic DNA, or transfec Inc., pp. 77-96). In an additional embodiment of the inven tion with Epstein-Barr virus. See, e.g., M. Schreier et al., tion, monoclonal antibodies can be produced in germ-free “Hybridoma Techniques” (1980); Hammerling et al., “Mono animals utilizing recent technology (PCT/US90/02545). clonal Antibodies And T-cell Hybridomas' (1981); Kennettet According to the invention, human antibodies may be used al., “Monoclonal Antibodies” (1980); see also U.S. Pat. Nos. and can be obtained by using human hybridomas (Cote et al., 45 4,341,761; 4,399,121: 4,427,783; 4,444,887; 4,451.570; 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or by 4,466,917; 4,472,500; 4,491,632: 4,493,890. transforming human B cells with EBV virus in vitro (Cole et Uses of Polynucleotides al., 1985, in Monoclonal Antibodies and Cancer Therapy, Diagnostics Alan R. Liss, pp. 77-96). In fact, according to the invention, Polynucleotides containing noncoding regions of SEQID techniques developed for the production of "chimeric anti- 50 NOS:5, 6, 7 or 8 may be used as probes for detecting muta bodies” (Morrison et al., 1984, J. Bacteriol. 159-870; Neu tions from samples from a patient. Genomic DNA may be berger et al., 1984, Nature 312:604-608: Takeda et al., 1985, isolated from the patient. A mutation(s) may be detected by Nature 314:452-454) by splicing the genes from a mouse Southern blot analysis, specifically by hybridizing restriction antibody molecule specific for the SNARE YKT6, human digested genomic DNA to various probes and Subjecting to glucokinase, AEBP1 or POLD2 polypeptide together with 55 agarose electrophoresis. genes from a human antibody molecule of appropriate bio Polynucleotides containing noncoding regions may be logical activity can be used; Such antibodies are within the used as PCR primers and may be used to amplify the genomic Scope of this invention. DNA isolated from the patients. Additionally, primers may be According to the invention, techniques described for the obtained by routine or long range PCR, that can yield prod production of single chain antibodies (U.S. Pat. No. 4,946, 60 ucts containing more than one exon and intervening intron. 778) can be adapted to produce polypeptide-specific single The sequence of the amplified genomic DNA from the patient chain antibodies. An additional embodiment of the invention may be determined using methods known in the art. Such utilizes the techniques described for the construction of Fab probes may be between 10-100 nucleotides in length and may expression libraries (Huse et al., 1989, Science 246:1275 preferably be between 20-50 nucleotides in length. 1281) to allow rapid and easy identification of monoclonal 65 Thus the invention is thus directed to kits comprising these Fab fragments with the desired specificity for the SNARE polynucleotide probes. In a specific embodiment, these YKT6, AEBP1, human glucokinase or POLD2 polypeptides. probes are labeled with a detectable substance. US 8,795,959 B2 21 22 Antisense Oligonucleotides and Mimetics mately bringing into association the active ingredients with The invention is further directed to antisense oligonucle liquid carriers or finely divided solid carriers or both, and otides and mimetics to these polynucleotide sequences. Anti then, if necessary, shaping the product. sense technology can be used to control gene expression The compositions of the present invention may be formu through triple-helix formation or antisense DNA or RNA, lated into any of many possible dosage forms such as, but not both of which methods are based on binding of a polynucle limited to, tablets, capsules, liquid syrups, softgels, Supposi otide to DNA or RNA. A DNA oligonucleotide is designed to tories, and enemas. The compositions of the present invention be complementary to a region of the gene involved in tran may also be formulated as Suspensions in aqueous, non scription or RNA processing (triple helix (see Lee et al., Nucl. aqueous or mixed media. Aqueous Suspensions may further Acids Res., 6:3073 (1979); Cooney et al, Science, 241:456 10 contain Substances which increase the Viscosity of the Sus (1988); and Dervanet al., Science, 251: 1360 (1991)), thereby pension including, for example, sodium carboxymethylcel preventing transcription and the production of said polypep lulose, Sorbitol and/or dextran. The Suspension may also con tides. tain stabilizers. The antisense oligonucleotides or mimetics of the present In one embodiment of the present invention, the pharma invention may be used to decrease levels of a polypeptide. For 15 ceutical compositions may be formulated and used as foams. example, SNARE YKT6 has been found to be essential for Pharmaceutical foams include formulations such as, but not vesicle-associated endoplasmic reticulum-Golgi transport limited to, emulsions, microemulsions, creams, jellies and and cell growth. Therefore, the SNARE YKT6 antisense oli liposomes. While basically similar in nature these formula gonucleotides of the present invention could be used to inhibit tions vary in the components and the consistency of the final cell growth and in particular, to treat or prevent tumor growth. product. The preparation of such compositions and formula POLD2 is necessary for DNA replication. POLD2 antisense tions is generally known to those skilled in the pharmaceuti sequences could also be used to inhibit cell growth. Glucoki cal and formulation arts and may be applied to the formula nase and AEBP1 antisense sequences may be used to treat tion of the compositions of the present invention. hyperglycemia. The formulation of therapeutic compositions and their sub The antisense oligonucleotides of the present invention 25 sequent administration is believed to be within the skill of may beformulated into pharmaceutical compositions. These those in the art. Dosing is dependent on severity and respon compositions may be administered in a number of ways siveness of the disease state to be treated, with the course of depending upon whether local or systemic treatment is treatment lasting from several days to several months, or until desired and upon the area to be treated. Administration may a cure is effected or a diminution of the disease state is be topical (including ophthalmic and to mucous membranes 30 achieved. Optimal dosing schedules can be calculated from including vaginal and rectal delivery), pulmonary, e.g., by measurements of drug accumulation in the body of the inhalation or insufflation of powders or aerosols, including by patient. Persons of ordinary skill can easily determine opti nebulizer; intratracheal, intranasal, epidermal and transder mum dosages, dosing methodologies and repetition rates. mal), oral or parenteral. Parenteral administration includes Optimum dosages may vary depending on the relative intravenous, intraarterial, Subcutaneous, intraperitoneal or 35 potency of individual oligonucleotides, and can generally be intramuscular injection or infusion; or intracranial, e.g., estimated based on EC50 as found to be effective in in vitro intrathecal or intraventricular, administration. and in vivo animal models. Pharmaceutical compositions and formulations for topical In general, dosage is from 0.01 ug to 10 g per kg of body administration may include transdermal patches, ointments, weight, and may be given once or more daily, weekly, lotions, creams, gels, drops, Suppositories, sprays, liquids and 40 monthly or yearly, or even once every 2 to 20 years. Persons powders. Conventional pharmaceutical carriers, aqueous, of ordinary skill in the art can easily estimate repetition rates powder or oily bases, thickeners and the like may be neces for dosing based on measured residence times and concen sary or desirable. trations of the drug in bodily fluids or tissues. Following Compositions and formulations for oral administration successful treatment, it may be desirable to have the patient include powders or granules, Suspensions or Solutions in 45 undergo maintenance therapy to prevent the recurrence of the water or non-aqueous media, capsules, Sachets or tablets. disease state, wherein the oligonucleotide is administered in Thickeners, flavoring agents, diluents, emulsifiers, dispersing maintenance doses, ranging from 0.01 ug to 10 g per kg of aids or binders may be desirable. body weight, once or more daily, to once every 20 years. Compositions and formulations for parenteral, intrathecal Gene Therapy or intraventricular administration may include sterile aque 50 As noted above, SNARE YKT6 is necessary for cell ous solutions which may also contain buffers, diluents and growth, POLD2 is involved in DNA replication and repair, other suitable additives such as, but not limited to, penetration AEBP1 is involved in repressing adipogenesis and glucoki enhancers, carrier compounds and other pharmaceutically nase is involved in glucose sensing in pancreatic islet beta acceptable carriers or excipients. cells and liver. Therefore, the SNARE YKT6 gene may be Pharmaceutical compositions of the present invention 55 used to modulate or prevent cell apoptosis and treat Such include, but are not limited to, Solutions, emulsions, and disorders as virus-induced lymphocyte depletion (AIDS); liposome-containing formulations. These compositions may cell death in neurodegenerative disorders characterized by the be generated from a variety of components that include, but gradual loss of specific sets of neurons (e.g., Alzheimer's are not limited to, preformed liquids, self-emulsifying solids Disease, Parkinson's disease, ALS, retinitis pigmentosa, Spi and self-emulsifying semisolids. 60 nal muscular atrophy and various forms of cerebellar degen The pharmaceutical formulations of the present invention, eration), cell death in blood cell disorders resulting from which may conveniently be presented in unit dosage form, deprivation of growth factors (anemia associated with may be prepared according to conventional techniques well chronic disease, aplastic anemia, chronic neutropenia and known in the pharmaceutical industry. Such techniques myelodysplastic syndromes) and disorders arising out of an include the step of bringing into association the active ingre 65 acute loss of blood flow (e.g., myocardial infarctions and dients with the pharmaceutical carrier(s) or excipient(s). In stroke). The glucokinase gene may be used to treat diabetes general, the formulations are prepared by uniformly and inti mellitus. The AEBP1 gene may be used to modulate or inhibit US 8,795,959 B2 23 24 adipogenesis and treat obesity, diabetes mellitus and/or tional expression while maintaining the benefit of the safety osteopenic disorders. POLD2 may be used to treat defects in associated with non-viral transfections. DNA repair such as Xeroderma pigmentosum, progeria and Chemical/Physical Vectors ataxia telangiectasia. Other methods to directly introduce genes into cells or As described herein, the polynucleotide of the present exploit receptors on the surface of cells include the use of invention may be introduced into a patient’s cells for thera liposomes and lipids, ligands for specific cell Surface recep peutic uses. As will be discussed in further detail below, cells tors, cell receptors, and calcium phosphate and other chemi can be transfected using any appropriate means, including cal mediators, microinjections directly to single cells, elec viral vectors, as shown by the example, chemical transfec troporation and homologous recombination. Liposomes are tants, or physico-mechanical methods such as electroporation 10 commercially available from Gibco BRL, for example, as and direct diffusion of DNA. See, for example, Wolff, Jon A, LIPOFECTIN and LIPOFECTACE, which are formed of et al., “Direct gene transfer into mouse muscle in vivo.” cationic lipids such as N-1-(2.3 dioleyloxy)-propyl-nnn Science, 247, 1465-1468, 1990; and Wolff, Jon A, “Human trimethylammonium chloride (DOTMA) and dimethyl dio dystrophin expression in mdx mice after intramuscular injec ctadecylammonium bromide (DDAB). Numerous methods tion of DNA constructs.” Nature, 352,815-818, 1991. As used 15 are also published for making liposomes, known to those herein, Vectors are agents that transport the gene into the cell skilled in the art. without degradation and include a promoter yielding expres For example, Nucleic acid-Lipid Complexes—Lipid car sion of the gene in the cells into which it is delivered. As will riers can be associated with naked nucleic acids (e.g., plasmid be discussed in further detail below, promoters can be general DNA) to facilitate passage through cellular membranes. Cat promoters, yielding expression in a variety of mammalian ionic, anionic, or neutral lipids can be used for this purpose. cells, or cell specific, or even nuclear versus cytoplasmic However, cationic lipids are preferred because they have been specific. These are known to those skilled in the art and can be shown to associate better with DNA which, generally, has a constructed using standard molecular biology protocols. Vec negative charge. Cationic lipids have also been shown to tors have been divided into two classes: mediate intracellular delivery of plasmid DNA (Felgner and a) Biological agents derived from viral, bacterial or other 25 Ringold, Nature 337:387 (1989)). Intravenous injection of SOUCS. cationic lipid-plasmid complexes into mice has been shown b) Chemical physical methods that increase the potential to result in expression of the DNA in lung (Brigham et al., for gene uptake, directly introduce the gene into the Am. J. Med. Sci. 298:278 (1989)). See also, Osaka et al., J. nucleus or target the gene to a cell receptor. Pharm. Sci. 85(6):612-618 (1996); San et al., Human Gene Biological Vectors 30 Therapy 4:781-788 (1993); Senior et al., Biochemica et Bio Viral vectors have higher transaction (ability to introduce physica Acta 1070:173-179 (1991); Kabanov and Kabanov, genes) abilities than do most chemical or physical methods to Bioconjugate Chem. 6:7-20 (1995); Remy et al., Bioconju introduce genes into cells. Vectors that may be used in the gate Chem. 5:647-654 (1994); Behr, J.-P. Bioconjugate Chem present invention include viruses, such as adenoviruses, 5:382-389 (1994); Behr et al., Proc. Natl. Acad. Sci., USA adeno associated virus (AAV), Vaccinia, herpesviruses, bacu 35 86:6982-6986 (1989); and Wyman et al., Biochem. 36:3008 loviruses and retroviruses, bacteriophages, cosmids, plas 3017 (1997). mids, fungal vectors and other recombination vehicles typi Cationic lipids are known to those of ordinary skill in the cally used in the art which have been described for expression art. Representative cationic lipids include those disclosed, for in a variety of eukaryotic and prokaryotic hosts, and may be example, in U.S. Pat. No. 5,283,185; and e.g., U.S. Pat. No. used for genetherapy as well as for simple protein expression. 40 5,767,099. In a preferred embodiment, the cationic lipid is Polynucleotides are inserted into vector genomes using meth N4-spermine cholesteryl carbamate (GL-67) disclosed in ods well known in the art. U.S. Pat. No. 5,767,099. Additional preferred lipids include Retroviral vectors are the vectors most commonly used in N4-spermidine cholestry1 carbamate (GL-53) and 1-(N4 clinical trials, since they carry a larger genetic payload than spermind)-2,3-dilaurylglycerol carbamate (GL-89). other viral vectors. However, they are not useful in non 45 The vectors of the invention may be targeted to specific proliferating cells. Adenovirus vectors are relatively stable cells by linking a targeting molecule to the vector. A targeting and easy to work with, have high titers, and can be delivered molecule is any agent that is specific for a cell or tissue type in aerosol formulation. Pox viral vectors are large and have of interest, including for example, a ligand, antibody, Sugar, several sites for inserting genes, they are thermostable and receptor, or other binding molecule. can be stored at room temperature. 50 Invention vectors may be delivered to the target cells in a Examples of promoters are SP6, T4, T7, SV40 early pro Suitable composition, either alone, or complexed, as provided moter, cytomegalovirus (CMV) promoter, mouse mammary above, comprising the vector and a suitably acceptable car tumor virus (MMTV) steroid-inducible promoter, Moloney rier. The vector may be delivered to target cells by methods murine leukemia virus (MMLV) promoter, phosphoglycerate known in the art, for example, intravenous, intramuscular, kinase (PGK) promoter, and the like. Alternatively, the pro 55 intranasal, Subcutaneous, intubation, lavage, and the like. The moter may be an endogenous adenovirus promoter, for vectors may be delivered via in vivo or ex vivo applications. example the E1 a promoter or the Ad2 major late promoter In vivo applications involve the direct administration of an (MLP). Similarly, those of ordinary skill in the art can con adenoviral vector of the invention formulated into a compo struct adenoviral vectors utilizing endogenous or heterolo sition to the cells of an individual. Ex vivo applications gous poly A addition signals. 60 involve the transfer of the adenoviral vector directly to har Plasmids are not integrated into the genome and the vast Vested autologous cells which are maintained in vitro, fol majority of them are present only from a few weeks to several lowed by readministration of the transduced cells to a recipi months, so they are typically very safe. However, they have ent. lower expression levels than retroviruses and since cells have In a specific embodiment, the vector is transfected into the ability to identify and eventually shut down foreign gene 65 antigen-presenting cells. Suitable sources of antigen-present expression, the continuous release of DNA from the polymer ing cells (APCs) include, but are not limited to, whole cells to the target cells Substantially increases the duration of func such as dendritic cells or macrophages; purified MHC class 1 US 8,795,959 B2 25 26 molecule complexed to S2-microglobulin and foster antigen ments are intended to be within the scope of this invention. presenting cells. In a specific embodiment, the vectors of the Indeed, various modifications of the invention in addition to present invention may be introduced into T cells or B cells those shown and described herein will become apparent to using methods known in the art (see, for example, Tsokos and those skilled in the art from the foregoing description. Such Nepom, 2000, J. Clin. Invest. 106:181-183). The invention described and claimed herein is not to be modifications are also intended to fall within the scope of the limited in scope by the specific embodiments herein dis appended claims. closed, since these embodiments are intended as illustrations Various references are cited herein, the disclosure of which of several aspects of the invention. Any equivalent embodi are incorporated by reference in their entireties.

SEQUENCE LISTING

<16 Os NUMBER OF SEO ID NOS: 8

<21 Oc SEO ID NO 1 <211 LENGTH: 198 <212> TYPE PRT <213> ORGANISM: Homo sapiens

SEQUENCE: 1.

Met Lys Lell Tyr Ser Luell Ser Wall Lell Tyr Gly Glu Ala Lys Wall 1. 5 15

Wall Lell Lell Lys Ala Ala Tyr Asp Wall Ser Ser Phe Ser Phe Phe Glin 25 3 O

Arg Ser Ser Wall Glin Glu Phe Met Thir Phe Thir Ser Glin Tell Ile Wall 35 4 O 45

Glu Arg Ser Ser Gly Thir Arg Ala Ser Wall Lys Glu Glin Asp SO 55 60

Luell His Wall Wall Arg Asn Asp Ser Luell Ala Gly Wall Wall 65 70

Ala Asp Asn Glu Tyr Pro Ser Arg Wall Ala Phe Thir Luell Tell Glu 85 90 95

Wall Lell Asp Glu Phe Ser Glin Wall Asp Arg Ile Asp Trp Pro Wall 105 110

Gly Ser Pro Ala Thir Ile His Tyr Pro Ala Luell Asp Gly His Lell Ser 115 12O 125

Arg Tyr Glin Asn Pro Arg Glu Ala Asp Pro Met Thir Wall Glin Ala 13 O 135 14 O

Glu Lell Asp Glu Thr Lys Ile Ile Lell His Asn Thir Met Glu Ser Lell 145 15 O 155 16 O

Luell Glu Arg Gly Glu Luell Asp Asp Lell Wall Ser Ser Glu Wall 1.65 17 O

Luell Gly Thir Glin Ser Ala Phe Tyr Thir Ala Arg Lys Glin Asn 18O 185 190

Ser Cys Ala Ile Met 195

<21 Os SEQ ID NO 2 &211s LENGTH: 464 212s. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs SEQUENCE: 2

Met Pro Arg Pro Arg Ser Glin Lieu Pro Glin Pro Asn Ser Glin Wall Glu 1. 5 15

Glin Ile Lell Ala Glu Phe Glin Lell Glin Glu Glu Asp Luell Lys Wall 25 3 O

Met Arg Arg Met Gln Lys Glu Met Asp Arg Gly Luell Arg Tell Glu Thir 35 4 O 45

His Glu Glu Ala Ser Wall Lys Met Lell Pro Thir Tyr Wall Arg Ser Thir US 8,795,959 B2 27 28 - Continued

SO 55 6 O

Pro Glu Gly Ser Glu Wall Gly Asp Phe Luell Ser Lell Asp Luell Gly Gly 65 70 7s 8O

Thir Asn Phe Arg Wall Met Lell Wall Wall Gly Glu Gly Glu Glu Gly 85 90 95

Glin Trp Ser Wall Lys Thir His Glin Thir Ser Ile Pro Glu Asp 105 11 O

Ala Met Thir Gly Thir Ala Glu Met Luell Phe Ile Ser Glu 115 12 O 125

Ile Ser Asp Phe Lell Asp Lys His Glin Met His Luell Pro 13 O 135 14 O

Lell Gly Phe Thir Phe Ser Phe Pro Wall Arg His Glu Asp Ile Asp Lys 145 150 155 160

Gly Ile Luell Luell Asn Trp Thir Gly Phe Ala Ser Gly Ala Glu 1.65 17O

Gly Asn Asn Wall Wall Gly Lell Luell Arg Asp Ala Ile Arg Arg Gly 18O 185 19 O

Asp Phe Glu Met Asp Wall Wall Ala Met Wall ASn Asp Thir Wall Ala Thir 195 2OO

Met Ile Ser Tyr Glu Asp His Glin Cys Glu Wall Gly Met Ile 21 O 215

Wall Gly Thir Gly Asn Ala Met Glu Glu Met Glin Asn Wall 225 23 O 235 24 O

Glu Luell Wall Glu Gly Asp Glu Gly Arg Met Wall Asn Thir Glu Trp 245 250 255

Gly Ala Phe Gly Asp Ser Gly Glu Luell Asp Glu Phe Lell Luell Glu Tyr 26 O 265 27 O

Asp Arg Luell Wall Asp Glu Ser Ser Ala Asn Pro Gly Glin Glin Luell Tyr 28O 285

Glu Lys Luell Ile Gly Gly Lys Met Gly Glu Lell Wall Arg Luell Wall 29 O 295 3 OO

Lell Luell Arg Luell Wall Asp Glu Asn Luell Luell Phe His Gly Glu Ala Ser 3. OS 310 315

Glu Glin Luell Arg Thir Arg Gly Ala Phe Glu Thir Arg Phe Wall Ser Glin 3.25 330 335

Wall Glu Ser Asp Thir Gly Asp Arg Lys Glin Ile Asn Ile Luell Ser 34 O 345 35. O

Thir Luell Gly Luell Arg Pro Ser Thir Thir Asp Asp Ile Wall Arg Arg 355 360 365

Ala Cys Glu Ser Wall Ser Thir Arg Ala Ala His Met Ser Ala Gly 37 O 375 38O

Lell Ala Gly Wall Ile Asn Arg Met Arg Glu Ser Arg Ser Glu Asp Wall 385 390 395 4 OO

Met Arg Ile Thir Wall Gly Wall Asp Gly Ser Wall Luell His Pro 4 OS 41O 415

Ser Phe Glu Arg Phe His Ala Ser Wall Arg Arg Lell Thir Pro Ser 42O 425 43 O

Glu Ile Thir Phe Ile Glu Ser Glu Glu Gly Ser Gly Arg Gly Ala 435 44 O 445

Ala Luell Wall Ser Ala Wall Ala Ala Cys Met Luell Gly Glin 450 45.5 460

<210s, SEQ ID NO 3 US 8,795,959 B2 29 30 - Continued

&211s LENGTH: 1158 212. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 3 Met Ala Ala Val Arg Gly Ala Pro Lieu. Lieu. Ser Cys Lieu. Lieu Ala Lieu. 1. 5 1O 15 Lieu Ala Lieu. Cys Pro Gly Gly Arg Pro Glin Thr Val Lieu. Thir Asp Asp 2O 25 3O Glu Ile Glu Glu Phe Lieu. Glu Gly Phe Lieu. Ser Glu Lieu. Glu Pro Glu 35 4 O 45 Pro Arg Glu Asp Asp Val Glu Ala Pro Pro Pro Pro Glu Pro Thr Pro SO 55 6 O Arg Val Arg Lys Ala Glin Ala Gly Gly Llys Pro Gly Lys Arg Pro Gly 65 70 7s 8O Thir Ala Ala Glu Val Pro Pro Glu Lys Thir Lys Asp Llys Gly Lys Llys 85 90 95 Gly Lys Lys Asp Llys Gly Pro Llys Val Pro Lys Glu Ser Lieu. Glu Gly 1OO 105 11 O Ser Pro Arg Pro Pro Llys Lys Gly Lys Glu Lys Pro Pro Lys Ala Thr 115 12 O 125 Llys Llys Pro Lys Glu Lys Pro Pro Lys Ala Thr Llys Llys Pro Lys Glu 13 O 135 14 O Glu Pro Pro Lys Ala Thr Llys Llys Pro Lys Glu Lys Pro Pro Lys Ala 145 150 155 160 Thr Llys Llys Pro Pro Ser Gly Lys Arg Pro Pro Ile Lieu Ala Pro Ser 1.65 17O 17s Glu Thir Lieu. Glu Trp Pro Leu Pro Pro Pro Pro Ser Pro Gly Pro Glu 18O 185 19 O Glu Lieu Pro Glin Glu Gly Gly Ala Pro Lieu. Ser Asn. Asn Trp Glin Asn 195 2OO 2O5 Pro Gly Glu Glu Thr His Val Glu Ala Glin Glu. His Gln Pro Glu Pro 21 O 215 22O Glu Glu Glu Thr Glu Gln Pro Thir Lieu. Asp Tyr Asn Asp Glin Ile Glu 225 23 O 235 24 O Arg Glu Asp Tyr Glu Asp Phe Glu Tyr Ile Arg Arg Gln Lys Glin Pro 245 250 255 Arg Pro Pro Pro Ser Arg Arg Arg Arg Pro Glu Arg Val Trp Pro Glu 26 O 265 27 O Pro Pro Glu Glu Lys Ala Pro Ala Pro Ala Pro Glu Glu Arg Ile Glu 27s 28O 285 Pro Pro Val Llys Pro Leu Lleu Pro Pro Leu Pro Pro Asp Tyr Gly Asp 29 O 295 3 OO Gly Tyr Val Ile Pro Asn Tyr Asp Asp Met Asp Tyr Tyr Phe Gly Pro 3. OS 310 315 32O

Pro Pro Pro Glin Llys Pro Asp Ala Glu Arg Glin Thr Asp Glu Glu Lys 3.25 330 335

Glu Glu Lieu Lys Llys Pro Llys Lys Glu Asp Ser Ser Pro Lys Glu Glu 34 O 345 35. O Thir Asp Llys Trp Ala Val Glu Lys Gly Lys Asp His Lys Glu Pro Arg 355 360 365

Lys Gly Glu Glu Lieu. Glu Glu Glu Trp Thr Pro Thr Glu Lys Val Lys 37 O 375 38O

Cys Pro Pro Ile Gly Met Glu Ser His Arg Ile Glu Asp Asin Glin Ile US 8,795,959 B2 31 32 - Continued

385 390 395 4 OO

Arg Ala Ser Ser Met Lell Arg His Gly Luell Gly Ala Glin Arg Gly Arg 4 OS 415

Lell Asn Met Glin Thir Gly Ala Thir Glu Asp Asp Asp Gly Ala 425 43 O

Trp Ala Glu Asp Asp Ala Arg Thir Glin Trp Ile Glu Wall Asp Thir 435 44 O 445

Arg Arg Thir Thir Phe Thir Gly Wall Ile Thir Glin Gly Arg Asp Ser 450 45.5 460

Ser Ile His Asp Asp Phe Wall Thir Thir Phe Phe Wall Gly Phe Ser Asn 465 470

Asp Ser Glin Thir Trp Wall Met Thir Asn Gly Glu Glu Met Thir 485 490 495

Phe His Gly Asn Wall Asp Asp Thir Pro Wall Lell Ser Glu Luell Pro SOO 505

Glu Pro Wall Wall Ala Arg Phe Ile Arg Ile Pro Lell Thir Trp Asn 515 52O 525

Gly Ser Luell Met Arg Lell Glu Wall Luell Gly Cys Ser Wall Ala Pro 53 O 535 54 O

Wall Ser Tyr Ala Glin Asn Glu Wall Wall Ala Thir Asp Asp Luell 5.45 550 555 560

Asp Phe Arg His His Ser Asp Met Arg Glin Lell Met Lys Wall 565 st O sts

Wall Asn Glu Glu Cys Pro Thir Ile Thir Arg Thir Ser Luell Gly 585 59 O

Ser Ser Arg Gly Lell Ile Tyr Ala Met Glu Ile Ser Asp Asn Pro 595 605

Gly Glu His Glu Lell Gly Glu Pro Glu Phe Arg Tyr Thir Ala Gly Ile 610 615

His Gly Asn Glu Wall Lell Gly Arg Glu Luell Luell Lell Lell Luell Met Glin 625 630 635 64 O

Luell Arg Glu Arg Asp Gly Asn Pro Arg Wall Arg Ser Luell 645 650 655

Wall Glin Asp Thir Ile His Luell Wall Pro Ser Lell Asn Pro Asp Gly 660 665 67 O

Glu Wall Ala Ala Glin Met Gly Ser Glu Phe Gly Asn Trp Ala Luell 675 685

Gly Luell Trp Thir Glu Glu Gly Phe Asp Ile Phe Glu Asp Phe Pro Asp 69 O. 695 7 OO

Lell Asn Ser Wall Lell Trp Gly Ala Glu Glu Arg Trp Wall Pro Tyr 7 Os

Arg Wall Pro Asn Asn Asn Lell Pro Ile Pro Glu Arg Luell Ser Pro 72 73 O 73

Asp Ala Thir Wall Ser Thir Glu Wall Arg Ala Ile Ile Ala Trp Met Glu 740 74. 7 O

Asn Pro Phe Wall Lell Gly Ala Asn Luell ASn Gly Gly Glu Arg Luell 7ss 760 765

Wall Ser Pro Tyr Asp Met Ala Arg Thir Pro Thir Glin Glu Glin Luell 770 775

Lell Ala Ala Ala Met Ala Ala Ala Arg Gly Glu Asp Glu Asp Glu Wall 78s 79 O 79.

Ser Glu Ala Glin Glu Thir Pro Asp His Ala Ile Phe Arg Trp Luell Ala 805 810 815 US 8,795,959 B2 33 34 - Continued

Ile Ser Phe Ala Ser Ala His Lieu. Thir Lieu. Thr Glu Pro Tyr Arg Gly 82O 825 83 O Gly Cys Glin Ala Glin Asp Tyr Thr Gly Gly Met Gly Ile Val Asin Gly 835 84 O 845 Ala Lys Trp Asn Pro Arg Thr Gly. Thir Ile Asn Asp Phe Ser Tyr Lieu 850 855 860 His Thr Asn. Cys Lieu. Glu Lieu. Ser Phe Tyr Lieu. Gly Cys Asp Llys Phe 865 87O 87s 88O Pro His Glu Ser Glu Lieu Pro Arg Glu Trp Glu Asn. Asn Lys Glu Ala 885 890 895 Lieu. Lieu. Thr Phe Met Glu Glin Val His Arg Gly Ile Lys Gly Val Val 9 OO 905 91 O Thr Asp Glu Gln Gly Ile Pro Ile Ala Asn Ala Thr Ile Ser Val Ser 915 92 O 925 Gly Ile Asn His Gly Wall Lys Thr Ala Ser Gly Gly Asp Tyr Trp Arg 93 O 935 94 O Ile Lieu. Asn Pro Gly Glu Tyr Arg Val Thir Ala His Ala Glu Gly Tyr 945 950 955 96.O Thr Pro Ser Ala Lys Thr Cys Asn Val Asp Tyr Asp Ile Gly Ala Thr 965 97O 97. Glin Cys Asn. Phe Ile Lieu Ala Arg Ser Asn Trp Lys Arg Ile Arg Glu 98O 985 99 O Ile Met Ala Met Asn Gly Asn Arg Pro Ile Pro His Ile Asp Pro Ser 995 1OOO 1005 Arg Pro Met Thr Pro Glin Glin Arg Arg Lieu. Glin Glin Arg Arg Lieu. O1O O15 O2O Glin His Arg Lieu. Arg Lieu. Arg Ala Glin Met Arg Lieu. Arg Arg Lieu. O25 O3 O O35 Asn Ala Thir Thr Thr Lieu. Gly Pro His Thr Val Pro Pro Thr Lieu. O4 O O45 OSO Pro Pro Ala Pro Ala Thir Thr Lieu Ser Thr Thr Ile Glu Pro Trp O55 O6 O O65 Gly Lieu. Ile Pro Pro Thr Thr Ala Gly Trp Glu Glu Ser Glu Thr Of O O7 O8O Glu Thr Tyr Thr Glu Val Val Thr Glu Phe Gly Thr Glu Val Glu O85 O9 O O95 Pro Glu Phe Gly Thr Llys Val Glu Pro Glu Phe Glu Thr Gln Leu 1 OO 105 11 O

Glu Pro Glu Phe Glu. Thir Glin Lieu. Glu Pro Glu Phe Glu Glu Glu 115 12 O 125 Glu Glu Glu Glu Lys Glu Glu Glu Ile Ala Thr Gly Glin Ala Phe 13 O 135 14 O Pro Phe Thr Thr Val Glu Thr Tyr Thr Val Asn Phe Gly Asp Phe 145 15 O 155

<210s, SEQ ID NO 4 &211s LENGTH: 469 212. TYPE: PRT <213> ORGANISM: Homo sapiens

<4 OOs, SEQUENCE: 4 Met Phe Ser Glu Glin Ala Ala Glin Arg Ala His Thr Lieu. Leu Ser Pro 1. 5 1O 15

Pro Ser Ala Asn Asn Ala Thr Phe Ala Arg Val Pro Val Ala Thr Tyr US 8,795,959 B2 35 36 - Continued

25

Thir Asn Ser Ser Glin Pro Phe Arg Luell Gly Glu Arg Ser Phe Ser Arg 35 4 O 45

Glin Tyr Ala His Ile Ala Thir Arg Luell Ile Glin Met Arg Pro Phe SO 55 6 O

Lell Glu Asn Arg Ala Glin Glin His Trp Gly Ser Gly Wall Gly Wall Lys 65 70

Luell Glu Lell Glin Pro Glu Glu Lys Wall Wall Gly Thir 85 90 95

Lell Phe Ala Met Pro Lell Glin Pro Ser Ile Lell Arg Glu Wall Ser 105 11 O

Glu Glu His Asn Lell Lell Pro Glin Pro Pro Arg Ser Lys Ile His 115 12 O 125

Pro Asp Asp Glu Lell Wall Lell Glu Asp Glu Luell Glin Arg Ile Luell 13 O 135 14 O

Lys Gly Thir Ile Asp Wall Ser Luell Wall Thir Gly Thir Wall Luell Ala 145 150 155 160

Wall Phe Gly Ser Wall Arg Asp Asp Gly Lys Phe Lell Wall Glu Asp 1.65 17s

Phe Ala Asp Lell Ala Pro Glin Lys Pro Ala Pro Pro Luell Asp Thir 18O 185 19 O

Asp Arg Phe Wall Lell Lell Wall Ser Gly Luell Gly Lell Gly Gly Gly Gly 195

Gly Glu Ser Luell Lell Gly Thir Glin Luell Luell Wall Asp Wall Wall Thir Gly 210 215 220

Glin Luell Gly Asp Glu Gly Glu Glin Ser Ala Ala His Wall Ser Arg 225 23 O 235 24 O

Wall Ile Luell Ala Gly Asn Lell Luell Ser His Ser Thir Glin Ser Arg Asp 245 250 255

Ser Ile Asn Lys Ala Luell Thir Thir Glin Ala Ala Ser 26 O 265 27 O

Wall Glu Ala Wall Lys Met Lell Asp Glu Ile Luell Lell Glin Luell Ser Ala 285

Ser Wall Pro Wall Asp Wall Met Pro Gly Glu Phe Asp Pro Thir Asn 29 O 295 3 OO

Thir Luell Pro Glin Glin Pro Lell His Pro Met Phe Pro Luell Ala Thir 3. OS 310 315 32O

Ala Ser Thir Lell Glin Lell Wall Thir Asn Pro Glin Ala Thir Ile 3.25 330 335

Asp Gly Wall Arg Phe Lell Gly Thir Ser Gly Glin Asn Wall Ser Asp Ile 34 O 345 35. O

Phe Arg Tyr Ser Ser Met Glu Asp His Luell Glu Ile Lell Glu Trp Thir 355 360 365

Lell Arg Wall Arg His Ile Ser Pro Thir Ala Pro Asp Thir Luell Gly 37 O 375

Tyr Pro Phe Lys Thir Asp Pro Phe Ile Phe Pro Glu Pro His 385 390 395 4 OO

Wall Phe Gly Asn Thir Pro Ser Phe Gly Ser Ile Ile Arg 4 OS 415

Gly Pro Glu Asp Glin Thir Wall Luell Luell Wall Thir Wall Pro Asp Phe Ser 42O 425 43 O

Ala Thir Glin Thir Ala Lell Wall Asn Luell Arg Ser Lell Ala Glin 435 44 O 445

US 8,795,959 B2 51 52 - Continued aatagtttgc tigagaatgat gig titt coagc titcatccatg tcc ctacaaa gogacatgaac 62OO t cat catttt titatggctgc atag tatt co atggtgtata totgccacat tittaggagga 626 O gcttgtacca ttccttctga aactatt coa atcaaaagaa aaa.gagagaa toc tocc taa 632O Ctcatttitat gaggc.ca.gca t catcctgat accalaagggit ggcagagaga gacacaacaa 638O aaaaagaatt ttagac caat atcCttgatgaac attgaag caaaaatcct cagtaaaata 644. O ctggcaaacc gaatccagca acacatcaaa aagcttatcc accatgatca agtgggcttic 65OO atcc ctdgga tigcaaggctg gttcaacata cqaaaat cag taaacgtaat coagcatata 656. O aacagaacca aagacaaaaa cca catgatt atctgaatag atgcagaaaa gogo ctittgac 662O aaaattcaac aaccct catg ctaaaaactic ticaataaatt agg tattgat giggacgitatic 668O tcaaaataat aagagctato tatgacaaac ccacago caa tat catactgaatggacaaa 674 O aactggaagc attic cctittgaaaactggca caagacitggg atgcc ct ct c ticaccacticc 68OO ttitt Caac at agtgttggaa gttctggcca gggcaat cag gtaggagaag gaaataaagg 6860 gtattolaatt aagaaaagag gaagttcaaat tdtcc ct gtt togcagatgac atgattgtat 692 O atctagaaaa ccc catcgtc. tcagoccaaa atcto cittaa gctgataagc aactt cagoa 698 O aagttct cagg atacaaaatc aatgtgcaaa aat cacaagc agt cittatac accaataa.ca 704 O gacagaga.gc caaatcatga gtgaactic cc attcacaatt gcttcaaaga gaataaaata 71OO

Cctaggaatc Caacttacaa gggatgtgaa ggacct Cttic aaggagaact acaaacgact 716 O gct caatgaa ataaaagagg atacaaacaa atggaagaac attic catgct catggg tagg 722 O aagaat Cagt atctgaaaa tigCCatact gcc Calaggta atttatagat ticalatgcCat 7280 c cctat caag ctaccalatga ctittctt cac agaattggaa aaaactaaag titcatatgga 734 O accaaaaaag agc.ccgcatt gccaagt caa t cctaagcca aaagaacaaa gctggaggca 74 OO t cacact acc tacttictaa citatactaca aggctacagt aaccaaaac a gcatgctact 746 O ggtaccaaaa cagagatata gag caatgga acagaacaga gcc ct cagaa ataatgcc.gc 752O at atctacaa gcatctgatc tittgacaaac ctdacaaaaa caa.gcaatgg ggaaaggatt 758 O C cct atttaa taaatggtgc tigggaaaact ggctago cat atgtagaaag ctgaaactgg 764 O atcc ct tcct tacaccittat acaaaaatta attcaagatg gattaaagaci ttacatgtta 77OO gacctaaaac cataaaaacc ctagaagaaa acctaggcaa taccatt cag gacataggca 776 O. tgggcaagga Ctt catgtct aaaacaccala aag caatggc aacaaaagcc aaaattgaca 782O aatgggat ct aattaalacta aagagcttct gcacago aaa agaaact acc atcagagtga 788 O acaggcaa.cc tacagaatgg gagaaaattt ttgcaaccta ct catctgac alaagggctaa 794 O tatic cagaat ctacaatgaa citcaaacaaa tttacaagaa aaaaacaaac aacco catca 8 OOO acaaatgggc gaaggatatgaacagacact tct caaaaga agacatttat gtagccaaaa 806 O aacacatgaaaaaatgct catcatcactgg ccatcagaga aatgcaaatc aaaac cacaa 812 O tgagatacca t ct cacacca gttagaatgg tat cattaa aaagt cagga aacaa.caggt 818O gctggagagg atgtggagaa at aggaacac ttitta cactg. titcgtgggac ttaalactag 824 O ttcaac catt gtggaagt ca gtgtggcgat tcct caggga tictagaactg gaaataccat 83OO ttgacc cago catcc catta citagg tatat acccaaagga ttataaatca togctgctata 8360 aggacacatg cacacg tatgttt attgttgg cactgttcac aatagcaaag acttggalacc 842O alacc caaatgaac cottctt tttgcttgcg ttgttgaaag aaggcaagtic tatggatagg 848 O aatgagtgag gcacagct co Ctgaggatgc catat cttgc ccgtttcttg tdt attalagt 854 O

US 8,795,959 B2 139 140 - Continued tggcct Cotg tagttt tagg aag cagotgt ggc ct cagac C catctgctg talacct ct a 572 O citccatattt attgcactitt ctdtctgtga gcgt.cggittt ct citcct ct a taacaatagg 578 O. ataataatga cactaccatg ccttgcaaaa atgctacaag gigttcactga gataaatctg 584 O gagagt catg cctgaaaaat agtaagttcgt tataaaggg aagctgctat taataaataa 59 OO agctttittct tttitttitttt tttgagatgg aatct cactic toggcgcc tag gotggagtgc 596 O agtgatgcaa tottggctica citgcaacctic cqc ct cotgt gttcaa.gcaa toctic ctact 6O20 t cagcatcct cagtagctgg gactacaggit gcgcaccacc atgc.ccggct agtttitttac 608 O atttittaaag ct attaatag gocagccaca gtggct catg cctataatcc cagcactittg 614 O ggaa.gctgag gCaggtggat C 61.61

What is claimed is: 2. A method for detecting the presence of: (a) a nucleic acid 1. A method of identifying a nucleotide sequence variant of molecule 45,980 nucleotides in length which is at least 99% a 5'-noncoding region, 3'-noncoding region or intron region identical to SEQID NO:6 which encodes a polypeptide that of SEQID NO:6, wherein said variant encodes a protein that has human glucokinase activity, wherein SEQID NO:6 con has human glucokinase activity, wherein SEQID NO:6 con sists of a 5'-noncoding region shown in sequence segment sists of a 5'-noncoding region shown in sequence segment 1-20484 of SEQID NO:6, a 3'-non coding region shown in 1-20484 of SEQID NO:6, a 3'-non coding region shown in 25 sequence segment 33461–45,980 of SEQ ID NO:6, exon sequence segment 33461-45.980 of SEQ ID NO:6, exon regions shown in sequence segments 20485-20523, 25133 regions shown in sequence segments 20485-20523, 25133 25297, 26173-26328, 27524-27643, 28535-28630, 29740 25297, 26173-26328, 27524-27643, 28535-28630, 29740 28838, 30765-30950, 31982-32134, 32867-33097, 333 14 28838, 30765-30950, 31982-32134, 32867-33097, 33460 of SEQ ID NO:6, and intron regions shown in 3331433460 of SEQ ID NO:6, and intron regions shown in 30 sequence segments 20524-25132, 25298-26172, 26329 sequence segments 20524-25132, 25298-26172, 26329 27523, 27644-28534, 28631-28739, 28839-30764, 30951 27523, 27644-28534, 28631-28739, 28839-30764, 30951 3.1981, 32135-32866, 33098-33313 of SEQ ID NO:6; (b) a 3 1981, 32135-32866, 33098-33313 of SEQID NO:6; or its fragment of (a), comprising at least nucleotides 20485-33460 complementary sequence comprising of SEQID NO:6 which encodes a polypeptide having human glucokinase activity and (c) a nucleic acid molecule which is (a) isolating genomic polynucleotide from a sample and 35 (b) determining the presence or absence of a nucleotide a complement of the nucleic acid molecules specified in (a)- sequence variation in said genomic polynucleotide by (b) in a sample, comprising contacting the sample with a comparing the nucleotide sequence of SEQ ID NO:6 polynucleotide probe comprising at least 20 contiguous nucleotides that hybridizes to said nucleic acid molecule with the nucleotide sequence of the isolated genomic understringent conditions and determining whether the poly polynucleotide and establishing if and where a differ 40 ence occurs between the two nucleic acid sequences nucleotide probe binds to said nucleic acid molecule in the thereby identifying a nucleotide sequence variant of sample. SEQ ID NO:6 or its complement.