( 12 ) United States Patent
Total Page:16
File Type:pdf, Size:1020Kb
US010127346B2 (12 ) United States Patent (10 ) Patent No. : US 10 , 127 ,346 B2 Dewey et al . (45 ) Date of Patent: Nov . 13 , 2018 (54 ) SYSTEMS AND METHODS FOR Fan et al. Noninvasive diagnosis of fetal aneuploidy by shotgun INTERPRETING A HUMAN GENOME USING sequencing DNA from maternal blood Proceedings of the National A SYNTHETIC REFERENCE SEQUENCE Academy of Sciences USA vol. 16266 - 16271 (2008 ) . * Chen et al. -717A > G polymorphism of human C - reactive protein ( 75 ) Inventors: Frederick Dewey , Redwood City , CA gene associated with coronary heart disease in ethnic Han Chinese : (US ) ; Euan A . Ashley , Menlo Park , CA the Beijing atherosclerosis study Journal of Molecular Medicine (US ) ; Matthew Wheeler, Sunnyvale , vol. 83 , pp . 72 -78 ( 2005 ). * CA (US ) ; Michael Snyder , Stanford , Wheeler et al. The complete genome of an individual by massively CA (US ) ; Carlos Bustamante , Emerald parallel DNA sequencing Nature vol. 452 , j pp . 872 -877 ( 2008 ). * Hills , CA (US ) Candore et al . Pharmacogenomics : A Tool to Prevent and Cure Coronary Heart Disease Current Pharmaceutical Design vol. 13 pp . ( 73 ) Assignee : The Board of Trustees of the Leland 3726 - 3734 ( 2007 ) . * Stanford Junior University , Stanford , Dewey et al . Phase Whole -Genome Genetic Risk in a Family CA (US ) Quartet Using a Major Allele Reference Sequence PLoS Genetics vol. 7 , article e1002280 ( 2011 ) . * ( * ) Notice : Subject to any disclaimer , the term of this Abecasis et al. , “ Merlin — rapid analysis of dense genetic maps patent is extended or adjusted under 35 using sparse gene flow trees ” , Nature Genetics , Jan . 2002 , vol. 30 , U . S . C . 154 ( b ) by 978 days . pp . 97 - 101. Adzhubei et al. , “ A method and server for predicting damaging ( 21 ) Appl. No. : 13/ 445 , 923 missense mutations” , Nat Methods 7 , 248 - 249 (2010 ) . Ashley et al. , " Clinical assessment incorporating a personal genome” , ( 22 ) Filed : Apr. 13 , 2012 Lancet 375 , 1525 - 1535 ( 2010 ) . Baudat et al. , " PRDM9 is a major determinant of meiotic recom (65 ) Prior Publication Data bination hotspots in humans and mice” , Science 327, 836 -840 ( 2010 ) . US 2013 /0073217 A1 Mar . 21 , 2013 Bentley et al. , " Accurate whole human genome sequencing using US 2015 /0370963 A9 Dec . 24 , 2015 reversible terminator chemistry ” , Nature , Articles , Nov. 6 , 2008 , vol. 456 , pp . 53 - 59 . Related U . S . Application Data Bezemer et al ., “ No association between the common MTHFR (60 ) Provisional application No . 61/ 474 ,749 , filed on Apr. 677C - > T polymorphism and venous thrombosis : results from the 13 , 2011, provisional application No . 61 / 502, 236 , MEGA study” , Arch Intern Med 167 , 497 - 501 ( 2007 ) . Bodenreider , “ The Unified Medical Language System (UMLS ) : filed on Jun . 28 , 2011 . integrating biomedical terminology ” , Nucleic Acids Research , 2004 , vol. 32 Database issue, D267 -D270 . ( 51 ) Int . Ci. Broman et al. , “ Comprehensive human genetic maps: individual and GOON 5 /02 ( 2006 .01 ) sex - specific variation in recombination ” , Am J Hum Genet 63 , GO6F 19 / 12 ( 2011 .01 ) 861 - 869 ( 1998 ) . G06F 19 / 18 ( 2011. 01 ) Chamary et al . , “ Hearing silence : non -neutral evolution at synony GO6F 19 /22 ( 2011. 01 ) mous sites in mammals ” , Nature ReviewsGenetics 2006 ; 7 : 98 -108 . G06F 19 / 24 ( 2011 . 01 ) Chen et al. , “ The reference human genome demonstrates high risk GO6F 15 / 00 ( 2006 . 01 ) of type 1 diabetes and other disorders ” , Pac Symp Biocomput, G11C 17700 ( 2006 .01 ) 231 - 242 (2011 ) . De Bakker et al ., “ A high -resolution HLA and SNP haplotype map (52 ) U . S . CI. for disease association studies in the extended human MHC ” , Nat GO6F 19 / 12 (2013 .01 ) ; G06F 19 / 18 Genet 38 , 1166 - 1172 (2006 ) . ( 2013 . 01 ) ; G06F 19 / 24 ( 2013 .01 ) ; GOON 5 / 02 Donnelly, “ The Probability that Related Individuals Share Some ( 2013 .01 ) ; G06F 19 /22 (2013 . 01 ) Section of Genome Identical by Descent” , Theoretical Population (58 ) Field of Classification Search Biology, 1983 , vol. 23 , pp . 34 -63 . CPC . .. C12Q 1 /6883 ; C120 2600 / 172 ; C120 Durbin et al. , “ A map of human genome variation from population 2600 / 118 ; G06F 19 / 18 ; G06F 19 /22 ; scale sequencing ” , Nature 467 , 1061 - 1073 (2010 ) . G06F 19 / 24 ; G06F 19 / 3437 ; G06F (Continued ) 19 /3443 See application file for complete search history . Primary Examiner - John S Brusca ( 56 ) References Cited (74 ) Attorney, Agent, or Firm — KPPB LLP U . S . PATENT DOCUMENTS ABSTRACT 8 ,008 ,013 B2 8 / 2011 Gaffney et al. (57 ) 8 , 163, 896 B1 4 / 2012 Bentwich et al. 2004 /0161779 Al 8 / 2004 Gingeras In an embodiment of the present invention , three novel 2012 / 0059594 Al 3 / 2012 Hatchwell et al . human reference genome sequences were developed based 2013 /0116930 A1 5 / 2013 Karczewski et al . on the most common population -specific DNA sequence 2013 /0116931 A15 / 2013 Karczewski et al. (“ major allele ” ) . Methods were developed for their integra tion into interpretation pipelines for highthroughput whole OTHER PUBLICATIONS genome sequencing . Shendure et al. Next- generation DNA sequencing Nature Biotech nology vol. 26 , pp . 1135 - 1145 ( 2008 ). * 7 Claims, 71 Drawing Sheets US 10 ,127 ,346 B2 Page 2 ( 56 ) References Cited Ng et al. , “ Accounting for Human Polymorphisms Predicted to Affect Protein Function ” , Genome Research , 2002 , vol. 12 , pp . 436 - 446 OTHER PUBLICATIONS Ng et al ., “ Predicting the Effects of Amino Acid Substitutions on Protein Function ” , Annu . Ref . Genomics Hum . Genet ., 2006 , vol . 7 , Eyre -Walker , “ An analysis of codon usage in mammals : selection or pp . 61 -80 . mutation bias ? " , J . Mol. Evol. 1991 ; 33 : 442 -449 . Ng et al ., “ Sift : Predicting amino acid changes that affect protein Fitch , “ Toward Defining the Course of Evolution : Minimum Change function ” , Nucleic Acids Res 31, 3812 -3814 (2003 ) . for a Specific Tree Topology ” , Systematic Zoology 20 , 406 - 416 Ormond et al. , “ Challenges in the clinical application o whole ( 1971 ) . genome sequencing ” , Lancet 375 , 1749 - 1751 (2010 ) . Hamosh et al. , " Online Mendelian Inheritance in Man ( OMIM ) a Paigen et al ., “ The recombinational anatomy of a mouse chromo knowledgebase of human genes and genetic disorders ” , Nucleic some” , PLoS Genet 4 , e1000119 (2008 ) . Acids Research vol. 30 , 52 -55 , 2002 . Parvanov et al. , “ Prdm9 controls activation of mammalian recom Hofacker, “ Vienna RNA secondary structure server ” , Nuc . Acid bination hotspots” , Science 327 , 835 ( 2010 ) . Res. 2003 ; V 31 No. 13 : 3429 - 3431. Petkov et al . , “ Crossover interference underlies sex differences in Hoppe et al. , “ Marburg I polymorphism of factor VII -activating recombination rates ” , Trends Genet 23 , 539 -542 (2007 ) . protease is associated with idiopathic venous thromboembolism ” , Price et al. , " Principal components analysis corrects fro stratifica Blood 105 , 1549 - 1551 (2005 ) . tion in genome- wide associate studies ” , Nature Genetics , Aug . Ikemura , “ Codon usage and tRNAcontent in unicellular and multicel 2006 , vol . 38 , No. 8 , pp . 904 - 909 . lular organisms” , Mol . Biol . Evol. 1985 2 : 13 - 34 . Pruitt et al . , “ NCBI reference sequences (RefSeq ) : a curated non Johnson et al ., " SNAP : a web -based tool for identification and redundant. sequence database of genomes, transcripts and proteins ” , annotation of proxy SNPs using HapMap ” , Bioinformatics 24 , Nucleic Acids Res 35 , D61- 65 ( 2007 ) . 2938 - 2939 (2008 ) . Purcell et al. , “ Plink : A Tool Set for Whole -Genome Association and Kimchi- Sarfaty et al. , “ A “ Silent” Polymorphism in the MDR1 Population - Based Linkage Analyses” , The American Journal of Gene Changes Substrate Specificity” , Science 2007 ; V 315 No . Human Genetics , Sep . 2007 , vol. 81 , pp . 559 -575 . 5811 : 525 - 528 . Ramensky et al . , “ Human non - synonymous SNPs : server and Kimura , “ Evolutionary rate at the molecular level” , Nature 217 , survey ” , Nucleic Acids Research , 2002 , vol. 30 , No. 7 , pp . 3894 624 -626 ( 1968 ) . 3900 . Klein et al ., “ Estimation of the warfarin dose with clinical and Ridker et al. , “ Interrelation of hyperhomocyst( e ) inemia , factor V pharmacogenetic data ” , N Engl J Med 360 , 753 - 764 ( 2009 ) . Lei den , and risk of future venous thromboembolism ” , Circulation Kong et al ., “ Fine -scale recombination rate differences between 95 , 1777 -1782 (1997 ) . sexes, populations and individuals ” , Nature 467, 1099 - 1103 (2010 ) . Ridker et al ., “ Mutation in the gene coding for coagulation factor V Kong et al ., “ Parental origin of sequence variants associated with and the risk ofmyocardial infarction , stroke , and venous thrombosis complex diseases " , Nature 462, 868 -874 ( 2009 ) . in apparently healthy men ” , N Engl J Med 332, 912 - 917 ( 1995 ) . Kong et al. , “ Sequence variants in the RNF212 gene associate with Rivas et al. , " Secondary structure alone is generally not statistically genome- wide recombination rate” , Science 319 , 1398 -1401 (2008 ) . significant for the detection of noncoding RNAs" , Bioinformatics Koster et al . , “ Venous thrombosis due to poor anticoagulant response 1999 ; V 16 No . 7 : 583 -605 . to activated protein C : Leiden Thrombophilia Study ” , Lancet 342 , Roach et al. , “ Analysis of genetic inheritance in a family quartet by 1503 - 1506 ( 1993 ) . whole -genome sequencing” , Science 328 , 636 -639 (2010 ) . Kruglyak et al . , “ Parametric and Nonparametric Linkage Analysis :