US010167509B2 (12 ) United States Patent ( 10 ) Patent No. : US 10 , 167, 509 B2 Regan et al. ( 45 ) Date of Patent : Jan . 1 , 2019 (54 ) ANALYSIS OF NUCLEIC ACIDS ( 56 ) References Cited (71 ) Applicant: Bio - Rad Laboratories , Inc ., Hercules , U . S . PATENT DOCUMENTS CA (US ) 4 ,437 ,975 A 3/ 1984 Gillespie et al. 4 ,683 , 194 A * 7 / 1987 Saiki et al ...... 435 /6 . 12 (72 ) Inventors : John F . Regan , San Mateo , CA ( US ) ; 4 ,683 , 202 A 7 / 1987 Mullis Serge Saxonov , Oakland , CA (US ) ; 4 ,883 ,750 A 11/ 1989 Whiteley et al. 4 ,988 ,617 A 1/ 1991 Landegren et al. Michael Y . Lucero , San Francisco , CA 5 ,242 , 794 A 9 / 1993 Whiteley et al. ( US) ; Benjamin J . Hindson , 5 ,409 , 818 A 4 / 1995 Davey et al. Livermore , CA (US ); Phillip 5 ,413 , 909 A 5 / 1995 Bassam et al . Belgrader , Severna Park , MD ( US) ; 5 , 426 , 180 A 6 / 1995 Kool 5 ,432 ,054 A 7 / 1995 Saunders et al. Simant Dube , Kirkland , WA (US ) ; 5 ,476 , 930 A 12 / 1995 Letsinger et al. Austin P . So , Pleasanton , CA (US ) ; 5 , 494 ,810 A 2 / 1996 Barany et al. Jeffrey C . Mellen , San Francisco , CA 5 ,554 ,517 A 9 / 1996 Davey et al. ( US ) ; Nicholas J . Heredia , Mountain 5 , 593 , 826 A 1 / 1997 Fung et al. House , CA (US ) ; Kevin D . Ness , 5 , 786 , 146 A 7 / 1998 Herman et al . 5 , 861, 245 A 1 / 1999 Mcclelland et al . Pleasanton , CA (US ) ; Billy W . Colston , 5 ,866 ,337 A 2 / 1999 Schon Jr. , San Ramon , CA (US ) 5 , 871, 921 A 2 / 1999 Landegren et al . 5 ,952 ,481 A * 9 / 1999 Markham et al . 23 . 2 (73 ) Assignee : Bio - Rad Laboratories, Inc ., Hercules , 6 , 063 ,603 A 5 /2000 Davey et al. CA (US ) 6 , 143 ,496 A * 11/ 2000 Brown ...... BOIL 3 /5027 422 / 504 Subject to any disclaimer , the term of this 6 , 143 ,504 A 11/ 2000 Das et al . ( * ) Notice : 6 , 180 ,349 B1 1 /2001 Ginzinger et al. patent is extended or adjusted under 35 6 , 200 ,756 B1 3 /2001 Herman et al. U .S .C . 154 (b ) by 123 days . 6 , 203, 989 B1 3/ 2001 Goldberg et al. 6 ,235 ,472 B1 5 / 2001 Landegren et al. ( 21 ) Appl. No. : 14 / 848, 311 6 ,251 , 594 B1 6 / 2001 Gonzalgo et al. 6 , 258 , 540 B1 7 / 2001 Lo et al. 6 ,265 , 171 B1 7 / 2001 Herman et al. (22 ) Filed : Sep . 8 , 2015 6 ,331 , 393 B1 12 /2001 Laird et al. (65 ) Prior Publication Data (Continued ) US 2016 / 0076099 Al Mar. 17 , 2016 FOREIGN PATENT DOCUMENTS EP 745140 B1 11/ 2000 Related U . S . Application Data EP 0964704 B1 2 / 2002 (62 ) Division of application No. 13 / 385 , 277 , filed on Feb . (Continued ) 9 , 2012 , now Pat. No. 9 , 127 ,312 . OTHER PUBLICATIONS Barrett et al ., Haploview : analysis and visualization of LD and (60 ) Provisional application No . 61/ 441 , 209 , filed on Feb . haplotype maps. Bioinformatics 21 ( 2 ) : 263 ( 2005 ) . * 9 , 2011 , provisional application No. 61 /444 , 539, filed Beer et al. , On - Chip , Real- Time, Single - Copy Polymerase Chain on Feb . 18 , 2011 , provisional application No . Reaction in Picoliter Droplets . Analytical Chemistry 79 : 8471 ( 2007 ) . * 61 /454 ,373 , filed on Mar . 18 , 2011 , provisional Carter NP. , Methods and strategies for analyzing copy number application No . 61/ 476 , 115 , filed on Apr. 15 , 2011 , variation using DNA microarrays. Nature Genetics 39 : S16 (2007 ) . * provisional application No . 61/ 478 ,777 , filed on Apr. 25 , 2011, provisional application No . 61/ 484 ,197 , (Continued ) filed on May 9 , 2011, provisional application No . Primary Examiner — Ethan C Whisenant 61/ 490 , 055 , filed on May 25 , 2011. (74 ) Attorney, Agent, or Firm — Kolisch Hartwell, P .C . (51 ) Int. Cl. C12Q 1/ 68 ( 2018 . 01 ) ( 57 ) ABSTRACT CO7H 21 /00 ( 2006 .01 ) Provided herein are improved methods, compositions, and C12Q 1/ 6883 ( 2018 .01 ) kits for analysis of nucleic acids. The improved methods , c120 1/ 683 (2018 .01 ) compositions, and kits can enable copy number estimation C12Q 1 /6858 ( 2018 . 01 ) of a nucleic acid in a sample . Also provided herein are (52 ) U .S . CI. methods, compositions , and kits for determining the linkage CPC ...... C120 1 /6883 (2013 . 01 ) ; C12Q 1 /683 of two or more copies of a target nucleic acid in a sample ( 2013 .01 ); C12Q 1 /6858 (2013 .01 ) ; C12Q ( e. g . whether the two or more copies are on the same 2600 /158 ( 2013 .01 ) ; C12Q 2600 / 172 or different ) or for phasing (2013 . 01 ) alleles . (58 ) Field of Classification Search CPC ...... C12Q 1/ 68 ; C07H 21 /00 18 Claims, 60 Drawing Sheets See application file for complete search history. Specification includes a Sequence Listing . US 10 ,167 ,509 B2 Page 2

(56 ) References Cited 2008 /0169184 AL 7 / 2008 Brown et al . 2008 / 0254474 A1 10 / 2008 Laird et al. U . S . PATENT DOCUMENTS 2009 / 0004646 A1 1 / 2009 Schuster et al . 2009 / 0026082 A1 1 /2009 Rothberg et al. 6 ,410 ,276 B1 6 / 2002 Burg et al . 2009 /0035762 A1 2 /2009 Sampas 6 ,440 ,706 B1 * 8 /2002 Vogelstein ...... C12Q 1 /6818 2009 / 0047669 Al 2 /2009 Zhang et al. 435 / 6 . 12 2009 / 0068648 Al 3 / 2009 Yakhini et al. 6 , 558 , 928 B1 5 / 2003 Landegren 2009 / 0069194 Al 3 / 2009 Ramakrishnan 6 ,582 ,938 B1 6 /2003 Su et al. 2009 / 0098547 A1 4 / 2009 Ghosh 6 ,596 ,493 B1 7 / 2003 Reeves et al . 2009 / 0155776 A1 6 / 2009 Lo et al . 6 , 790 , 328 B2 9 / 2004 Jacobson et al. 2009 /0215633 A1 8 / 2009 Van Eijk et al. 6 ,858 ,412 B2 2 / 2005 Willis et al. 2009 /0239308 AL 9 / 2009 Dube et al. 6 ,884 ,586 B2 4 /2005 Van Ness et al . 2009 /0317798 A1 12 /2009 Heid et al. 6 , 927 ,028 B2 8 / 2005 Dennis et al. 2010 / 0070455 Al 3 /2010 Halperin et al. 6 ,953 ,663 B1 10 / 2005 Lipshutz et al. 2010 / 0092973 A14 / 2010 Davies et al. 7 ,041 , 481 B2 5 / 2006 Anderson et al . 2010 / 0092981 A1 4 / 2010 Shuber 7 , 074 , 564 B2 7 / 2006 Landegren 2010 /0093550 A1* 4 /2010 Stuelpnagel ...... C12Q 1/ 6837 7 ,256 ,020 B2 8 / 2007 Lyamichev et al. 506 / 3 7 , 320 , 860 B2 1 / 2008 Landegren et al. 2010 /0105052 A14 / 2010 Drmanac et al . 7 , 332 ,275 B2 2 / 2008 Braun et al . 2010 /0105112 A1 4 / 2010 Holtze et al. 7 , 351 ,528 B2 4 / 2008 Landegren 2010 /0129799 A1* 5/ 2010 Guomundsson . .. . . C12Q 1/ 6886 7 , 368, 242 B2 5 / 2008 Miao et al. 435 / 6 . 14 7 , 424 , 368 B2 9 / 2008 Huang et al. 2010 /0173394 AL 7 / 2010 Colston , Jr . et al . 7 ,510 , 829 B2 3 / 2009 Faham et al. 2010 / 0203538 AL 8 /2010 Dube et al. 7 ,618 , 780 B2 11/ 2009 Thompson 2010 /0255492 Al 10 /2010 Quake et al . 7 ,622 , 280 B2 11/ 2009 Holliger et al. 2010 / 0256013 A1 10 /2010 Quake et al. 7 .674 , 587 B23 / 2010 Lipshutz et al. 2010 / 0256015 A1 10 / 2010 Lim et al. 7 , 700 , 323 B2 4 / 2010 Willis et al. 2010 /0273165 Al 10 /2010 Ehrich et al . 7 ,700 , 325 B2 * 4 /2010 Cantor .. C12Q 1/ 6827 2010 /0291568 A1 * 11/ 2010 Olson ...... C12Q 1/ 6858 435 / 6 . 11 435 / 6 . 11 7 ,718 , 364 B2 5 / 2010 Hoon et al. 2010 /0317542 Al 12 /2010 Lim et al. 7 ,718 , 370 B2 5 / 2010 Dhallan 2010 /0330619 A1 12 / 2010 Willis et al. 7 , 754 ,428 B2 7 / 2010 Lo et al. 2011/ 0015084 AL 1 / 2011 Christian et al. RE41 , 780 E 9 / 2010 Anderson et al. 2011/ 0029251 AL 2/ 2011 Huang et al. 7 ,790 ,388 B2 9 /2010 Landegren et al. 2011 /0033848 A1 * 2 / 2011 Simons ...... C12Q 1/ 6809 7 ,790 , 393 B2 9 / 2010 Lyamichev et al. 435 /6 . 12 7 , 799 ,531 B2 9 / 2010 Mitchell et al . 2011/ 0039724 A1 2 / 2011 Lo et al. 7 ,811 ,757 B2 10 / 2010 Shuber 2011/ 0105353 A1 5 / 2011 Lo et al . 7 , 822 , 555 B2 10 / 2010 Huang et al. 2011 /0143949 A1 6 / 2011 Heid et al. 7 , 888 ,017 B22 / 2011 Quake et al. 2011/ 0159499 AL 6 / 2011 Hindson et al . 8 ,606 , 526 B1 * 12 / 2013 Fernandez et al ...... 702 / 19 2011/ 0183330 A1 * 7 / 2011 Lo et al...... 435 /6 . 11 9 , 127 , 312 B2 * 9 / 2015 Regan ...... C12Q 1 /683 2011/ 0217712 A1 9 / 2011 Hiddessen et al . 2002 /0151040 Al 10 /2002 O 'Keefe et al . 2011 /0244455 A1 * 10 / 2011 Larson ...... C12Q 1/ 686 2003 / 0082543 A1 * 5 / 2003 Su et al...... 435 /6 435 / 6 . 11 2003/ 0219769 AIA1 "* 1111/ 2003 OlsonOlson ...... C12Q 1 /6876 2012 /0171683 AL 7 /2012 Ness et al. 435 / 6 . 11 2012 /0220494 A1 * 8 / 2012 Samuels ...... C12N 15 / 1075 2004 / 0005614 AL 1 / 2004 Kum et al. 506 / 16 2004 / 0005710 A1 1 /2004 Son et al. 2012/ 0252015 Al 10 /2012 Hindson et al. 2004 /0091905 Al 5 / 2004 Guo 2012 /0322058 AL 12 /2012 Regan et al. 2004 /0101835 Al 5 / 2004 Willis et al. 2013/ 0040824 A1* 2/ 2013 Lo . . C12Q 1 /6886 2004 / 0137470 A1 7 / 2004 Dhallan 506 / 2 2004 / 0157243 AL 8 / 2004 Huang et al. 2013/ 0295568 A1 11/ 2013 Link 2004 / 0259778 A1 * 12 / 2004 Kotani ...... CO7K 14 / 723 800 / 8 2014 / 0087962 A1 * 3 /2014 Keys ...... C12Q 1/ 6858 2005 /0026180 A1 2 / 2005 Willis et al. 506 / 9 2005 / 0064476 A1 3 /2005 Huang et al. 2014 /0162266 Al * 6 /2014 Klitgord et al ...... 435 /6 . 11 2005 /0112636 A1 5 / 2005 Hurt et al . 2015 /0038356 AL 2 /2015 Karlin - Neumann et al. 2005 /0130217 AL 6 / 2005 Huang et al . 2016 / 0362729 Al 12 / 2016 Regan et al . 2005 /0158739 AL 7 / 2005 Jeddeloh et al. 2005 /0272070 Al 12 / 2005 Ehrich et al. FOREIGN PATENT DOCUMENTS 2006 /0073511 AL 4 / 2006 Jones et al. 2006 / 0078893 A14 / 2006 Griffiths et al . EP 951565 B1 3 / 2003 2006 / 0094111 A1 * 5 / 2006 Saito et al ...... 435 /325 EP 1366192 B1 12 / 2007 2006 /0134674 A1 6 / 2006 Huang et al. EP 1644519 B1 12 / 2008 2006 /0275789 Al 12 / 2006 Willis et al . EP 1066414 B1 6 / 2009 2006 /0281098 A1 12 / 2006 Miao et al. WO 8810315 AL 12 / 1988 WO WO 88 / 10315 AL 12 / 1988 2006 /0286580 A1 12 / 2006 Lin et al . WO 9006995 A1 6 / 1990 2006 /0292585 Al 12/ 2006 Nautiyal et al. WO WO 90 /06995 A1 6 / 1990 2007/ 0065823 AL 3 / 2007 Dressman et al. WO 9709069 AL 3 / 1997 2007 / 0172873 A1 7 / 2007 Brenner et al . WO WO 97 /09069 A1 3 / 1997 2007/ 0178479 AL 8 / 2007 Willis et al. WO 9741254 AL 11 / 1997 2007 /0225487 A1 9 / 2007 Nilsson et al. WO WO 97 / 41254 Al 11 / 1997 2007 / 0264642 AL 11/ 2007 Willis et al. WO 9949079 AL 9 / 1999 2008 / 0014589 AL 1/ 2008 Link et al. wo WO 99 / 49079 Al 9 / 1999 2008 / 0108063 A1 5 / 2008 Lucero et al . Wo 01004360 A3 1 / 2001 2008 / 0125324 A1 * 5 / 2008 Petersdorf ...... C12Q 1 /6813 WO 0177383 A2 10 / 2001 506 / 1 wo WO 01 / 77383 A2 10 / 2001 2008 /0138809 A1 6 /2008 Kapur et al. WO 02057491 A2 7 / 2002 US 10 ,167 ,509 B2 Page 3

References Cited WO 2013093530 A1 6 /2013 ( 56 ) wo 2013109731 A1 7 / 2013 FOREIGN PATENT DOCUMENTS WO 2015048571 A2 4 /2015 WO WO 2002 /057491 A2 7 /2002 WO 0177383 A3 12 / 2002 OTHER PUBLICATIONS WO WO 01/ 77383 A3 12 / 2002 WO 03012119 A2 2 / 2003 Ding et al ., Direct molecular haplotyping of long - range genomic WO WO 03012119 A2 2 /2003 DNA with M1- PCR . PNAS100 ( 13 ) : 7449 (2003 ). * WO 03035671 A2 5 / 2003 Fan et al. , Detection of Aneuploidy with Digital Polymerase Chain WO WO 03 /035671 A2 5 / 2003 Reaction . Analytical Chemistry 79 :7576 (2007 ). * WO 02057491 A3 10 / 2003 Iafrate et al . ,Detection oflarge - scale variation in the . WO WO 2002 /057491 A3 10 / 2003 WO 03012119 A3 11 /2003 Nature Genetics 36 ( 9 ) : 949 ( 2004 ) . * WO 03035671 A3 11 / 2003 Kato et al. , Population - genetic nature of copy numbervariations in WO WO 03 / 035671 A3 11 / 2003 the human genome. Human Molecular Genetics 19 ( 5 ) : 761 ( 2010 . * wo WO 03012119 A3 11 / 2003 Kwok et al. , Single -Molecule Analysis for Molecular Haplotyping . WO 2004044225 A2 5 / 2004 Human Mutation 23 : 442 ( 2004 ) . * WO WO 04 /044225 A2 5 / 2004 Lo et al. , Digital PCR for the molecular detection of fetal chromo WO 2004096825 A 11 / 2004 WO WO 2004 /096825 AL 11 /2004 somal aneuploidy . PNAS 104 (32 ) : 13116 (2007 ) . * WO WO 2005 /038051 A2 4 / 2005 Menzel et al. , Experimental Generation of SNP Haplotype Signa WO 2005038051 A2 5 / 2005 tures in Patients with Sickle Cell Anaemia . PLoS one 5 ( 9 ) e13004 WO 2005054506 A2 6 / 2005 (2010 ). * WO 2005111242 A2 11 / 2005 Michalatos -Beloin et al ., Molecular haplotyping of genetic markers WO WO 2005 / 111242 A2 11 / 2005 10 kb apart by allele -specific long - range PCR . Nucleic Acids WO 2005111242 A3 1 / 2006 Research 24 ( 23 ) : 4841 ( 1996 ) . * WO WO 2005 / 111242 A3 1 / 2006 Mitra et al. , Digital genotyping and haplotyping with polymerase WO 2004044225 A3 4 / 2006 colonies . PNAS 100 ( 10 ) : 5926 (2003 ) . * WO WO 04 /044225 A3 4 / 2006 Ruano et al. , PNAS 87 : 6296 ( 1990 ). * wo 2007008605 A1 1 / 2007 WO 2007037678 A2 4 / 2007 Salem et al. ,Human Genomics 2 ( 1 ) : 39 ( 2005 ) .* WO 2007044091 A2 4 / 2007 Templeton et al. , Genetics 120 : 1145 ( 1988 ). * WO WO 2007 /044091 A2 4 / 2007 Tost et al ., Molecular haplotyping at high throughput. Nucleic Acids WO 2007087312 A2 8 / 2007 Research 30 ( 19 ) :e96 (2002 ). * WO 2007092473 A2 8 / 2007 Davis et al . (Editors ) Basic Methods in Molecular Biology pp . 47 -65 WO WO 2007 /087312 A2 8 / 2007 ( 1991 ) . Elsevier Scientific Publishing, NY, NY . * WO WO 2007 /092473 A2 8 / 2007 Alitalo et al. , Homogeneously staining chromosomal regions con WO 2007100243 A1 9 / 2007 tain amplified copies of an abundantly expressed cellular oncogene WO WO 2007 / 100243 A1 9 / 2007 ( c- mnc ) in malignant neuroendocrine cells from a human colon WO 2007044091 AL 11/ 2007 carcinoma. PNAS 80 : 1707 ( 1983 ). * WO WO 2007/ 044091 A3 11/ 2007 Barber et al ., GAPDH as a housekeeping : analysis of GAPDH WO 2007147076 A2 12 / 2007 mRNA expression in a panel of 72 human tissues . Physiological WO 2007147079 A2 12 / 2007 Genetics 21 : 389 (2005 ) . * WO WO 2007 / 147076 A2 12 / 2007 Braun et al. , Myf- 6 , a new member of the human gene family of WO WO 2007 / 147079 A2 12 / 2007 myogenic determination factors : evicence for a gene cluster on WO 2007147079 A3 3 / 2008 chromosome 12 . EMBO Journal 9 ( 3 ) : 821 ( 1990 ) . * WO WO 2007 / 147079 A3 3 / 2008 Bignell et al. , High -Resolution Analysis of DNA Copy No . Using WO 2007147076 A3 4 / 2008 Oligonucleotide Microarrays. Genome Research 14 :287 ( 2004 ) . * WO WO 2007 / 147076 A3 4 / 2008 Capero et al ., MET and KRAS Gene Amplification Mediates WO 2008063227 A2 5 / 2008 Acquired Resistance to MET Tyrosine Kinase Inhibitors . Cancer WO WO 2008 /063227 A2 5 / 2008 Research 70 ( 19 ) : 7580 ( 2010 ) . * WO 2008063227 A3 11 / 2008 Dalgleish et el . Copy Number of a Human Type I alpha2 Collagen WO 2008134153 Al 11 / 2008 Gene Journaol of Biological Chemistry 257 ( 22 ) : 13816 ( 1982 ). * WO WO 2008 / 063227 A3 11 / 2008 Dressman et al. , Transforming single DNA molecules into fluores wo 2009014848 A2 1 /2009 cent magnetic particles for detection and enumeration of genetic WO WO 2009 /014848 A2 1 / 2009 wo 2009033178 A1 3 / 2009 variations . PNAS 100 ( 15 ) : 8817 ( 2003 ) . * WO WO 2009 /033178 Al 3 / 2009 Kalinina et al . , Nanoliter scale PCR with TaqMen detection . Nucleic WO 2009014848 A3 12 / 2009 Acids Research 25 ( 10 ) : 1999 ( 1997 ) . * WO WO 2009 /014848 A3 12 / 2009 Kallioniemi et al. , ERBB2 amplification in breast cancer analyzed WO 2010009365 A1 1 / 2010 by fluorescence in situ hybridization . PNAS 89 : 5321 ( 1992 ) . * WO WO 2010 /009365 Al 1 / 2010 Kant et al ., Evolution and organization of the fibrinogen locus on WO 2010036352 A1 4 / 2010 chromosome 4 :Gene duplication accompanied by transposition and WO WO 2010 /036352 A1 4 / 2010 inversion . PNAS 82 : 2344 ( 1985 ) . * WO 2011000604 A2 1 / 2011 Kaufhold et al. , Identical Confer High -Level Resistance to WO WO 2011 /000604 A2 1 / 2011 Gentamicin upon Enterococcus faecalis , Enterococcus faecium , and WO 2011034631 A1 3 / 2011 Streptococcus agalactiae , Antimicrobial Agents and Chemotherapy WO WO 2011 /034631 A1 3 / 2011 36 ( 6 ) : 1215 ( 1992 ). * WO 2011066476 Al 6 / 2011 Kilpatrick , CW . , Noncryogenic Preservation ofMammalian Tissues WO WO 2011 /066476 A1 6 / 2011 for DNA Extraction : An Assessment of Storage Methods. Biochemi WO 2011000604 A3 2 / 2012 cal Genetics 40 ( 1/ 2 ) : 53 ( 2002 ) . * WO WO 2011 /000604 A3 2 / 2012 Leamon et al. , A massively parallel Pico TiterPlate . . . based WO 2012116331 A2 8 / 2012 platform for discrete picoliter -scale polymerase chain reactions . WO WO 2012 / 116331 A2 8 / 2012 Electrophoresis 24 :3769 ( 2003) . * WO 2012129436 Al 9 / 2012 Manuelidis et al . Genomic representation of the Hind 11 1. 9 kb wo 2013049443 Al 4 / 2013 repeated DNA . Nucleic Acids Research 10 ( 10 ) : 3221 ( 1982 ) . * US 10 ,167 ,509 B2 Page 4

( 56 ) References Cited Gerdes, et al. Computer - assisted prenatal aneuploidy screening for chromosome 13 , 18 , 21 , X and Y based on multiplex ligation OTHER PUBLICATIONS dependent probe amplification (MLPA ), Eur J Hum Genet . Feb . 2005; 13 ( 2 ): 171 -5 . Murthy et al. Copy Number Analysis of c - erb - B2 (HER - 2 / neu ) and Guo , et al. Simultaneous detection of trisomies 13 , 18 , and 21 with Topoisomerase IIa Genes in Breast Carcinoma by Quantitative multiplex ligation - dependent probe amplification -based real - time Real - Time Polymerase Chain Reaction Using Hybridization Probes PCR . Clin Chem . Sep . 2010 , 56 ( 9 ) : 1451 - 9 . and Fluorescence In Situ Hybridization . Archives of Pathology & Hardenbol, et al. Multiplexed genotyping with sequence -tagged Laboratory Medicine 129 : 39 ( 2005 ). * molecular inversion probes . Nat Biotechnol . Jun . 2003 ;21 ( 6 ) :673 - 8 . Shumaker at al . Mutation Detection by Solid Phase Primer Exten Epub May 5 , 2005 . PMID : 12730666 . sion . Human Mutation 7 : 346 (1996 ). * Hayatsu , et al . DNA methylation analysis : speedup of bisulfite Slamon et al . , Human Breast Cancer : Correlation of Relapse and mediated deamination of cytosine in the genomic sequencing pro Survival with Amplification of the HER - 2 /neu Oncogene. Science cedure . Proc . Jpn . Acad . Ser. B . 2004 ; 80 : 189 - 194 . 235 : 177 ( 1987 ) . * Herzenberg , et al. Petal cells in the blood of pregnant women : Wilke et al. , Diagnosis of Haploidy and Triploidy Based on Mea detection and enrichment by fluorescence -activated cell sorting . surement of Gene Copy Number by Real - Time PCR . Human Proc Natl Acad Sci US A . Mar. 1979 ;76 ( 3) : 1453 -5 . Mutation 16 : 431 (2000 ) . * Komiyama, et al. Catalysis of diethylenetriamine for bisulfite Wu et al. , Comparative DNA Sequence Analysis of Mouse and induced deamination of cytosine in oligodeoxyribonucleotides. Tet Human Protocadherin Gene Clusters . Genome Research 11 : 389 rahedron Letters. 1994 ; 35 ( 44 ) : 8185 - 8188 . ( 2001) . * Lieber. , The mechanism of double - strand DNA break repair by James G . Wetmur et al. , “ Molecular haplotping by linking emulsion nonhomologous DNA end - joining pathway. u Rev Biochem . PCR analysis of paraoxonase 1 haplotypes and phenotypes ” , Nucleic 2010 ; 79 : 181- 211 . Acids Research , vol. 33 , No . 8 , published online May 10 , 2005 , pp . Ng, et al. The concentration of circulating corticotropin - releasing 2615 - 2619 . hormone mRNA in maternal plasma is increased in preeclampsia . Richard Williams et al. , “ Amplification of complex gene libraries by Clin Chem . May 2003 ;49 ( 5 ) :727 - 31. emulsion PCR ” , Nature Methods, published online Jun . 21, 2006 , 6 Panne , et al. The MerBC endonuclease translocates DNA in a pgs . reaction dependent on GTP hydrolysis. J Mol Biol. Jul. 2 , Yolanda Schaerli et al. , “ The potential of microfluidic water - in -oil 1999 ; 290 ( 1 ) : 49 -60 . droplets in experimental biology ” , Molecular BioSystems, vol . 5 , Putney , et al. A DNA fragment with an alpha -phosphorothioate published online Oct. 12 , 2009 , pp . 1392 - 1404 . nucleotide at one end is asymmetrically blocked from digestion by European Patent Office, “ European Search Report " in connection exonuclease III and can be replicated in vivo . Proc Natl Acad Sci U with related European Patent Application No . 12744865 , dated Dec . S A . Dec . 1981; 78 ( 12 ): 7350 - 4 . 15 , 2014 , 9 pgs . Raleigh . Organization and function of the merBC genes of Escherichia James G . Wetmur et al. , “Molecular haplotyping by linking emul coli K - 12 . Mol Microbiol. May 1992 ; 6 ( 9 ) : 1079 - 86 . sion PCR : analysis of paraoxonase 1 haplotypes and phenotypes” , Schouten , et al . MLPA for prenatal diagnosis of commonly occur Nucleic Acids Research , vol. 33 , No . 8 , May 10 , 2005 , pp . 2615 ring aneuploidies. Methods Mol Biol. 2008 ;444 : 111 - 22 . 2619 . Sivaraman , et al . Codon choice in genes depends on flanking Stephan Menzel et al. , “ ExperimentalGeneration of SNP Haplotype sequence information implications for theoretical reverse transla Signatures in Patients with Sickle Cell Anaemia ” , PLoS ONE , vol . tion . Nucleic Acids Res . Feb . 2008 ;36 ( 3) : e16 . Epub Jan . 18 , 2008 . 5 , Issue 9 , Sep . 24 , 2010 , 8 pages. Takai, et al. Comprehensive analysis of CpG islands in human James G . Wetmur et al ., “ Linking Emulsion PCR Haplotype Analy chromosomes 21 and 22 . Proc Natl Acad Sci US A . Mar. 19 , sis ” , Methods in Molecular Biology , vol . 687 , 2011 , pp . 165- 175 . 2002 ; 99 ( 6 ) : 3740 - 5 . Epub Mar. 12 , 2002 . Chan , et al . Hypermthylated RASSF1A in maternal plasma: A Timmermans , et al . Characterization of a meiotic crossover in maize universal fetal DNA marker that improves the reliability of nonin identified by a restriction fragment length polymorphism -based vasive prenatal diagnosis. Clin . Chem . 2006 ; 52 ( 12 ) : 2211 - 2218 . method . Genetics. Aug . 1996 ; 143 ( 4 ) : 1771 -83 . Milbury , et al. PCR -based methods for for the enrichment of Tong , et al . Noninvasive prenatal detection of trisomy 21 by an minority alleles and mutations . Clin . Chem . 2009 ; 55 ( 4 ) :632 -640 . epigenetic - genetic chromosome- dosage approach . Clin Chem . Jan . Aitman , et al. Copy number polymorphism in Fcgr3 predisposes to 2010 ; 56 ( 1 ) : 90 - 8 . Epub Oct . 22 , 2009 . glomerulonephritis in rats and humans . Nature . Feb . 16 , Wang , et al . Analysis of molecular inversion probe performance for 2006 ;439 ( 7078 ) : 851 - 5 . allele copy number determination . Genome Biol . 2007 ; 8 ( 11) :R246 . Balikova. Autosomal- dominant microtia linked to five tandem cop Wu , et al. The ligation amplification reaction (LAR ) _ amplification ies of a copy - number- variable region at chromosome 4p16 . Am J of specific DNA sequences using sequential rounds of template Hum Gent. Jan . 2008 ; 82 ( 1 ) : 181 - 7 . dependent ligation . Genomics. May 1989 ;4 (4 ): 560 - 9 . Bhat , et al. Single molecule detection in nanofluidic digital array Yang , et al. A novel universal real - time PCR system using the enables accurate measurement of DNA copy number. Anal Bioanal attached universal duplex probes for quantitative analysis of nucleic Chem . May 2009 ; 394 ( 2 ) : 457 -67 . acids. BMC Mol Biol. Jun . 4 , 2008 ;9 :54 13 pages . Boorsman , et al . Comparison of multiplex ligation -dependent probe Zejskova, et al . Feasibility of fetal- derived hypermethylated RASSF1A amplification and karyotyping in prenatal diagnosis . Obstet Gynecol . sequence quantification in maternal plasma - next step toward reli Feb . 2010 ; 115 ( 2 Pt 1 ): 297 - 303. able non - invasive prenatal diagnostics . Exp Mol Pathol. Dec . Cappuzzo , et al. Increased HER2 gene copy number is associated 2010 ; 89 ( 3 ) :241 - 7 . with response to gefitinib therapy in epidermal growth factor Caughey , et al. Chorionic villus sampling compared with amnio receptor- positive non -small -cell lung cancer patients . J Clin Oncol. centesis and the difference in the rate of pregnancy loss . Obstet Aug . 1 , 2005 ; 23 ( 22 ) :5007 - 18 . Gynecol. Sep . 2006 ; 108 ( 3 Pt 1 ) :612 -6 . Dahl, et al. Multiplex amplification enable by selective circulariza Nicolaidis , et al. Origin and mechanisms of non - disjunction in tion of large sets of genomic DNA fragments. Nucleic Acids Res. human autosomal trisomies . Hum Repord . Feb . 1998 ; 13 ( 2 ) : 313 - 9 . Apr. 28 , 2005 ; 33 ( 8 ) :e71 . Kane, et al. Rapid , high - throughput, culture -based PCR methods to Fan , et al. Microfluidic digital PCR enables rapid prenatal diagnosis analyze samples for viable sports of Bacillus anthracis and its of fetal aneuploidy. Am J Obstet Gynecol . May 2009 ;200 ( 5 ) :543 . surrogates . J Microbio Methods. 2009 ; 76 : 278 -284 . e1- 7 . Motomura , Kazushi et al ., “ Genetic Recombination between Human Frommer, et al . A genomic sequencing protocol that yields a Immunodeficiency Virus Type 1 (HIV - 1 ) and HIV - 2 , Two Distinct positive display of 5 -methylcytosine resides in individual DNA Lentiviruses ” , Journal of Virology , vol . 82 , No . 4 , Feb . 2008 , pp . strands . Proc Natl Acad Sci US A . Mar. 1 , 1992 ;89 ( 5 ) : 1827 - 31. 1923 - 1933 . US 10 ,167 ,509 B2 Page 5

( 56 ) References Cited Soni et al ., Progress Toward Ultrafast DNA Sequencing Using Solid -State Nanopores, Clin Chem , Nov. 2007 , 53 ( 11 ) , 1996 - 2001 , OTHER PUBLICATIONS Epub Sep . 21 , 2007 . Stewart et al. , Dependence ofMerBC Cleavage on Distance Between Tadmor, Arbel D . et al ., " Supporting Online Material for Probing Recognition Elements , Biol Chem , Apr. -May 1998 , 379 ( 4 - 5 ): 611 - 6 . Individual Environmental Bacteria for Viruses by Using Microfluidic Stewart et al . “ Methyl- Specific DNA Binding by MerBC , A Modi Digital PCR ” , Science , vol. 333 , No. 6038 , Jun . 30 , 2011, 48 pgs. fication - Dependent Restriction Enzyme” , J Mol Biol , May 12 , 2000 , Takehisa , Jun et al. , “ Human Immunodeficiency Virus Type 1 298 ( 4 ): 611 -22 Intergroup (M /O ) Recombination in Cameroon ” , Journal of Virol Sudmant et al ., “ Diversity of Human Copy Number Variation and ogy , vol. 73 , No. 8, pp . 6810 -6820 . Multicopy Genes ” , Science, Oct. 29, 2010 , 330 (6004 ) :641 -6 . Wang , Jianbin et al. , “Genome -wide Single - Cell Analysis of Recom Sutherland et al ., “ MerBC A MultisubunitGTP - Dependent Restric bination Activity and De Novo Mutation Rates in Human Sperm ” , tion Endonuclease ” , J Mol Microbiol, 1992 , 225 :327 - 334 . CELL , vol. 150 , No . 2 , Jul . 20 , 2012, pp . 402 -412 . Syvanen , “ Accessing Genetic Variation :Genotyping Single Nucleo Barz , Wolfgang , Substantive Examiner , European Patent Office , tide Polymorphisms” , Nat Rev Genet, Dec . 2001, 2 ( 12 ): 930 -42 . " Communication Pursuant to article 94 ( 3 ) EPC ” in connection with Takai et al . “ Comprehensive Analysis of CpG Islands in Human related European Patent Application No . 12744865. 2 - 1404 , dated Chromosomes 21 and 22 ” , Proc Natl Aca Sci USA , Mar. 19 , 2002 , Feb . 23 , 2017 , 5 pages . 99 (6 ) : 3740 - 5 , Epub Mar . 12 , 2002. Anwar, Muhammad Zohaib et al . , “ Gene Locater : Genetic Linkage Taylor et al. , “ Functional Copy -Number Alterations in Cancer ” , Analysis Software Using Three - Point Testcross” , Bioinformation , PLoS One , Sep . 11 , 2008 , 3 ( 9 ) : e3179 . Dated Mar . 17 , 2012 , pp . 243 - 245 , vol. 8 ( 5 ) . Timmermans et al ., “ Characterization of Meiotic Crossover in Boni, Maciej F . et al ., “ Guidelines for Identifying Homologous Maize Identified by a Restriction Fragment Length Polymorphism Recombination Evens in Influenza A Virus ” , Plos One , dated May Based Method ” , Genetics, Aug. 1996 , 143 ( 4 ) : 1771 -83 . 3 , 2010 , pp . 1 - 11 , vol. 5 ( 5 ). Tong et al. , Noninvasive Prenatal Detection of Trisomy 21 by an Dear, Paul H ., “Happy Mapping” , BNSDOCID , dated 1997 , pp . Epigentic -Genetic Chromosome- Dosage Approach , Clin Chem , Jan . 95 - 123 2010 , 56 ( 1 ) : 90 - 8 , Epub Oct . 22 , 2009 . Dear , Paul H . et al. , “ Happy Mapping : Linkage Mapping Using a Tosch et al. , “ Large Branched Self- Assembled DNA Complexes” , J . Physical Analogue of Meoisis ” , Nucleic Acids Research , vol. 21, Of Physics : Conference Series, 2007 , 61 : 1241 - 1245 doi : 10 . 1088 / No. 1 , dated 1993 , pp . 13 - 20 , vol. 21 ( 1 ) . 17426596 /61 / 1 / 245 International Conference on Nanoscience and “ The First Office Action for Application No. 2012800176169 ” , The Technology ( ICN & T 2006 ) . State Intellectual Property Office of the People' s Republic of China , UK Combined Office Action and Search Report dated Apr . 7 , 2011 for GB1100787. 9 . dated Jul. 3 , 2014 , 3 pages. Van Opstal et al. , “ Rapid Aneuploidy Detection With Multiplex Malou , Nada et al . “ Immuno - PCR : A Promising Ultrasensitive Ligation -Dependent Probe Amplification : A Prospective Study of Diagnostic Method to Detect Antigens and Antibodies” , Trends in 4000 Amniotic Fluid Samples” , Eur J Hum Genet, Jan . 2009 , Microbiology, vol. 19 , No. 6 , dated Jun . 2011, pp . 295 -302 , vol. 17 ( 1 ) : 112 - 21 . 19 (6 ). Vogelstein et al. , “ Digital PCR ” , Proc Natl Acad Sci USA , Aug . 3 , Pole , Jessica C . M . et al ., “ Single -Molecule Analysis of Genome 1999 , 96 ( 16 ) :9236 -41 . Rearrangements in Cancer ” , Nucleic Acids Research , Mar . 30 , Wang et al ., “ Allele Quantification Using Molecular Inversion 2011, pp . 1 - 13, vol. 39 ( 13) . Probes (MIP ) " , Nucleic Acids Res , Nov . 28 , 2005 , 33 (21 ) : e183 . Canadian Intellectual Property Office, “ Office Action ” in connection Wang et al ., “ Digital Karyotyping " , Proc Natl Acad Sci USA , Dec . with related Canadian Patent Application No. 2 , 826 ,748 , dated Jan . 10 , 2002 , 99 (25 ): 16156 -61 . 10 , 2018, 6 pgs . Weber et al. , " Chromosome -Wide and Promoter- Specific Analyses Wei, Hua et al ., “ The Fidelity Index Provides a Systematic Quantita Identify Sites of Differential DNA Methylation in Normal and tion of Star Activity of DNA Restriction Endonucleases” , Nucleic Transformed Human Cells ” , Nat Genet , Aug . 2005 , 37 ( 8 ) :853 -62 , Acids Research , Mar. 28 , 2008 , pp . 1 - 10 , vol. 36 ( 9 ) . Epub Jul. 10 , 2005 . Wikipedia , “ Genectic Linkage” , Wikipedia , The Free Encyclodpedia , White et al. , Digital PCR Provides Sensitive and Absolute Calibra dated Mar. 20 , 2017 , pp . 1 - 6 . tion for High Throughput Sequencing BMC Genomics, Mar. 19 , Satoh et al . , “ Interim Safety Analysis From TYTAN : A Phase III 2009, 10 :116 , 12 pages . Asian Study of Lapatinib in Combination With Paclitaxel as Second White et al . , “ Retrotransposons in the Flanking Regions of Normal Line Therapy in Gastric Cancer” , J . Clin . Oncol, 2010 , 28 : 15s . Plant Genes : A Role for Copia -Like Elements in the Evolution of Schouten et al. , “ MLPA for Prenatal Diagnosis of Commonly Gene Structure and Expression ” , Proc Natl Acad Sci USA , Dec . 6 , Occurring Aneuploidies” , Methods of Mol Biol, 2008 , 444 : 111 - 22 . 1994 , 91 (25 ) : 11792- 6 . Schouten et al. , “ Relative Quantification of 40 Nucleic Acid Sequences Wong et al. “ A Comprehensive Analysis of Common Copy -Number by Multiplex Ligation -Dependent Probe Amplification ” , Nucleic Variations in the Human Genome” , Am J Hum Genet, Jan . 2007 , Acids Res ., 2002, 30 ( 12 ): e57 . 80 ( 1 ) : 91 - 104 , Epub Dec . 5 , 2006 . Sebat et al ., “ Large - Scale Copy Number Polymorphism in The Wright, “ The Use of Cell - Free Fetal Nucleic Acids in Maternal Human Genome” , Science , Jul . 23, 2004 , 305 ( 5683 ) : 525 - 8 . Blood for Non - Invasive Prenatal Diagnosis " , Hum Reprod Update , Sebat et al. , “ Strong Association of De Novo Copy Number Muta Jan . - Feb . 2009 , 15 ( 1 ) : 139 -51 , Epub Oct . 22 , 2008 . tions With Autism ” , Science, Apr. 20 , 2007 , 316 ( 5823 ) :445 - 9 , Epub Wu et al . , “ The Litigation Amplification Reaction (LAR ) Mar. 15 , 2007 . Amplification of Specific DNA Sequences Using Sequential Rounds Shifman et al. , “ Quantitative Technologies for Allele Frequency of Template -Dependent Ligation ” , Genomics ,May 1989 , 4 ( 4 ) : 560 Estimation of SNPs in DNA Pools” , Mol Cell Probes , Dec . 2002 , 9. 16 (6 ): 429 - 34 . Yang et al. , “ A Novel Universal Real- Time PCR System Using the Shlien et al. , “Copy Number Variations and Cancer ” , Genome Med , Attached Universal Duplex Probes for Quantitative Analysis of Jun . 16 , 2009 , 1( 6 ): 62 . Nucleic Acids” , BMC Mol Biol, Jun . 24 , 2008 , 9 : 54 13 pages . Shlien et al. , “ Excessive Genomic DNA Copy Number Variation in Zejskova et al. , “ Feasibility of Fetal- Derived Hypermethylated the Li- Fraumeni Cancer Predisposition Syndrome” , Proc Natl Acad RASSFIA Sequence Quantification in Maternal Plasma — Next Step Sci USA , Aug . 12 , 2008 , 105 (32 ) : 11264 - 9 , Epub Aug . 6 , 2008 . Toward Reliable Non - Invasive Prenatal Diagnostics ” , Exp Mol Sivaraman et al. , " Codon Choice in Genes Depends on Flanking Pathol, Dec . 2010 , 89 ( 3 ) :241 - 7 . Sequence Information — Implications for TheoreticalReverse Trans Kane et al. , “ Rapid High - Throughput Culture - Based PCR Methods lation ” , Nucleic Acid Res, Feb . 2008 , 36 (3 ): e16 , Epub Jan . 18 , to Analyze Samples for Viable Spores of Bacillus Anthracis and Its 2008 . Surrogates” , J Microbio Methods, 2009 , 76 :278 - 284 . US 10 ,167 ,509 B2 Page 6

( 56 ) References Cited U . S . Appl . No. 13 /400 ,030 , filed Feb . 17 , 2012 , Hindson et al. Aitman et al. , " Copy Number Polymorphism in Fergr3 Predisposes OTHER PUBLICATIONS to Glomerulonephritis in Rats and Humans ” , Nature, Feb . 16 , 2006 , 439 ( 7078 ): 851 - 5. Letant et al. , “ Most- Probable -Number Rapid Viability PCR Method Alkan et al ., “ Personalized Copy Number and Segmental Duplica to Detect Viable Spores Bacillus Anthracis in Swab Samples” , J . tion Maps Using Next -Generation Sequencing ” , Nat Genet, Oct . Microbio Methods , 2010 , 81 : 200 - 202 . 2009 , 41 ( 10 ): 1061- 7 , Epub Aug . 30 , 2009 . Kiss et al. , “ High - Throughput Quantitative Polymerase Chain Reac tion in Picoliter Droplets ” , Anal Chem , Dec . 1, 2008 , 80 ( 23 ): 8975 Balikova , “ Autosomal- Dominant Microtia Linked to Five Tandem 81 . of a Copy - Number - Variable Region at Chromosome 4p16 ” , Am J Office Action dated Sep . 5 , 2014 for U . S . Appl . No. 13 /400 ,030 . Hum Genet, Jan . 2008 , 82 ( 1 ) : 181- 7 . Brenner et al. , “ Gene Expression Analysis by Massively Parallel Barringer et al. , Blunt - End and Single -Strand Ligations by Escherichia Signature Sequencing (MPSS ) on Microbead Arrays” ,Nat Biotechnol, coli Ligase : Influence on an In Vitro Amplification Scheme, Gene , Jun . 2000 , 18 (6 ) :630 - 4 . Apr. 30 , 1990 , 89 ( 1 ) : 117 - 22 . Chan et al ., “ Hypermethylated RASSF1A in Maternal Plasma: A Bennetzen et al ., “ Active Maize Genes Are Unmodified and Flanked Universal Fetal DNA Marker That Improves the Reliability of by Diverse Classes of Modified , Highly Repetitive DNA ” , Genome, Noninvasive Prenatal Diagnosis ” , Clin Chem , 2006 , 52 ( 12 ) :2211 Aug. 1994 , 37 (4 ): 565- 76 . 2218 . Bhat et al ., “ Single Molecule Detection in Nonofluidic Digital Array Conze et al ., “ Analysis of Genes, Transcripts , and via DNA Enables Accurate Measurement of DNA Copy Number” , Anal Ligation ” , Annu Rev Anal Chem ( Palo Alto Calif ), 2009 , 2 :215 - 39 . Bioanal Chem , May 2009 , 394 ( 2 ) :457 -67 . International Search Report and Written Opinion dated Aug . 3, 2012 Bianchi et al. “ Isolation of Fetal DNA From Nucleated Erythrocytes for PCT/ US2012 /025760 . in Maternal Blood” , Proc Natl Acad Sci USA , May 1990 , 87 ( 9 ) :3279 International Search Report and Written Opinion dated Aug. 3 , 2012 83 . for PCT/US2012 / 024573 . Boorsman et al. , “ Comparison of Multiplex Multiplex Ligation Milbury et al. , “ PCR - Based Methods for the Enrichment of Minor Dependent Probe Amplification and Karyotyping in Prenatal Diag ity Alleles and Mutations” , Clin Chem , 2009 , 55 (4 ): 632 - 640 . nosis ” , Obstet Gynecol, Feb . 2010 , 115 ( 2 Pt 1 ): 297 - 303 . Shi et al. , “ Digital Quanitification of Gene Expression Emulsion Brouzes et al ., “ Droplet Microfluidic Technology for Single - Cell PCR ” , Electrophoresis , Jan . 2010 , 31 ( 3 ) :528 - 34 . Tsui et al. , “ Epigenetic Approaches for the Detection of Fetal DNA High - Throughput Screening” , Proc Natl Acad Sci USA, Aug . 25 , in Maternal Plasma” , Chimerism , Jul. 2010 , 1 ( 1 ) : 30 - 35 . 2009 , 106 ( 34 ) : 14195 - 200 . Yoshimoto et al . , “ High -Resolution Analysis of DNA Copy Number Bruch et al. , “ Trophoblast- Like Cells Sorted From Peripheral Mater Alterations and Gene Expression in Renal Clear Cell Carcinoma” , nal Blood Using Flow Cytometry : A Multiparametric Study Involv J Pathol, Dec . 2007 , 213 ( 4 ): 392 -401 . ing Transmission Electron Microscopy and Fetal DNA Amplifica Caughey et al. , " Chronic Villus Sampling Compared With Amnio tion ” , Prenat Diagn , Oct. 1991 , 11 ( 10 ): 787- 98 . centesis and the Difference in the Rate of Pregnancy Lost” , Obstet Cappuzzo et al. , “ Increased HER2 Gene Copy Number is Associ Gynecol, Sep . 2006 , 108 ( 3 Pt 1 ) :612 - 6 . ated With Response to Gefitnib Therapy in Epidermal Growth D ’Alton , “ Prenatal Diagnostic Procedures” , Semin Perinatol, Jun . Factor Receptor - Positive Non - Small -Cell Lung Cancer Patients ” , J 1994 , 18 ( 3 ) : 140 -62 . Clin Oncol, Aug . 1, 2005 , 23 (22 ) : 5007 - 18 . Dhallan et al. , “ Methods to Increase the Percentage of Free Fetal Clark et al ., “ High Sensitivity Mapping of Methylated Cytosines” , DNA Recovered From the Maternal Circulation ” , JAMA , Mar. 3 , Nucleic Acids Res, Aug . 11 , 1994 , 22 ( 15 ) :2990 - 7 . 2004 , 291( 9 ) : 1114 - 9 . Cook et al. , “ Copy -Number Variations Associated With Neuropsychiatric Nicolaidis et al. , " Origin and Mechanisms of Non - Disjunction in Conditions” , Nature , Oct. 16 , 2008 , 455 (7215 ) : 919 -23 . Human Autosomal Trisomies” , Hum Reprod , Feb . 1998 , 13 ( 2 ) : 313 Dahl et al. , “ Circle - to -Circle Amplification for Precise and Sensitive 9 . DNA Analysis ” , Proc Natl Acad Sci USA ,Mar . 30 , 2004 , 101 ( 13 ) :4548 Kalinina et al. , “ Nanoliter Scale PCR With TaqMan Detection ” , 53 , Epub Mar. 15 , 2004 . Nucleic Acids Research , 25 ( 10 ) : 1999 ( 1997 ) . Dahl et al ., “ DNA Methylation Analysis Techniques” , Biogerontol Manuelidis et al ., “ Genomic Representation of the Hind II 1. 9 kb ogy , 2003, 4 ( 4 ) : 233 - 50 . Repeated DNA ” , Nucleic Acids Research , 10 ( 10 ) : 3221 ( 1982 ). Dahl et al. , “ Multiplex Amplification Enabled by Selective Circu Murthy et al. , “ Copy Number Analysis of c -erb - B2 (HER - 2 /neu ) larization of Large Sets of Genomic DNA Fragments ” , Nucleic and Topoisomerase IIa Genes in Breast Carcinoma by Quantitative Acids Res ., Apr. 28 , 2005 , 33 ( 8 ) : e71 . Real- Time Polymerase Chain Using Hybridization Probes and Fluo Dube et al . , “Mathematical Analysis of Copy Number Variation in rescence in Situ Hybridization ” , Archives of Pathology & Labora a DNA Sample Using Digital PCR on a Nanofluidic Device " , PLos tory Medicine , 129 : 39 ( 2005 ) . One , 2008 , 3 ( 8 ) :e2876 , doi: 10 . 1371 / journal. pone .0002876 . Shumaker et al. , “Mutation Detection by Solid Phase Primer Exten Elmore, “ Apoptosis : A Review of Programmed Cell Death ” , Toxicol sion ” , Human Mutation , 7: 346 ( 1996 ). Pathol, Jun . 2007 , 35 ( 4 ) : 495 - 516 . International Preliminary Report on Patentability dated May 30 , Elnifro et al. , “ P . E . Multiplex PCR : Optimization and Application in 2012 for PCT /US2010 /058124 . Diagnostic Virology ” , Clin . Microbiol. Rev. , 13 , 559 -570 ( 2000 ). International Preliminary Report on Patentability dated Jul. 29 , Fan et al. , “ Analysis of the Size Distributions of Fetal and Maternal 2008 for PCT/ US2007 /001796 . Cell - Free DNA by Paired - End Sequencing ” , Clin Chem ., Aug. International Preliminary Report on Patentability dated Sep . 2 , 2008 2010 , 56 ( 8 ): 1279 - 86 , Epub Jun . 17 , 2010 . for PCT/ NL 2007 /000055 . Fan et al. , “ Microfluidic Digital PCR Enables Rapid Prenatal Office Action dated Dec . 5 , 2012 for U . S . Appl. No. 12 /954 ,634 . Diagnosis of Fetal Aneuplpidy ” , Am J Obstet Gynecol, May 2009 , European Search Report dated May 8 , 2013 for EP Application No . 200 ( 5 ) :543 .el - 7 . 10833991 . 2 . Fan et al. , “ Noninvasive Diagnosis of Fetal Aneuploidy by Shotgun Mazutis et al . , “ Droplet - Based Microfluidic Systems for High Sequencing DNA From Maternal Blood ” , Proc Natl Acad Sci USA , Through - Put Amplification and Analysis” , Analytical Chemistry , Oct . 21 , 2008 , 105 ( 42 ) : 16266 - 71 , Epub Oct . 6 , 2008 . American Chemical Society , 2009 , 81 ( 12 ) : 4813 -4821 . Fire et al. , “ Rolling Replication of Short DNA Circles” , Proc Natl Boettger et al. , “ Structural Haplotypes and Recent Evolution of the Acad Sci USA , May 9 , 1995 , 92 ( 10 ) :4641 - 5 . Human 17421. 31 Region ” , Nat Genet , Jul. 1 , 2012 , 44 ( 8 ) :881 - 5 , Fraga et al ., “ DNA Methylation : A Profile of Methods and Appli doi: 10. 1038/ ng .2334 . cations” , Biotechniques, Sep . 2002, 33 ( 3) : 632 , 634 , 636 -49 . Supplementary note for Boettger et al. , " Structural Haplotypes and Frommer et al . , “ A Genomic Sequencing Protocol That Yields a Recent Evolution of the Human 17q21 . 31 Region ” , Nat Genet, Jul. Positive Display of 5 -Methylcystosine Residues in Individual DNA 1 , 2012, 44 ( 8 ): 881 - 5 , doi : 10 . 1038/ ng. 2334. Strands ” , Proc Natl Acad Sci USA , Mar . 1 , 1992 , 89 ( 5 ) : 1827 - 31. US 10 ,167 ,509 B2 Page 7

( 56 ) References Cited Hindson et al ., " High - Throughput Droplet Digital PCR System for Absolute Quantitation of DNA Copy Number” , Anal Chem , Nov . OTHER PUBLICATIONS 15 , 2011 , 83 ( 22 ) :8604 - 10 , Epub Oct . 28 , 2011 . Hochstenbach et al ., Rapid Detection of Chromosomal Aneuploidies Gerdes et al. , " Computer - Assisted Prenatal Aneuploidy Screening in Uncultured Amniocytes by Multiplex Ligation -Dependent Probe for Chromosome 13 , 18 , 21 X and Y Based on Multiplex Ligation Amplification (MLPA ) , Prenat Diagn , Nov. 2005 , 25 ( 11 ) : 1032 - 9 . Dependant Probe Amplification (MLPA )" , Eur J Hum Genet, Feb . Huber et al . , “ High -Resolution Liquid Chromatography of DNA 2005, 13 (2 ): 171 -5 . Fragments on Non -Porous Poly (Styrene -Divinylbenzene ) Par Gonzalez et al. , “ Fluidic Interconnect for Modular Assembly of ticles” , Nucleic Acids Res . , Mar. 11 , 1993 , 21 ( 5 ) : 1061 - 6 . Chemical Microsystems” , Sensors and Actuators, 1998 , B49 : 40 - 45 . Hyman et al ., “ Multiplex Identification of Microbes ” , Appl Environ Gonzalez et al. “ The Influence of CCL3L1 Gene - Containing Seg Microbiol, Jun . 2010 , 76 ( 12 ) :3904 - 10 , Epub Apr. 23 , 2010 . mental Duplications on HIV - 1 /AIDS Susceptibility ” , Science , Mar. International Search Report and Written Opinion dated Apr. 7 , 2011 for PCT/ US2010 /058124 . 4 , 2005 , 307 (5714 ) : 1434 - 40 , Epub Jan . 6 , 2005 . Ji et al. , “ Molecular Inversion Probe Assay for Allelic Quantitation ” , Office Action dated Oct . 3, 2013 for U . S . Appl . No . 12 /954 ,634 . Methods Mol Biol, 2009 , 556 :67 - 87 . Wetmur , James G . et al . , “Molecular Haplotping by Linking Emul Kass et al. , “ How Does DNA Methylation Repress Transcription ? ” , sion PCR : Analysis of Paraoxonase 1 Haplotypes and Phenotypes ” , Trends Genet , Nov . 1997 , 13 ( 11 ) : 444 - 9 . Nucleic Acids Research , vol. 33 , No . 8 , published online May 10 , Kato et al ., " A New Packing for Separation of DNA Restriction 2005, pp . 2615 - 2619 . Fragments by High Performance Liquid Chromatography ” , J Biochem ., Williams, Richard et al ., “ Amplification of Complex Gene Libraries Jan . 1984 , 95 ( 1 ) :83 - 6 . by Emulsion PCR ” , Nature Methods, published online Jun . 21 , Komiyama et al ., “ Catalysis of Diethylenetriamine for Bisulfate 2006 , 6 pages . Induced Deamination of Cytosine in Oligodeoxyribonucleotides” , Schaerli , Yolanda et al. , “ The Potential of Microfluidic Water- In - Oil Tetrahedron Letters , 1994 , 35 (44 ) :8185 - 8188 . Droplets in Experimental Biology ” , Molecular BioSystems, vol. 5 , Kwoh et al. , “ Transcription - Based Amplification System and Detec published online Oct . 12 , 2009 , pp . 1392 - 1404 . tion of Amplified Human Immunodeficiency Virus Type 1 With a European Patent Office , “ European Search Report ” in connection Bead - Based Sandwich Hybridization Format” , Proc Natl Acad Sci With Related European Patent Application No . 12744865 , dated USA , Feb . 1989 , 86 ( 4 ) : 1173 - 7 . Dec . 15 , 2014 , 9 pages. Landegren et al ., “ A Ligase -Mediated Gene Detection Technique ” , Wetmur, James G . et al. , “ Molecular Haplotyping by Linking Science, Aug. 26 , 1988 , 241( 4869 ): 1077 - 80 . Emulsion PCR : Analysis of Paraoxonase 1 Haplotypes and Pheno Larijani et al . , “ Methylation Protects Cytidines From AID -Mediated types” , Nucleic Acids Research , vol. 33 , No . 8, May 10 , 2005, pp . Deamination ” , Mol Immunol, Mar . 2005 , 42 ( 5 ) :599 -604 . 2615 - 2619 . Leng et al. , “ Agarose Droplet Microfluidics for Highly Parallel and Menzel, Stephan et al. , “ Experimental Generation of SNP Haplotype Efficient Single Molecule Emulsion PCR ” , Lab Chip , Nov . 7 , 2010 , 10 ( 21 ) : 2841 - 3 . Signatures in Patients With Sickle Cell Anaemia ” , PLoS One, vol . Li et al. , “ Detection of Paternally Inherited Fetal PointMutations for 5 , Issue 9 , Sep . 24 , 2010 , 8 pages. Beta - Thalassemia Using Size - Fractionated Cell -Free DNA in Mater Wetmur, James G . et al. , “ Linking Emulsion PCR Haplotype nal Plasma” , JAMA , Feb . 16 , 2005 , 293 ( 7 ) :843 - 9 . Analysis ” , Methods in Molecular Biology, vol. 687, 2011 , pp . Lieber, “ The Mechanism of Double -Strand DNA Break Repair by 165- 175 . the Nonhomologous DNA End - Joining Pathway ” , U Rev Biochem , European Patent Office , “ Extended European Search Report " in 2010 , 79 : 181 -211 . Connection With Related European Patent Application No. 12744865 . Liu et al ., “ Rolling Circle DNA Synthesis : Small Circular Oligo 2 , dated Apr. 20 , 2015 , 11 pages . nucleotides as Efficient Templates for DNA Polymerases” , J Am Braun et al. , “ Myf- 6 , A New Member of the Human Gene Family Chem Soc , Feb . 21, 1996 , 118 ( 7 ) : 1587 - 1594 . of Myogenic Determination Factors: Evidence for a Gene Cluster Lizardi et al. , “ Mutation Detection and Single -Molecule Counting on Chromosome 12 ” , EMBO Journal 9 ( 3 ) : 821 ( 1990 ) . Using Isothermal Rolling -Circle Amplification ” , Nat Genet , Jul . Bignell et al. , “ High -Resolution Analysis of DNA Copy Number 1998 , 19 ( 3 ) : 225 - 32 . Using Oligonucleotide Microarrays" , Genome Research 14 , 287 , Lo et al. , “ Noninvasive Prenatal Diagnosis of Fetal Chromosomal 2004 . Aneuploidies by Maternal Plasma Nucleic Acid Analysis ” , Clin Dalgleish et al. , “ Copy Number of Human Type I Alpha2 Collagen Chem , Mar. 2008 , 54 ( 3 ) :461 - 6 . Gene, ” Journal of Biological Chemistry 257 , ( 22 ) : 13816 , ( 1982 ) . Lo et al. , “ Presence of Fetal DNA in Maternal Plasma and Serum ” , Wang et al . , “ Ananlysis of Molecular Inversion Probe Performance Lancet , Aug. 16 , 1997 , 350 (9076 ): 485 -7 . for Allele Copy Number Determination ” , Genome Biol , 2007 , Lupski, “ Genomic Rearrangements and Sporadic Disease ” , Nat 8 ( 11 ) :R246 . Genet, Jul . 2007 , 39 ( 7 Suppl) :S43 - 7 . Guatelli et al. , “ Isothermal, In Vitro Amplification of Nucleic Acids Marguiles et al ., “ Genome Sequencing in Microfabricated High by a Multienzyme Reaction Modeled After Retroviral Replication ” , Density Picolitre Reactors ” , Nature , Sep . 15 , 2005 , 437 ( 7057 ): 376 Proc Natl Acad Sci USA , Mar. 1990 , 87 ( 5 ) : 1874 - 8 . 80 , Epub Jul. 31, 2005. Guo et al ., “ Simultaneous Detection of Trisomies 13 , 17 , and 21 Martienssen et al. , “ DNA Methylation in Eukaryotes” , Curr Opin With Multiplex Ligation -Dependent Probe Amplification -Based Real Genet Dev, Apr. 1995 , 5 ( 2 ) :234 - 42 . Time PCR ” , Clin Chem , Sep . 2010 , 56 ( 9 ): 1451 - 9 . Martienssen , “ Transposons, DNA Methylation and Gene Control” , Hardenbol et al. , “ Highly Multiplexed Molecular Inversion Probe Trends Genet, Jul. 1998 , 14 (7 ): 263- 4 . Genotyping : Over 10 ,000 Targeted SNPs Genotyped in a Single Moelans et al. , " Absence of Chromosome 17 Polysomy in Breast Tube Assay " , Genome Res. , Feb . 2005 , 15 (2 ): 269 -75 . Cancer: Analysis by CEP17 Chromogenic in Situ Hybridization and Hardenbol et al. , “ Multiplexed Genotyping With Sequence - Tagged Multiplex Ligation -Dependent Probe Amplification ” , Breast Cancer Molecular Inversion Probes” , Nat Biotechnol, Jun . 2003 , 21 (6 ) :673 Res Treat , Feb . 2010 , 120 ( 1 ) 1 - 7 . 8 , Epub May 5, 2003 , PMID : 12730666 . Moore et al . , “ Key Features of Cereal Genome Organization as Harris et al ., “ Single -Molecule DNA Sequencing of a ViralGenome ” , Revealed by the use of Cytosine Methylation -Sensitive Restriction Science , Apr. 4 , 2008 , 320 (5872 ) : 106 - 9 . Endonucleases" , Genomics, Mar. 1993 , 15 ( 3 ) :472 - 82 . Hayatsu et al. , “ DNA Methylation Analysis : Speedup of Bisulfate Ng et al . , “MRNA of Placental Origin is Readily Detectable in Mediated Deamination of Cytosine in the Genomic Sequencing Maternal Plasma” , Proc Natl Acad Sci USA , Apr. 15 , 2003 , 100 ( 8 ): 4748 Procedure” , Proc . Jpn . Acad . Ser . B , 2004 , 80 : 189 - 194 . 53. Epub Mar. 18 , 2003 . Herzenberg et al. , “ Fetal Cells in the Blood of Pregnant Women : Ng et al . , “ The Concentration of Circulating Corticotropin Detection and Enrichment by Fluorescence - Activated Cell Sorting” , Releasing Horomone MRNA in Maternal Plasma is Increased in Proc Natl Acad Sci USA , Mar. 1979 , 76 ( 3 ) : 1253 - 5 . Preeclampsia ” , Clin Chem , May 2003, 49 ( 5 ) : 727 -31 . US 10 ,167 ,509 B2 Page 8

( 56 ) References Cited Pinto et al. , “ Functional Impact of Global Rare Copy Number Variation in Autism Spectrum Disorders ” , Nature , Jul. 15 , 2010 , OTHER PUBLICATIONS 466 (7304 ) :368 - 72 , Epub Jun . 9 , 2010 . Putney et al. , “ A DNA Fragment With an Alpha -Phosphorothiaoate Nolte , “ Branched DNA Signal Amplification for Direct Quantitation Nucleotide At One End is Asymmetrically Blocked From Digestion of Nucleic Acid Sequences in Clinical Specimens ” , Adv Clin Chem , 1998 , 33 :201 - 35 . by Exonuclease III and Can Be Replicated in Vivo ” , Proc Natl Acad Nygren et al. , “ Quantification of Fetal DNA by Use of Methylation Sci USA , Dec . 1981 , 78 ( 12 ) :7350 - 4 . Based DNA Discrimination ” , Clin Chem , Oct . 2010 , 56 ( 10 ) : 1627 Qin et al ., “ Studying Copy Number Variations Using a Nanofluidic 35 , Epub Aug . 20 , 2010 . Platform ” , Nucleic Acids Res , Oct. 2008, 36 ( 18 ) : e116 . Olek et al. , “ A Modified and Improved Method for Bisulphite Based Raleigh , “ Organization and Function of the MerBC Genes of Cytosine Methylation Analysis ” , Nucleic Acids Res , Dec . 15, 1996 , Escherichia coli K - 12" , Mol Microbiol, May 1992 , 6 ( 9 ) : 1079 -86 . 24 ( 24 ) :5064 - 6 . Redon et al. , “ Global Variation in Copy Number in the Human Panne et al. , “ The MerBC Endonuclease Translocates DNA in a Genome” , Nature , Nov . 23 , 2006 , 444 ( 7118 ) :444 -54 . Reaction Dependent on GTP Hydrolysis” , J Mol Biol, Jul. 2 , 1999 , Rickert et al ., “ Multiplexed Real - Time PCR Using Universal Report 290 ( 1 ) : 49 -60 . ers ” , Clin Chem , Sep . 2004 , 50 ( 9 ) : 1680 - 3 . Pathak , “ Circulating Cell -Free DNA in Plasma/ Serum of Lung Robertson , “ DNA Methylation and Human Disease” , Nat Rev Cancer Patients as a Potential Screening and Prognostic Tool" , Clin Genet , Aug . 2005 , 6 ( 8 ) :597 -610 . Chem , Oct . 2006 , 52 ( 10 ) :1833 -42 . Ropers, “ New Perspectives for the Elucidation of Genetic Disor Perry et al. , “ Diet and the Evolution of Human Amylase Gene Copy ders ” , Am J Hum Genet , Aug. 2007 , 81 ( 2 ) : 199 - 207 , Epub . Jun . 29 , Number Variation ” , Nat Genet , Oct. 2007 , 39 ( 10 ) : 1256 -60 , Epub 2007 . Sep . 9 , 2007 San Miguel et al. , “ Nested Retrotransposons in the Intergenic Pielberg et al. , “ A Sensitive Method for Detecting Variation in Copy Regions of the Maize Genome” , Science , 1996 , 274 : 765 . Numbers of Duplicated Genes” , Genome Res, Sep . 2003 , 13 (9 ): 2171 * cited by examiner U . S . Patent Jan . 1 , 2019 Sheet 1 of 60 US 10 , 167, 509 B2

OBTAIN SAMPLE OF POLYNUCLEOTIDES

PHYSICALLY SEPARATE TARGET SEQUENCES

SEPARATE SAMPLE INTO APLURALITY OF PARTITIONS

ENUMERATE PARTITIONS WITH TARGET SEQUENCE

2012 ESTIMATE COPY NUMBER OF TARGET FIG . 1 U . S . Patent Jan . 1 , 2019 Sheet 2 of 60 US 10 , 167, 509 B2

42

2 009090 centromere ** PSE 97

. . 2 SE ADASINAST VERSOS 02 . 845 ERK os FREZE,

22

Target sequence AR SA centromere EFFOGATESGES SESSE 2018 - 1 :12 Lates . A CE10 ! SHOWS 1723 RS FUNST DOXTHEN NARA . LA . . GAN ZA BESTELLENmit CHASE Target sequence ENVERLOHELUST DER Figure2

LUGA AVANTURA centromere XLOTT ESFW SK

.

. HOTELS centromere SWART CA

32

ES MA VUK

1 . Target sequences AYAM TECAREER U . S . Patent Jan . 1 , 2019 Sheet 3 of 60 US 10 , 167, 509 B2

* * *

Separatesecondsubsampleintoa Determinelinkageofthetarget Figure3a Physicallyseparatetarget sequencesinafirstsubsample Separatefirstsubsampleinto Estimatecopynumberoftargetsubsamplefirstthesequencein pluralityofpartitions Estimatecopynumberoftarget sequenceinthesecondsubsample sequence 1 . Separateasampleintosubsamples apluralityofpartitions

320 ????????????????????????????????????????????????????????? 322 324 326 328 330 332 334 atent Jan . 1 , 2019 Sheet 4 of 60 US 10 , 167 ,509 B2

ESTIMATECOPYNUMBEROFTARGET DONOTPRE-AMPLIFYTHESECOND SUBSAMPLE PARTITIONTHENON-AMPLIFIEDSECOND SUBSAMPLEINTOMULTIPLEPARTITIONS SEQUENCEINTHESECONDSUBSAMPLE

w

???????????????????????? ????????????????????? 348 350 352

CORRELATETHECOPYNUMBERFROM 1 OBTAINSAMPLEOFPOLYNUCLEOTIDES DMDETHESAMPLEINTOATLEASTTWOSUBSAMPLES FIRSTANDSECONDSUBSAMPLESTODETERMINELINKAGE FIG.3B

wwwwwwwwwwwwwwwwwwwwwwwwwww wwnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn 3402 WITHSHORTCYCLEPCR PARTITIONTHEPRE-AMPLIFIEDFIRST ESTIMATECOPYNUMBEROFTARGET 354 PRE-AMPLIFYTHEFIRSTSUBSAMPLE SUBSAMPLEINTOMULTIPLEPARTITIONS SEQUENCEINTHEFIRSTSUBSAMPLE 336

344 346 U . S . Patent Jan . 1 , 2019 Sheet 5 of 60 US 10 , 167, 509 B2

-

I 26 30 OR POTENTIALHAPLOTYPES -G 40 OBTAINSAMPLE PARTITIONSAMPLE AMPLIFYPOLYMORPHICLOCI COLLECTALLELESPECIFICAMPLIFICATIONDATAFOREACHLOCUS CORRELATETHEDATABETWEENLOC SELECTHAPLOTYPE GENOTYPE *** TTTTTTTT Figure4 Figuresabout CELL Figure6 60% 77274 U . S . Patent Jan . 1 , 2019 Sheet 6 of 60 US 10 , 167, 509 B2

A

frin'

? ' o sivee 118 OX VICROXSIGNAL ?138134 O AAAA toe meetingROX COLLECTDATA106 . D FORMDROPLETS .

. 10 -400 COM.FAM ww |hoàn 140tem 22 116 AND SIGNAL CORRELATEDATA w 1324 A 22 126 ROX wennernameyang 13Ovo 3 wwwwwwwwwww AMPLIFYALLELESEQUENCES 1 Figure7-82 OBTAIN SAMPLE FAM the

mining mmminiminiminimin m M U . S . Patent Jan . 1 , 2019 Sheet 7 7 of 60 US 10, 167, 509 B2

VIC 0000000000000000000 0 10000000000000000000 ROX to + FAMOWEYVIC.Rox ONLY Figure8 ot coded FAM ROX remote +

L LLLLLLL LLLLLLLL 1607 FAM ONLY

000000000000000000000000 000000000000000 ...... GREATER NUMBER OF POSITIVEDROPLETS FEWER U . S . Patent Jan . 1 , 2019 Sheet 8 of 60 US 10 , 167, 509 B2

Enumeratepartitionswithtarget1and betweentarget1and2 ** * target2 Figure9 Partitionsample . Obtainsampleofpolynucleotides10 Usealgorithmtopredictfragmentation

M

.* 900 M* 920 940 960 980 U . S . Patent Jan . 1 , 2019 Sheet 9 of 60 US 10 , 167, 509 B2

OS organice- T2 Figure10 or100Kbases/bp maybesub T1andT2linked;T1fragmentT2fragmentSSGsag 1Kor10K

linkednotT2andT1 T1

A B " U . S . Patent Jan . 1 , 2019 Sheet 10 of 60 US 10 , 167, 509 B2

????????????????

VANNAAAAAAAAAAAAAAAAA AmplifysequencesthatflanktheCpG Figure11 enzyme Partitionsample island Determineifflankingsequencesarein Determinemethylationstatus Obtainsampleofpolynucleotides sensitivemethylationwithTreat vvvvvvvvvvvvlat thesameordifferentpartitions

1

wwwww VAN VIIVVVVIVAL 1100 1110 1120 1130 1140 1150 1160 U . S . Patent Jan . 1 , 2019 Sheet 11 of 60 US 10 , 167, 509 B2

po VIC

0 . 0 . 0 . 2 Unmethylated CpGisland TS CpGisland FAMandVICindifferentpartitions FAM

t Figure12 Digestwithmethylationsensitiverestrictionenzymes ddPCR VIC

Methylated Me CpGisland Me CpGisland Me CpGisland FAMandVicinsamepartition FAM NUN U . S . Patent Jan . 1 , 2019 Sheet 12 of 60 US 10 , 167, 509 B2

Cost/100units $12.20 $6.10 $12.80 $10.00 $13.20 306.$ $77.33 LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL L * FIG.13

#Cuts 26 24 24 18

EnzymePicker:MRGPRX1CNV AmpliconLength:70bp MRGPRX1CNVAssay: VisualizeRestrictionSitesRRRRRRRRR Sequence#1 PrefixLength:1000bpSuffixLength:1000bp SingleDigests R FastDdel MpyF31 Name Nall Ddel Mbol Tsel LLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLLL U . S . Patent Jan . 1 , 2019 Sheet 13 of 60 US 10 , 167, 509 B2

AmpliconCuts:0 PrefixCuts:6 SuffixCuts:4 Results Cutsbothsides UU

RS

FIG.14 Selectanenzymeandpress'Cuttodetermineifcutswithintheamplicon(areen)orsurroundingregionsred,blue. Youcanhoveroverthecolorbarwithyourmousetoseesequenceitself,andwhetherornotcutscomecloseamplicon.Red colorbarsindicatearestrictionsitemaybealteredbymutation;greenaresitesthatemergeduetomutations.

SequenceFASTACodes:chr1118955710*18956379,-

39.33% Haelll GGCC 89 CutterAnalysis:MRGPRX1CNV AmpliconLength:70bp CNVMRGPRX1Assay: PickRestrictionDigests PrefixLength:300bpSuffixLength:300bp FragmentLength: Fragment%GC: MAAN TestCutters Enzyme: Sequence: U . S . Patent Jan . 1 , 2019 Sheet 14 of 60 US 10 , 167, 509 B2

www .

ww

w

ve

Figure15A wwwwwwwwww 0 AN 11 www 11 1

EnterBySNP p wwwwwww tuttutustutatutartphotosho ettetettelee M VELLA internalSitespage MRGPRX1CNV www EnterByLocation indegraa:Celsius EditAssay:MRGPRX1CNV ITTAAGCTTCATCAGTATCCCCCA TCAAAGTAGGAAAACATCATCACAGGA JACCATCTCTAAAATCCT FAM MGB SecondaryStructure?: ForwardPrimer: ReversePrimer: OptimalAnnealTemp: ReferenceSource: EnteredBy: EnterByPrimer ProbeSequence: Quencher: *Name: Gene: Dye:

MA inn U . S . Patent Jan . 1 , 2019 Sheet 15 of 60 US 10 , 167, 509 B2

Actions Edit Edit Reporter Ryank Ryank Ryank Ryank Ryank

Stored Stored Stored Stored Stored FIG.15B

ForwardPrimer:TTAAGCTTCATCAGTATCCCCCA43%GC61.3°MAG000ReversePrimer.CAAAGTAGGAAAACATCATCACAGGA38%GC613°MAG000 5'-FAMACCATCTCTAAAATCCTMGB335%GC68.5°MAG000 SampleReferenceCNV#EvidencePlateOther

GCAG0.001895607970bp)39%(chr1118956010-Region:Amplicon - on2010/1001 #:5'-caaagtaggaaaacatcatcacaggatagaggattagagatggtatgggggatactgatgaagcttaa3-;5TTAAGCTTCATCAGTATCCCCCAtaccatctctaaaatcctctaTCCTGTGATGATGTTTTCCTACTTTG3 WN

Sequence(h919refseg);

ConfirmedSampleCNVS - EnteredBy: Match#1: MRGPRX1CNV ear Probe: NA10851 NA11994 NA12155 NA12872 NA12873 U .S . Patentatent damJan. .1 .1 , 20192019 Sheet 16 of 60 US 10, 167 , 509 B2

: :

P OpticalReader OTKU 3.ReadDroplets VIVA

23 13 : :: : : :: : : : :: : :: : :: : :: : :: :: :: : :: :: : : :: : : : : : : : : : : : : : :: :

VE'.

22 .

. . .

Figure16 ta 2.CycleDroplets BulkPCRThermocycler 29:

I.

21 Tutti :: : : : :

: 1.MakeDroplets DropletGenerator Sy U . S . Patent Jan . 1 , 2019 Sheet 17 of 60 US 10 , 167, 509 B2

2x A . 2

.

9 X PA Figure17 YAYINtha Oil/SampleRatio-1.4 TAK 4? WAAR ** ** * VY 1psinom * Oil/SampleRatio=2. * * 4 * 1psi 70 - lama KAWur he 3 *

Sample:MasterMix/NTC U . S . Patent Jan . 1 , 2019 Sheet 18 of 60 US 10 , 167, 509 B2

titi 1400

www. 1200

teetit 1000 twit weetChipShop,35ng/ulDNAr=1.4 samoistfamousChipshop,SDbufferr=1.4 DNA18ng/ul,r=1.4 DNA18ng/ul,r=2. wayMasterMix,r-1.4 sevesbenNTCOS,*2. 800 www Figure18 SampleFlowRate,nl/s 600 YANMAR monD 400 Thehigheroil/sampleratioshiftsthejettingthreshold Richey AAAM ALALA

200

0 1200 um ,breakoff before Extension Max U . S . Patent Jan . 1 , 2019 Sheet 19 of 60 US 10 , 167 , 509 B2

75 ul)(ng/ Rajíundig37.5 Rajíundig7518.| Rajiundig3.75 19205undig37.51 19205undig18.75 19205undig3.75 Rajidigested37.51 18.75Rajidigested 3.75Rajidigested 37.519205digested 18.7519205digested 3.7519205digested! Rajiundig75 19205undig75 19205digested75 019205digested Conc8SampleType Rajiundig 19205undig Rajidigested Rajidigested LE MINNAN

XYV rberginhandiagonale bagongpagkataposanginterna extension)N=normal(nojeftingor

3.5 e

h deild J=jetting E-extension ANAN papantang up nghargaham LLA shah

! FIG.19 wwwwwwwwwwwww 2.5 wir

w

y

b . my phongyuting W nginmeng 1 V X fotogrammiRA 5,1

XXXXXXXXX i VVVA LAAAAA

V artimin ...... /

V ... . h . . ????? INI . XXX wwwwww AAAAAA IN NN *** * *

15N sample# nompsi 10

LAAAAAAAAAAAA w ymienione U . S . Patent Jan . 1 , 2019 Sheet 20 of 60 US 10 , 167 , 509 B2

AVY AARA PONOVO OTONOTONAPONOROUVOD Www

Beta03CNVcommandomniummedforunhamnummmmmmmmmmmmandcomment Beta06CNV

eryVery mamlanma hindi

2

TA ettebb wwwwwwwwww hemmettetett Figure20A Beta02CNVoverymanagemensammaundangannya Beta05CNV

My

ESE N ywise Beta01CNV Beta04CNV PA mig

W U . S . Patent Jan . 1 , 2019 Sheet 21 of 60 US 10 , 167, 509 B2

TETTEN Beta03CNV Beta06CNV ???

toistuuduttore

7

womenandmakananmen KETE Figure20B Beta02CNVannarsmore Beta05CNV

andwheremoreonnowwomenand VA 2 . 1 - . WWW

-

ANA Beta01CNV Beta04CNV www U . S . Patent Jan . 1 , 2019 Sheet 22 of 60 US 10 , 167, 509 B2

P KOK KONNEKTÖR KOMMER MED NON NONNONKÉPONDRON

H07- alul - 12873

A04 -alul - 12872 9.82 Male Female H04- alul - 18516 71.7 OS H01- alul -18573

1.1. . 16798.7 F04 -alul - 18502 F01- alul -18501 058. STESSO E04 -alul - 19239 02.8 C04- alul -19221 Figure21 6.89 c Sample CopyNumberVariation Cosyo 004 -alul - 18853 G01- alul - 10851 .8589.156* OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO B01- alul - 18537 MA C01- alul - 19205

SED USA F07- alul - 18854 )(AChinese Caucasion(B) 3.86 -EEEEEEEEEEEEEEN ASSES B04 -alul - 11994 oueqnjosSO 3.9788 krivivivivivvvvvvvv G07- alul -18552

#

Number Copy U . S . Patent Jan . 1 , 2019 Sheet 23 of 60 US 10 , 167, 509 B2

18853 18854 digestedwith 19239 Samples Msel 18502 19204 19221 art 18516 18507 M HHHH 18552 ANANANANANANANANANANANANANANANV 18537 Sample Figure22 NARA 19108 nawanananananananananananananawiamariwinierijumimiminiwawimwi 19205 - 18916 19109

a ênk?tcáchlinhtinhlinh NNNNN 18501

i

L 11994 - 10851

- 12872 12873

bei werden

? Attention 18573 Number Copy CCL3L1 U . S . Patent Jan . 1 , 2019 Sheet 24 of 60 US 10 , 167, 509 B2

6615 CN2 NA19108 0620412. CN2 NA19108 CN2 NA19108 1.95 CN2 NA19108 204 CN2 NA19108 €099 CN2 NA19108 87.5 CN6 NA18916 1972 CN6 NA18916 + 6699 CN6 NA18916 82.5 CN6 NA18916 CN6 NA18916 50932%8,5 89.4 CN6 NA18916 tit . . 44• + CN5 NA19205 CN5 NA19205 99.0245. CN5 NA19205

.

. CN5 NA19205 81.894.403.1 CN5 NA19205 +44 CN5 NA19205 01.5* CN4 NA19221 CN4 NA19221 COPYNUMBERVARIATION *01.476.32 182.3 CN4 NA19221 CN4 NA19221 FIG.23A CN4 NA19221 96.2 CN4 NA19221 . CN3 NA18502 11 CN3 NA18502 CN3 NA18502 . CN3 NA18502 02.3053. 89.2 CN3 NA18502 . .. CN3 NA18502 922.7.2991.2 CN2 NA18507 99.1871.96.1 CN2 NA18507 CN2 NA18507 CN2 NA18507 1 .. CN2 NA18507 CN2 NA18507 4.N

+ 96.1 IT CN1 NA11994

I+ .1ooo0.954968 + CN1 NA11994 ba 993.0994.0 CN1 NA11994 1.03 CN1 NA11994 CN1 NA11994 1.01 ad CN1 NA11994 NG27 NUMBER COPY U . S . Patent Jan . 1 , 2019 Sheet 25 of 60 US 10 , 167, 509 B2

???????????} CN2 NA19108 NaMw 201

a Mmmmm wwwwwwwwwwwwwwwwww CN6 NA18916RON 1 mmmeinwwwwww 5.92 Koledandantelange com as CN5 NA19205

wwwwwww am 01922 -23.94777 CN4 NA19221 SAMPLE FIG.23B klubaduudadangan •--**4

4 wwwwwwwwwwwwww 4 COPYNUMBERVARIATION C comowwwwwwwwwwwwwwwww CN3 NA18502

o ????????? -2.96

nowhow CN2 NA18507

the

. on mmmmmmm-1.96 wwwwwwwwwwAm . 00out WA ja buin in die alimenten mingine niin limitingin mimi toimia nimine mimi wa waimbaji NA11994

do00 1-0.996

ww NUMBER COPY U . S . Patent Jan . 1 , 2019 Sheet 26 of 60 US 10 , 167, 509 B2

3- 18507 3 -18507 30 -18507 06 13.12531253.15829893.0153035 10 . 3 -19108 3 -19108

.ttttteet 3- 19108 25 2 - 18916 4tttt4400977 2- 18916 2- 18916 2062.08 SAMPLE FIG.24A COPYNUMBERVARIATION tttttt 2- 10851 2 -10851

.deceteoorP 2- 10851 2- 10851 &$oooo9 2 -10851 EVOTTO ./2172.07521552.05252.0532113.2200

. 2 -10851

+.t¢¢¢990097 1 -18573 .¢¢¢Goor 1 -18573

€¢¢ 1¢¢¢¢¢o000 1 -18573

HAN NUMBER COPY U . S . Patent Jan . 1 , 2019 Sheet 27 of 60 US 10 , 167, 509 B2

EYLE000 18507-3F07

.tt

1

•44

???? 19108-3E07 3.13min C

¢¢¢¢¢4

>

E$¢¢ 18501-2C0718916007 VARIATIONNUMBERCOPY SAMPLE FIG.24B

2.06

HES¢¢¢o0> 10851-2B07 2.03

. 14¢¢¢©0 1+

. . . 11 cesco04901> CO99991 . C • 4¢¢¢OOOO ! 18573-2A07 .tettico get&¢¢¢¢¢ 2.11

3

NUMBER COPY U . S . Patent Jan . 1 , 2019 Sheet 28 of 60 US 10 , 167, 509 B2

55. 0 55. 8 57. 8 60 .4 62. 9 64 .6 16000 DO2 D04 D06 008 010 012 140001 12000 10000 :24 : SMN2 440 8000 . 2 ES FAMAmplitude 60004 .2 4000 3076 2000 20000 140000 160000 CONC: 312 VIC PO $: 42733 NEG: 116506 D01 D02 D03 D04 D05 D06 DO7 D08 D09 010 011 012

VICAmplitude SMN1

20000 40000 60000 120000 140000 160000 EVENT NUMBER CORIELL SAMPLE 18507 DIGESTEDwww E WITH CVIQI FIG . 25 U . S . Patent Jan . 1 , 2019 Sheet 29 of 60 US 10 , 167, 509 B2

A FAM CNV: 0 .582 CONC: 166BURI FAM POS : 25474 NEG : 141112 55. 0 55 .8 57. 8 60. 4 629 . 64 .6 14000 12000 10000 KI! 16M 45 V . 241 ex WON9 SMN2 . :: R . SeBIRON :: 10 8000 X2 FAMAmplitude 6000 60004000 2000 - m ade the 00007 40000 60000 80000 100000 120000 140000 160000 Web

CONC: 571 VIC POS: 72445 NEG :94141 A E01 E02 E03 E04 E05 E06 E07 E08 E09 E10 E11 E12

VICAmplitude TERT

20000 40000 60000 00008 120000 140000 160000 EVENT NUMBER FIG . 26 U . S . Patent Jan . 1 , 2019 Sheet 30 of 60 US 10 , 167, 509 B2

153A

132A 121A 106A #2222222222222 89A 65A SMN2COPYNUMBER 56A

4.0 40A 25A 15A

wiiiiiiiiiiiii 1.023 NUMBERJOY COPY SMN2 SMNICOPYNUMBER FIG.27 153A 141A ANANANANANAN. 132A ?????????? 121A 0. 106A 89A NUMBER COPY SMN2 SMN1COPYNUMBER ERRORBARSINDICATE95%CONFIDENCEINTERVALSOF 2222222222 00mithommiamiben EACHCNVMEASUREMENT NUMBER COPY SMN1 U . S . Patent Jan . 1 , 2019 Sheet 31 of 60 US 10 , 167, 509 B2

Figure 28 Her 2 Duplex # 2 Duplex # 1 CNV = 2 Centromere » Duplex # 3 Tert (ref cene ) SMN 21 SMN 111 CNV = 2 CNV = ? CNV = ?

" FAM + Vic + :35456 FAM + Vic -: 125636 FAM 26000 Vic + :74068 FAM - Vic -: 231645 " R2 = 0 .9998 9 Measured(SMN2/SMN1) SMN2(FAM)FAMAmplitude XX HHHHH

12 23 Estimated (SMN2 / SMN1 ) o 1440 2880 4320 5760 7200 VIC Amplitude SMNI ( VIC ) (SMN2 , SMNI) combinations Theoretically yielding same ratio possible ratios

ratio 1 0 , 1; 0 ,2 ;VY 0 ,3 ; 0 ,4 ratio2 0 . 25 ratio3 1 , 3 0 .33 ratio4 1 , 2 ; 2 , 4 0 .5 ratio5 2 , 3 0 .66 EstimatedSMN2CopyNumber ratio6 3. 4 0 .75 ratio7 1 , 1 ; 2 , 2 ; 3 , 3 ; 4 , 4 ratio8 4 , 3 1 .33 ration 3 , 2 1 .5 0 1 2 3 4 5 ratio 10 2 ,1 ; 4 ,2 Estimated SMN1 Copy Number ratio11 3 , 1 U . S . Patent Jan . 1 , 2019 Sheet 32 of 60 US 10 , 167, 509 B2

100

...... w

.

.

.

.

.

.

.

Tumor. Normal . .

.

. . 10 . i . . . . ------ERBB2Copynumber

. Figure29 .

. . . I

.

.

.

.

.

.

.

.

.

. mundo wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww 100spamman OL number Copy GRB7 U . S . Patent - Jan . 1 , 2019 Sheet 33 of 60 | US 10 , 167 , 509 B2

UNCUT CUT NA18507 UNCUT CUT NA19108 UNCUT CUT NA18853

? COPYNUMBERVARIATION UNCUT SAMPLE FIG.30 CUT D1234152 UNCUT CUT 01234106 UNCUT .???????????|2 CUT ????????????? 01234090 CUT=Rsai NUMBER COPY U . S . Patent Jan . 1 , 2019 Sheet 34 of 60 US 10 , 167, 509 B2

MRG RPP30

T .

160000 *

*

* 140000 140000 1 RE * * Figure31 FluorescenceIntensityat25cycles 120000 120000 TET 100000 100000

www E02503E04FOZF03F04G02G03G04HOZHO3 80000 EOZE03504F02F03F04GO2G03G04HO2HO 80000

Y - EventNumber EventNumber * 00009 MAXX 00009 * ABS Shoulisha Conc:67.8VICPos10707Neg152605 CNV:2.87Conc973FAMPos15147Neg148165 Lt fanum 40000 40000

months 20000 20000 famamo1000 4800- ws2880ron 960 1920 Amplitude FAM Amplitude VIC U . S . Patent Jan . 1 , 2019 Sheet 35 of 60 US 10 , 167, 509 B2

21 11 VICSignal(RPP30) 579 : : : : : : : : : : : : : : : :: : : : : : : : : : : : : : : : : BC106 AAAAAAAAAAArray

+

+ + www :: : : : : : : : : : : : : : + 584 w 19108 YAAYYYYYversveree w :: : : : :: :: : :: : :: :: :: : : : 589 : : : : : : : : : : : : 28853

: : : : : : : : : : : : : : : :: AverageFluorescenceofTriplicateMeasurementsatcycle25 Awwwwwww 589 : : : : : : : : : : : : : 18507

*

. . . P :: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : :: : 23 : 59 : 55 : 9977777773 hWwWliv : : : 1111111111 Figure32 3218 ...... : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : . : : BC106

V : : : : : : : : : : : : : : : : : : :: : : : : : : : : : : : : : : : :: : : : : : : : : : : : : : :: : :: : : : : . :: : : : : 3333333333333333333333333 :: 3445 ...... 19108

Xo FAMSignal(MRGPRX1) ...... :: : : : : : : : : : :: : : : : : : : : : ...... 3318 ...... 18853

: : : : : : : : 2355555599977329793 :: : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 33333333333333333333 3163 ...... : :. : . :. : . :. . : : : : : :: :: : : : : :: : : : : : : : :: : : : : : : : : :: :: : :: : : 18507

wisie 4000 3500 3000 2500 2000 1500 1000 500 Fluorescence U . S . Patent Jan . 1 , 2019 Sheet 36 of 60 US 10 , 167, 509 B2

MRGMRG RPP30

004 . 00009 } 004 . 160000 2 5 IR PA3 140000 002003 MU 140000 004DOZ003 > - 20000 1 - - 20000 1 FluorescenceIntensityat28cycles - Figure33 CO2C03 100000 05032 100000 1501 . 804 . EventNumber EventNumber 803 BOOK AQ2A03ADBOZB03BugCO2C03004 CNV:1.82Conc974FAMPos15672Neg153130 20 Conc:107VICPos17105Neg151697 WS . 40000 .ELIT 40000 A02A03A04802 - 0000Z 000022002 48007 1400mm 1200 1000 Amplitude FAM Amplitude VIC RFU RFU U . S . Patent Jan . 1 , 2019 Sheet 37 of 60 US 10 , 167, 509 B2

COPY NUMBER W

MRGPRX1(CON) 87.2RPP30(CON) MRGPRX1CN Sasa:444*+ 90.4 08.2 444444444444444444 $ 3000 > > 620099 BC106

11117 MRGPRX1CN 0000409989995 1.68 +PP 19108 RPP30107 89CON.9MRGPRX1)( ddPCRCONCENTRATION/COPYNUMBERVARIATION MRGPRX1CN SAMPLE FIG.34 V99 utsa666666644 18853

> RPP30(CON)134 MRGPRX1(CON)111 Iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii CON)RPP30( * *

* 2.97

ECCLELLE ??????? 18507 4.98 2.031 *4¢¢¢¢11. MRGPRXI(CON) MRGPRX1CN

*80

+100

+

AAAAAA A AAAAAAAAAAAAAAAAAAAAAAA ) UL/ COPIES( CONCENTRATION U . S . Patent Jan . 1 , 2019 Sheet 38 of 60 US 10 , 167, 509 B2

BC106

mg5gmwnmgzm . . . . . * . * . * . . " . 19108 VICSignal(RPP30) tuvatMuut 810mmccccccmann 18853

AverageFluorescenceofTriplicateMeasurementsat28cycles 762 : 18507 Figure35 NAALALLAVALAVA 3185 8C106 winter WEEEEEEEEE222712 : : : : : : : : : : : : : : : : : : : 3372 2323XXXXX2ZKER * * * * * * * 19108 FAMSignal(MRGPRX1)

3196 3 18853

3001 18507

mm 4000 3500 3000 2500 2000 1500 1000 500 Fluorescence U . S . Patent Jan . 1 , 2019 Sheet 39 of 60 09 B2

2632 : : : : : : : : : : : : : : : :: :

T : : : : : : : : : : : : : : : : : : : 1 : 31 : 33 55 :

- 291027152878 . 1850718853191088C106BC106 :: : : : : : : :: : : : : : :: : 4104 HET : : : : : : : : : :

40cycles a53 : 33 : 15 : 3 2 4495 : : : : : : : : : : : : : : : : : : : : : 1956 ...... AverageFluorescenceforTriplicateMeasurements

: : : : : : : : : 1998 ...... 3.4127 www

:: : : : : : : : : : : : : : : : : ... : :

MYYYYYYYYYY . 1924 T4604 . 1877 185071885319108BC106 Fluorescence Figure36 31cycles : : : : : : : : : : : : : : : : : : : : : : : : :: : : : : : : : : : :: : AverageFluorescenceforTriplicateMeasurements . ALANAN 46744723 : : : : : : : : : : : : : : : ::

W

: : : : : : : : : : : : : : : : : : : Muhunuttum46644694 6000T540546856045278 5000 4000 3000 2000 1000 1850718853191088C106BC106 Fluorescence : : 21 34cycles ?????? AverageFluorescenceforTriplicateMeasurements 863384838440 23333333333333333 19TEE3323333333

8252 : :: : : : : : : : :: . 10000 8000 6000 4000 2000 Fluorescence U . S . Patent Jan . 1 , 2019 Sheet 40 of 60 US 10 , 167 ,509 B2

3

E :

10 9

8 TETESE5312ZLER

9 7 232323777777777777XXXXXXXXXX 8 XSEX* 6 7 MRGPRX1 10.

6 Figure37 9.

5 MRGPRX1 8 37555555555555555555555555533333333333333333

4 .: 00000000000000000 MRGPRX1 7 . 3137137237 *3 3

2 12?45675 2

1 1 EE3333333333EEEEEEEEEE 11:17 A 23 B 2: U . S . Patent Jan . 1 , 2019 Sheet 41 of 60 US 10, 167 , 509 B2

-Nobandsuggestsnodeletionispresentand PCRfailedtoamplifythe25KBamplicon, sincetheconditionsonlyweregoodfor momme-Bandindicatesthepresenceofadeletion. ampliconsofsmallersize. MRGPRX1PCRproduct Long-rangePCR

Figure38 Cis CNV=2 18853 19108 Trans CNV=2 18507 19239 CNV=1 18573 11994

.

1000-www.hmmm 700-www 10380 3000-1500- U . S . Patent Jan . 1 , 2019 Sheet 42 of 60 US 10 , 167, 509 B2

Bandindicatesthepresenceofadeletion. MRGPRX1PCRproduct

H S6 (2,0) ND- S6 3 RE- SO 2.7Kb ALAW ARA (2.0) ND . S5

E RE- S5

istitih HeidettiinP ND - S4 S4 (1,) Figure39 HRWild RE- S4 24kb ND- s3 Samples S3 (1,) RE- SZ wwwvwwww ND- S2 * S2 (1,0) RE- S2 1.6kbpermemory m ON - IS S1 (1,0) ASMA RE- S1

# Copy MRGPRX1 10380 U . S . Patent Jan . 1 , 2019 Sheet 43 of 60 US 10 , 167, 509 B2

Figure 40

1 . 6 kb 24 kb 2 . 7 kb HA

EFFDB

5000 3000 1000 manat $ 1 S2 S3 S4 S5 S6 ( 1, 0 ) (1 , 0 ) (1 , 1 ) ( 1. 1) (2 , 0 ) (2 , 0 )

CN)Sample(Digested ILSEN %Differencefrom entre U . S . Patent Jan . 1 , 2019 Sheet 44 of 60 US 10 , 167, 509 B2

-

VIC Figure41 1Kor10K or100Kbases

FAM atent Jan . 1 , 2019 Sheet 45 of 60 US 10 ,167 ,509 B2

Figure42

.

5 4 .

.: •?.??, 1 4.?? Totalfragmentationbetweentargets 's,654 Partialfragmentationbetweentargets

targetsbetweenfragmentationNo:

,

B. ? A U . S . Patent Jan . 1 , 2019 Sheet 46 of 60 US 10 , 167, 509 B2

20.0

wwwww WWW 15.0 w

hat Figure43 W 10.0 etty DNAyield(1,000sofGenomeEquivalentspermlplasma)

5.0

t

. 1KbMilepostassay kb 1 than larger DNA of Percentage U . S . Patent Jan . 1 , 2019 Sheet 47 of 60 US 10 , 167, 509 B2

CNVmeasurementsinplasmaby targetingchromosomes21and1

Figure44 RRRRRRRR 100 100

A

W 80 ANN

60 20406080 PercentageofDNAlargerthan1kb PercentageofDNAlargerthan1kb 40

20

.

0 finansefina en m fem.www forma en

1.5

CNV Measured CNV Measured U . S . Patent Jan . 1 , 2019 Sheet 48 of 60 US 10 , 167, 509 B2

Figure 45 Restriction Methylation Inc . Enzyme Sensitivity Digestion Buffer Test Assays EcoRI T OL CpG TECORI Buffer T 37 www . TMRGPRX1 CYP2D6 TCCL3L7

* * * * * * * * * * * QL RPP30 QL RPP30 QL RPP30

Rsal COL Dam NEB 4 37 MRGPRX1 CYP2D6 TCCL3L1 QL _ RPP30 Chmr 16 QLRPP30 Msel T ok NEB 4 , BSA 37 IMRGPRXT TCYP2D6 TCCL3L7 QL _ RPP30 QL RPP30 QL _ RPP30 OL Dam , CpG Imp INEB 4 MRGPRX1 TChmr X TCCL3L1 QL RPP30 QL RPP30 QL RPP30 Alul okTNEB 4 MRGPRX1? TSRY T CCL3L1 QL _ RPP30 QL RPP30 QL _ RPP30 TaqalT OL Dam NEB 4 , BSA MRGPRX1 CYP2D6 { CCL3L1 QL RPP30 QL RPP30 QL RPP30 Bsm } MRGPRX1 CYP2D6 CCL3L1 ???????????????????????? iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiOKNEB4 165 vitittiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii QL RPP30 TQL RPP30 QL RPP30 Bstyi ok NEB2 60 MRGPRX1 CYP2D6 TCCL3L1 QL RPP30 QL RPP30 QL RPP30 Xhol CpG Imp NEB 4 , BSA 37 MRGPRX1 CYP2D6 CCL3L1

poooooooooooo000000000000000 poodDogodb0000000OOOOOOOOOOOOOOOOQL RPP30 QL RPP30 QL RPP30 Dpnil Dam Dpnil Buffer 37 MRGPRXT DOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOCYP2D6 TCCL3L1 ?????????????????????????????????????????????????????????????????????????????????????????????????? QL RPP30 QL RPP30 QL RPP30 Haelll ok NEB 4 L MRGPRXT ChrmX CCL3L1 QL RPP30 QL RPP30 QL RPP30 Hpall CpG NEB7 IMRGPRXT CYP2D6 TCCL3L1 QL _ RPP30 QL _ RPP30 QL _RPP30 Hhal CpGNEB 4 , BSAT MRGPRX1 CYP2D6 CCL3L1 ?????????????????????????????????????????? Chrm 14 QL RPP30 Chrm 14 Msp ! 10 NEB 4 MRGPRX1 CYP2D6 CCL3L1 ...... QL_ RPP30 QL RPP30 TQL _ RPP30 Tsel COL CpGNEB 4 MRGPRXT1111UUUUUUUU CYP2D6 TCCL3L1 QL _ RPP30 QL RPP30 QL RPP30 Bstul CpGTNEB 4 60 MRGPRX1 CYP2D6 CCL3L1 Chrm 14 TQL RPP30 Chrm 14 NalllI OK NEB 4 , BSA I IMRGPRX1 CYP2D6 TCCL3L1 TQL RPP30 QL RPP30 TQL RPP30 ...... atent Jan . 1 , 2019 Sheet 49 of 60 US 10 , 167, 509 B2

Figure 46 EcoRI Titration ( N = 4 ) 1800 1600 1400 1200 800 600 w Ave Target Conc ddPCRConcentration 400 ?????????????? ?????????????????????????????????????????????????????????the one wifes Ave RPP30 Conc

20 01 20 20 1.25 1.25 1.25 UND(+Buff) 0.3125 UND(+Buff) 0.3125 UND(+Buff) 0.3125 0.078125 0.01953125 0.004882813 0.0781250.01953125 004882813 .0 0.0781250.01953125 0.004882813 MRGPRX1 CYP2D6 CCL3L1

Ave CNV ( N = 4 )

1

owNWACON

LO 20 O 20 1.25 1.25 20 1.25 UND(+Buff) 0.3125 UND(+Buff) 0.3125 UND(+Buff) 0.3125 0.078125 0.01953125 0.004882813 0.078125 0.01953125 0.004882813 0.078125 0.01953125 0.004882813 MRGPRX1 CYP2D6 CCL3L1 U. S . Patenatent Jan . 1 , , 2019 Sheet 5050 ofof 6060 US 10 , 167, 509 B2

Figure 46 (Continued ) usa( uogjeni ) N = +( 1800 ______0092 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA U RRRRRRRRRRRRRRR R A A 000L

1200 2 1000 800 009 Ave Target Conc ddPCRConcentration 400 . . . . . apaAve RPP30 Conc 200 ?????????? ???? 20 20 20 1.25 1.25 1.25 UND(+Buff) UND(+Buff) 0.3125 UND(+Buff) 0.3125 0.3125 0.078125 0.0781250.01953125 0.004882813 0.078125 0.01953125 0.004882813 0.01953125 0.004882813

MRGPRX1 CYP2D6 CCL3L1 M

Ave CNV (N = 4 )

w w wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww O-NwV 20 20 25.1 20 1.25 1.25 0.3125 0.3125 UND(+Buff) UND(+Buff) 0.3125 UND(+Buff) 0.078125 0.078125 0.01953125 0.004882813 0.078125 0.01953125 0.01953125 0.004882813 0.004882813 w MRGPRX1 CYP2D6 CCL3L1 U . S . Patent Jan . 1 , 2019 Sheet 51 of 60 09 B2

Figure 46 (Continued ) Dpnll Titration ( N = 4 )

?????????????????????????????????????????????????????????????????????????????????????????

ave With

??????????????????????????????????????????????????????????????????????????????????????? ?????????????????????????????????

. Ave Target Conc ddPCRConcentration Sampagne Ave RPP30 Conc

20 20 1.25 20 1.25 NNN 1.25 UND(+Buff) 0.3125 0.3125 UND(+Buff) 0.3125 0.078125 UND(+Buff) 0.078125 0.019531250.004882813 0.078125 0.01953125 0.004882813 0.01953125 0.004882813 MRGPRX1 CYP2D6 CCL3L1

Ave CNV (N = 4 )

?

?

?

? ??????????????????? 20 1.25 20 25.1 20 1.25 UND(+Buff) 31250. 0.078125 UND(+Buff) 0.3125 UND(+Buff) 0.3125 0.078125 0.01953125 0.004882813 0.078125 0.01953125 0.004882813 0.01953125 0.004882813 MRGPRX1 CYP2D6 CCL3L1 U . S . Patent Jan. 1 , 2019 Sheet 52 of 60 US 10 , 167 ,509 B2

Figure 46 (Continued ) Xhot Titration ( N = 4 )

Ave Target Conc ddPCRConcentration - Ave RPP30 Conc

20 20 $ U 201 25.1 1.25 1.25 UND(+Buff) 0.3125 UND(+Buff) 0.3125 0.078125 UND(+Buff) 0.31250.078125 0.078125 NNNNNNNNNNNN 0.019531250.004882813 0.01953125 0.004882813 0.019531250.004882813 MRGPRX1 CYP2D6 CCL3L1

Ave CNV ( N = 4 )

?

?

??? LLLLLLLLLLLLLLLLL

?????????????????????????????????

20 1.25 20 1.25 20 1.25 0.3125 0.3125 UND(+Buff) 0.078125 UND(+Buff) 0.3125 UND(+Buff) 0.078125 0.01953125 0.004882813 0.078125 0.01953125 0.004882813 0.01953125 0.004882813 MRGPRX1 CYP2D6 CCL3L1 U . S . Patent Jan . 1 , 2019 Sheet 53 of 60 US 10 , 167 ,509 B2

193, .7 1 wei 65C 120000 LL 60 100000

LL L

L

X 80000 . wwwwwwwwww L FO6 w Neg:113437 FO5L 60000 CNV:2.33Conc207FAMPos26076 . EventNumber 12 m . :12

. F04 1X FO3 ww 57C FO2 20000 140004 55C E100008000 6000 2000 012000FAMAmplitude 191 dapatidumok

E12 :47.! 65C 140000

?

W 120000 Figure47 U * 100000 E06 000033EventNumber 0.5X CNV:2.28Conc307FAMPos42109 Neg:117097 E05 00009 59C TH u

.

.

.

. 266 me . .

40000 .

.

.

.

.

.

. 55C . .

200000 . u .

.

.

.

RescuedTarget .

.

.

.

.

Amplification .

. 2000 . 6000 . 2000 . 14000 6000 . 10000€8000 . 200002 E10000 1600014000 812000 . A Š4000

.

.

. OBECAM012 . .:: 160000 : 0 65C 1110 0 340000 HIE 600 000zv_

. . D07 006 l 000083 DO5 EventNumber CNV:0.195Conc433FAMPos7259Neg163920 . ??????????????????????????????????????????????????.???????????????????????????????????????????????????????? OX D04 00009 60C D03 . 40000 FailedTarget 55C DOZ 20000 100 16000.14000 10000 60004000 .20002000. AmplificationConcentration Amplitudeg12000 E8000 FAM Amplitude VIC OptimalAnnealing GCAdditive Annealing Temperature Gradient Temperature atent Jan . 1 , 2019 Sheet 54 of 60 US 10 , 167, 509 B2

Copy Number

D

2 EEKERBZD +907 alenoesojepljen2=ANO ?gualajaipuej?bjet uopleguenb 955 Lt?n?! (p?nu?puog) XG'O {ANO)ZX??????????????? 195 7402907 1eDie}{nlp30?O!1e1???? aolaiaeieigelle!30??olleague2009 07=?njenpappadx

XO T tot 19205 091 +602210 DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD tablet'aa!ppe39XOUI Aduebus?op DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD ) uL/ copies ( Concentration atent Jan . 1 , 2019 Sheet 55 of 60 US 10 , 167, 509 B2

Figure 48

Alul

NEB Ferm ?????????????????????????????????????????????????????????????????? CNV :0 .0761 Conc :25 . 9 FAM Pos: 9044 Neg :345174

18000 . . ::: : :

16000 .

14000 : : . . 13 ** 12000 ME 10000 QL - er2 - T1, cut 2 times3 . 4 8000 LY FAMAmplitude 6000 4000 si . ! A 2000 * * * :.: 0 72000 144000 216000 288000 360000 Conc :680 VIC Pos: 174681 Neg :179537 12000 10000 RPP30 , should not be cut 88000 6000 . . . HU : M . ..** * PT :.

4000 !? : R VICAmplitude H .. SA A

11 . Š 2000 - - 4 ...... 2 WW

.:

. . - 2000mm WWW Wyn 72000 144000 216000 288000 360000 Event Number U . S . Patent Jan . 1 , 2019 Sheet 56 of 60 US 10 , 167, 509 B2

Figure 48 ( Continued )

Z NEB LE

000002

Fermentas GO 4444* LA 24 44* + are forti7496 100* . +4 16 ..

2 142242

PercentTargetLeftUndigested S *** * * SyTAAL vee<< Am 2 ** * **

1444 NAALRA st st it at at at O04X 0 .04x 0.008x 0.008x Fold Over- Digestion U . S . Patent Jan . 1 , 2019 Sheet 57 of 60 US 10 , 167, 509 B2

Figure 48 ( Continued ) Msel

NEB Ferm Ferm VROUW CNV : 1 . 06 Conc : 384 FAM Pos: 157918 Neg : 337094 12000 mm 10000 MP _ 10K , cut 2 times

. X . 8000 wwwwww ART . . FAMAmplitude 4000 2000. godineht 104000 208000 312000 416000 520000 Conc: 726 VIC Pos: 255585 Neg: 239427 5600

.

. 4480 *. RPP30 , should not be cut ? . .

. 3360% : . :* : : 2 40:1 .66 27 ; AY VICAmplitude .

1120 .

2 . .02 OLAR . Mwenci 0 104000 208000 312000 416000 520000 Event Number U . S . Patent Jan . 1 , 2019 Sheet 58 of 60 US 10 , 167, 509 B2

Figure 48 (Continued )

NEB S Fermentas

EEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEEE

-

-

- PercentTargetLeftUndigested -

-

-

-

-

0.2x 0.2x NDND 0.04x0.04x 0.008x 0.008x .0016x .0016x Fold Over- Digestion U . S . Patent Jan . 1 , 2019 -Sheet -59 of 60 . US 10, 167 , 509 B2

Fewpartitionswitht ManyVIConly partitions FAMandVIC usuallycolocalize VICbutnotFAM

turistici: FAM target

* Figure49 * : : : : : : : : : : : : : : : : :

minimiminiminimi : *2223, : : (targetdeletion) : : FAMFAM target FAM target FAM target liimimineinimene Vs. VIC marker VIC marker VIC marker VIC marker

m

helli U . S . Patent Jan . 1 , 2019 Sheet 60 of 60 US 10 , 167 ,509 B2

FAM/VICassay FAM VIC closeaftertranslocation Centromere Translocation->

*

:: : : VIC Breaks FAM/VICassay . LLLLLL ondifferent

+ chromosomes Figure50 Centromere FAMEL

=FAM VIC FAM/VICassay closeafterinversion Inversion->

FAM/VICassay farapart FAM VIC Centromere Breaks US 10 , 167 ,509 B2 ANALYSIS OF NUCLEIC ACIDS including non - viable phenotypes , increased risk of disease , or inability to metabolize a class of medications , among CROSS -REFERENCE TO RELATED others . APPLICATIONS Information on haplotype can be useful information for 5 life sciences applications , the medical field , and in applied This application is a divisional of U . S . patent application markets, such as forensics . The issue of haplotype determi Ser. No. 13 /385 ,277 , filed Feb . 9 . 2012 , which issued Sep . 8 , nation , arises in many contexts of human ( and non -human ) 2015 as U .S . Pat. No . 9, 127 ,312 and is entitled ANALYSIS genetics. For example , many genetic associations are tied to OF NUCLEIC ACIDS ; which application claim the benefit and thus predicted by haplotypes. The HLA ( human leuko of the following U . S . provisional patent applications : ( 1 ) 10 cyte antigen ) region is one prominent instance where par ticular genetic diseases have been associated with various U .S . Provisional Patent Application Ser . No . 61 /441 , 209 , haplotypes of the major histocompatibility complex . filed Feb . 9 , 2011 ; ( 2 ) U . S . Provisional Patent Application Conventional genotyping technologies interrogate differ Ser. No. 61 /444 ,539 , filed Feb . 18 , 2011 ; ( 3 ) U . S . Provi ent loci of sequence variation , such as SNPs ( single nucleo sional Patent Application Ser. No. 61 /454 ,373 , filed Mar . 18 ,· 15 tide polymorphisms) , in isolation from one another. Thus, 2011; (4 ) U . S . Provisional Patent Application Ser. No . the technologies can confidently determine that a pair of 61 /476 , 115 , filed Apr. 15 , 2011 ; ( 5 ) U . S . Provisional Patent distinct alleles is present at each of two linked loci in a Application Ser. No. 61 /478 ,777 , filed Apr. 25 , 2011; (6 ) sample of genetic material . However, the technologies can U . S . Provisional Patent Application Ser. No . 61/ 484 , 197 , not tell which combination of alleles from the two loci is filed May 9 , 2011 ; and ( 7 ) U . S . Provisional Patent Appli - 20 located on the same chromosome copy . A new approach to cation Ser. No. 61/ 490 , 055 , filed May 25 , 2011 . Each of determining haplotypes is needed to overcome this obstacle . these priority documents is incorporated herein by reference Similarly , new methods are needed for determining the in its entirety for all purposes. probability of fragmentation between two target nucleic acid sequences in a sample , determining levels of degradation in BACKGROUND OF THE INVENTION 25 a sample of nucleic acids ( e . g ., DNA , RNA ) , assessing alternative splicing, or detecting inversions , translocations, Determining the copy number of a target nucleic acid in or deletions . a sample from a subject can provide useful information for a variety of clinical applications. However , some assays for SUMMARY OF THE INVENTION determining copy number of a target nucleic acid can 30 In some aspects , this disclosure provides a method of underestimate the copy number of the target nucleic acid if detecting variations in copy number of a target nucleic acid multiple copies of the target nucleic acid are on the same comprising : a . contacting a sample comprising a plurality of polynucleotide in the sample . For instance , in a digital polynucleotides with at least one agent, wherein the poly polymerase chain reaction (dPCR ) , a spatially isolated poly 35 nucleotides comprise a first and second target nucleic acid ; nucleotide comprising two copies of a target sequence can b . subjecting the sample to conditions that enable the agent be counted as only having one copy of the target nucleic to cleave a specifically -selected target site between the two acid . There is a need for improved methods, compositions, target nucleic acids when the two target nucleic acids are and kits for copy number estimation of nucleic acid target located on the same polynucleotide, thereby separating the sequences in assays , such as dPCR assays , that take into 40 two target nucleic acids; c . separating the sample contacted account the linkage of target nucleic acidid sequences ( e . g . , with the agent into a plurality of spatially isolated partitions; whether multiple copies of a target nucleic acid sequence are d . enumerating the number of spatially isolated partitions on the same polynucleotide in a sample ) . comprising the target nucleic acid ; and e . determining a copy Knowledge ofhaplotypes or phasing ofneighboring poly number of the target nucleic acid based on said enumerating . morphisms can be useful in a variety of settings . Humans are 45 In some embodiments , the spatially isolated partitions are diploid organisms , because each chromosome type is rep - droplets within an emulsion . In some embodiments , the resented twice as a pair of individual chromosomes in each target nucleic acids are present at an average concentration of a person ' s somatic cells . One copy of the pair is inherited of less than about five copies per droplet, less than about four from the person ' s father and the other copy from the copies per droplet , less than about three copies per droplet , person ' s mother . Therefore , most genes exist as two copies 50 less than about two copies per droplet , or less than about one in each diploid cell . However, the copies generally have copy per droplet . In some embodiments, the specifically many loci where sequence variation occurs between the selected target site is located between the first and second copies to form distinct sequences known as alleles . target nucleic acids . In some embodiments , the method It can often be useful to know the pattern of alleles, the further comprises subjecting the first and second target haplotype , for each individual chromosome of a chromo - 55 nucleic acids to an amplification reaction . some pair . For example , if a person has inactivating muta - In some embodiments of this aspect, the sequences of the tions at two different loci within a gene, the mutations may first and second target nucleic acids are identical or near be of limited consequence if present together on the same identical. In some embodiments , the first and second target individual chromosome, but could exert a major effect if nucleic acids have different sequences. In some embodi distributed between both individual chromosomes of a chro - 60 ments , the specifically -selected target site is a site capable of mosome pair. In the first case , one copy of the gene is digestion with a restriction enzyme. In some embodiments , inactivated at two different loci, but the other copy is the agent is one or more restriction enzymes. In some available to supply active gene product . In the second case , embodiments , the specifically -selected site and the first each copy of the gene has one of the inactivating mutations, target nucleic acid are located within the same gene . In some so neither gene copy supplies active gene product. Depend - 65 embodiments , the target nucleic acid is correlated with a ing on the gene in question , mutation of both copies of the disease or disorder. In some embodiments , the method gene could lead to a variety of physiological consequences further comprises enumerating ( or counting ) the number of US 10 , 167 ,509 B2 spatially isolated partitions comprising a reference nucleic a PCR reaction ; b . separating the sample into a plurality of acid . In some embodiments , the reference nucleic acid is spatially - isolated partitions , wherein the spatially - isolated present at a fixed number of copies in a genome from which partitions comprise on average less than five target nucleic the sample is derived . In some embodiments , the reference acids ; c . subjecting the samples to a PCR amplification nucleic acid is a housekeeping gene . In some embodiments , 5 reaction in order to detect the target nucleic acids ; d . the reference nucleic acid is present at two copies per diploid detecting the fluorescence intensity of the fluorescent labels genome from which the sample is derived . In some embodi- before any of the reagents for the PCR reaction become ments , the determining the copy number of the target nucleic limiting, wherein a higher fluorescence intensity within a acid comprises dividing the number of enumerated target partition is indicative of the presence of more than one target nucleic acids by the number of enumerated reference nucleic 10 nucleic acid on a polynucleotide ; e . enumerating the number acids. of partitions that have a fluorescence intensity above a In some embodiments of this aspect , the agent does not threshold value indicative of one copy of the target nucleic cut the sequence of the target nucleic acid . In some embodi - acid ; f . enumerating the number of partitions that have a ments, the agent does not cut the sequence of the reference fluorescence intensity above a threshold value indicative of nucleic acid . In some embodiments , the target nucleic acid 15 multiple copies of the target nucleic acid ; and g . either : ( i ) is present in multiple copies on a single polynucleotide . In calculating the copy number of the target nucleic acid based some embodiments , the one or more restriction enzymes on the numbers obtained in step e and step for ( ii ) deter have more than one recognition sequence between the two mining whether two target nucleic acids are present on the target sequences . In some embodiments, the one or more same polynucleotide. restriction enzymes do not exhibit significant star activity . In 20 In some embodiments , an increased copy number of the some embodiments , the one or more restriction enzymes target nucleic acid is correlated with a disease or disorder. comprise two or more restriction enzymes . In some embodi- In yet another aspect , this disclosure provides a method of ments , the software is used to select the one or more identifying a plurality of target nucleic acids as being present restriction enzymes . In some embodiments , the one or more on the same polynucleotide comprising , a . separating a restriction enzymes are heat - inactivated after digesting the 25 sample comprising a plurality of polynucleotides into at polynucleotides . In some embodiments , the temperature of least two subsamples , wherein the polynucleotides comprise the heat inactivation is below the melting point of the a first and second target nucleic acid ; b . contacting the first restricted target fragments in order to maintain the double subsample with an agent capable of physically separating stranded nature of the target fragments . In some embodi- the first target nucleic acid from the second target nucleic ments, a control restriction enzyme digest is performed to 30 acid if they are present on the same polynucleotide ; c . measure the efficiency of digestion of nucleic acid by the one following step b , separating the first subsample into a first or more restriction enzymes. In some embodiments, the set of partitions; d . determining the number of partitions in percentage of linked target sequences that are fragmented in the first set of partitions that comprise the target nucleic acid ; the sample is determined . e . separating a second subsample into a second set of In another aspect, this disclosure provides a method of 35 partitions; f . determining the number of partitions in the detecting variations in copy number of a nucleic acid second set of partitions that comprise a target nucleic acid ; comprising : a . providing a sample comprising a plurality of and g . comparing the value obtained in step d with the value polynucleotides, the plurality comprising a first and second obtained in step f to determine the whether the first and target nucleic acid located within at least one of the plurality second target nucleic acid are present within the same of polynucleotides ; b . cleaving the at least one polynucle - 40 polynucleotide . otide between the first and a second target nucleic acids, In some embodiments , the first and second target nucleic when the first and second target nucleic acids are present acids comprise the same sequence . In some embodiments , within the same polynucleotide to form a cleaved sample ; c . the agent is a restriction enzyme. In some embodiments , separating the cleaved sample into a plurality of spatially linkage of the first and second target nucleic acid is indicated isolated regions ; d . enumerating the number of spatially 45 when the number obtained in step d is significantly higher isolated regions comprising the target nucleic acids ; and e . than the number obtained in step f. In some embodiments , determining a copy number of the target nucleic acid based the restriction enzyme recognizes a site between the first and on said enumerating ; wherein two of the at least two target second target nucleic acid . In some embodiments , the first nucleic acids are located within a same region of the same and second target sequences are located within less than 1 polynucleotide . 50 megabase of each other. In some embodiments , after con In some embodiments, less than 1 megabase separates the tacting the sample with one or more restriction enzymes, the two target nucleic acids . In some embodiments , less than 1 sample is stored for less than 24 hrs before the subsample is kilobase separates the two target nucleic acids . In some separated . In some embodiments , the target sequences are embodiments , the cleaving is accomplished with a restric not physically separated in the second subsample . In some tion enzyme . In some embodiments , the plurality of spa - 55 embodiments , the method further comprises determining tially - isolated regions are droplets within an emulsion . In whether the first and second target nucleic acids are on the some embodiments , the method further comprises conduct - same chromosome or different chromosomes . In some ing a PCR reaction prior to the enumerating of step d . In embodiments , the sequence of the first target nucleic acid is some embodiments , the method does not comprise a different from the sequence of the second target nucleic acid . sequencing reaction . 60 In some embodiments , the sequence of the first target In another aspect , the method comprises a method of nucleic acid is a genetic variation of the second target detecting variations in copy number of a target nucleic acid nucleic acid . In some embodiments , the genetic variation is comprising: a . obtaining a sample comprising ( i ) a plurality a single nucleotide polymorphism . In some embodiments , of polynucleotides , wherein at least one of the polynucle - wherein the first and second target nucleic acids are within otides comprises a first target nucleic acid and a copy of the 65 the same gene . In some embodiments, at least two of the first target nucleic acid ; ( ii ) a probe with a fluorescent label polynucleotides are chromosomes . In some embodiments , to detect the target nucleic acid ; and ( iii ) reagents to enable the first and second target nucleic acid are located on US 10 , 167 , 509 B2 separate chromosomes . In some embodiments , the first and set of partitions ; d ) enumerating the number of partitions second target nucleic acid are located on the same chromo from the first subsample that contain the first and second some. In some embodiments , at least one of the chromo - target nucleic acids together ; e ) separating the second sub somes comprises two copies of the first target nucleic acid . sample into a second set of partitions; f ) enumerating the In some embodiments , at least one of the chromosomes 5 number of partitions from the second subsample that contain comprises at least three copies of the first target nucleic acid . the first and second target nucleic acids together ; and g ) In other embodiments , a chromosome comprises at least 1 , comparing the value of step f with that of step d in order to 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , or 10 copies of the first target nucleic determine the probability that the first and second target acid . In some embodiments , one or more of the restriction nucleic acids are linked on the same polynucleotide . enzymes is a methyl- sensitive restriction enzyme. 10 In some embodiments , the short cycle PCR comprises less In yet another aspect, this disclosure provides a method of than 24 cycles of PCR reaction . In some embodiments , the determining the probability of fragmentation of polynucle - method further comprises use of an algorithm to determine otides in a sample comprising : a . obtaining a sample com the probability that the first and second target nucleic acids prising ( i ) a plurality of polynucleotides, wherein at least one are phased or present on the same polynucleotide. of the polynucleotides comprises a first target nucleic acid 15 In another aspect, this disclosure provides a method of and a second target nucleic acid ; ( ii ) a first labeled probe to identifying the probability of a deletion of a target nucleic detect the first target nucleic acid ; ( iii ) a second labeled acid from a chromosome comprising: ( a ) subdividing a probe to detect the second target nucleic acid ; and ( iii ) sample into multiple partitions wherein the sample com reagents to enable a PCR reaction ; b . separating the sample prises: ( i ) a pair of chromosomes such that at least one of the into a plurality of spatially - isolated partitions, wherein the 20 chromosomes contains a first target nucleic acid and wherein spatially - isolated partitions comprise on average less than both chromosomes contain the same marker nucleic acid ; five target polynucleotides; c . subjecting the samples to a ( ii ) a first labeled probe to detect the first target nucleic acid ; PCR amplification reaction in order to detect the first and ( iii ) a second labeled probe to detect the marker nucleic acid ; second target nucleic acids ; d . detecting the labeled probe in and ( iii ) reagents to enable a PCR reaction ; ( b ) performing the partitions ; e . based on step d , enumerating the number of 25 an amplification reaction to detect the first target nucleic acid partitions that comprise both the first and second target and the marker nucleic acid within the partitions ; ( c ) enu nucleic acids ; and f. correlating the number of partitions that merating the number of partitions containing the marker comprise both the first and second target nucleic acids with nucleic acid and no target nucleic acid ; and ( d ) determining the degree of fragmentation of the sample . In some embodi- the probability of a deletion of the target nucleic acid from ments, the first and second target nucleic acid are within one 30 at least one of the chromosomes based on the value of step megabase apart . In some embodiments , in a normal subject, c , where a higher value in step c is correlated with an the first and second target nucleic acids are separated by increased probability that the target nucleic acid is absent greater than one kilobase . In some embodiments , the method from one of the chromosomes within the pair of chromo further comprises calculating the expected number of par - somes . titions containing both first and second target nucleic acids 35 In some embodiments , the target nucleic acid is suspected if the target nucleic acids are physically linked . In some to be present in multiple copies on at least one of the embodiments , the probability of fragmentation is positively chromosomes. In some embodiments , the marker and the correlated with a decreased number of partitions that contain target nucleic acid are located in close proximity to each both the first and second target nucleic acid . other on at least one of the chromosomes. In some embodi In yet another different but related aspect , this disclosure 40 ments , themarker and the target nucleic acid are within 5000 provides a method for determining the probability of frag - base pairs apart . In some embodiments , the marker nucleic mentation of two genetically linked loci in a sample of acid comprises a single nucleotide polymorphism . In some polynucleotides comprising : a. performing digital PCR embodiments , the method further comprises performing a (dPCR ) on said sample , wherein the dPCR comprises sepa - separate assay to determine fragmentation of a different rating the polynucleotides into separated units ; b . determin - 45 polynucleotide . In some embodiments , the method further ing a first sum of the number of units with signal indicating comprises using the results of the separate assay to aid the the presence of a first locus and the number of units with determining of step d , wherein a determination of a high signal indicating the presence of the first locus and a second probability of a deletion is strengthened by a determination locus ; c . determining a second sum of the number of units that the different polynucleotide is fragmented . In some with signal indicating the presence of the second locus and 50 embodiments , the method further comprises performing a the number of units with signal indicating the presence of the separate assay to determine the presence of high molecule first locus and the second locus; and d . inputting the first and weight DNA in the sample . second sums into an algorithm to determine the percentage In yet another aspect, this disclosure provides a method of of the two genetically linked loci in the sample that are identifying the probability of a deletion or translocation of a fragmented . 55 target region from a polynucleotide comprising : (a ) subdi In some embodiments , the polynucleotides in the sample viding a sample into multiple partitions wherein the sample are partially degraded . In some embodiments , the polynucle - comprises: ( i ) a polynucleotide suspected of comprising a otides are DNA . In some embodiments , the polynucleotides first marker nucleic acid and a second marker nucleic acid , are RNA . In some embodiments , the polynucleotides com - wherein the first and second marker nucleic acids are prise a mixture of DNA and RNA . 60 separated by at least one megabase and wherein the target The present disclosure also provides a method for deter region is suspected to be positioned between the first and mining the probability that a first target nucleic acid is second marker nucleic acids ; ( ii ) a first labeled probe to present on the same polynucleotide as a second targetel detect the first marker nucleic acid ; ( iii ) a second labeled nucleic acid comprising : a ) dividing a sample of polynucle - probe to detect the second marker nucleic acid ; and ( iii ) otides into at least two subsamples ; b ) in a first subsample , 65 reagents to enable a PCR reaction ; (b ) performing an ampli pre - amplifying the first and second target nucleic acid with fication reaction to detect the first and second marker nucleic short cycle PCR ; c ) separating the first subsample into a first acids within the partitions ; (c ) enumerating the number of US 10 , 167 ,509 B2 partitions containing both the first and second marker locus with allele - specific amplification data for the second nucleic acids within the partitions ; and ( d ) determining the locus from the same volumes. probability of a deletion or translocation of the target region In some embodiments of this aspect , the step of partition based on the value of step c , where a higher value in stepc ing includes a step of forming an emulsion , in which the is correlated with an increased probability that the target 5 volumes are droplets . In some embodiments, the step of nucleic acid is absent from the polynucleotide . In some collecting may include collecting allele - specific amplifica embodiments , the polynucleotide is a chromosome. In some tion data from individual droplets of the emulsion . In some embodiments , the step of forming an emulsion embodiments , the target region is a specific region of a includes a step of passing the aqueous phase through an chromosome known to be present in a wild -type subject. In 10 orifice such thatmonodisperse droplets of the aqueous phase some embodiments , the method further comprises perform are generated . In some embodiments , the step of partitioning ing a separate assay to determine fragmentation of a different includes a step of forming droplets that are about 10 to 1000 polynucleotide . In some embodiments, the method further micrometers in diameter. In some embodiments, the step of comprises using the results of the separate assay to aid the partitioning includes a step of forming at least about 1000 determining of step d , wherein a determination of a high311 15 volumes . In some cases, the step of partitioning includes a probability of a deletion or translocation is strengthened by step of partitioning an aqueous phase including optically a determination that the different polynucleotide is not distinguishable fluorescent probes capable of hybridizing fragmented . specifically to each allele sequence amplified . In one aspect, this disclosure provides methods for hap In some embodiments , the nucleic acid is obtained or lotype analysis comprising : ( a ) partitioning an aqueous 20 derived from a diploid subject. In some cases, the nucleic phase containing nucleic acid into a plurality of discrete acid is genetic material obtained from a subject . In some volumes ; ( b ) amplifying in the volumes at least one allele cases , the nucleic acid includes cDNA obtained by reverse sequence from each of a first polymorphic locus and a transcription of RNA obtained from a subject . In some cases , second polymorphic locus that exhibit sequence variation in the subject is a multicellular organism . In some cases, the the nucleic acid ; ( c ) determining at least one measure of 25 subject is a person . co -amplification of allele sequences from both loci in the In some embodiments , the step of amplifying includes a same volumes ; and ( d ) selecting a haplotype of the first and step of amplifying a pair of different allele sequences from second loci based on the at least one measure of co - the first locus , and the step of collecting data includes a step amplification . of collecting data that distinguishes amplification of each In some embodiments of this aspect , the first and second 30 allele sequence of the pair in individual droplets . loci are contained in a target region of the nucleic acid , In some embodiments , the step of correlating includes a wherein the step of partitioning results in an average con - step of separately correlating allele- specific amplification centration of less than several copies of the target region per data for each allele sequence of the first locus with allele volume. In some embodiments , the step of partitioning specific amplification data for the allele sequence of the results in an average concentration of less than about one 35 second locus . copy of the target region per volume. In some cases, the step of correlating includes applying a In some embodiments of this aspect, the step of deter - threshold to the allele- specific amplification data to convert mining at least one measure includes a step of determining it to binary form , and the step of correlating is performed at least one correlation coefficient for allele - specific ampli - with the binary form of the data . In some cases, the step of fication data of the first locus correlated with allele - specific 40 correlating further comprises a step of determining a number amplification data of the second locus from the same vol- of volumes exhibiting co - amplification of a particular allele umes . In some cases , the step of determining at least one sequence from both loci . In some cases, the step of corre measure includes a step of determining a first correlation lating further comprises a first step of determining a first coefficient and a second correlation coefficient for allele - number of volumes exhibiting co - amplification of the sec specific amplification data of a first allele sequence and a 45 ond locus allele sequence and a first allele sequence from the second allele sequence of the first locus correlated respec - first locus and a second number of droplets exhibiting tively with allele - specific amplification data of the second co - amplification of the second locus allele sequence and a locus from the same volumes , and wherein the step of second allele sequence from the first locus , and a second step selecting a haplotype is based on a step of comparing the of comparing the first and second numbers of volumes . first and second correlation coefficients with each other. In 50 In some embodiments , the step of selecting a haplotype is some cases, the step of determining at least one measure based on which allele - specific amplification data for the first includes a step of determining a number of volumes that locus exhibits a higher correlation with such allele -specific exhibit co - amplification of a particular allele sequence of the amplification data for the second locus . In some embodi first locus and a particular allele sequence of the second ments , the step of selecting a haplotype is based on corre locus, and wherein the step of selecting a haplotype is based 55 lation coefficients corresponding to respective distinct allele on the number of volumes exhibiting co - amplification . The sequences amplified from the first locus . In some embodi step of determining a number of volumes may further ments , the step of selecting a haplotype is based on whether comprise a step of determining a first number of volumes the step of correlating indicates a negative or a positive and a second number of volumes that exhibit respective correlation for co -amplification of particular allele co -amplification of a first allele sequence or a second allele 60 sequences of the first and second loci in the same volumes . sequence of the first locus with a particular allele sequence In another aspect , this disclosure provides a system for of the second locus, and wherein the step of selecting a haplotype analysis comprising : ( a ) a droplet generator con haplotype is based on first and second numbers of volumes . figured to form droplets of an aqueous phase including In some embodiments , the step of determining at least one nucleic acid ; ( b ) a detector configured to collect allele measure comprises collecting allele -specific amplification 65 specific amplification data for each of the loci from indi data for each of the loci from individual volumes , and vidual droplets ; and ( c ) a processor configured to correlate correlating allele - specific amplification data for the first allele - specific amplification data for the first locus with US 10 , 167 , 509 B2 In allele - specific amplification data for the second locus from chromosome type in the genetic material of a subject , in the same volumes and to select a haplotype of the nucleic accordance with aspects of the present disclosure . acid for the first and second loci based on correlation of the FIG . 7 is a schematic view of a flowchart illustrating allele - specific amplification data . performance of an exemplary version of the method of FIG . In some aspects , this disclosure provides a method of 5 4 , with droplets as partitions and with the genetic material determining the degree ofmethylation of CpG islands within from the subject of FIG . 6 being analyzed to distinguish the a sample of genomic DNA comprising : a. obtaining a sample potential haplotypes presented in FIG . 6 , in accordance with of genomic DNA , wherein the genomic DNA comprises a aspects of the present disclosure . first target nucleic acid that is separated from a second target FIG . 8 is a graph illustrating an alternative approach to nucleic acid by a CpG island ; b . contacting the sample of " correlating the amplification data of FIG . 7 , in accordance genomic DNA with a methyl- sensitive restriction enzyme with aspects of the present disclosure . under conditions suitable for enabling digestion of unmeth - FIG . 9 illustrates a flowchart for predicting fragmentation ylated nucleic acids within the genomic DNA ; C . contacting between two targets . the sample with a probe that detects the presence of the first 15 FIG . 10 illustrates linked and unlinked targets . Part A target nucleic acid ; d . contacting the sample with a probe illustrates unlinked targets T1 and T2 . Part B illustrates a that detects the presence of the second target nucleic acid ; e. mixture of linked T1 and T2 and unlinked T1 and T2. Part separating the sample into a plurality of spatially isolated C illustrates different spacings between T1 and T2 . partitions ; f. within the partitions , conducting an amplifica - FIG . 11 illustrates a flowchart for analyzing methylation tion reaction suitable for detecting the first and second target 20 burden . nucleic acids; g. determining the number of partitions that FIG . 12 illustrates an embodiment of a methylation bur comprise both the first and second target nucleic acids ; h . den assay. comparing the number obtained in step g with a number of FIGS . 13 and 14 illustrate information that can be con partitions comprising first or second target nucleic acids ; and sidered when selecting a restriction enzyme. i. from the comparison in step h , determining the degree of 25 FIGS . 15A and 15B illustrate assay information that can methylation of CpG islands within the genomic DNA be entered into a database . FIG . 15A discloses SEQ ID NOS: sample . In some embodiments , the genomic DNA is human 26 - 28 , respectively , in order of appearance . FIG . 15B dis genomic DNA . In some embodiments , the genomic DNA closes SEQ ID NOS : 26 - 28 and 46 - 47 , respectively , in order comprises maternal and fetal DNA . In some embodiments , of appearance . the genomic DNA comprises fetal DNA . 30 FIG . 16 illustrates an example of a workflow for a ddPCR experiment. INCORPORATION BY REFERENCE FIG . 17 illustrates maximum extension in droplet genera tion . All publications, patents , and patent applications men FIG . 18 illustrates maximum extension as a function of tioned in this specification are herein incorporated by ref- 35 sample flow rate . erence to the same extent as if each individual publication , FIG . 19 depicts droplet properties of undigested samples patent, or patent application was specifically and individu 1 - 10 and digested samples 11 -20 . ally indicated to be incorporated by reference . FIGS. 20A and 20B illustrate drift of CNV values of BRIEF DESCRIPTION OF THE DRAWINGS 40 stored , digested DNA . FIG . 21 illustrates high variability in the copy number of The novel features of the invention are set forth with the amylase gene among individuals . particularity in the appended claims. A better understanding FIG . 22 illustrates copy number variation for CCL3L1. of the features and advantages of the present invention will FIGS . 23A and 23B illustrate copy number variation for be obtained by reference to the following detailed descrip - 45 the MRGPRX1. tion that sets forth illustrative embodiments , in which the FIGS. 24A and 24B illustrate copy number variation for principles of the invention are utilized , and the accompany . CYP2D6 . ing drawings of which : FIG . 25 illustrates an optimal common annealing tem FIG . 1 illustrates a flowchart for estimating the copy perature identified using a simple gradient plate . number of a target sequence . 50 FIG . 26 illustrates an optimal common annealing tem FIG . 2 illustrates an example where two target sequences perature identified using a simple gradient plate . are on a maternal chromosome and an example where one FIG . 27 illustrates SM1 and SMN2 copy number . target sequence is on a maternal chromosome and one is on FIG . 28 illustrates copy number variation for SMN1 and a paternal chromosome. SMN2. FIG . 3a illustrates a flowchart for determining the linkage 55 FIG . 29 illustrate analysis of HER2 . of a target sequence . FIG . 30 illustrates results of a linkage experiment. Six FIG . 3b illustrates an alternative workflow for determin - samples were subjected to a restriction digest or not, and ing the linkage of a target sequence . copy number of a target sequence was determined . The data FIG . 4 is a flowchart listing steps that may be performed suggest that samples 4 and 5 have two target sequences on in an exemplary method of haplotype analysis using ampli- 60 the same chromosome . fication performed in sample partitions , in accordance with FIG . 31 illustrates an analysis of fluorescence intensity in aspects of the present disclosure . a real- time PCR experiment after 25 cycles for MRGPRX1 FIG . 5 is a schematic view of selected aspects of an and RPP30 . exemplary system for performing the method of FIG . 4 , in FIG . 32 illustrates average fluorescence of triplication accordance with aspects of present disclosure . 65 measurements for 4 samples in a real - time PCR experiment . FIG . 6 is a schematic view of exemplary haplotypes that Samples suspected of having two targets on one chromo may be created by a pair of SNPs located on the same some and 0 on another (18853 and 19108 ) have higher US 10 , 167 ,509 B2 11 12 fluorescence than samples suspected of having one copy of FIG . 50 illustrates examples of genetic rearrangements a target on a chromosome ( trans - configuration ) ( 18507 and that can be analyzed with a collocation assay . BC106 ). FIG . 33 illustrates an analysis of fluorescence intensity in DETAILED DESCRIPTION OF THE a real - time PCR experiment after 28 cycles for MRGPRX1 5 INVENTION and RPP30 . Overview FIG . 34 illustrates a comparison of ddPCR concentration In general, provided herein are methods, compositions, and copy number variation . In samples 18853 and 19108 , and kits for analyzing nucleic acid sequence , e . g ., digital depressed CNV for uncut samples is observed at 28 cycles* *. 10 partitioning . For example , provided herein are methods , FIG . 35 illustrates average fluorescence of triplicate mea compositions , and kits for estimating the number of copies surements for 4 samples in a ddPCR experiment after 28 of a target nucleic acid sequence in a sample , e. g. , a genome. cycles. Samples suspected of having two targets on one Also provided herein are methods , compositions , and kits chromosome and 0 on another (18853 and 19108 ) have for determining linkage or haplotype information of one or higher fluorescence than samples suspected of havingring one 15 moremo target sequences in a sample , e .g . , a genome. The copy of a target on each of two chromosomes ( trans haplotyping information can be information regarding configuration ) ( 18507 and BC106 ) . whether or not multiple copies of one target sequence are on FIG . 36 illustrates average fluorescence of triplicate mea a single or multiple chromosomes . Using the concept of surements for 4 samples in a ddPCR experiment after 31, 34 , collocation of different targets within the same partition , it and 40 cycles . Samples suspected of having two targets on 20 can be practical to infer phase , i. e ., whether a particular one chromosome and 0 on another ( 18853 and 19108 ) have allele of one mutation or a SNP is physically linked to an higher fluorescence than samples suspected of having one allele of another mutation or a SNP . In another aspect , copy of a target on each of two chromosomes ( trans methods, compositions, and kits are provided herein for configuration ( 18507 and BC106 ) at cycle 31 but not cycle determining the extent of fragmentation or degradation in a 34 or 40 . 25 nucleic acid sample ( e . g ., a genomic DNA sample , RNA FIG . 37 illustrates an experimental setup for a long range sample , mRNA sample , DNA sample , CRNA sample , cDNA PCR experiment. Long - range PCR for scenarios A & B are sample, miRNA sample, siRNA sample ) , by, e . g . , digitally expected to fail because the distance between primers is too analyzing collocation signal. In another aspect, methods are great. However , PCR should work for scenario C containing provided herein for analyzing nucleic acid methylation mowe 30 status , alternative splicing , finding inversions, transloca a chromosome that has a deletion for MRGPRX1. Arrows 30 stattions , and deletions. indicate primers . Copy Number Variation Estimation FIG . 38 illustrates a long -range PCR experiment with 6 Copy number variation of one or more target sequences samples . Bands migrating at 1500 indicate PCR products can play a role in a number of diseases and disorders . One from MRGPRX1 internal primers, and bands migrating JUSTjust 2535 method to analyze copy number variation of a target below 3000 indicate PCR products from primers flanking sequence is through a digital analysis , such as digital PCR , the MRGPRX1 gene . or droplet digital PCR . However, digital analysis of copy FIG . 39 illustrates long -range PCR products and MRG number of a target sequence can underestimate the number PRX1 CNV results from six samples , further distinguishing of copies of a target nucleic acid sequence in a sample if the samples with one or two MRGPRX1 copies on the same 40 multiple copies of the target nucleic acid sequence are on the chromosome by restriction digest analysis . same polynucleotide in a sample . For example , in a digital FIG . 40 illustrates the percentage difference from digested PCR assay that has multiple compartments (e . g ., partitions, sample of the estimated copy number. spatially isolated regions ) , nucleic acids in a sample can be FIG . 41 is a schematic illustrating sequences recognized partitioned such that each compartment receives on average by FAM and VIC probes separated by 1K , 10K , or 100K 45 about 0 , 1 , 2, or several target polynucleotides . Each parti bases . tion can have, on average , less than 5 , 4 , 3 , 2 , or 1 copies of FIG . 42 illustrates fragments of nucleic acid . T1 and T2 a target nucleic acid per partition ( e . g . , droplet) . In some are target sequences . FIG . 42A illustrates a scenario in which cases , at least 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , T1 and T2 are always on separate nucleic acids ( total 60 , 70 , 80 , 90 , 100 , 125 , 150 , 175 , or 200 partitions ( e . g . , fragmentation ). FIG . 42B illustrates a scenario in which T1 50 dropletsnumber )of have compartments zero copies that of contain a target a polynucleotidenucleic acid , Thecan and T2 are always linked on a nucleic acid ( no fragmenta be enumerated . However , if two copies of a target nucleic tion ) . FIG . 42C illustrates a scenario in which T1 and T2 are acid sequence are on a single polynucleotide a compartment linked on some nucleic acids and are also on separate nucleic containing that polynucleotide can be counted as having acids (partial fragmentation ). only one target sequence . FIG . 43 illustrates a DNA quality assessment. Provided herein are methods for physically separating FIG . 44 illustrates contamination by large DNA in plasma target nucleic acids sequences . Often , the methods provided affects CNV values . herein avoid underestimating copy numbers of a target FIG . 45 illustrates restriction enzymes for titration for sequence due to the presence of multiple copies of the target CNV analysis. 60 sequence on a single polynucleotide. FIG . 1 illustrates an FIG . 46 illustrates restriction enzyme titration for CNV overview of an embodiment of a method of copy number analysis . estimation ( 101) ; this figure and the remaining figures FIG . 47 illustrates development of protocols to optimize provided in this disclosure are for illustrative purposes only challenging ddPCR CNV assays . and are not intended to limit the invention . The steps in FIG . FIG . 48 illustrates measurement of restriction enzyme 65 1 can be performed in any suitable order and combination activity with ddPCR . and can be united with any other steps of the present FIG . 49 illustrates haplotyping through collocation . disclosure. A first sample of polynucleotides is obtained US 10 , 167 ,509 B2 13 14 ( 121 ); the first sample can be , e . g ., a genomic DNA sample of genome copies of a target nucleic acid sequence in a The target nucleic acid sequences in the first sample can be sample . In another embodiment, estimating the copy number physically separated ( e . g . , by contacting the first sample can comprise comparing the number of partitions compris with one or more restriction enzymes ) ( 141) . The first ing the target sequence to the number of partitions compris sample can be separated into a plurality of partitions ( 161 ) . 5 ing the reference nucleic acid sequence . In another embodi The number of partitions with the target sequence can be ment, a CNV estimate is determined by a ratio of the enumerated ( 181 ) . The copy number of the target can then concentration of target nucleic acid sequence to a reference be estimated (201 ) . sequence . The target nucleic acids can be identical; or, in other In another embodiment, the method further comprises the cases , the target nucleic acids can be different. In some 10 step of analyzing a second sample , wherein the second cases , the target nucleic acids are located within the same sample and the first sample are derived from the same gene. In some cases , the target nucleic acids are each located sample ( e . g . , a nucleic acid sample is split to the first sample in a different copy ( identical or near identical copy ) of a and the second sample ) . The method can further comprise gene . In still other cases , the target sequences are located not contacting the second sample with one or more restric within introns, or in a region between genes . Sometimes , one 15 tion enzymes . In some cases , the method further comprises target sequence is located in a gene ; and the second target separating the second sample into a plurality of partitions. sequence is located outside of the gene . In some cases , a The method can further comprise enumerating the number target sequence is located within an exon . of partitions of the second sample that comprise the target In some cases , a genome comprises one target sequence . sequence . In another embodiment, the method further com In some cases , a genome comprises two or more target 20 prises enumerating the number of partitions of the second sequences. When a genome comprises two or more target sample that comprise a reference sequence . In another sequences , the target sequences can be about, or more than embodiment, the method comprises estimating the copy about 50 % , 55 % , 60 % , 65 % , 70 % , 75 % , 80 % , 85 % , 90 % , number of the target sequence in the second sample . In 95 % , or 100 % identical . another embodiment, estimating the copy number of the Physically separating two target sequences can comprise 25 target sequence in the second sample comprises comparing physically separating the target sequences by cleaving a the number of partitions from the second sample with the specific site on the nucleic acid sequence . In some cases, the target sequence and the number of partitions from the second physically separating target nucleic acid sequences can sample with the reference sequence . comprise contacting the first sample with one or more The copy number of the target sequence from the first restriction enzymes. Physically separating the target nucleic 30 sample and the copy number of the target sequence in the acid sequences can comprise digesting a polynucleotide at a second sample can be compared to determine whether the site located between the target nucleic acid sequences. In copy number of the target sequence in the second sample some cases , the target nucleic acid sequences are each was underestimated . The degree to which the copy number located within a gene . In some cases , the site that is targeted was underestimated may be indicative of whether interro for digestion is located between the two genes . In some 35 gated copies were all on one chromosome or if at least one cases, the site selected for digestion is located in a gene ; and , copy was on one homologous chromosome and at least one in some cases, the gene is the same gene as the gene which copy was on the other homologous chromosome. Values contains the target sequences . In other cases, the site selected closer to one per diploid genome may indicate the first case , for digestion is located in a different gene from that of the while values closer to two may indicate the second case . target sequence . In some cases , a target sequence and the site 40 Additional methods of determining copy number differ targeted for digestion are located in the same gene ; and the ences by amplification are described , e . g . , in U . S . Patent target sequence is located upstream of the site targeted for Application Publication No . 20100203538 . Methods for digestion . In other cases, a target sequence and the site determining copy number variation are described in U . S . targeted for digestion are located in the same gene ; but the Pat. No. 6 , 180 , 349 and Taylor et al. ( 2008 ) PLoS One 3 ( 9 ) : target sequence is located downstream of the site targeted for 45 e3179 . digestion . In some cases , target nucleic acids can be sepa - When employing methods described herein , a variety of rated by treatment of a nucleic acid sample with one ormore features can be considered : restriction enzymes . In some cases , target nucleic acids can Sample preparation : Properties of nucleic acids to be be separated by shearing . In some cases , target nucleic acids considered can include secondary structure , amplicon can be separated by sonication . 50 length , and degree of fragmentation . An assay can be Following the physical separation step ( e . g ., digesting performed to determine the degree of fragmentation of a with one or more restriction enzymes ), the sample can be nucleic acid sample . If the degree of fragmentation of a partitioned into multiple partitions. Each of the plurality of nucleic acid sample is too high , the sample can be discarded partitions can comprise about 0 , 1 , 2 or several target from an analysis . Steps can be taken to eliminate secondary polynucleotides . Each partition can have , on average , less 55 structure of nucleic acids in a sample . Secondary structure of than 5 , 4 , 3 , 2 , or 1 copies of a target nucleic acid per a nucleic acid can be modulated , for example , by regulating partition ( e . g ., droplet) . In some cases, at least 0 , 1 , 2 , 3 , 4 , the temperature of a sample or by adding an additive to a 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 125 , 150 , sample . It can be determined whether a potential amplicon 175 , or 200 droplets have zero copies of a target nucleic acid . is too large to be efficiently amplified . In one embodiment, Often , target nucleic acid is amplified in the partitions . In 60 a Bioanalyzer is used to assess nucleic acid ( e . g . , DNA ) some cases, the amplification comprises use of one or more fragmentation . In another embodiment, size exclusion chro TaqMan probes. matography is used to assess nucleic acid ( e . g ., DNA ) In another embodiment, the method further comprises the fragmentation . step of enumerating the number of partitions comprising a Dynamic range : Increasing the number of partitions or reference nucleic acid sequence . A reference nucleic acid 65 spatially isolated regions can increase the dynamic range of sequence can be known to be present in a certain number of a method . Template nucleic acid can be diluted into a copies per genome and can be used to estimate the number dynamic range . US 10 , 167 , 509 B2 15 16 Accuracy : If a homogenous sample is used , CNV values ence nucleic acid sequence . The mean intensity can corre can be expected to fall on integer values (self -referencing ) . spond to the number of starting copies of the target. If Drop -out amplification can cause inaccurate concentration multiple targets are linked along a single polynucleotide measurements and , therefore, inaccurate CNV determina strand, the intensity in the partition ( e . g . , droplet ) that tions . Additives ( e . g . , DMSO ) can be added in GC -rich 5 captures this strand may be higher than that of a partition assays . ( e . g . , droplet) that captures a strand with only a single copy Multiplexing : An experiment can be multiplexed . For of the target. Excess presence of positive droplets with example , two colors can be used in the methods provided higher mean amplitudes can suggest the presence of a herein : FAM : BHQ and NFQ -MGB assays ; VIC : NFQ - haplotype with multiple CNV copies . Conversely, presence MGB , TAMRA . HEX : BHQ . 5 ' and 3 ' labeling can be used , 10 of positive droplets with only low mean amplitudes can and an internal labeled dye can be used . In some cases , the suggest that only haplotypes with single CNV copies are number of colors used in the methods provided herein is present in the sample . In another embodiment , the number of greater than two, e . g ., greater than 3 , 4 , 5 , 6 , 7 , 8 , 9 , or 10 cycles used to estimate CNV can be optimized based on the colors . size of the partitions and the amount of reagent in the Precision : Increased precision can be accomplished in 15 partitions . For example, smaller partitions with lower several ways . In some cases , increasing the number of amounts of reagent may require fewer amplification cycles droplets in a dPCR experiment can increase the ability to than larger partitions that would be expected to have higher resolve small differences in concentration between target amounts of reagent. and reference nucleic acids. Software can enable “ metawell ” The method can be useful because it can be used to analysis by pooling replicates from individual wells . In 20 analyze even target copies that are near each other on the some cases , the methods provided herein enable detection of polynucleotide , e . g . , less than about 10 , 9 , 8 , 7 , 6 , 5 , 4 , 5 , 2 , a difference in copy number that is less than 30 % , 20 % , 1 , 0 . 7 0 .5 , 0 . 3 , 0 . 2 , 0 . 1 , 0 . 05 , or 0 .01 megabases apart; or that 15 % , 14 % , 13 % , 12 % , 11 % , 10 % , 9 % , 8 % , 7 % , 6 % , 5 % , are very near each other on the polynucleotide, e .g ., less than 4 % , 3 % , 2 % , or 1 % . about 10 , 9 , 8 , 7 , 6 , 5 , 4 , 3 , 2 , or 1 kilobase apart . In some Assay landscape : Target gene assays described herein can 25 cases, the method is useful for analyzing target copies that be combined with commercially available or custom are very close to each other on the polynucleotide , e . g . , designed target gene assays . within about 1 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 200 , Copy number variations described herein can involve the 300 , 400 , 500 , 600 , 700 , 800 , 900 , or 950 base pairs (bp ' s ) loss or gain of nucleic acid sequence . Copy number varia - apart . In some cases , the method is useful for analyzing tions can be inherited or can be caused by a de novo 30 target copies that are separated by zero ( 0 ) base pairs . In mutation . A CNV can be in one or more different classes . some cases, the method can be applied to identical , near See, e . g . Redon et al . ( 2006 ) Global variation in copy identical, and completely different targets . number in the human genome. Nature 444 pp . 444 -454 . A Additional embodiments of methods for estimating the CNV can result from a simple de novo deletion , from a copy number of one or more target sequences are described simple de novo duplication , or from both a deletion and 35 herein . duplication . A CNV can result from combinations of multi Determining Linkage of Target Sequences allelic variants. A CNV can be a complex CNV with de novo Methods described herein can indicate whether two or gain . A CNV can include about, or more than about 0 , 1 , 2 , more target sequences are linked on a polynucleotide ( e . g . , 3 , 4 , 5 , 6 , 7 , 8 , 9 , or 10 contiguous genes. A CNV can include the methods can be used to determine the linkage of target about 1 to about 10 , about 1 to about 5 , about 1 to about 4 , 40 sequences ) . In one embodiment, a method is provided about 1 to about 3 , about 1 to about 2 , about 0 to about 10 , comprising physically separating target sequence copies about 0 to about 5 , or about 0 to about 2 contiguous genes. ( e . g ., by using one or more restriction enzymes ) so that the A copy number variation can involve a gain or a loss of copies can be assorted independently into partitions for a about, or more than about, 100 , 500 , 1000 , 2000 , 3000 , digital readout, and using a readout of undigested DNA 4000 , 5000 , 6000 , 7000 , 8000 , 9000 , 10 ,000 , 20 , 000 , 45 together with a readout from digested DNA to estimate how 30 ,000 , 40 ,000 , 50 ,000 , 60, 000 , 70, 000 , 80, 000 , 90, 000 , the target copies are linked . For example ,methods described 100 , 000 , 200 , 000 , 500 ,000 , 750 , 000 , 1 million , 5 million , or herein can be used to determine if the target sequences are 10 million base pairs . In some cases , a copy number varia - present on the same chromosome or if they are on different tion can involve the gain or loss of about 1 , 000 to about chromosomes ( see e . g . , FIG . 2 ) . FIG . 2 illustrates a nucleus 10 ,000 ,000 , about 10 ,000 to about 10 ,000 , 000 , about 100 , 50 (left ) in which a maternal chromosome comprises two 000 to about 10 ,000 ,000 , about 1 ,000 to about 100 ,000 , or copies of a target sequence , but the corresponding paternal about 1, 000 to about 10 , 000 base -pairs of nucleic acid chromosome comprises no copies; in the nucleus on the sequence . A copy number variation can be a deletion , right, a maternal chromosome and the corresponding pater insertion , or duplication of nucleic acid sequence . In some n al chromosome each comprise one copy of the target. cases , a copy number variation can be a tandem duplication . 55 FIG . 3a illustrates a workflow of an embodiment of a In another embodiment, CNV haplotypes can be esti- method , without being restricted to any order of the steps . In mated from fluorescent signals generated by real- time PCR one aspect , a method (320 ) is provided comprising a ) or ddPCR of partitioned samples . Before the late stages of separating a sample comprising a plurality of polynucle a real- time PCR or ddPCR experiment, when reagents can otides into at least two subsamples ( 322 ) ; b ) physically become limiting, a partition with a higher copy number of a 60 separating physically linked target sequences in a first sub target sequence can have a higher signal than a partition with sample ( 324 ) ; c ) separating the first subsample into a first set a lower copy number of the target sequence . In one embodi- of a plurality of partitions (326 ); d ) estimating the copy ment, a sample ( e . g . , a subsample of a sample used in a number of a target sequence in the first subsample ( 328 ) ; e ) linkage experiment) can be partitioned , and PCR can be separating a second subsample into a second set of a performed on the partitions (e . g. , droplets ). The mean fluo - 65 plurality of partitions ( 330 ); f) estimating the copy number rescence intensity of partitions can be determined as they of the target sequence in the second subsample ( 332 ) ; g ) undergo exponential amplification for a target and /or refer comparing the estimated copy number of the target sequence US 10 , 167 , 509 B2 17 18 in the first subsample to the estimated copy number of the Often , the greater the difference between copy numbers in target sequence in the second subsample to determine the the first subsample and the second subsample , the more haplotypes of the target sequence in the sample ( 334 ) . likely it is that one of the chromosomes does not carry a copy In one embodiment , physically separating physically of the target . linked target sequences in the first subsample comprises 5 FIG . 3b illustrates a workflow of another embodiment of contacting the first subsample with one or more restriction a method , without being restricted to any order of the steps . enzymes . In another embodiment, contacting the sample In one aspect, a method (336 ) is provided comprising , a ) comprising polynucleotides with one or more restriction obtaining a sample of polynucleotides ( 338 ) comprising a enzymes comprises digesting nucleic acid sequence between plurality of polynucleotides into at least two subsamples at least two target nucleic acid sequences . In some cases , 10 ( 340 ) ; b ) pre - amplifying target sequence in the first sub physically linked target nucleic acids can be separated by sample with short cycle PCR ( 342 ); c ) separating the first contacting a nucleic acid sample with one or more restriction subsample into a first set of a plurality of partitions (344 ) ; d ) enzymes . In some cases , physically linked target nucleic estimating the copy number of a target sequence in the first acids can be separated by shearing . In some cases, physi - subsample ( 346 ); e ) taking a second subsample that has not cally linked target nucleic acids can be separated by soni- 15 been pre - amplified ( 348 ) into a second set of a plurality of cation . partitions ( 350 ) ; f ) estimating the copy number of the target In another embodiment, each of the plurality of partitions sequence in the second subsample ( 352 ) ; g ) comparing the of the first and second subsample comprise about 0 , 1 , 2 or estimated copy number of the target sequence in the first several target polynucleotides . Each partition can have, on subsample to the estimated copy number of the target average , less than 5 , 4 , 3 , 2 , or 1 copies of a target nucleic 20 sequence in the second subsample to determine the linkage acid per partition ( e . g . , droplet ) . In some cases , at least 0 , 1 , of the target sequence in the sample ( 354 ) . 2 , 3 , 4 , 5 , 6 , 7 , 8, 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , In some cases , the preamplification used to separate 125 , 150, 175 , or 200 partitions ( e . g . , droplets ) have zero targets is Specific Target Amplification (STA ) (Qin et al. copies of a target nucleic acid . (2008 ) Nucleic Acids Research 36 e16 ) , which entails per In another embodiment, target sequences are amplified in 25 forming a short pre - amplification step to generate separate the partitions . unlinked amplicons for the target nucleic acids . In another embodiment, estimating the copy number of In one embodiment, pre -amplifying target sequence in the the target sequence in the first subsample comprises enu - first subsample comprises contacting the first subsample merating the number of partitions of the first subsample with a reaction mixture comprising DNA polymerase , comprising the target sequence . In another embodiment , 30 nucleotides , and primers specific to the target sequence and estimating the copy number of the target sequence in the first amplifying the target sequence for a limited number of subsample comprises enumerating the number of partitions cycles. Optionally , the method also comprises using primers of the first subsample comprising a reference nucleic acid for a reference sequence and , optionally , amplifying the sequence . In another embodiment, estimating the copy num reference sequence for a limited number of cycles . In some ber of the target sequence in the first subsample comprises 35 embodiments , the number for the number of cycles can comparing the number of partitions of the first subsample range from 4 - 25 cycles . In some cases , the number of cycles comprising the target sequence to the number of partitions is less than 25 , 24 , 23 , 22 , 21, 20 , 19 , 18 , 17 , 16 , 15, 14 , 13 , comprising the reference nucleic acid sequence in the first 12 , 11 , 10 , 9 , 8 , 7 , 6 , 5 , or 4 cycles . The number of cycles subsample . may vary depending on the droplet size and the quantity of In another embodiment , the method further comprises not 40 available reagents . For example , few cycles may be neces contacting the second subsample with one or more restric - sary for partitions (e . g ., droplets ) that are of smaller size . tion enzymes. In another embodiment, estimating the copy T he pre -amplified first subsample may be partitioned into number of the target sequence in the second subsample multiple partitions , each partition comprising on average comprises enumerating the number of partitions of the less than one target polynucleotide . Each partition can have , second subsample that comprise the target sequence . In 45 on average , less than 5 , 4 , 3 , 2 , or 1 copies of a target nucleic another embodiment, estimating the copy number of the acid per partition (e . g ., droplet ) . In some cases , at least 0 , 1 , target sequence in the second subsample comprises enumer - 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20, 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , ating the number of partitions of the second subsample that 125 , 150 , 175 , or 200 partitions ( e . g . , droplets ) have zero comprise a reference sequence . In another embodiment, copies of a target nucleic acid . estimating the copy number of the target sequence in the 50 In another embodiment, estimating the copy number of second subsample comprises comparing the number of the target sequence in the first subsample comprises enu partitions from the second subsample with the target merating the number of partitions of the first subsample sequence and the number of partitions from the second comprising a reference nucleic acid sequence . In another subsample with the reference sequence . In another embodi embodiment, estimating the copy number of the target ment, the reference sequence for the first and second sub - 55 sequence in the first subsample comprises comparing the sample is the same sequence or a different sequence . number of partitions of the first subsample comprising the In another embodiment, determining haplotypes of the target sequence to the number of partitions comprising the target sequence comprises comparing the estimated copy reference nucleic acid sequence in the first subsample . number of the target sequence in the first subsample to the In another embodiment, the method further comprises not estimated copy number of the target sequence in the second 60 subjecting the second to a pre - amplification step . In another subsample . In another embodiment, the haplotypes com - embodiment, the second subsample is partitioned into mul prises two copies of the target sequence on a single poly - tiple partitions, each partition containing on average about 0 , nucleotide and no copies on the homologous polynucleotide. 1 , 2 , or several target polynucleotides. Each partition can In another embodiment, the haplotyping comprises one copy have , on average, less than 5 , 4 , 3 , 2 , or 1 copies of a target of a target sequence on a first polynucleotide and a second 65 nucleic acid per partition ( e . g . , droplet ). In some cases, at copy of the target sequence on a second (possibly homolo least 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , gous ) polynucleotide . 90 , 100 , 125, 150 , 175 , or 200 partitions ( e .g ., droplets ) have US 10 , 167 ,509 B2 19 20 zero copies of a target nucleic acid . In another embodiment, of probes with two different labels ( e . g ., VIC and FAM ) to estimating the copy number of the target sequence in the detect the same target sequence. For example , a nucleic acid second subsample comprises enumerating the number of sequence can be separated into a plurality of spatially partitions of the second subsample that comprise the target isolated partitions, the target sequence can be amplified in sequence . In another embodiment , estimating the copy num - 5 the partitions, and the two different probes can be used to ber of the target sequence in the second subsample com - detect the target sequence . The nucleic acid sample can be prises enumerating the number of partitions of the second partitioned such that on average about 0 , 1 , 2 , or several subsample that comprise a reference sequence . In another target polynucleotides are in each partition . Each partition embodiment, estimating the copy number of the target can have, on average , less than 5 , 4 , 3 , 2 , or 1 copies of a sequence in the second subsample comprises comparing the 10 target nucleic acid per partition ( e . g . , droplet ) . In some number of partitions from the second subsample with the cases , at least 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , target sequence and the number of partitions from the second 60 , 70 , 80 , 90 , 100 , 125 , 150 , 175 , or 200 partitions ( e . g . , subsample with the reference sequence . In another embodi droplets ) have zero copies of a target nucleic acid . ment, the reference sequence for the first and second sub If a partition comprises two targets linked on a polynucle sample is the same sequence or a different sequence. 15 otide , the partition can have signal for VIC only (VICNIC ), In another embodiment, determining haplotypes of the FAM only (FAM /FAM ) , or VIC and FAM . Overabundance target sequence comprises comparing the estimated copy of partitions with VIC and FAM signal in a partition number of the target sequence in the first subsample to the compared to what is expected from random dispersion of estimated copy number of the target sequence in the second FAM and VIC targets can indicate that the sample contained subsample . In another embodiment, the haplotypes comprise 20 polynucleotides that have at least two targets linked on a two copies of the target sequence on a single polynucleotide polynucleotide . Lack of overabundance of partitions with and no copies on the homologous polynucleotide . In another both VIC and FAM signal can indicate that two target embodiment, the haplotypes comprise one copy of a target nucleic acid sequences are not linked in a sample . sequence on a first polynucleotide and a second copy of the Collocalization target sequence on a second (possibly homologous ) poly - 25 Sample partitioning and the ability to analyze multiple nucleotide . targets in a partition can allows one to detect when targets The greater the difference between copy numbers in the are spatially clustered together in the sample . This can be first subsample and the second subsample , the more likely it done by assessing whether the number of partitions with a is that one of the chromosomes does not carry a copy of the particular combination of targets is in statistical excess target . 30 compared to what would be expected if the targets were In yet another aspect, this disclosure provides a method of randomly distributed in the partitions . The extent of over identifying a plurality of target nucleic acids as being present abundance of such partitions can be used to estimate the on the same polynucleotide comprising , a . separating a concentration of the combination of targets . sample comprising a plurality of polynucleotides into at For example , one can measure two targets: A and B using least two subsamples , wherein the polynucleotides comprise 35 a digital PCR ( e . g ., ddPCR ) . For example , there would be a first and second target nucleic acid ; b . contacting the first four types of droplets : droplets negative for both targets , subsample with an agent capable of physically separating droplets positive for A , droplets positive for B and droplets the first target nucleic acid from the second target nucleic positive for both . Under random distribution the number of acid if they are present on the same polynucleotide; c . double positive droplets should be close to ( total number of following step b , separating the first subsample into a first 40 droplets ) * ( fraction of droplets with at least B ) * ( fraction of set of partitions; d . determining the number of partitions in droplets with at least A ) . If the number of double positive the first set of partitions that comprise the target nucleic acid ; droplets significantly exceeds the expectation , an inference e . separating a second subsample into a second set of can be made that the two targets are in proximity to each partitions; f . determining the number of partitions in the other in the sample . This result can mean that target A and second set of partitions that comprise a target nucleic acid ; 45 B are physically linked by virtue of, e . g . , being on the same and g . comparing the value obtained in step d with the value polynucleotide , that they are part of the same /nucleic obtained in step f to determine the whether the first and acid complex , that they are part of the same exosome, or that second target nucleic acid are present within the same they are part of the same cell. polynucleotide. The presence of a particular target in a partition may be The sample can be of sufficiently high molecular weight 50 assessed by using a fluorophore specific to that target as part so that if a pair of targets is on the same chromosome, they of a probe - based TaqMan assay scheme. For example , when can be mostly linked in solution as well . If the nucleic acid measuring two targets A and B , one can use FAM for A and ( e . g . , DNA ) in a sample is completely unfragmented , the VIC for B . In some embodiments different targets can be readout can be 0 , 1 , or 2 copies of the target ( integers ). assessed with the same fluorophore or intercalating dye However, because nucleic acid ( e . g . , DNA ) can be partially 55 using endpoint fluorescence to distinguish partitions con degraded , copy numbers can span non - integer values , as taining A from those containing B from those containing A well as numbers greater than 2 . Another step can be taken to and B . assess nucleic acid fragmentation of a sample , e . g ., by using Alternative Splicing gels , a Bioanalyzer, size exclusion chromatography, or a Alternative splicing is a pervasive feature of eukaryotic digital PCR co -location method (milepost assay ) . If a 60 biology . This phenomenon occurs when different transcripts nucleic acid sample is found to be overly fragmented , this are constructed from the same genomic region , by employ decreases the likelihood information can be gleaned about ing exons in different configurations often depending on linkage . cell - type , cell state , or external signals . For example , a This approach can used to determine smaller copy number particular gene may have two exons: exon A and exon B and states, e. g ., 2 , 3 , 4 . 65 three possible transcripts : Transcript 1 containing A only , In another embodiment, a method of linkage determina - Transcript 2 containing B only , Transcript 3 containing A tion of a target nucleic acid sequence is provided making use and B . Being able to specify which transcripts are actually US 10 , 167, 509 B2 21 expressed and their levels can have multiple implications for ( region suspected of having tandem copies of the target ) can our understanding of basic biology, disease , diagnostics, and be used . In another embodiment, DNA polynucleotides can therapy . be partitioned into droplets . Partitioning DNA polynucle Collocalization can be used to detect and quantify alter - otides into droplets can be beneficial, as it can permit native splicing by measuring concentrations of RNA mol- 5 detection of two types of DNA species : a ) the DNA segment ecules that possess two or more exons of interest . For with tandemly arranged targets and b ) a DNA segment with example , one assay may target one exon , while another the deletion of the target. If a similar reaction is performed assay may target another exon . Statistically significant col- in bulk ( e .g ., without partitioning polynucleotides ) , the localization of the two assays may suggest the presence of smaller PCR product representing the DNA with the deleted RNA molecules containing both exons . Additional exons 10 target can outcompete the PCR product representing the and locations on potential transcript molecules can be tar DNA segment with tandemly arranged target sequences . As geted and assessed in this fashion . a result , only one PCR product can be generated . The size Rearrangements difference of these PCR products can be estimated using, In another embodiment, two assays (amplicons ) can be e . g . , gel electrophoresis or a Bioanalyzer . constructed that are normally far apart from each other on a 15 In some cases, DNA with tandemly arranged copies of a polynucleotide ( e . g . , two genes separated by millions of by target nucleic acid sequence can be too large to be success on a chromosome) . One assay is on one channel (e . g ., FAM ), fully PCR amplified (e . g. , > 20 KB in size ). In these cases, the other on another channel ( e . g . , VIC ). In a digital ampli often only the smaller PCR product is amplified , represent fication method , e . g ., dPCR or ddPCR , normally , co -local - ing the DNA segment with the deleted target nucleic acid ization in the same partition ( e . g . , droplet) should not be 20 sequence . If the target nucleic acid sequence is too long to observed above the baseline statistical expectation . If col- permit generation of a PCR product, PCR can be performed localization of FAM and VIC occurs ( e . g ., as measured a on a chromosome that contains a deletion for the target linkage analysis described herein ) , that can be an indication nucleic acid sequence . In this case, a product can be gen that the two loci were brought in the vicinity of each other erated if the PCR is over a region deleted for the sequence , on the genome. This result can indicate an inversion or a 25 but a product may not be generated if the target sequence is translocation depending on where the loci are normally . The present because the distance between the primers can be too assays can also be multiplexed on the same channel if their great. endpoint fluorescences are distinct enough . More than two In some embodiments , long range PCR can be used to assays can be multiplexed to catch multiple inversion / resolve linkage or determine copy number estimation . In translocation events or to account for the fact that a given 30 some cases , long range PCR is used in conjunction with the translocation may present with different breakpoints . methods provided herein . In some cases , genotypes of Detection of rearrangements can be used for diagnosing parents or other relatives can be used (alone or in combi and prognosing a variety of conditions, including cancer and nation with the methods provided herein ) to infer the copy fetal defects . Detection of rearrangements can be used to number state of the target individual. select one or more therapeutic treatments for a subject. For 35 In some cases , the method may comprise cloning a example , detection of translocation t ( 9 :22 ) (q34 . 1 ;q11 . 2 ) can chromosomal region using recombinant DNA technology lead to generation of a BCR -ABL fusion protein , associated and sequencing individual copies of the chromosomal with chronic myelogenous leukemia (CML ) . CML patients region . In some embodiments , the method comprises ( a ) that express BCR - ABL can be treated with imatinib using next- generation sequencing , to identify information (Gleevec ) . 40 related to polymorphisms that are closely spaced ( e . g . , less Rearrangements that can be detected with methods than about 2000 nucleotides , less than about 1000 nucleo described here include, e . g ., inversions , translocations , tides , less than about 500 nucleotides , less than about 200 duplications , or deletions ( see e . g . , FIG . 50 ) . ??nucleotides " , or less than about 100 nucleotides apart ) and are Confirming Linkage (Haplotype ) Information Generated present together in the same sequencing read and ( b ) using by Digital Experiment 45 a method provided herein to identify information related to In another embodiment, linkage information determined polymorphisms that are further apart ( e . g . , greater than about using digital analysis and restriction enzyme digest of 5 , 10 , 50 , 100 , 150 , 200 , 250 , 300 , 400 , 450 , 500 , 550 ,600 , samples can be confirmed by one or more other assays . In 650 , 700 , 750 , 800 , 850 , 900 , 950 , 1000 , 1250 , 1500 , 1750 , one embodiment, signal generation during a real- time PCR 2000 , 2500 , 3000 , 3500 , 4000 , 4500 , or 5000 nucleotides or ddPCR experiment of a partitioned sample as described 50 apart ). In some embodiments , the method comprises using a herein can be used to confirm linkage information . In one method provided herein to identify information related to embodiment, a sample ( e . g ., a subsample of a sample used polymorphisms that are further apart ( e . g . , greater than about in a linkage experiment ) can be partitioned , and PCR can be 5 , 10 , 50 , 100 , 150 , 200 , 250 , 300 , 400 , 450 , 500 , 550 , 600 , performed on the partitions ( e . g ., droplets ) . The mean fluo - 650 , 700 , 750 , 800 , 850 , 900 , 950 , 1000 , 1250 , 1500 , 1750 , rescence intensity of partitions can be determined as they 55 2000 , 2500 , 3000 , 3500 , 4000 , 4500 , or 5000 nucleotides undergo exponential amplification for a target and / or refer - apart) . In some cases , the method comprises using a method ence nucleic acid sequence . Partitions with a polynucleotide provided herein in conjunction with using the genotype with multiple ( e . g . , 2 ) linked copies of a target nucleic acid information for the parents or other close relatives of a sequence can have higher fluorescence intensity than drop - subject to infer phase information using Mendelian rules of lets with only one copy of a target nucleic acid sequence . 60 inheritance . However , this approach cannot phase every In another embodiment, long range PCR can be used to polymorphism . Some embodiments comprise a method pro confirm linkage information . For example , PCR can be used vided herein used in conjunction with statistical approaches to detect the presence of two tandemly arranged copies of a to linkage determination . target nucleic acid sequence on the same chromosome Haplotypes ( cis - configuration ) , and it can be used to detect deletion of 65 A haplotype can refer to two or more alleles that are the target nucleic acid sequence on another chromosome. In present together or linked on a single chromosome ( e . g . , on one embodiment , primers outside of the amplified region the same chromosome copy ) and /or on the same piece of US 10 , 167 , 509 B2 23 24 nucleic acid and /or genetic material. Phasing can be the (IV ) exemplary haplotype analysis with amplification in process of determining whether or not alleles exist together droplets , and ( V ) selected embodiments . on the same chromosome. Determination of which alleles in a genome are linked can be useful for considering how genes I. Definitions are inherited . The present disclosure provides a system , 5 including method and apparatus, for haplotype analysis by Technical terms used in this disclosure have the meanings amplification of a partitioned sample . that are commonly recognized by those skilled in the art . FIG . 4 shows a flowchart listing steps that can be per However, the following terms may have additional mean ings, as described below . formed in an exemplary method ( 20 ) of haplotype analysisand . 10 Sequence variation can be any divergence in genome The steps can be performed in any suitable order and sequence found among members of a population or between / combination and can be united with any other steps of the among copies of a chromosome type of a subject and / or a present disclosure. A sample can be obtained (22 ), generally sample . Sequence variation also may be termed polymor from a subject with a diploid or higher complement of phism . chromosomes. The sample can be partitioned ( 24 ). Parti - 151 Locus can be a specific region of a genome, generally a tioning the sample may include partitioning or dividing an relatively short region of less than about one kilobase or less aqueous phase that includes nucleic acid of the sample . A than about one -hundred nucleotides. pair (or more ) of polymorphic loci may be amplified ( 26 ). Polymorphic locus can be a locus at which sequence Allele - specific amplification data may be collected for each variation exists in the population and /or exists in a subject polymorphic locus (28 ). Amplification data for the polymor- 20 and /or a sample . A polymorphic locus can be generated by phic loci and from the samevolumes may be correlated ( 30 ) . two or more distinct sequences coexisting at the same A haplotype for the polymorphic loci can be selected ( 32 ) . location of the genome. The distinct sequences may differ A method of haplotype analysis may be performed with a from one another by one or more nucleotide substitutions, a sample obtained from a subject, such as a person . An deletion / insertion , and /or a duplication of any number of aqueous phase containing nucleic acid of the sample may be 25 nucleotides , generally a relatively small number of nucleo partitioned into a plurality of discrete volumes , such as tides, such as less than about 50 , 10 , or 5 nucleotides , among droplets . Each volume may contain on average less than others. In exemplary embodiments , a polymorphic locus is about one genome equivalent of the nucleic acid , such that created by a single nucleotide polymorphism (a “ SNP ” ) , each volume contains on average less that about one copy of namely , a single nucleotide position that varies within the an allele of a first polymorphic locus and an allele of a linked 30 population . second polymorphic locus . At least one allele sequence from Allele can be one of the two or more forms that coexist at each of the first polymorphic locus and the second poly - a polymorphic locus. An allele also may be termed a variant. morphic locus in the nucleic acid may be amplified . Distin - An allele may be the major or predominant form or a minor guishable allele - specific amplification data for each of the or even very rare form that exits at a polymorphic locus . loci may be collected from individual volumes. Allele - 35 Accordingly , a pair of alleles from the same polymorphic specific amplification data for the first locus may be corre - locus may be present at any suitable ratio in a population , lated with allele - specific amplification data for the second such as about 1 : 1 , 2 : 1 , 5 : 1 , 10 : 1 , 100 : 1 , 1000 : 1 , etc . locus from the same volumes . A haplotype of the nucleic Allele sequence can be a string of nucleotides that char acid for each of the first and second loci may be selected acterizes , encompasses , and / or overlaps an allele . Amplifi based on correlation of the allele - specific amplification data . 40 cation of an allele sequence can be utilized to determine In general , the method can rely on co -amplification , in the whether the corresponding allele is present at a polymorphic same volumes , of allele sequences from distinct loci , if the locus in a sample partition . allele sequences constitute a haplotype of the subject, and , Haplotype can be two or more alleles that are present conversely, lack of co - amplification if they do not. together or linked on a single chromosome ( e . g ., on the same A system for haplotype analysis may be capable of 45 chromosome copy ) and / or on the same piece of nucleic acid performing the method disclosed herein . The system may and / or genetic material; haplotype can also refer to two or comprise a droplet generator configured to form droplets of more target nucleic acids that are present together or linked an aqueous phase including nucleic acid . The system also on a single chromosome. The target nucleic acids can be the may comprise a detector configured to collect allele -specific same or different . amplification data for each of the loci from individual 50 Linkage can be a connection between or among alleles droplets . The system further may comprise a processor. The from distinct polymorphic loci and can also be a connection processor may be configured to correlate allele - specific between or among target nucleic acids that are identical or amplification data for the first locus with allele - specific nearly identical. Polymorphic loci that show linkage (and /or amplification data for the second locus from the same are linked ) generally include respective alleles that are volumes and to select a haplotype of the nucleic acid based 55 present together on the same copy of a chromosome, and on correlation of the allele - specific amplification data . may be relatively close to one another on the same copy , Optionally , the sample may be divided into subsamples. such as within about 10 , 1 , or 0 . 1 megabases, among others . Optionally , the first subsample may be contacted with a restriction enzyme that cleaves a site between the polymor II . System Overview for Haplotype Analysis phic loci; and the second subsample may optionally be 60 exposed to a restriction enzyme. Optionally , allele -specific FIG . 4 shows a flowchart listing steps that may be amplification data from the first subsample may be corre - performed in an exemplary method 20 ofhaplotype analysis . lated with allele -specific amplification data from the second The steps may be performed in any suitable order and subsample . combination and may be united with any other steps of the Further aspects of the present disclosure are presented in 65 present disclosure. the following sections: ( 1 ) definitions, ( II ) system overview , sample may be obtained , indicated at 22 . The sample ( III ) exemplary potential haplotypes created by linked SNPs, may be obtained from a subject, generally a subject with a US 10 , 167 ,509 B2 25 26 diploid or higher complement of chromosomes. In other In some embodiments , one or both of the steps indicated words, the subject typically has at least two sets of chro - at 28 and 30 may be substituted by a step of determining at mosomes, and at least a pair of each type of chromosome in least one measure of co -amplification of allele sequences the subject' s cells . For example , somatic cells of humans from both loci in the same volumes . Any suitable measure ( s ) each contain two copies of chromosome 1 , 2 , 3 , etc . , to give 5 of co - amplification may be used , such as at least one 23 chromosome pairs (two sets of chromosomes ) and a total correlation coefficient obtained by correlation of allele of 46 chromosomes . specific amplification data for the polymorphic loci from the The sample may be partitioned , indicated at 24 . Partition - same volumes . In other examples , the measure of co ing the sample may include partitioning or dividing an amplification may be at least one value representing at least aqueous phase that includes nucleic acid of the sample . 10 one number or frequency of co - amplification of an allele Partitioning divides the aqueous phase into a plurality of sequence from each locus. Further aspects of correlating discrete and separate volumes, which also may be called amplification data and determining measures of co - amplifi partitions . The volumes may be separated from one another cation are described elsewhere in the present disclosure , by fluid , such as a continuous phase ( e . g ., an oil) . Alterna such as in Section IV . tively , the volumes may be separated from one another by 15 In some embodiments , the sample containing polynucle walls , such as the walls of a sample holder . The volumes otides may be divided into two or more subsamples . The first may be formed serially or in parallel. In some embodiments , subsample may be exposed to a restriction enzyme which the volumes are droplets forming a dispersed phase of an cleaves at a site between the two polymorphic loci. The first emulsion . subsample may then be partitioned into multiple partitions . A pair ( or more ) of polymorphic loci may be amplified , 20 Allele - specific amplification data may then be collected for indicated at 26 . More particularly , at least one allele each polymorphic locus , as described herein . The second sequence from each of the polymorphic loci may be ampli - subsample , having not been exposed to a restriction enzyme fied . Each allele sequence is characteristic of a correspond which cleaves at a site between the two polymorphic loci, ing allele of the locus. In some embodiments , only one allele may be partitioned into multiple partitions. Allele - specific sequence may be amplified from each locus , or a pair of 25 amplification may then be collected for each polymorphic allele sequences may be amplified from at least one of the locus . Amplification data from the first and second sub loci. The particular allele sequences and number of distinct samples may be correlated to determine the haplotype for allele sequences that are amplified may be determined by the the polymorphic loci. particular primer sets included in the aqueous phase before A haplotype for the polymorphic loci may be selected , the aqueous phase is partitioned . 30 indicated at 32 . Selection may be based on correlation of Allele - specific amplification data may be collected for amplification data and / or based on the at least one measure each polymorphic locus, indicated at 28 . The data may relate of co - amplification . The haplotype may be selected from to distinguishable amplification (or lack thereof) of each of among a set of potential haplotypes for the polymorphic loci the allele sequences in individual volumes. The data may be being investigated . The selected haplotype generally detected from distinguishable probes corresponding to and 35 includes designation of at least a pair of particular alleles capable of hybridizing specifically to each of the allele that are likely to be linked to one another on the same sequences amplified . The data may be collected in parallel or chromosome copy of the subject. serially from the volumes . In exemplary embodiments , the FIG . 5 shows a schematic view of selected aspects of an data may be collected by optical detection of amplification exemplary system 40 for performing method 20 of FIG . 4 . signals . For example , optical detection may include detect - 40 The system may include a droplet generator (DG ) 42 , a ing fluorescence signals representing distinguishable ampli- thermocycler ( TC ) 44 , a detector (DET ) 46 , and a processor fication of each allele sequence . (PROC ) 48 . Arrows 50 -54 extend between system compo Amplification data for the polymorphic loci and from the nents to indicate movement of droplets (50 and 52 ) and data same volumes may be correlated , indicated at 30 . Correla (54 ) , respectively . tion generally determines which allele sequences are most 45 Droplet generator 42 can form droplets of an aqueous likely to be present together in individual volumes , and thus phase including nucleic acid . The droplets may be formed originally linked to one another on the same chromosome serially or in parallel . copy in genetic material of the subject. Correlation may Thermocycler 44 can expose the droplets to multiple include determining at least one correlation coefficient cor - cycles of heating and cooling to drive amplification , such as responding to co - amplification of distinct allele sequences in 50 PCR amplification , of allele sequences . The thermocycler the same volumes . In some cases , correlation may include may be a batch thermocycler, which amplifies all of the determining a pair of correlation coefficients corresponding droplets in parallel, or may be a flow -based thermocycler , to co - amplification of each of a pair of allele sequences of which amplifies droplets serially , among others . the same locus with an allele sequence of another locus. Detector 46 collects amplification data , such as allele Correlation also may include comparing correlation coeffi - 55 specific amplification data from the droplets . The detector cients with each other and / or with a threshold , or may m ay , for example , be a fluorescence detector, and may detect include determining whether a correlation coefficient is droplets serially or in parallel. negative or positive . In some embodiments , correlation may Processor 48 , which also may be termed a controller , can be performed with amplification data that has been con - be in communication with detector 46 and can be pro verted to a binary form by applying a threshold that distin - 60 grammed to process amplification data from the detector. guishes amplification - positive and amplification -negative The processor, which may be a digital processor, may be signals . Correlation also or alternatively may include com - programmed to process raw data from the detector, such as paring the numbers of volumes that exhibit co - amplification to subtract background and /or normalize droplet data based of different sets of allele sequences and /or comparing the on droplet size . The processor also or alternatively may be number of volumes that exhibits co -amplification of a set of 65 programmed to apply a threshold to convert the data to allele sequences versus the number that exhibits amplifica - binary form , to perform a correlation of amplification data , tion of only one of the allele sequences . to calculate and /or compare one or more measures of US 10 , 167 ,509 B2 27 28 co - amplification , to select a haplotype based on the corre phase 118 separating the droplets from one another. The lation and /or measures, or any combination thereof. droplets may be monodisperse , that is , of substantially the Further aspects of droplet generators , thermocyclers , same size . Exemplary degrees of monodispersity that may detectors , and controllers are described in U . S . Patent Appli - be suitable are described in U . S . Patent Application Publi cation Publication No . 2010 / 0173394 A1, published Jul. 8 . 5 cation No . 2010 /0173394 A1, published Jul. 8 , 2010 , which 2010 , which is incorporated herein by reference . is incorporated herein by reference . Fragments 98 may distribute randomly into the droplets as III. Exemplary Potential Haplotypes Created by they are formed . At a proper dilution of fragments 98 in the Linked SNPs aqueous phase that is partitioned , and with a proper selection 10 of droplet size , an average of less than about one copy or FIG . 6 schematically illustrates a haplotyping situation molecule of target region 80 is contained in each droplet . created by linked SNPs in which the genetic material of a Accordingly , some droplets, such as the empty droplet diploid subject 60 has two different nucleotides at each of indicated at 120, contain no copies of the target, many two different loci. The goal of haplotyping is to determine contain only one copy of the target region , some contain two which nucleotide at the first locus is combined with which 15 or more copies of the target ( e . g ., the droplet indicated at nucleotide at the second locus on each chromosome copy . 122 ) , and some contain only one of the allele sequences of Subject 60 can have either of two alternative haplotype the target region ( e . g ., the droplets indicated at 124 ) . configurations 62 , 64 created by a pair of single nucleotide Allele sequences can be amplified , indicated at 126 . Here , polymorphisms 66 , 68 . Each configuration represents two two allele sequences, 104 and 108 , are amplified from locus haplotypes: configuration 62 has haplotypes ( G , C ) and ( A , 20 76 and only allele sequence 110 is amplified from locus 78 T ) , and configuration 64 has haplotypes ( G , T ) and ( A , C ) . ( also see FIG . 6 ) . Amplified copies of each allele sequence A cell 70 of the subject includes a pair of chromosome are indicated at 104 ', 108 ', and 110 '. In other embodiments , copies 72 , 74 ofthe same type . (Other types of chromosomes only one allele sequence may be amplified from each locus , that may be present in the cell are not shown. ) Chromosome or at least two allele sequences may be amplified from each copies 72 , 74 may be mostly identical in sequence to each 25 locus, among others . (For example , allele sequence 106 may other, but the copies also typically have many loci of be amplified with the same primers that amplify allele sequence variation , such as polymorphic loci 76 , 78 , where sequence 110 , but amplification of allele sequence 106 is not the two chromosome copies differ in sequence . Loci 76 , 78 shown here to simplify the presentation .) are contained in a genome region or target region 80 , which Allele - specific amplification data can be collected from is outlined by a dashed box in the nucleus of cell 70 and 30 the droplets , indicated at 130 . In this example, fluorescence which is shown enlarged adjacent the cell as a composite data is collected , with a different, distinguishable fluorescent sequence that represents a genotype 82 for loci 76 , 78 . (Only dye , each included in a different allele - specific probe , pro one strand of each chromosome copy and target region is viding amplification signals for each allele sequence 104 ' , shown in FIG . 6 (and FIG . 7 ) to simplify the presentation . ) 108 ', 110 '. In particular, the dyes FAM , VIC , and ROX emit Genotype 82 may be determined by any suitable geno - 35 FAM - , VIC - , and ROX signals 132 - 136 that relate to ampli typing technology , either before haplotype analysis or as a fication of allele sequences 104 , 108 , and 110 , respectively . part of a haplotype analysis . Genotype 82 shows that the In other embodiments , allele - specific amplification of all single polymorphic nucleotide of locus 76 is a “ G ” and an four allele sequences 104 - 110 or of only two allele “ A ” on chromosome copies 72 and 74 (or vice versa ), and sequences (one from each locus) may be detected . for locus 78 is a “ C ” and a “ T .” However, the genotype does 40 The amplification data is correlated , indicated at 140 , not indicate how the individual nucleotides of the two loci and /or at least one measure of co -amplification of allele are combined on chromosome copies 72 , 74 . Accordingly, sequences in the same droplets is determined Graphs 142 , the genotype can be produced by alternative , potential 144 schematically illustrate an approach to correlation and haplotype configurations 62 , 64 . Haplotype analysis , as or determination ofmeasures of co - amplification . Graph 142 disclosed herein , permits determination of which of the 45 plots FAM and ROX signal intensities for individual drop potential haplotypes are present in the subject ' s genetic lets ( represented by dots in the plot) , while graph 144 plots material. VIC and ROX signal intensities for individual droplets . Signal values that represent amplification -negative (“ -” ) IV . Exemplary Haplotype Analysis with and amplification -positive (“ + ” ) droplets for a given signal Amplification in Droplets 50 type ( and thus a given allele sequence ) are indicated adja cent each axis of the graphs . FIG . 7 schematically illustrates performance of an exem Lines 146 , 148 represent a best - fit of the amplification plary version 88 of the method of FIG . 4 . Here , genetic data of each graph to a linear relationship . However, the two material from the subject of FIG . 6 is analyzed to distinguish fits have associated correlation coefficients of opposite the alternative , potential haplotype configurations described 55 polarity . The amplification data in graph 142 provides a in the preceding section . negative correlation coefficient, because there is a negative A sample 90 is obtained , indicated at 92 . The sample is correlation for co -amplification of allele sequence 104 (as disposed in an aqueous phase 94 including nucleic acid 96 reported by FAM signals ) and allele sequence 110 (as of the subject. In this view , for simplification , only frag - reported by ROX signals ) in the same droplets . In contrast, ments 98 containing genome region 80 are depicted . Frag - 60 the amplification data in graph 144 provides a positive ments 98 are long enough that only a minority ( e . g . , incom - correlation coefficient, because there is a positive correlation plete fragments 100 , 102 ) fail to include an allele sequence for co -amplification of allele sequence 108 (as reported by 104 - 110 from both loci 76 , 78 (also see FIG . 6 ) . The aqueous VIC signals ) and allele sequence 110 (as reported by ROX phase may be configured for PCR amplification of allele signals ) in the same droplets . The correlation coefficients sequences 104 - 110 . 65 may be compared to one another to select a haplotype . For Droplets 112 are formed , indicated at 114 . The droplets example , the haplotype may be selected based on which may be part of an emulsion 116 that includes a continuous correlation coefficient is larger ( e . g . , closer to 1 . 0 ) and /or US 10 , 167 ,509 B2 30 which is positive (if only one is positive ) . Here, a first they are still linked in a sample or are separated in the haplotype including allele sequences 104 and 106 and a sample using the methods, compositions, and kits described second haplotype including allele sequences 108 and 110 herein can be about 2 to about 10 , about 2 to about 8 , about may be selected based on the positive correlation of VIC and 2 to about 6 , about 2 to about 4 , about 3 to about 10 , about ROX signals . In some embodiments , a haplotype may be 5 3 to about 8 , about 3 to about 6 , about 4 to about 10 , or about selected based on only one correlation , such as based on 4 to about 6 . whether a correlation coefficient is negative or positive or The number of base pairs between each of the genetically based on comparison of the correlation coefficient to a linked loci can be about, at least about , or less than about 10 predefined value . bp , 25 bp , 50 bp , 75 bp , 100 bp , 250 bp , 500 bp , 750 bp , 1000 FIG . 8 shows a bar graph 160 illustrating an alternative 10 bp , 2000 bp , 3000 bp , 4000 bp , 5000 bp , 6000 bp , 7000 bp , approach to correlating the amplification data of FIG . 7 . The 8000 bp , 9000 bp , 10 ,000 bp , 15 ,000 bp , 20 , 000 bp , 33 ,000 amplification data of FIG . 7 has been converted to binary bp , 50 ,000 bp , 75 ,000 bp , 100 ,000 bp , 250 ,000 bp , 500 ,000 form by comparing each type of droplet signal ( FAM , VIC , bp , 750 , 000 bp , 1 ,000 , 000 bp , 1 , 250 ,000 bp , 1 , 500 , 000 bp , and ROX ) with a threshold that distinguishes amplification - 2 , 000 ,000 bp , 5 ,000 , 000 bp , or 10 , 000 , 000 bp . The number positive droplets (assigned a “ 1 ” ) from amplification -nega - 15 ofbase pairs between each of the genetically linked loci can tive droplets ( assigned a “ O ” ) for each allele sequence . be about 10 to about 10 ,000 , 000 bp , about 100 to about Graph 160 tabulates the binary form of the data to present 10 , 000 ,000 bp , about 1 ,000 to about 10 ,000 ,000 bp , about the number of amplification -positive droplets for various 1 ,000 to about 1, 000 ,000 bp , about 1 ,000 to about 500 ,000 allele sequences alone or in combination . The leftward two bp , about 1, 000 to about 100 ,000 bp , about 3000 to about bars , indicated at 162 , allow a comparison of the number of 20 100 ,000 bp , about 1000 to about 33 ,000 bp , about 1 ,000 to droplets that contain only allele sequence 104 (FAM ) with about 10 , 000 bp , or about 3 , 000 to about 33 ,000 bp . The the number that contain both allele sequences 104 ( FAM ) number of base pairs between each genetically linked alleles and 110 (ROX ) . The leftward data shows that amplification can be 0 bp . of allele sequence 104 does not correlate well with ampli - In some embodiments , a method of haplotyping com fication of allele sequence 110 . In other words , allele 25 prises examining if two alleles at two different loci co sequences 104 and 110 tend not to be co - amplified in the localize to the same spatially - isolated partition . In one same droplets . The rightward two bars , indicated at 164 , embodiment, additional alleles at the two loci can be ana allow a comparison of the number of droplets that contain lyzed . For example , if two alleles at two different loci do not only allele sequence 108 (VIC ) with the number that contain co - localize in a digital experiment, one or more other alleles both allele sequences 108 (VIC ) and 110 (ROX ) . The 30 at the two loci can be analyzed to provide a positive control rightward data shows that amplification of allele sequence for colocalization . For example , assume a maternally - inher 108 correlates well with amplification of allele sequence ited chromosome has allele A is at locus 1 and allele Y is at 110 . In other words, allele sequences 108 and 110 tend to be locus 2 , 100 bp away from locus 1 . On the corresponding co -amplified in the same droplets . The leftward pair of bars paternally - inherited chromosome, assume allele B is at locus and the rightward pair of bars considered separately or 35 1 and allele Z is at locus 2 . If a nucleic acid sample collectively indicate a haplotype in which allele sequence comprising these nucleic acids is separated into spatially 108 is associated with allele sequence 110 . isolated partitions, and amplification for allele A and allele A sample comprising genetically linked loci can be sub - Z is performed , the amplification signal for allele A and jected to fragmentation before being analyzed by the meth allele Z should rarely or never colocalize to a single partition ods, compositions, or kits described herein . A sample com - 40 because allele A and allele Z are not linked . A digital analysis prising genetically linked loci can be fragmented by, e . g ., can be performed to confirm that allele A and allele Y are mechanical shearing, passing the sample through a syringe , linked on thematernally - inherited chromosome or that allele sonication , heat treatment ( e . g ., 30 mins at 90° C . ) , and /or B and allele Z are linked on the paternally - inherited chro nuclease treatment ( e . g ., with DNase , RNase , endonuclease , mosome. exonuclease , restriction enzyme ) . A sample comprising 45 Haplotyping with Two Colors genetically linked loci can be subjected to no or limited While embodiments shown herein demonstrate the use of processing before being analyzed . a three - color system to measure phasing, phasing may also In another embodiment, using droplet digital PCR be measured using a two color system . For example , if two (ddPCR ) , a duplex reaction can be performed targeting two heterozygous SNPs ( Aa and Bb ) need to be phased , one may genomic loci, e . g . , two genes on a common chromosome. 50 design a FAM assay targeting A and a VIC assay targeting The droplets can be categorized into four populations B . Excess of partitions containing both A and B would be according to their fluorescence . For example , if a FAM indicative of linkage between A and B , suggesting that the labeled probe is used to detect to one loci, and a VIC - labeled two haplotypes are A - B and a - b . Absence of such excess probe is used to detect another loci, the four populations can may be suggestive of the alternative combination of haplo be FAM + /VIC + , FAM + /VIC - , FAM - /VIC + , and FAM - 1 55 types : A - b and a - B . One can determine that the DNA is of VIC - . By comparing the number of droplets with each of high enough molecular weight to make this later inference . these populations, it is possible to determine the frequency In order to confirm the alternative combination of haplo at which loci co -segregate to the same droplet . Using types , it can be necessary to run another duplex assay in a Poisson statistics , the percentage of species that are actually separate well , where a different combination of alleles is linked to one another can be estimated versus instances 60 targeted . For example , one may run a FAM assay targeting where two separated loci are in the same droplet by chance . A and a VIC assay targeting b . Excess of partitions contain The number of genetically linked loci that can be exam - ing both A and b would be indicative of linkage between A ined to determine if they are still linked in a sample or are and b , suggesting that the two haplotypes are A -b and a -B . separated in the sample using the methods, compositions, Reference Sequences and kits described herein can be about , at least about, or 65 In methods involving the analysis of copy number ( or more than about 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , or 10 . The number of other applications described herein ) , it is useful to count the genetically linked loci that can be examined to determine if number of times a particular sequence ( e . g . , target) is found , US 10 , 167 , 509 B2 31 32 e . g . , in a given genome. In some embodiments , this analysis algorithm can be used to predict fragmentation between the is done by assessing ( or comparing ) the concentrations of a first and second target sequence (980 ) . target nucleic acid sequence and of a reference nucleic acid If two different loci ( T1 and T2 ) are on different poly sequence known to be present at some fixed number of nucleotides, a sample with the polynucleotides ( 920 ) will copies in every genome . For the reference , a housekeeping 5 contain polynucleotides with T1 only and T2 only ( see FIG . gene ( e . g . , a gene that is required for the maintenance of 10A ) . However , if T1 and T2 are on the same polynucle basic cellular function ) can be used that is present at two otide , a sample containing polynucleotides with T1 and T2 copies per diploid genome. Dividing the concentration or can have three species: fragmented polynucleotides with T1 , amount of the target by the concentration or amount of the fragmented polynucleotides with T2 , and fragmented poly reference can yield an estimate of the number of target 10 nucleotides with T1 and T2 (FIG . 10B ) . The longer the copies per genome. One ormore references can also be used distance between T1 and T2 , the higher the probability of to determine target linkage . fragmentation between T1 and 12 . The sample can be A housekeeping gene that can be used as reference in the partitioned (FIG . 9 : 940 ) . A digital analysis can be per methods described herein can include a gene that encodes a formed , such as digital PCR or droplet digital PCR , and transcription factor, a transcription repressor, an RNA splic - 15 partitions with signal for T1 , T2 , and T1 and T2 can be ing gene , a translation factor, tRNA synthetase , RNA bind - enumerated ( 960 ) . An algorithm can be developed and used ing protein , ribosomal protein , RNA polymerase, protein to determine the probability of fragmentation between T1 processing protein , heat shock protein , histone, cell cycle and T2 (980 ). The algorithm can make use of the number of regulator, apoptosis regulator, oncogene , DNA repair / repli - bases or base pairs between T1 and T2 if known. This cation gene, carbohydrate metabolism regulator, citric acid 20 method can be used to determine the extent of fragmentation cycle regulator, lipid metabolism regulator, amino acid of a DNA sample . If there are a number of partitions metabolism regulator , nucleotide synthesis regulator, containing signal for T1 and T2 is greater than the number NADH dehydrogenase , cytochrome C oxidase , ATPase , of partitions one would expect T1 and T2 to be in the same mitochondrial protein , lysosomal protein , proteosomal pro - partition , this observation can indicate that T1 and T2 are tein , ribonuclease , oxidase / reductase , cytoskeletal protein , 25 linked . cell adhesion protein , channel or transporter, receptor, It can be advantageous to use the above methods on a kinase , growth factor, tissue necrosis factor, etc . Specific nucleic acid ( e . g . , DNA ) sample to ensure that DNA is of examples of housekeeping genes that can be used in the high enough molecular weight that linkage information is methods described include , e . g . , HSP90 , Beta - actin , tRNA , preserved in the sample . rRNA , ATF4 , RPP30 , and RPL3 . 30 In any of the methods described herein making use of For determining the linkage of a target , one of the loci DNA , an assay can be performed to estimate the fragmen genetically linked to another locus can be a common refer - tation of the DNA in the sample , and the methods can ence , e . g . , RPP30 . Any genetically linked loci can be used incorporate the information on fragmentation of the DNA . in the methods described herein . In another embodiment, results of an assay can be normal A single copy reference nucleic acid ( e . g ., gene ) can be 35 ized based on the extent of fragmentation of DNA in a used to determine copy number variation . Multi -copy ref- sample . erence nucleic acids ( e . g . , genes ) can be used to determine Nucleic acid fragmentation can also bemeasured by , e . g . , copy number to expand the dynamic range . For example , the gels , a Bioanalyzer, or size exclusion chromatography . multi - copy reference gene can comprise about, or more than Measurement of Methylation Burden about 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14, 15 , 16 , 17 , 18 , 40 A milepost assay can be used to determine methylation 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , status of a CPG island . In one embodiment, a method is 35 , 36 , 37, 38 , 39, 40, 41 , 42, 43 , 44 , 45 , 46 , 47, 48 , 49 , 50 , provided herein comprising using linkage of amplicons that 51, 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , 62 , 63 , 64 , 65, 66 , can be measured through ddPCR (Milepost assay ). The 67 , 68 , 69 , 70 , 71, 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , distance between amplicons can span greater than 10 kb , 83 , 84 , 85 , 86 , 87, 88 , 89 , 90 , 91, 92, 93 , 94 , 95 , 96 , 97 , 98 , 45 fully covering the range of sizes observed for CpG islands. 99 , 100 , 500 , 1000 , 2000 , 3000 , 4000 , 5000 , 6000 , 7000 , CPG islands can be common in the 5 ' region of human 8000 , 9000 , 10 ,000 , 20 , 000 , 30 ,000 , 40 , 000 , 50 ,000 , 60 ,000 , genes , including in promoter sequences. Methylation of 70 ,000 , 80 ,000 , 90 ,000 , or 100 ,000 copies in a genome. CpG islands can play a role in gene silencing during Multiple different nucleic acids ( e . g ., multiple different X - chromosome inactivation and imprinting . Aberrant meth genes ) can be used as a reference . 50 ylation of promoter region CpG islands can be associated Determining Probability of Nucleic Acid Fragmentation with transcriptional inactivation of tumor suppressor genes Digital analysis can be performed to determine the extent in neoplasia . For example , abnormal methylation of tumor of fragmentation between two markers in a nucleic acid suppression genes p16 , hMLH1, and THBSi has been sample . FIG . 9 illustrates a workflow (900 ) . The steps in observed . There is an association between microsatellite FIG . 9 can be performed in any suitable order and combi- 55 instability and promoter- associated CpG island methylation . nation and can be united with any other steps of the present Determination of methylation status can play a role in disclosure . A sample of polynucleotides can be obtained determining fetal DNA load in maternal plasma. Methyl (920 ) . The sample can be partitioned into a plurality of ation of CpG sequences can be a marker for cancer. partitions (940 ) such that each partition contains on average FIG . 11 illustrates a workflow of a method of determining only about 0 , 1 , 2 , or several target polynucleotides . Each 60 the methylation status of a CpG island ( 1100 ) . The steps in partition can have , on average , less than 5 , 4 , 3 , 2 , or 1 copies FIG . 11 can be performed in any suitable order and combi of a target nucleic acid per partition ( e . g ., droplet ) . In some nation and can be united with any other steps of the present cases , at least 0 , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , disclosure . A sample of polynucleotides can be obtained 60 , 70 , 80 , 90 , 100 , 125 , 150 , 175 , or 200 partitions ( e . g ., ( 1110 ) . The polynucleotide sample can be treated with a droplets ) have zero copies of a target nucleic acid . 65 methylation sensitive enzyme (1120 ). The sample can be The partitions can be assayed to enumerate partitions with partitioned into a plurality of partitions ( 1130 ) such that each a first target and a second target sequence ( 960 ) and an partition contains on average only about 0 , 1 , 2 , or several US 10 , 167, 509 B2 33 34 target polynucleotides . Each partition can have, on average , for the same region of interest. Alternatively , conversion of less than 5 , 4 , 3 , 2 , or 1 copies of a target nucleic acid per C - T through bisulfite treatment can change the melting partition ( e .g ., droplet ) . In some cases, at least 0 , 1 , 2 , 3 , 4 , temperature of the amplicon , which can be measured 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 125 , 150 , through melting curve analysis post amplification . The limit 175 , or 200 partitions ( e . g . , droplets ) have zero copies of a 5 of sensitivity is about 1 - 5 % with this method , however . target nucleic acid . Nucleic acid sequences that flank the 5 Furthermore complete conversion requires extensive and 3' end of the CpG island can be amplified ( 1140 ). The bisulfite treatment, which can lead to degradation of DNA . partitions can be assayed to determine if the sequences that In one embodiment of the method , two amplicons ( e . g . , flank the CPG island are in the same or different partitions FAM & VIC labeled ) are designed flanking the CpG island ( 1150 ) to determine methylation status ( 1160 ) . If the CpG is 10 of interest . DNA samples are digested with one or more methylated , it will not be cleaved by a methylation sensitive methylation sensitive restriction enzymes , and digested and enzyme, and the flanking sequences (mileposts ) can be in the undigested samples can be analyzed via digital PCR , e . g . , same partition . If the CpG is not methylated , the CpG island ddPCR . Undigested samples should exhibit 100 % linkage of can be cleaved by the methylation sensitive enzyme, and the the FAM and VIC amplicons , and digested samples will flanking sequences (mileposts ) can be in different partitions. 15 exhibit 0 % linkage of the FAM and VIC amplicons, indi In another embodiment, a methylated CPG island can be cating a difference in methylation status . The extended cleaved by a methylation sensitive enzyme that can cleave precision of this system can detect at least about 1/ 10 , 1 /30 , methylated but not unmethylated sequence . In another 1100 , 1/ 200 , 1/ 300 , 1 /400 , 1/ 500 , 1600, 1/ 100 , 1/ 800 , 1900 , 1/ 1000 , 1/ 2000 , embodiment, a methylated CPG island can be cleaved by a 1/ 3000, 1/ 4000 , 1/ 5000 , 1/ 6000 , 1 /1000 , 1/ 8000 , 1/ 9000 , 1/ 10 ,000 differences methylation sensitive enzyme that can recognize ( e . g . , bind ) 20 in methylation status between two samples. FIG . 12 illus methylated sequence by not unmethylated sequence . trates an embodiment of the method . In another embodiment, the mileposts (markers ) can be The regions to be amplified can be within about, or at least within a CpG island . In another embodiment, the mileposts about, 1 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 110 , 120 , (markers ) can flank a single CpG sequence . In another 130 , 140 , 150 , 160 , 170 , 180 , 190 , 200 , 250 , 300 , 350 , 400 , embodiment, the mileposts (markers ) can flank about, or at 25 450 , or 500 bp of the 5 ' or 3 ' end of the CpG island . least about 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , One or more methylation sensitive restriction enzymes 80 , 90 , 100 , 150 , 200 , 250 , 300 , 350 , 400 , or 500 CpGs ( e . g ., can be used in the methods, compositions, and kits provided one marker is 5 ' of the CpGs and the other marker is 3 ' of herein . The one or more methylation sensitive enzyme can the CpGs ). include , e . g . , Dpnl, Acc651, Kpnl, Apal, Bsp1201, Bsp1431, In another embodiment, demethylation of a CpG island 30 Mbol, BspOi, Nhel, Cfr91 , Smal, Csp6l, Rsal, Ecl13611 , can be determined . The methylation status of a CpG island Sac?, EcoRII , Mval, Hpall, MSPJI, LpnPI, FsnEI, Dpnll , can be measured over a period of time ( e . g ., in samples taken MerBc, or Mspl. In one embodiment, a methylation sensi at different time points , e . g ., days , months, years ). The tive restriction enzyme can cleave nucleic acid that is not methylation status of a CpG island among samples from methylated , but cannot cleave nucleic acid this is methyl different individuals can be compared . 35 ated . The one or more methylation sensitive enzymes can be A CPG can be a cytosine linked to a guanine through a about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , phosphodiester bond , and a CpG island can be a cluster of 18 , 19 , 20 , 21 , 22 , 23 , 24 , or 25 methylation -sensitive such dinucleotide sequences . A CPG island , or CG island , enzymes . can be a region of a genome that contains a high frequency In some embodiments , nucleic acid is subject to Dam of CPG sites . A CPG island can be about, less than about, or 40 methylation and or Dem methylation . more than about , 100 , 150 , 200 , 250 , 300 , 350 , 400 , 450 , Table 1 includes a list of restriction enzymes for which the 500 , 550 , 600 , 650 , 700 , 750 , 800 , 850 , 900 , 950 , 1000 , ability of the restriction enzyme to cleave DNA can be 1100 , 1200 , 1300 , 1400 , 1500 , 1600 , 1700 , 1800 , 1900 , blocked or impaired by CpG methylation . 2000 , 2100 , 2200 , 2300 , 2400 , 2500 , 2600 , 2700 , 2800 , 2900 , 3000 , 3100 , 3200 , 3300 , 3400 , 3500 , 3600 , 3700 , 45 TABLE 1 3800 , 3900 , 4000 , 4100 , 4200, 4300 , 4400 , 4500 , 4600 , 4700 , 4800 , 4900 , 5000 , 5100 , 5200 , 5300 , 5400 , 5500 , List of restriction enzymes whose ability to 5600 , 5700 , 5800 , 5900 , or 6000 base pairs in length . A CPG cleave can be blocked or impeded by CpG island can be about 100 to about 6000 bp , about 100 to about methylation 5000 bp , about 100 to about 4000 bp , about 100 to about 50 Enzyme Sequence 3000 bp , about 100 to about 2000 bp , about 100 to about 1000 bp , about 300 to about 6000 bp , about 300 to about AatII GACGT / C 5000 bp , about 300 to about 4000 bp , about 300 to about Acc651 G /GTACC 3000 bp , about 300 to about 2000 bp , or about 300 to about 1000 bp . In one embodiment, a CpG island can be a stretch 55 ACCI GT /MKAC of DNA at least 200 bp in length with a C + G content of at least 50 % and an observed CpG / expected CPG ratio of at Acil CCGC ( - 3 / - 1 ) least 0 .6 (Gardiner - Garden , M . and Frommer M . ( 1987 ) J. Acl? AA / CGTT Mol. Biol. 196 : 261 - 282 ) . In another embodiment, a CpG island can be stretch of DNA of greater than 500 bp with a 60 Afel AGC /GCT G + C content equal to or greater than 55 % and an observed Aget A / CCGGT CpG / expected CpG ratio of at least 0 .65 ( Takai D . and Jones P . (2001 ) PNAS 99: 3740 -3745 ) . AgeI - HFTM A / CCGGT In one method to identify methylated cytosines , bisulfite treatment is used to convert cytosines to uracils ; methylated 65 Ahdi GACNNN /NNGTC (SEQ ID NO : 1 ) cytosines, however , are protected . Subsequent PCR converts Alel CACNN / NNGTG (SEQ ID NO : 2 ) uracil to a thymidine, resulting in a different sequence output US 10 , 167 , 509 B2 35 36 TABLE 1 - continued TABLE 1 - continued List of restriction enzymes whose ability to List of restriction enzymes whose ability to cleave can be blocked or impeded by CpG cleave can be blocked or impeded by CpG methylation . methylation . Enzyme Sequence Enzyme Sequence

ApaI GGGCC / C BstUI CG / CG ApaLI G / TGCAC 10 Bst2171 GTA / TAC Asci GG / CGCGCC BtOZI GCGATG ( 10 / 14 ) AsisI GCGAT / CGC BtsIMutI CAGTG ( 2 / 0 ) Aval C / YCGRG 15 Cac8I GCN / NGC AvaII G / GWCC ClaI AT / CGAT Bae? ( 10 / 15 ) ACNNNNGTAYC ( 12 / 7 ) Dpni GA / TC ( SEQ ID NO : 3 ) DraIII CACNNN / GTG Bani G / GYRCC 20 DraIII - HFTM CACNNN /GTG BbvCI CCTCAGC ( - 5 / - 2 ) Drdi GACNNNN / NNGTC ( SEQ ID NO : 9 ) ???AI ACGGC ( 12 / 14 ) Eael Y / GGCCR BcgI ( 10 / 12 ) CGANNNNNNTGC ( 12 / 10 ) 25 ( SEQ ID NO : 4 ) Eagi C / GGCCG BCODI GTCTC ( 1 / 5 ) EagI - HF “M C / GGCCG BfUAI ACCTGC (4 / 8 ) Earl CTCTTC ( 1 / 4 ) 30 Bfuci /GATC Ecii GGCGGA ( 11 / 9 ) Bgli GCCNNNN / NGGC (SEQ ID NO : 5 ) Eco53kI GAG / CTC BmgBI CACGTC ( - 3 / - 3 ) ECORI G / AATTC 35 BsaAI YAC /GTR ECORI - HF TM G / AATTC BsaBI GATNN / NNATC ( SEQ ID NO : 6 ) ECORV GAT / ATC BsaHI GR / CGYC ECORV - HF TM GAT / ATC Bsal GGTCTC ( 1 / 5 ) 40 Faul CCCGC (4 / 6 ) BsaI -HFTM GGTCTC ( 1 / 5 ) Fnu4HI GC / NGC BseYI CCCAGC ( - 5 / - 1 ) FokI GGATG ( 9 / 13 ) BsiEI CGRY / CG 45 FseI GGCCGG / CC BsiWI C /GTACG Fspi TGC / GCA Bsli CCNNNNN / NNGG ( SEQ ID NO : 7 ) Hae?? RGCGC / Y BSMAI GTCTC ( 1 / 5 ) 50 HgaI GACGC ( 5 / 10 ) BsmBI CGTCTC ( 1 / 5 ) Hhal GCG / C BsmFI GGGAC ( 10 / 14 ) Hinc II GTY / RAC BspDI AT / CGAT 55 HinfI G / ANTC BspEI T / CCGGA HinP11 G / CGC BsrBI CCGCTC ( - 3 / - 3 ) Hpal GTT / AAC BsrFI R / CCGGY ?pa?? C / CGG BSSHII G / CGCGC Hpy16611 GTN / NAC BsSKI / CCNGG Hpy188III TC / NNGA BstAPI GCANNNN /NTGC ( SEQ ID NO : 8 ) Hpy991 CGWCG / 65 . BstBI TT / CGAA HpYAV CCTTC (6 / 5 ) US 10 , 167 , 509 B2 37 38 TABLE 1 - continued TABLE 1 - continued List of restriction enzymes whose ability to List of restriction enzymes whose ability to cleave can be blocked or impeded by CpG cleave can be blocked or impeded by CpG methylation . methylation . Enzyme Sequence Enzyme Sequence PyuI - HFTM CGAT / CG H???H4 IV A / CGT RsaI GT /AC I - Ceul CGTAACTATAACGGTCCTAAGGTAGCGAA ( - 9 / 10 13 ) Rsril CG / GWCCG (SEQ ID NO : 10 ) SacII CCGC / GG I - Scel TAGGGATAACAGGGTAAT ( - 9 / - 13 ) ( SEQ ID NO : 11 ) Sall G / TCGAC Kasi G / GCGCC SalI - HF G / TCGAC MboI GATC Sau3AI /GATC Mlul A / CGCGT Sau961 G / GNCC Mmel TCCRAC (20 / 18 ) 20 ScrFI CC / NGG MSPA1I CMG / CKG SfaNI GCATC ( 5 / 9 ) MWOI GCNNNNN / NNGC ( SEQ ID NO : 12 ) Sfil GGCCNNNN / NGGCC ( SEQ ID NO : 16 ) Nael GCC / GGC 25 SfoI GGC / GCC Nari GG / CGCC SgrAI CR / CCGGYG Nb . BtsI GCAGTG Smal CCC / GGG Ncil CC / SGG 30 SnaBI TAC /GTA NOOMIV G / CCGGC StyD41 / CCNGG Nhel G / CTAGC Tfil G / AWTC NheI - HF TM G / CTAGC Tlil C / TCGAG NlaIV GGN / NCC Tsel G / CWGC NotI GC / GGCCGC TSPMI C / CCGGG NotI - HF TM GC / GGCCGC 40 XhoI C / TCGAG Nrul TCG / CGA Xmal C / CCGGG Nt . BbVCI CCTCAGC ( - 5 / - 7 ) Zral GAC / GTC Nt . BSMAI GTCTC ( 1 / - 5 ) 45 In another embodiment, a nucleic acid comprising a CpG Nt . CviPII (0 / - 1 ) CCD sequence or CpG island is contacted with one or more DNA PaeR7I C / TCGAG methyltransferase (MTases ), and a method of analyzing methylation of the CpG sequence described herein is per PhoI GG / CC formed . A DNA methyltransferase can transfer a methyl 50 PI - PspI TGGCAAACAGCTATTATGGGTATTATGGGT group from S - adenosylmethionine to a cytosine or adenine ( - 13 / - 17 ) ( SEQ ID NO : 13 ) residue . The methyltransferase can be , e . g . , a CpG MTase . PI - Scel AT CTATGTC GGGTGC GGAGAAAGAGGTAAT The DNA MTase can be an mba MTase (an MTase that can ( - 15 / - 19 ) ( SEQ ID NO : 14 ) generate N6 -methyladenine ) ; an m4C MTase (an MTase that 55 can generate N4 -methylcytosine ) ; or an m5C MTase ( an PleI GAGTC ( 4 / 5 ) MTase that can generate C5 -methylcytosine . The MTase can Pmel be, e. g. , DNMT1, DNMT3A , and DNMT3B . GTTT / AAAC Separation Pmli CAC / GTG Physical separation of target sequences can occur in a PshAI GACNN / NNGTC ( SEQ ID NO : 15 ) 60 sequence - specific or non - sequences specific manner. Non sequence specific means for separating target sequences PspOMI G / GGCCC include use of a syringe, sonication , heat treatment ( e . g . , 30 PspXI mins at 90° C . ) , and some types of nuclease treatment ( e . g . , VC / TCGAGB with DNase , RNase , endonuclease , exonuclease ) . Pvul CGAT / CG 65 Restriction Enzymes A sequence specific method of separation of nucleic acid sequences can involve use of one or more restriction US 10 , 167 ,509 B2 39 40 enzymes . One or more restriction enzymes can be used in A Type IIb restriction endonuclease can have a recogni any of the methods described herein . For example , restric tion sequence with is bipartite or interrupted . The subunit tion enzymes can be used to separate target copies in order structure of a Type IIb restriction endonuclease can be a to estimate copy number states accurately , to assess phasing , heterotrimer . Cofactors and activators of Type IIb restriction to generate haplotypes, or determine linkage , among other 5 endonucleases can include magnesium and AdoMet ( for methods . One or more enzymes can be chosen so that the methylation ) . A Type IIb restriction endonuclease can cleave nucleic acid ( e . g . , DNA or RNA ) between the target nucleic at a cleavage site on both strands on both sides of a acids sequences is restricted , but the regions to be amplified recognition site a defined , symmetric , short distance away or analyzed are not. In some embodiments , restriction and leave a 3 ' overhang . Examples of Type IIb restriction enzymes can be chosen so that the restriction enzyme does 10 endonucleases include, e . g . , Bcgl, Bsp241, Cjel, and CjePi. cleave within the target sequence , e . g . , within the 5 ' or 3' end A Type Ile restriction endonuclease can have a recogni of a target sequence . For example , if target sequences are tion site that is palindromic , palindromic with ambiguities , tandemly arranged without spacer sequence , physical sepa - or non -palindromic . The subunit structure of a Type Ile ration of the targets can involve cleavage of sequence within restriction endonuclease can be a homodimer or monomer. the target sequence . The digested sample can be used in the 15 Cofactors and activators of Type Ile restriction endonuclease digital analysis ( e . g ., ddPCR ) reaction for copy number can include magnesium , and a second recognition site that estimation , linkage determination , haplotyping , examining can act in cis or trans to the endonuclease can act as an RNA or DNA degradation , or determining methylation allosteric affector . A Type Ile restriction enzyme can cleave burden , e. g ., of a CpG island . a cleavage site in a defined manner with the recognition Restriction enzymes can be selected and optimal condi- 20 sequence or a short distance away . Activator DNA can be tions can be identified and validated across numerous used to complete cleavage . Examples of Type Ile restriction sample and assay types for broad applications, e . g . , digital enzymes include, e . g ., Nael, Narl, BspMI, Hpall, Sa II , PCR ( e . g . , ddPCR ) for CNV determinations and any of the EcoRII , Eco571, AtuBI, Cfr91 , SauBMKI, and Ksp6321. other methods described herein . A Type IIs restriction enzyme can have a recognition Computer software can be used to select one or more 25 sequence that is non -palindromic . The recognition sequence restriction enzymes for the methods , compositions , and / or can be contiguous and without ambiguities . The subunit kits described herein . For example , the software can be structure of a Type IIs restriction endonuclease can be Qtools software. monomeric . A cofactor that can be used with a Type IIs One or more restriction enzyme used in the methods , restriction enzyme can be magnesium . A Type IIs restriction compositions , and / or kits described herein can be any 30 enzyme can cleave at a cleavage site in a defined manner restriction enzyme, including a restriction enzyme available with at least one cleavage site outside the recognition from New England BioLabs , Inc . ( see www .neb .com ) . A sequence . Examples of Type IIs restriction enzymes include, restriction enzyme can be , e. g ., a restriction endonuclease , e. g. , Fokl, Alw261, Bbvl, Bsrl , Earl, Hphl, Mboll , SfaNI, homing endonuclease , nicking endonuclease , or high fidelity and Tth1111. (HF ) restriction enzyme. A restriction enzyme can be a Type 35 A Type III enzyme can cleave at a short distance from a I, Type II , Type III , or Type IV enzyme or a homing recognition site and can require ATP . Sadenosyl- L -methio endonuclease . In some cases , the restriction digest occurs nine can stimulate a reaction with a Type III enzyme but is under conditions of high star activity . In some cases , the not required . A Type III enzyme can exist as part of a restriction digest occurs under conditions of low star activ complex with a modification methylase . The recognition ity . 40 sequence of a Type III restriction endonuclease can be A Type I enzyme can cleave at sites remote from the non - palindromic . Cofactors and activators that can be used recognition site ; can require both ATP and Sadenosyl- L - with Type III restriction endonucleases include , e . g . , mag methionine to function , and can be a multifunctional protein nesium , ATP ( not hydrolyzed ) , and a second unmodified site with both restriction and methylase activities . The recogni- in the opposite orientation , a variable distance away . tion sequence for a Type I restriction endonuclease can be 45 Examples of Type III restriction endonucleases include, e . g . , bipartite or interrupted . The subunit configuration of a EcoP151 , EcoPI, HinfIII , and StyLTI. restriction endonuclease can be a penatmeric complex . A Type IV enzyme can target methylated DNA . Examples Coactivators and activators of Type I restriction endonu - of Type IV restriction enzymes include , e . g ., MorBC and cleases include , e. g . , magnesium , AdoMet ( S - Adenosyl Mrr systems of E . coli . methionine ; SAM , SAMe, SAM - e ), and ATP . Type I restric - 50 The restriction enzyme can be a homing endonuclease . A tion endonucleases can cleave at a cleavage site distant and homing endonuclease can be a double stranded DNase . A variable form the recognition site . Examples of Type I homing endonuclease can have large , asymmetric recogni restriction endonucleases can include , e . g . , EcoKI, ECOAI, tion sites ( e . g . , 12 - 40 base pairs ) . Coding sequences for EcoBI, CfrAI, StyLTII , StyLTIII, and StySPI. homing endonucleases can be embedded in introns or A Type II enzyme can cleave within or at short specific 55 inteins . An intein can be a " protein intron ” that can excise distances from a recognition site ; can require magnesium ; itself and rejoin the remaining portions ( the exteins ) with a and can function independent of methylase . The recognition peptide bond . A homing endonuclease can tolerate some sequence for a Type II restriction endonuclease can be sequence degeneracy within its recognition sequence . The palindromic or an interrupted palindrome. The subunit struc - specificity of a homing endonuclease can be 10 - 12 base ture of a Type II restriction endonuclease can be a homodi- 60 pairs . Examples of homing endonucleases include 1 - Ceul , mer Cleavage of a cleavage site with a Type II restriction Scel, I - Ppol, PI- Scel , PI- Psp?, and PI- Scel . endonuclease can result in fragments with a 3' overhang , 5 ' A restriction enzyme used in the methods, compositions , overhang, or a blunt end . Examples of Type II restriction and / or kits herein can be a dimer , trimer , tetramer , pentamer, endonucleases include , e . g . , EcoRI, BamHI, Kpnl, Notl, hexamer , etc . Pstl, Smal, and Xhol. 65 The one or more restriction enzymes used in the methods , There are several subtypes of Type II restriction enzymes , compositions and / or kits described herein can be a compo including Type IIb , Type IIs , and Type Ile . nent of a hybrid or chimeric protein . For example , a domain US 10 , 167 ,509 B2 41 42 of a restriction enzyme comprising an enzymatic activity Scal, Scal- HFTM , ScrFI, SexAI, SfaNI, Sfcl, Sfil, Sfol, ( e . g . , endonuclease activity ) can be fused to another protein , SgrAI, Smal, Smll, SnaBI, Spel, Sphl, Sphi- HFTM , Sspl, e . g . , a DNA binding protein . The DNA binding protein can Sspl- HFTM , Stul, StyD41, Styl, Styl- HFTM , Swal, Taqal, target the hybrid to a specific sequence on a DNA . The Tfil, Tlil , Tsel, Tsp451, Tsp5091, TspMI, TspRI, Tth1111, nucleic acid cleavage activity of the domain with enzymatic 5 Xbal, Xcml, Xhol, Xmal, Xmnl, and Zral. activity can be sequence specific or sequence non -specific . The one or more restriction enzymes used in the methods, For example , the non - specific cleavage domain from the compositions, and /or kits described herein can be derived type IIs restriction endonuclease Fokl can be used as the enzymatic (cleavage ) domain of the hybrid nuclease . The from a variety of sources. For example , the one or more sequence the domain with the enzymatic activity can cleave 10 restriction enzymes can be produced from recombinant can be limited by the physical tethering of the hybrid to nucleic acid . The one or more restriction enzymes can be DNA by the DNA binding domain . The DNA binding produced from recombinant nucleic acid in a heterologous domain can be from a eukaryotic or prokaryotic transcrip host ( e . g . , in a bacteria , yeast, insect, or mammalian cell ). tion factor. The DNA binding domain can recognize about, The one or more restriction enzymes can be produced from or at least about, 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15, 15 recombinantrece nucleic acid in a heterologous host and purified 16 , 17 , 18 , 19 , 20 , 21, 22, 23 , 24 , or 25 base pairs of from the heterologous host. The one or more restriction continuous nucleic acid sequence . In some cases , the restric - enzymes can be purified from a native source , e . g ., a tion enzyme is a 4 -base cutter, 6 - base cutter , or 8 -base cutter . bacterium or archaea . If more than one restriction enzyme is The DNA binding domain can recognize about 9 to about 18 used , at least one of the restriction enzymes can be from a base pairs of sequence . The DNA binding domain can be , 20 recombinant source and at least one of the more than one e . g . , a zinc finger DNA binding domain . The hybrid can be restriction enzymes can be from a native source . a zinc finger nuclease ( e . g . , zinc finger nuclease) . The hybrid A recognition site for the one or more restriction enzymes protein can function as a multimer ( e . g ., dimer, trimer, can be any of a variety of sequences . For example , a tetramer, pentamer, hexamer, etc . ) . recognition site for the one or more restriction enzymes can Examples of specific restriction enzymes that can be used 25 be a palindromic sequence . A recognition site for the one or in the methods , compositions, and / or kits described herein more restriction enzymes can be a partially palindromic include Aaal, Aagl, Aarl, Aasi , Aatl , AatII , Aaul, Abal, sequence . In some embodiments , a recognition site for the Abel, Abri, AccI, Accll , AccIII , Acc161, Acc361, Acc651, one or more restriction enzymes is not a palindromic Acc1131, AccB11, AccB21, AccB71, AccBSI, AccEBI, Acel, sequence . A recognition site for the one or more restriction Acell , AcelII, Acil, Acll, AcINI, AclWI, Acpl, Acpll , AcpII , 30 enzymes can be about , or more than about, 1 , 2 , 3 , 4 , 5 , 6 , Acsi, Acul, Acvl, Acyl, Adel, Aeul, Afal, Afa22MI, 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 or 20 bases or Afal6RI, Afel, Afl, AflII, AAIII , Agel, AgeI- HF , AglI, Ahal, base pairs . A recognition site for a restriction enzyme can be Ahall , AhalII , AhaB81, Ahdi, Ahli , Ahyl, Ait ), Ajnl, Ajol, about 2 to about 20 , about 5 to about 20 , about 5 to about 15 , Alel, Alfi, Alil , AliAJI, Alol, Alul, Alwi, Alw211, Alw261, about 5 to about 10 , about 7 to about 20 , about 7 to about 15 , Alw441, AlwNI, AlwXI, Ama871, Acol , AocII , Aori , 35 or about 7 to about 10 bases or base pairs . Aor13HI, Aor51HI, Aosl, Aosll , Apal , ApaB1, ApaCI, Two or more restriction enzymes can be used to digest a ApaLI, ApaORI, ApeKI, Apil , Apol, Apyl, Aqul, AscI , Asel , polynucleotide. The two or more restriction enzymes can AselII, AsilII , Aval, Avall, AvrII , BaeGI, Bael, BamHI, recognize the same or different recognition sites. There can BamHI- HF , Banl, BanII , BbsI, BbvCI, Bbvl, Bccl, BceAi, be one or more recognition sites for a single restriction BcgI, BciVI, Bcll, BcoDI, Bfal, BfUAI, BfUCI, Bgli, BglII , 40 enzyme between two target nucleic acid sequences on a Blpi, BmgBI, Bmrl, Bmti , Bpml, Bpu10I, BpuEI, BsaAI, single polynucleotide . There can be about , or at least about, BsaBI, BsaHI, Bsal, Bsal -HF , BsaJI , BsaWI, BsaXI, BseRI, 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 or more recognition sites for a BseYI, Bsgl, BsiEl, BsiHKAI, BsiWI, Bsll , BsmAI, single restriction enzyme between two target nucleic acid BSmBI, BsmFI, Bsml, BsoBI, Bsp12861, BspCNI, BspDI, sequences on a single polynucleotide . There can be two or BspEI, BspHI, BspMI, BspQI, BsrBI, BsrDI, BsrFI, BsrGI, 45 more different restriction enzyme recognition sites between Bsrl, BssHII , BssKI, BssSI, BstAPI, BstBI, Bstell , BstNI, two target nucleic acid sequences on a single polynucleotide . BstUI, BstXI, Bst YI, BstZ171, Bsu361, Btgl, BtgZI, BtsCI , There can be about, or at least about, 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , Btsi , BtsIMuti, Cac81, Clal, CspCI, CviAII , CviKI- 1 , 10 or more different restriction enzyme recognition sites CviQI, Ddel, Dpnl, DpnII , Dral, DralII, DralII -HFTM , between two target nucleic acid sequences on a single Drdi , Eael, Eagl , Eagl -HFTM , Earl , Ecil , Eco53kl , EcoNI, 50 polynucleotide. There can be one or more different restric Eco01091 , EcoP151, EcoRI, EcoRI- HFTM , EcoRV, EcoRV tion enzyme restriction sites between two target nucleic acid HFTM , Fati , Faul, Fnu4HI, Foki, Fsel, FspEI, Fspi, Haell, sequences on a single polynucleotide . There can be about, or HaeIII , Hgal , Hhal, HincII, HindIII , HindIII- HFTM , Hinfl , at least about, 1, 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 or more restriction HinP11, Hpal, Hpall , Hphi, Hpy16611 , Hpy1881, Hpy188III , enzyme restriction sites between two target nucleic acid Hpy991, HpyAV , HpyCH4111 , HpyCH4IV , HpyCH4V , 55 sequences on a single polynucleotide . 1 -Ceul , 1 - Scel , Kasl, KpnI, KpnI- HFTM , LpnPI, Mbol, A restriction enzyme digest can comprise one or more Mboll , Mfel, Mfel- HFTM , MluCI, Mlul, Mlyl, Mmel, isoschizomer. Isoschizomers are restriction endonucleases Mnll , Msci, Msel, Msll , MspAli , Mspl, MspJI , Mwol, that recognize the same sequence . The isoschizomers can Nael , Narl , Nb . BbvCI, Nb. Bsml, Nb . BsrDI, Nb .Btsi , Ncil , have different cleavage sites; these enzymes are referred to Ncol, Ncol- HFTM , Ndel, NgoMIV , Nhel, Nhel -HFTM , 60 as neoschizomers . NlaIII, NlaIV , NmeAIII , Notl , NotI -HFTM , Nrul, Nsil, In some embodiments , cleavage by a restriction enzyme Nspl, Nt. Alwi, Nt. BbvCI, Nt. BsmAI, Nt. BspQI, Nt. BstNBI, results in a blunt end . In some embodiments, cleavage by a Nt. CviPII , Pad , PaeR71, Pcil, PAFI, PAMI, Phol, PI- Pspl , restriction enzyme does not result in a blunt end . In some PI- Scel, Plel, Pmel, PmlI , PpuMI, PshAI, Psil , PspGI, embodiments , cleavage by a restriction enzyme results in PspOMI, PspXI, Pstl , PstI - HFTM , Pvul, Pvul- HFTM , 65 two fragments , each with a 5 ' overhang . In some embodi Pvull, Pvull -HFTM , Rsal, RsrII , Sacl, SacI- HFTM , SacII , ments , cleavage by a restriction enzyme results in two Sall , Sall- HFTM , Sapl, Sau3AI, Sau961, Sbfi , SbfI -HFTM , fragments each with a 3 ' overhang . US 10 , 167 ,509 B2 43 44 Primers for one or more amplification reactions can be One or more restriction enzymes can be incubated with a designed to amplify sequences upstream and downstream of sample comprising polynucleotides for about, or more than restriction enzyme cleavage site . about, 1 , 2, 3 , 4 , 5, 6 , 7 , 8 , 9, 10 , 11, 12 , 13 , 14 , 15 , 16 , 17 , In one embodiment, a restriction enzyme does not cut the 18 , 19 , 20 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31, 32 , 33 , target nucleic acid sequence or a reference amplicon . One 5 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , can use a reference sequence , e . g ., a genome sequence , to 50, 51 , 52 , 53 , 54, 55, 56 , 57 , 58 , 59 , or 60 minutes. One or predict whether a restriction enzyme will cut a nucleic acid more restriction enzymes can be incubated with a sample sequence . In another embodiment , a restriction enzyme does cut the target nucleic acid sequence . The cleavage can occur comprising polynucleotides for about, ormore than about, 1, near ( within about 5 , 10 , 15 , 25, 50 , or 100 bp ) of the 5 ' or 10 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 3 ' end of the target sequence, within the target sequence . 20 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32, 33 , 34 , 35 , In another embodiment, a restriction enzyme does not cut 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , or 48 hours . One the target or the reference nucleic acid sequence or amplicon or more restriction enzymes can be incubated with a sample even if the sequence or amplicon contains one or more comprising polynucleotides for about 1 to about 60 min ., SNPs. SNP information can be obtained from several data - 15 aboutapo 1 min . to about 48 hrs, about 1 min . to about 24 hrs , bases , most readily from dbSNP (www .ncbi . nlm .nih . gov / about 1 min . to about 20 hrs, about 1 min to about 16 hrs , projects /SNP /) . about 0 . 5 hr to about 6 hrs, about 0 . 5 hr to about 3 hrs , about One or more methylation sensitive restriction enzymes 1 hr to about 10 hrs , about 1 hr to about 5 hr, or about 1 hr can be used in the methods, compositions, and kits provided to about 3 hr. herein . The one or more methylation sensitive enzyme can 20 A restriction enzyme digest can be performed at a tem include , e . g ., Dpnl, Acc651, Kpnl, Apal, Bsp1201, Bsp1431, perature of about, or more than about 5 , 6 , 7 , 8 , 9 , 10 , 11, 12 , Mbol, BspOI, Nhel, Cfr91, Smal, Csp6l, Rsal, Ec113611 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , SacI , EcoRII, Mval, Hpall , or Mspl. In one embodiment, a 29 , 30 , 31, 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41, 42 , 43 , 44 , methylation sensitive restriction enzyme can cleave nucleic 45 , 46 , 47 , 48 , 49, 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , acid that is not methylated , but cannot cleave nucleic acid 25 61, 62 , 63 , 64 , or 65° C . A restriction enzyme digest can be this is methylated . performed at a temperature of about 10 to about 65° C . , The restriction enzymes used in the present disclosure can about 20 to about 65° C ., about 30 to about 65° C . , about 37 be selected to specifically digest a selected region of nucleic to about 65° C . , about 40 to about 65° C ., about 50 to about acid sequence . In one embodiment, the one or more restric - 65° C ., about 25 to about 37° C ., or about 30 to about 37° tion enzymes cut between target nucleic acid sequences or 30 C . target amplicons. One or more enzymes can be chosen The pH of a restriction enzyme digest using one or more whose recognition sequences occur e. g. , once or multiple restriction enzymes can be about 2 , 2. 5 , 3 , 3 .5 , 4 , 4 . 5, 5 , 5 .5 , times near the target nucleic acid sequences or target 6 , 6 .5 , 6 .6 , 6 .7 , 6 .8 , 6 . 9 , 7 , 7 . 1 , 7 .2 , 7 .3 , 7 . 4 , 7 .5 , 7 .5 , 7 .6 , 7 .7 , amplicons . In one embodiment , care is taken to ensure that 7 . 8 , 7 . 9 , 8 , 8 . 1 , 8 . 2 , 8 . 3 , 8 . 4 , 8 . 5 , 8 . 6 , 8 . 7 , 8 . 8 , 8 . 9 , 9 , 9 . 5 , 10 , these recognition sequences are not affected by the presence 35 10 . 5 , 11 , 11 . 5 , 12 , or 12 . 5 . The pH of a restriction enzyme of SNPs . digest can be about 5 to about 9 , about 5 to about 8 , about In one embodiment , a restriction enzyme is an efficient 5 to about 7 , about 6 to about 9 , or about 6 to about 8 . but specific ( no star activity ) cutter . This property , along restriction enzyme digest can contain one or more with digestion time and enzyme concentration , can be deter- buffers . The one or more buffers can be , e . g ., tris - HCl, mined in advance by performing appropriate enzyme titra - 40 bis - tris - propane -HCI , TAPs , bicine, tris , tris - acetate , tris tion experiments . In one embodiment, a restriction enzyme HCl, tricine , TAPSO , HEPES , TES , MOPS , PIPES , cacody can have star activity . Star activity can be the cleavage of late , SSC , phosphate buffer , collidine , veronal acetate , sequences that are similar but not identical to a defined MES ., ADA , ACES , cholamine chloride, acetamidoglycine , recognition sequence . glycinamide , maleate , CABS , piperidine, glycine, citrate , The ratio of the number of “ units ” of a restriction enzyme 45 glycylglycine , malate , formate, succinate , acetate , propi to an amount of nucleic acid ( e . g ., DNA or RNA ) can be , onate , pyridine, piperazine , histidine , bis - tris , ethanolamine , e . g . , about, or at least about , 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , carbonate , MOPSO , imidazole , BIS - TRIS propane , BES , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , MOBS, triethanolamine ( TEA ) , HEPPSO , POPSO , hydra 28 , 29 , 30 , 31, 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , zine, Trizma ( tris ) , EPPS , HEPPS , bicine , HEPBS, AMPSO , 44 , 45 , 46 , 47 , 48 , 49, 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 50 taurine (AES ), borate , CHES , 2 - amino - 2 -methyl - 1 -propanol 60 , 61, 62 , 63 , 64 , 65 , 66 , 67, 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , (AMP ) , ammonium hydroxide , or methylamine . The con 76 , 77 , 78 , 79 , 80 , 81, 82 , 83 , 84 , 85 , 86 , 87, 88 , 89 , 90 , 91 , centration of a buffer in a solution can be , e . g . , about 1 , 2 , 92 , 93, 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11, 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 106 , 107 , 108 , 109 , 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 118 , 119 , 120 , 121 , 122 , 123 , 124 , 125 , 126 , 127 , 128 , 129 , 55 37 , 38 , 39 , 40 , 41, 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 130 , 131, 132, 133 , 134 , 135 , 136 , 137 , 138 , 139 , 140 , 141 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61, 62 , 63 , 64 , 65 , 66 , 67 , 68 , 142 , 143 , 144 , 145 , 146 , 147 , 148 , 149 , 150 , 155 , 160 , 165, 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81, 82 , 83 , 84 , 170 , 175 , 180 , 185, 190 , 195 , 200 , 205 , 210 , 215 , 220 , 225 , 85 , 86 , 87 , 88 , 89, 90 , 91, 92 , 93, 94 , 95 , 96 , 97 , 98, 99 , or 230 , 235 , 240 , 245, 250 , 300 , 350 , 400 , 500 , 600 , 700 , 800 , 100 mM . The concentration of buffer in a solution can be 900 , 1000 , 2000 , 3000 , 4000 , 5000 , 6000 , 7000 , 8000 , 9000 , 60 about 10 to about 100 mM , about 10 to about 75 mM , about 10 ,000 , 12 , 000 , 14 , 000 , 15 , 000 , 16 , 000 , 18 ,000 , or 20 , 000 25 to about 75 mm , or about 10 to about 50 mM . units /. mu . g of nucleic acid . The ratio of the number of units A restriction enzyme digest using one or more restriction of restriction enzyme to an amount of nucleic acid can be enzymes can comprise bovine serum albumin ( BSA ) . The about 1 to about 20 ,000 , about 1 to about 10 ,000 , about 1 to concentration of BSA in a restriction digest can be about, or about 5 , 000 , about 100 to about 10 , 000 , about 100 to about 65 more than about 0 .01 , 0 .02 , 0 . 03 , 0 .04 , 0 . 05 , 0 . 06 , 0 .07 , 0 .08 , 1 ,000 , about 50 to about 500 , or about 50 to about 250 0 .09 , 0 . 1 , 0 . 2 , 0 . 3 , 0 . 4 , 0 . 5 , 0 .6 , 0 . 7 , 0 . 8 , 0 . 9 , 1 , 1 . 5 , 2 , 3 , 4 , units / .mu . g . 5 , 6 , 7 , 8 , 9 , or 10 mg/ml . The concentration of BSA in a US 10 , 167 , 509 B2 45 46 restriction digest can be about 0 .01 to about 10 mg/ ml , about enzymes that can be used in a restriction enzyme digest can 0 .01 to about 1 mg/ml , about 0 .05 to about 1 mg/ ml , or about be , e .g . , about 1 to about 20 , about 1 to about 15 , about 1 to 0 .05 to about 0 . 5 mg/ ml . about 10 , about 1 to about 7 , about 1 to about 6 , about 1 to A restriction enzyme digest using one or more restriction about 5 , about 1 to about 4 , about 1 to about 3 , or about 1 enzymes can comprise glycerol. Glycerol can be at a con - 5 to about 2 . centration (volume to volume ) of about, or at least about, 1 , There is some evidence that a PCR works better when the 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , size of the fragment containing the amplicon or targets is 20 , 21 , 22 , 23 , 24 , or 25 percent. The concentration of relatively small . Therefore, selecting restriction enzymes glycerol in a restriction enzyme digest can be about 1 to with cutting sites near the amplicons or target can be about 25 % , about 1 to about 20 % , about 1 to about 15 % , 10 desirable . For example , a restriction enzyme recognition site about 1 to about 10 % , or about 1 to about 5 % . or cleavage site can be within about, or less than about, 1 , A restriction enzyme digest can comprise one or more 2 , 3 , 4 , 5 , 6 , 7, 8 , 9 , 10 , 11, 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , organic solvents , e . g . , DMSO , ethanol, ethylene glycol, 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31, 32 , 33 , 34 , 35 , dimethylacetamide, dimethylformamide , or suphalane. A 36 , 37 , 38 , 39 , 40 , 41, 42 , 43 , 44 , 45 , 46 , 47 , 48, 49, 50 , 51, restriction enzyme digest can be free of one or more organic 15 52, 53 , 54 , 55, 56 , 57, 58, 59 , 60, 61, 62 , 63, 64 , 65, 66 , 67 , solvents . 68 , 69 , 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , A restriction enzyme digest can comprise one or more 84 , 85 , 86 , 87 , 88, 89, 90 , 91, 92, 93 , 94 , 95 , 96 , 97 , 98 , 99, divalent cations . The one or more divalent cations can be , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111 , e . g . , Mg2 + , Mn2 + , Cu2+ , Co2 + , or Zn2+ . 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121 , 122 , 123 , A restriction digest can comprise one or more salts . The 20 124 , 125 , 126 , 127, 128 , 129 , 130 , 131 , 132 , 133 , 134 , 135 , one or more salts can include , for example , potassium 136 , 137 , 138 , 139 , 140 , 141, 142 , 143 , 144 , 145 , 146 , 147 , acetate , potassium chloride, magnesium acetate, magnesium 148 , 149 , 150 , 155 , 160 , 165 , 170 , 175 , 180 , 185 , 190 , 195 , chloride , sodium acetate , or sodium chloride . The concen - 200 , 205 , 210 , 215 , 220 , 225 , 230 , 235 , 240 , 245 , 250 , 300 , tration of each of the one or more salts can be , e .g ., about, 350 , 400 , 500 , 600 , 700 , 800 , 900 , 1000 , 2000 , 3000 , 4000 , or more than about, 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 25 5000 , 6000 , 7000 , 8000 , 9000 , 10 ,000 , 12 ,000 , 14 , 000 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29, 15 , 000 , 16 ,000 , 18 , 000 , or 20 , 000 base pairs from the 5 ' end 30 , 31, 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , or 3 ' end of one of the targets on a polynucleotide. A 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61 , restriction enzyme recognition site or cleavage site can be 62 , 63 , 64 , 65 , 66 , 67 , 68 , 69 , 70 , 71, 72 , 73 , 74 , 75 , 76 , 77 , within about 1 to about 10 ,000 , about 1 to about 5 , 000 , about 78 , 79 , 80 , 81, 82 , 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 30 1 to about 2 , 500 , about 1 to about 1 , 000 , about 1 to about 94 , 95 , 96 , 97 , 98 , 99 , 100 , 101 , 102 , 103 , 104 , 105 , 106 , 100 , about 100 to about 1000 , about 100 to about 500 , or 107, 108 , 109, 110 , 111 , 112 , 113 , 114 , 115 , 116 , 117 , 118 , about 100 to about 250 bp from the 5 ' or 3 ' end of a target 119, 120 , 121, 122, 123 , 124, 125 , 126 , 127, 128 , 129 , 130 , nucleic acid sequence. 131, 132 , 133 , 134 , 135 , 136 , 137 , 138 , 139 , 140 , 141 , 142 , A single sample can be analyzed for multiple CNVs. In 143 , 144 , 145 , 146 , 147 , 148 , 149 , 150 , 155 , 160 , 165 , 170 , 35 this case , it can be desirable to select the smallest number of 175 , 180 , 185 , 190 , 195 , 200 , 205 , 210 , 215 , 220 , 225 , 230 , digests that would work well for the entire set of CNVs . In 235 , 240 , 245, or 250 mM . The concentration of each of the one embodiment, a single restriction enzyme cocktail can be one or more salts can be about 5 to about 250 , about 5 to found that does not cut within any of the amplicons or target about 200 , about 5 to about 150 , about 5 to about 100 , about nucleic acid sequences but has recognition sites or cleavage 10 to about 100 , about 10 to about 90 , about 10 to about 80 , 40 sites near each one of them . A restriction enzyme recognition about 10 to about 70 , about 10 to about 60 , or about 10 to site or cleavage site can be within about , or less than about , about 50 mM . 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19, A restriction digest can comprise one or more reducing 20 , 21 , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , agents . The one or more reducing agents can inhibit the 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51, formation of disulfide bonds in a protein . A reducing agent 45 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61, 62 , 63 , 64 , 65 , 66 , 67, can be, for example , dithiothreitol (DTT ) , 2 -mercaptoetha 68 , 69 , 70 , 71 , 72, 73 , 74 , 75 , 76 , 77 , 78 , 79, 80 , 81, 82 , 83 , nol (BME ) , 2 -mercaptoethylamine -HC1 , tris ( 2 -carboxythyl ) 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , phosphine ( TCEP ) , or cysteine -HC1 , The concentration of 100 , 101 , 102 , 103 , 104 , 105 , 106 , 107 , 108 , 109 , 110 , 111, the one or more reducing agents in a restriction enzyme 112 , 113 , 114 , 115 , 116 , 117 , 118 , 119 , 120 , 121, 122 , 123 , digest can be about, or more than about, 0 .01 , 0 . 02 , 0 .03 , 50 124 , 125 , 126 , 127 , 128 , 129 , 130 , 131, 132 , 133 , 134 , 135 , 0 .04 , 0 .05 , 0 .06 , 0 .07 , 0 .08 , 0 .09 , 0 . 1 , 0 . 2 , 0 . 3 , 0 . 4 , 0 . 5 , 0 .6 , 136 , 137 , 138 , 139, 140 , 141, 142 , 143 , 144 , 145 , 146 , 147, 0 . 7 , 0 . 8 , 0 . 9 , 1 , 1 . 1 , 1 . 2 , 1 . 3 , 1 . 4 , 1 . 5 , 1 . 6 , 1 . 7 , 1 . 8 , 1 . 9 , 2 , 3 , 148 , 149 , 150 , 155 , 160 , 165 , 170 , 175 , 180 , 185 , 190 , 195 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , or 200 , 205 , 210 , 215 , 220 , 225 , 230 , 235 , 240 , 245 , 250 , 300 , 25 mM . The concentration of the one or more reducing 350 , 400 , 500 , 600 , 700 , 800 , 900 , 1000 , 2000 , 3000 , 4000 , agents in a restriction enzyme digest can be about 0 .01 to 55 5000 , 6000 , 7000 , 8000 , 9000 , 10 ,000 , 12 ,000 , 14 ,000 , about 25 mM , about 0 .01 to about 15 mM , about 0 .01 to 15 , 000 , 16 , 000 , 18 , 000 , or 20 , 000 base pairs from the 5 ' end about 10 mM , about 0 .01 to about 5 mM , about 0 . 1 to about or 3 ' end of one of the targets on a polynucleotide . A 5 mM , or about 0 . 5 to about 2 .5 mM . restriction enzyme recognition site or cleavage site can be More than one restriction enzyme can be used in a within about 1 to about 20 ,000 , about 10 to about 20 ,000 , restriction enzyme digest of nucleic acid . For example , 60 about 100 to about 20 , 000 , about 1000 to about 20 , 000 , multiple -digests can be employed if one or more of the about 10 to about 10, 000 , about 10 to about 1000 , about 10 restriction enzymes do not efficiently cut a nucleic acid , or to about 100 , about 50 to about 20 , 000 , about 50 to about if they do not all work universally well across all samples 1000 , about 50 to about 500 , about 50 to about 250 , about ( e . g . , because of SNPs) . The number of different restriction 50 to about 150 , or about 50 to 100 base pairs from the 5 ' end enzymes that can be used in a restriction digest can be about , 65 or 3 ' end of one or the targets on a polynucleotide . or at least about, 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , Appropriate software can be written and / or used to auto 15 , 16 , 17 , 18 , 19 , or 20 . The number of different restriction mate the process of restriction enzyme choice and present an US 10 , 167 ,509 B2 47 48 interface for a user, e . g . , an experimental biologist , to choose digest can be , e . g ., about, or at least about , 1 , 2 , 3 , 4 , 5 , 6 , the most appropriate enzymes given the criteria above . 7 , 8 , 9 , 10 , 11 , 12, 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, 22 , 23, Additional considerations can be employed by the software , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31, 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , such as enzyme cost, enzyme efficiency, buffer compatibility 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , 52 , 53 , 54 , 55 , of restriction enzymes , methylation sensitivity , number of 5 56 , 57 , 58 , 59 , 60 , 61, 62 , 63 , 64 , 65 , 66 ,67 , 68, 69 , 70 , 71, cleavage sites in a segment of nucleic acid , or availability. 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 . 87 . The software can be used on a computer. An algorithm can 88 , 89 , 90 , 91, 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , or 100 mM . The be generated on a computer readable medium and be used to concentration of the one or more chelating agents can be select one or more restriction enzymes for digesting nucleic about 1 to about 100 mM , about 1 to about 75 mM , or about acid . A computer can be connected to the interns and can be 10 25 to about 75 mM . used to access a website that can permit selection of restric A control assay and template can be used to measure the tion endonucleases . A web tool can be used to select efficiency of a restriction enzyme digestion step . restriction enzymes that will cut around an amplicon in order Samples to separate linked gene copies for CNV estimation . For Samples to be analyzed using the methods, compositions, example , enzymes and assays can be stored in a database 15 and kits provided herein can be derived from a non - cellular and selection of a restriction enzyme can be automatic . entity comprising nucleic acid ( e . g . , a virus ) or from a Additional statistics that can be considered include, e . g ., cell -based organism ( e . g ., member of archaea , bacteria , or length of shortest fragment, % GC content, frequency of cuts eukarya domains) . A sample can be obtained in some cases around (or in ) an amplicon , and cost of enzymes . QTools can from a hospital, laboratory , clinical or medical laboratory. be used to assist in the selection of one or more restriction 20 The sample can comprise nucleic acid , e . g ., RNA or DNA . enzymes . FIGS. 13 and 14 illustrate information that can be The sample can comprise cell - free nucleic acid . In some considered when selecting a restriction enzyme. cases , the sample is obtained from a swab of a surface , such For assay storage for data analysis , a researcher can enter as a door or bench top . assays by location or primer sequence . QTools can auto The sample can from a subject, e . g . , a plant, fungi, matically retrieve and stores amplicon sequences and known 25 eubacteria , archeabacteria , protest, or animal. The subject SNPs and compute thermodynamic parameters . As research - may be an organism , either a single - celled or multi - cellular ers use the assay more , they can enter additional data , organism . The subject may be cultured cells , which may be including confirmed sample CNVs and annealing tempera - primary cells or cells from an established cell line , among tures . others . The sample may be isolated initially from a multi Digestion with more than one enzyme, performed serially 30 cellular organism in any suitable form . The animal can be a or together in a single tube , can help to ensure complete fish , e . g . , a zebrafish . The animal can be a mammal. The cutting of difficult targets . A series of restriction enzyme mammal can be , e . g ., a dog , cat, horse , cow , mouse , rat, or digests of one sample can be performed with different pig . The mammal can be a primate , e . g ., a human , chim enzymes , e . g . , about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , or 10 . panzee , orangutan , or gorilla . The human can be a male or One or more restriction enzymes in a digest can be 35 female . The sample can be from a human embryo or human inactivated following the restriction enzyme digest . In some fetus . The human can be an infant, child , teenager, adult, or embodiments , the one or more restriction enzymes cannot be elderly person . The female can be pregnant, can be sus inactivated by exposure to heat. Most restriction enzymes pected of being pregnant , or planning to become pregnant. can be heat- inactivated after restriction by raising the tem - The sample can be from a subject ( e . g . , human subject ) perature of the restriction reaction . The temperature for 40 who is healthy. In some embodiments , the sample is taken heat- inactivation can be , e. g . , about, or more than about, 50 , from a subject ( e . g ., an expectant mother ) at least 4 , 5 , 6 , 7 , 51 , 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59 , 60 , 61, 62 , 63 , 64 , 65 , 66 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, 22 , 23 , 24 , 67 , 68 , 69 , 70 , 71, 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 25 , or 26 weeks of gestation . In some embodiments , the 83 , 84 , 85 , 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , subject is affected by a genetic disease , a carrier for a genetic 99 , or 100 degrees Celsius . The temperature for heat inac - 45 disease or at risk for developing or passing down a genetic tivation can be about 50 to about 100 , about 50 to about 90 , disease , where a genetic disease is any disease that can be about60 to about 90 , about65 to about 90 , about65 to about linked to a genetic variation such as mutations, insertions, 85 , or about 65 to about 80 degrees Celsius . The duration of additions, deletions , translocation , point mutation , trinucle heat- inactivation can be , e . g ., about, or more than about , 1 , otide repeat disorders and / or single nucleotide polymor 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11, 12 , 13 , 14 , 15 , 16 , 17, 18 , 19 , 50 phisms ( SNPs ). In other embodiments , the sample is taken 20 , 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29, 30 , 31 , 32 , 33 , 34 , 35 , from a female patient of child -bearing age and , in some 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51 , cases , the female patient is not pregnant or of unknown 52 , 53 , 54 , 55 , 56 , 57 , 58 , 59, 60 , 65 , 70 , 80 , 90 , 100 , 110 , pregnancy status. In still other cases , the subject is a male 120 , 180 , 240 , or 300 minutes. The duration of heat patient, a male expectant father, or a male patient at risk of, inactivation can be about 5 to about 300 , about 5 to about 55 diagnosed with , or having a specific genetic abnormality . In 200 , about 5 to about 150 , about 5 to about 100 , about 5 to some cases , the female patient is known to be affected by , or about 75 , about 5 to about 50 , about 5 to about 40 , about 5 is a carrier of, a genetic disease or genetic variation , or is at to about 30 , about 5 to about 35 , about 5 to about 25 , about risk of, diagnosed with , or has a specific genetic abnormal 5 to about 20 , or about 10 to about 20 minutes . The ity . In some cases , the status of the female patient with temperature of heat - inactivation can be below the melt point 60 respect to a genetic disease or genetic variation may not be of the restricted target fragments , so as to maintain double known . In further embodiments , the sample is taken from stranded template copies. any child or adult patient of known or unknown status with A restriction enzyme digest can be stopped by addition of respect to copy number variation of a genetic sequence . In one or more chelating agents to the restriction enzyme some cases, the child or adult patient is known to be affected digest. The one or more chelating agents can be , e . g . , EDTA , 65 by , or is a carrier of, a genetic disease or genetic variation . EGTA , citric acid , or a phosphonate . The concentration of The sample can be from a subject who has a specific the one or more chelating agents in a restriction enzyme disease, disorder, or condition , or is suspected of having (or US 10 , 167, 509 B2 49 50 at risk of having ) a specific disease , disorder or condition . e . g ., heart , skin , liver , lung, breast, stomach , pancreas, For example, the sample can be from a cancer patient , a bladder, colon , gall bladder, brain , etc . patient suspected of having cancer , or a patient at risk of When the nucleic acid is RNA , the source of the RNA can having cancer. The cancer can be, e . g ., acute lymphoblastic be any source described herein . For example , the RNA can leukemia ( ALL ) , acute myeloid leukemia ( AML ) , adreno - 5 a cell - free mRNA , can be from a tissue biopsy , core biopsy , fine needle aspirate , flash frozen , or formalin - fixed paraffin cortical carcinoma, Kaposi Sarcoma, anal cancer, basal cell embedded ( FFPE ) sample . The FFPE sample can be depar carcinoma, bile duct cancer , bladder cancer, bone cancer, affinized before the RNA is extracted . The extracted RNA osteosarcoma, malignant fibrous histiocytoma, brain stem can be heated to about 30 , 31, 32 , 33 , 34 , 35 , 36 , 37 , 38 , 39 , glioma , brain cancer , craniopharyngioma, ependymoblas 10 40 , 41 , 42 , 43, 44 , 45 , 46 , 47 , 48, 49, 50 , 51, 52 , 53 , 54 , 55 , toma, ependymoma , medulloblastoma, medulloeptithe 56 , 57 , 58 , 59 , 60 , 61, 62 , 63 , 64 , 65 , 66 , 67, 68 , 69, 70 , 71 , lioma, pineal parenchymal tumor, breast cancer , bronchial 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , 86 , 87 , tumor, Burkitt lymphoma, Non -Hodgkin lymphoma, carci 88 , 89 , 90 , 91, 92 , 93 , 94 , 95 , 96 , 97 , 98 , or 99 .degree . C . noid tumor, cervical cancer , chordoma, chronic lymphocytic before analysis . The extracted RNA can be heated to any leukemia (CLL ), chromic myelogenous leukemia ( ML ) , 15 these temperatures for about, or at least about, 15 min , 30 colon cancer, colorectal cancer, cutaneous T - cell lymphoma, min , 45 min , 60 min , 1. 5 hr, 2 hr, 2 .5 hr, 3 hr, 3 .5 hr, 4 hr, ductal carcinoma in situ , endometrial cancer, esophageal 4 . 5 hr, 5 hr, 5 . 5 hr, 6 hr, 6 . 5 hr, 7 hr, 7 . 5 hr, 8 hr, 8 . 5 hr, 9 cancer , Ewing Sarcoma, eye cancer , intraocular melanoma , hr . 9 . 5 hr, or 10 hr . retinoblastoma, fibrous histiocytoma, gallbladder cancer, RNA can be used for a variety of downstream applica gastric cancer , glioma, hairy cell leukemia , head and neck 20 tions . For example , the RNA can be converted to cDNA with cancer , heart cancer , hepatocellular ( liver ) cancer , Hodgkin a reverse transcriptase and the cDNA can optionally be lymphoma, hypopharyngeal cancer , kidney cancer, laryn - subject to PCR , e . g ., real- time PCR . The RNA or cDNA can geal cancer, lip cancer, oral cavity cancer , lung cancer, be used in an isothermal amplification reaction , e .g ., an non -small cell carcinoma, small cell carcinoma, melanoma, isothermal linear amplification reaction . The RNA , resulting mouth cancer, myelodysplastic syndromes , multiple 25 cDNA , or molecules amplified therefrom can be used in a myeloma, medulloblastoma, nasal cavity cancer, paranasal microarray experiment, gene expression experiment, North sinus cancer, neuroblastoma, nasopharyngeal cancer, oral ern analysis , Southern analysis , sequencing reaction , next cancer, oropharyngeal cancer, osteosarcoma, ovarian cancer, generation sequencing reaction , etc . Specific RNA pancreatic cancer, papillomatosis, paraganglioma, parathy - sequences can be analyzed , or RNA sequences can be roid cancer, penile cancer , pharyngeal cancer, pituitary 30 globally analyzed . tumor, plasma cell neoplasm , prostate cancer, rectal cancer, Nucleic acids can be extracted from a sample by means renal cell cancer , rhabdomyosarcoma , salivary gland cancer , available to one of ordinary skill in the art . Sezary syndrome, skin cancer , nonmelanoma, small intes The sample may be processed to render it competent for tine cancer, soft tissue sarcoma, squamous cell carcinoma, amplification . Exemplary sample processing may include testicular cancer, throat cancer , thymoma , thyroid cancer , 35 lysing cells of the sample to release nucleic acid , purifying urethral cancer, uterine cancer, uterine sarcoma, vaginal the sample ( e . g ., to isolate nucleic acid from other sample cancer , vulvar cancer, Waldenstrom Macroglobulinemia , or components , which may inhibit amplification ) , diluting Wilms Tumor. The sample can be from the cancer and / or concentrating the sample , and / or combining the sample with normal tissue from the cancer patient . reagents for amplification , such as a DNA / RNA polymerase The sample can be from a subject who is known to have 40 ( e . g ., a heat- stable DNA polymerase for PCR amplification ) , a genetic disease , disorder or condition . In some cases, the dNTPs ( e . g . , dATP, ACTP , dGTP, and dTTP ( and / or dUTP ) ) , subject is known to be wild - type or mutant for a gene , or a primer set for each allele sequence or polymorphic locus portion of a gene , e . g ., CFTR , Factor VIII ( F8 gene ), beta to be amplified , probes ( e . g . , fluorescent probes , such as globin , hemachromatosis , GOPD , neurofibromatosis , TAQMAN probes or molecular beacon probes , among oth GAPDH , beta amyloid , or pyruvate kinase gene . In some 45 ers ) capable of hybridizing specifically to each allele cases, the status of the subject is either known or not known , sequence to be amplified , Mg2 + , DMSO , BSA , a buffer, or and the subject is tested for the presence of a mutation or any combination thereof, among others . In some examples, genetic variation of a gene , e . g . , CFTR , Factor VIII ( F8 the sample may be combined with a restriction enzyme, gene ) , beta globin , hemachromatosis , G6PD , neurofibroma - uracil -DNA glycosylase (UNG ) , reverse transcriptase , or tosis , GAPDH , beta amyloid , or pyruvate kinase gene . 50 any other enzyme of nucleic acid processing. The sample can be aqueous humour, vitreous humour, Target Polynucleotide bile , whole blood, blood serum , blood plasma, breast milk , The methods described herein can be used for analyzing cerebrospinal fluid , cerumen , enolymph , perilymph , gastric or detecting one or more target nucleic acid molecules. The juice , mucus, peritoneal fluid , saliva , sebum , semen , sweat , term polynucleotide , or grammatical equivalents , can refer tears , vaginal secretion , vomit , feces, or urine . The sample 55 to at least two nucleotides covalently linked together . A can be obtained from a hospital , laboratory, clinical or nucleic acid described herein can contain phosphodiester medical laboratory . The sample can be taken from a subject. bonds , although in some cases , as outlined below (for The sample can comprise nucleic acid . The nucleic acid can example in the construction of primers and probes such as be , e . g . , mitochondrial DNA , genomic DNA , mRNA , label probes ) , nucleic acid analogs are included that can siRNA , miRNA, CRNA , single -stranded DNA, double - 60 have alternate backbones, comprising , for example, phos stranded DNA , single - stranded RNA , double - stranded phoramide (Beaucage et al. , Tetrahedron 49 ( 10 ) : 1925 RNA , URNA , rRNA , or cDNA . The sample can comprise ( 1993 ) and references therein ; Letsinger, J . Org . Chem . cell - free nucleic acid . The sample can be a cell line , genomic 35 : 3800 ( 1970 ) ; Sprinzl et al. , Eur. J . Biochem . 81: 579 DNA , cell -free plasma , formalin fixed paraffin embedded ( 1977 ) ; Letsinger et al. , Nucl. Acids Res . 14 : 3487 ( 1986 ) ; ( FFPE ) sample, or flash frozen sample . A formalin fixed 65 Sawai et al, Chem . Lett . 805 ( 1984 ) , Letsinger et al . , J . Am . paraffin embedded sample can be deparaffinized before Chem . Soc . 110 :4470 ( 1988 ); and Pauwels et al. , Chemica nucleic acid is extracted . The sample can be from an organ , Scripta 26 :141 91986 )) , phosphorothioate (Mag et al. , US 10 , 167 , 509 B2 51 52 Nucleic Acids Res. 19 :1437 ( 1991 ); and U .S . Pat. No . rare population of polynucleotides exists within a larger 5 ,644 ,048 ), phosphorodithioate (Briu et al ., J. Am . Chem . population of polynucleotides . Soc . 111 :2321 ( 1989 ) , Omethylphophoroamidite linkages The number of copies of a target nucleic acid sequence in (see Eckstein , Oligonucleotides and Analogues : A Practical a sample ( e . g ., a genome) of a subject whose sample is Approach , Oxford University Press ) , and peptide nucleic 5 analyzed using the methods, compositions, and kits pro acid ( also referred to herein as “ PNA ” ) backbones and vided herein can be 0 , or about, or at least about 1 , 2 , 3 , 4 , linkages ( see Egholm , J . Am . Chem . Soc . 114 : 1895 ( 1992 ) ; 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, Meier et al. , Chem . Int. Ed . Engl. 31 : 1008 ( 1992 ) ; Nielsen , 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , 37 , Nature , 365 : 566 ( 1993 ) ; Carlsson et al. , Nature 380 : 207 38 , 39 , 40 , 41, 42, 43 , 44 , 45 , 46 , 47, 48 , 49 , 50 , 51, 52 , 53 , ( 1996 ) , all of which are incorporated by reference ). Other 10 54 , 55 , 56 , 57, 58, 59 , 60 , 61, 62, 63, 64 , 65 , 66 , 67, 68 , 69 , analog nucleic acids include those with bicyclic structures 70 , 71 , 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 80 , 81 , 82 , 83 , 84 , 85 , including locked nucleic acids (also referred to herein as 86 , 87 , 88 , 89 , 90 , 91 , 92 , 93 , 94 , 95 , 96 , 97 , 98 , 99 , 100 , 150 , “ LNA ” ), Koshkin et al. , J. Am . Chem . Soc . 120 . 13252 3 200 , 250 , 300 , 350 , 400 , 450 , 500 , 550 , 600 , 650 , 700 , 750 , ( 1998 ) ; positive backbones (Denpcy et al. , Proc . Natl . Acad . 800 , 850 , 900 , 950 , 1000 , 5000 , 10 , 000 , 20 ,000 , 50 ,000 , or Sci. USA 92 :6097 ( 1995 ) ; non - ionic backbones ( U . S . Pat. 15 100 , 000 . The number of copies of a target nucleic acid Nos. 5 ,386 , 023 , 5 ,637 ,684 , 5 ,602 ,240 , 5 ,216 , 141 and 4 ,469 , sequence in a genome of a subject whose sample is analyzed 863; Kiedrowshi et al ., Angew . Chem . Intl. Ed . English using the methods , compositions, and kits provided herein 30 :423 ( 1991 ) ; Letsinger et al. , J. Am . Chem . Soc. 110 :4470 can be about 1 to about 20 , about 1 to about 15 , about 1 to ( 1988 ) ; Letsinger et al. , Nucleoside & amp ; Nucleotide about 10 , about 1 to about 7 , about 1 to about 5 , about 1 to 13 : 1597 ( 1994 ) ; Chapters 2 and 3 , ASC Symposium Series 20 about 3 , about 1 to about 1000 , about 1 to about 500 , about 580 , “ Carbohydrate Modifications in Antisense Research ” , 1 to about 250 , about 1 to about 100 , about 10 to about 1000 , Ed . Y . S . Sanghui and P . Dan Cook ; Mesmaeker et al. , about 10 to about 500 , about 10 to about 250 , about 10 to Bioorganic & Medicinal Chem . Lett . 4 : 395 ( 1994 ) ; Jeffs et about 100 , about 10 to about 50 , about 10 to about 20 , about al. , J . Biomolecular NMR 34 : 17 ( 1994 ) ; Tetrahedron Lett . O to about 100 , about 0 to about 50 , about 0 to about 25 or 37 :743 ( 1996 ) ) and non - ribose backbones, including those 25 about 0 to about 10 . described in U . S . Pat . Nos. 5 ,235 ,033 and 5 ,034 ,506 , and The target nucleic acid sequence can be on one chromo Chapters 6 and 7 , ASC Symposium Series 580 , “ Carbohy - some. If the target nucleic acid is in a sample derived from drate Modifications in Antisense Research ” , Ed . Y . S . San - a human subject, the target nucleic acid sequence can be on ghui and P . Dan Cook . Nucleic acids containing one or more one or more of chromosome 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , carbocyclic sugars are also included within the definition of 30 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21, 22 , X , or Y . The target nucleic acids ( see Jenkins et al. , Chem . Soc . Rev . ( 1995 ) pp nucleic acid can be on about , or more than about, 1 , 2 , 3 , 4 , 169 176 ) . Several nucleic acid analogs are described in 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , Rawls , C & E News Jun . 2 , 1997 page 35 . “ Locked nucleic 22 , or 23 chromosomes . Two or more copies of the target acids ” are also included within the definition of nucleic acid nucleic acid sequence can be on the same or different analogs . LNAs are a class of nucleic acid analogues in which 35 chromosomes . In a human subject, two or more copies of the the ribose ring is “ locked ” by a methylene bridge connecting target nucleic acid sequence can be on chromosome 1 , 2 , 3 , the 2 - 0 atom with the 4 '- C atom . All of these references are 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 21 , hereby expressly incorporated by reference. These modifi - 22 , X , or Y . Two or more copies of the target nucleic acid cations of the ribose - phosphate backbone can be done to sequence can be on one polynucleotide ( e . g ., chromosome) increase the stability and half -life of such molecules in 40 in a subject, but the target nucleic acids can be separated in physiological environments . For example , PNA :DNA and a sample taken from the subject due to handling of the LNADNA hybrids can exhibit higher stability and thus can sample ( e . g ., by fragmentation ) . be used in some embodiments. The target nucleic acids can When two copies of a target nucleic acid are on the same be single stranded or double stranded , as specified , or polynucleotide , e . g ., same chromosome, the two copies can contain portions of both double stranded or single stranded 45 be spaced apart on the polynucleotide by about, more than sequence . Depending on the application , the nucleic acids about, or less than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 25 , 50 , can be DNA ( including , e . g . , genomic DNA , mitochondrial 75 , 100 , 200 , 300 , 400 , 500 , 600 , 700 , 800 , 900 , 1000 , 2000 , DNA , and cDNA ), RNA ( including , e. g ., mRNA and rRNA ) 3000 , 4000 , 5000 , 6000 , 7000 , 8000 , 9000 , 10 ,000 , 20 ,000 , or a hybrid , where the nucleic acid contains any combination 30 , 000 , 40 ,000 , 50 ,000 , 60 , 000 , 70 ,000 , 80 , 000 , 90 ,000 , of deoxyribo - and ribo - nucleotides , and any combination of 50 100 , 000 , 200 ,000 , 300 ,000 , 400 , 000 , 500 , 000 ,600 , 000 , 700 , bases , including uracil, adenine, thymine, cytosine, guanine, 000 , 800 ,000 , 900 ,000 , 1 million , 2 million , 3 million , 4 inosine, xathanine hypoxathanine, isocytosine , isoguanine , million , 5 million , 6 million , 7 million , 8 million , 9 million , etc . 10 million , 20 million, 30 million , 40 million , 50 million , 60 The methods and compositions provided herein can be million , 70 million , 80 million , 90 million , or 100 million used to evaluate a quantity of polynucleotides (e . g ., DNA , 55 bases or base pairs . Two target nucleic acids can be spaced RNA , mitochondrial DNA, genomic DNA, mRNA , siRNA , apart by about 100 to about 100 , 000 , about 100 to about miRNA, CRNA , single -stranded DNA , double -stranded 10 , 000 , about 100 to about 1, 000 , about 10 to about 10 ,000 , DNA , single - stranded RNA , double - stranded RNA , TRNA , or about 10 to about 1 ,000 bases or base pairs. rRNA , DNA , etc . ) . The methods and compositions can be The target sequence can be a gene . For example , the gene used to evaluate a quantity of a first polynucleotide com - 60 can be ERBB2, EGFR , BRCA1, BRCA2, APC , MSH2, pared to the quantity of a second polynucleotide . The MSH6 , MLH1, CYP2D6 , a low - copy repeat (LCR ) - rich methods can be used to analyze the quantity of synthetic sequence ( see e. g . , Balikova et al . ( 2008 ) Am J . Hum Genet. plasmids in a solution ; to detect a pathogenic organism ( e . g ., 82 : 181 - 187 ) , TASIR1, GNATI, IMPDHI, OPN1SW , microbe , bacteria , virus, parasite , retrovirus, lentivirus, HIV OR2A12 , OR2A14 , OR2A2, OR2A25 , OR2A5, OR2A1, 1 , HIV - 2 , influenza virus, etc .) within a sample obtained 65 OR2A42 , OR2A7, OR4F21, OR4F29 , OR4C6 , OR4P4 , from a subject or obtained from an environment. The OR4S2 , OR5D13 , ROMI, TAS @ R14 , TAS2R44 , methods also can be used in other applications wherein a TAS2R48 , TAS2R49 , TAS2R50 , OR6C2 , OR6C4 , US 10 , 167 , 509 B2 53 54 OR6C68, OR6C70 , OR4MI, OR4Q3 , OR4K1, OR4K2 , nucleic acid sequence in each droplet and detecting the OR4K5 , OR4N2, OR4K13 , OR4K14 , OR4K15 , OR4M2, presence of a target nucleic acid sequence . In some cases , OR4N4 , OR1F1 , ACTG1, FSCN2, OR2Z1, OR11H1, the sequence that is amplified is present on a probe to the MYH9, SKI, TP73 , TNFRSF25 , RAB3B , VAV3, RALB , genomic DNA , rather than the genomic DNA itself. In some BOK , NAT6 , TUSC2, TUSC4 , TAB33B , C6orf210 , ESR1, 5 cases , at least 200 , 175 , 150 , 125 , 100 , 90 , 80 , 70 , 60 , 50 , 40 , MAFK , MADILI, MYC , VAV2 , MAP2K8, CDKNIC , 30 , 20 , 10 , or 0 droplets have zero copies of a target nucleic WT1, WIT - 1 , C1QTNF4 , MEN1, CCND1, ORAOV1, acid . MLL2 , C13orf10 , TNFAIP2 , AXIN1, BCAR1, TAX1BP3 , Information about an amplification reaction can be NF1, PHB , MAFG , CIQTNF1, YES1, DCC , SH3GL1, entered into a database . For example , FIGS. 15A and 15B TNFSF9 , TNFSF7 , TNFSF14 , VAVI, RAB3A , PTOV1, 10 illustrate assay information that can be entered into a data BAX , RRAS , BCAS4 , HIC2, NROB2, TTN , SGCB , SMA3, base . SMA4 , SMN2, LPA , PARK2, GCK , GPR51, BSCL2, A2M , Primers TBXA2R , FKRP , or COMT. Primers can be designed according to known parameters The target sequence can encode a microRNA, e . g . , hsa - for avoiding secondary structures and self -hybridization . let -7g , hsa -mir - 135a - 1 , hsa -mir - 95 , hsa -mir218 - 1 , hsa -mir - 15 Different primer pairs can anneal and melt at about the same 320 , has- let - 7a - 1 , has - let- 7d , has- let- 7f - 1 , has -mir - 202 , has temperatures, for example , within about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , mir - 130a , has- mir - 130a , has -mir338 , has -mir - 199a - 1 , has - 9 or 10° C . of another primer pair . In some cases, greater mir - 181c, has -mir - 181d , has -mir - 23a , has -mir - 24 - 2 , has than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 15 , 20 , 25 , 30 , 35 , 40 , mir - 27a , has -mir - 150 , hasmir - 499 , has -mir - 124a - 3 , or has 45 , 50 , 100 , 200 , 500 , 1000 , 5000 , 10 ,000 or more primers mir - 185 . 20 are initially used . Such primers may be able to hybridize to The target sequence can be any sequence listed in Wong the genetic targets described herein . In some embodiments , et al . (2007 ) Am J of Hum Genetics 80 : 91 - 104 . about 2 to about 10 , 000 , about 2 to about 5 ,000 , about 2 to Amplification and Detection about 2 ,500 , about 2 to about 1 , 000 , about 2 to about 500 , The methods described herein can make use of nucleic about 2 to about 100 , about 2 to about 50 , about 2 to about acid amplification . Amplification of target nucleic acids can 25 20 , about 2 to about 10 , or about 2 to about 6 primers are be performed by any means known in the art. Amplification used . can be performed by thermal cycling or isothermally . In Primers can be prepared by a variety ofmethods including exemplary embodiments , amplification may be achieved by but not limited to cloning of appropriate sequences and the polymerase chain reaction ( PCR ) . direct chemical synthesis using methods well known in the Examples of PCR techniques that can be used include , but 30 art (Narang et al. , Methods Enzymol. 68 : 90 ( 1979 ); Brown are not limited to , quantitative PCR , quantitative fluorescent et al. , Methods Enzymol. 68 : 109 ( 1979 ) ) . Primers can also PCR ( QF - PCR ), multiplex fluorescent PCR (MF -PCR ), real be obtained from commercial sources such as Integrated time PCR (RT - PCR ) , single cell PCR , restriction fragment DNA Technologies , Operon Technologies , Amersham Phar length polymorphism PCR (PCR - RFLP ) , PCR -RFLP /RT - macia Biotech , Sigma, and Life Technologies . The primers PCR - RFLP , hot start PCR , nested PCR , in situ polony PCR , 35 can have an identical melting temperature . The melting in situ rolling circle amplification ( RCA ) , bridge PCR , temperature of a primer can be about 30 , 31 , 32 , 33 , 34 , 35 , picotiter PCR , digital PCR , droplet digital PCR , and emul- 36 , 37 , 38 , 39 , 40 , 41 , 42 , 43 , 44 , 45 , 46 , 47 , 48 , 49 , 50 , 51, sion PCR . Other suitable amplification methods include the 52, 53 , 54 , 55 , 56 , 57 , 58 , 59, 60 , 61 , 62 , 63 , 64 , 65 , 66 , 67 , ligase chain reaction (LCR ) , transcription amplification , 68 , 69 , 70 , 71, 72 , 73 , 74 , 75 , 76 , 77 , 78 , 79 , 81, 82 , 83 , 84 , molecular inversion probe (MIP ) PCR , self - sustained 40 or 85° C . In some embodiments , the melting temperature of sequence replication , selective amplification of target poly - the primer is about 30 to about 85° C . , about 30 to about 80° nucleotide sequences , consensus sequence primed poly - C ., about 30 to about 75° C . , about 30 to about 70° C ., about merase chain reaction (CP -PCR ) , arbitrarily primed poly - 30 to about 65° C ., about 30 to about 60° C ., about 30 to merase chain reaction ( AP - PCR ), degenerate about 55° C . , about 30 to about 50° C ., about 40 to about 85° oligonucleotide -primed PCR (DOP - PCR ) and nucleic acid 45 C ., about 40 to about 80° C ., about 40 to about 75° C ., about based sequence amplification (NABSA ). Other amplifica - 40 to about 70° C ., about 40 to about 65° C ., about 40 to tion methods that can be used herein include those described about 60° C ., about 40 to about 55° C . , about 40 to about 50° in U . S . Pat . Nos. 5 ,242 ,794 ; 5 ,494 , 810 ; 4 , 988, 617 ; and C ., about 50 to about 85° C . , about 50 to about 80° C . , about 6 ,582 , 938 . Amplification of target nucleic acids can occur on 50 to about 75° C . , about 50 to about 70° C . , about 50 to a bead . In other embodiments , amplification does not occur 50 about 65° C ., about 50 to about 60° C ., about 50 to about 55° on a bead . Amplification can be by isothermal amplification , C ., about 52 to about 60° C ., about 52 to about 58° C ., about e .g ., isothermal linear amplification . A hot start PCR can be 52 to about 56° C ., or about 52 to about 54° C . performed wherein the reaction is heated to 95° C . for two The lengths of the primers can be extended or shortened minutes prior to addition of the polymerase or the poly - at the 5' end or the 3' end to produce primers with desired merase can be kept inactive until the first heating step in 55 melting temperatures. One of the primers of a primer pair cycle 1. Hot start PCR can be used to minimize nonspecific can be longer than the other primer. The 3 ' annealing lengths amplification . Other strategies for and aspects of amplifica of the primers, within a primer pair, can differ . Also , the tion are described in U . S . Patent Application Publication No. annealing position of each primer pair can be designed such 2010 /0173394 A1, published Jul. 8 , 2010 , which is incor - that the sequence and length of the primer pairs yield the porated herein by reference . 60 desired melting temperature . An equation for determining Techniques for amplification of target and reference the melting temperature of primers smaller than 25 base sequences are known in the art and include the methods pairs is the Wallace Rule ( Td = 2 ( A + T ) + 4 ( G + C ) ) . Computer described in U . S . Pat. No . 7 ,048 ,481 . Briefly , the techniques programs can also be used to design primers , including but can includemethods and compositions that separate samples not limited to Array Designer Software (Arrayit Inc . ), Oli into small droplets , in some instances with each containing 65 gonucleotide Probe Sequence Design Software for Genetic on average less than about 5 , 4 , 3 , 2 , or one target nucleic Analysis (Olympus Optical Co . ) , NetPrimer, and DNAsis acid molecule (polynucleotide ) per droplet , amplifying the from Hitachi Software Engineering . The TM (melting or US 10 , 167, 509 B2 55 56 annealing temperature ) of each primer can be calculated other non - target polynucleotides likely to be in a sample using software programs such as Net Primer ( free web based (e .g ., genomic DNA outside the region occupied by the program at http :/ /www .premierbiosoft .com /netprimer / in - target polynucleotide ) . dex . html) . The annealing temperature of the primers can be A label ( fluorophore, dye ) used on a probe ( e . g . , a Taqman recalculated and increased after any cycle of amplification , 5 probe ) to detect a target nucleic acid sequence or reference including but not limited to about cycle 1 , 2 , 3 , 4 , 5 , about nucleic acid sequence in the methods described herein can cycle 6 to about cycle 10 . about cycle 10 to about cycle 15 . be , e . g . , 6 - carboxyfluorescein (FAM ), tetrachlorofluorescein ( TET ), 4 ,7 , 2 -trichloro - 7' - phenyl- 6 -carboxyfluorescein about cycle 15 to about cycle 20 , about cycle 20 to about (VIC ) , HEX , Cy3 , Cy 3 . 5 , Cy 5 , Cy 5 . 5 , Cy 7 , tetrameth cycle 25 , about cycle 25 to about cycle 30 , about cycle 30 10 ylrhodamine , ROX , and JOE. The label can be an Alexa to about cycle 35, or about cycle 35 to about cycle 40 . After 10 Fluor dye , e . g ., Alexa Fluor 350 , 405 , 430 , 488 , 532, 546 , the initial cycles of amplification , the 5 ' half of the primers 555 , 568 , 594 , 633 , 647 , 660 , 680 , 700 , and 750 . The label can be incorporated into the products from each loci of can be Cascade Blue, Marina Blue , Oregon Green 500 , interest; thus the TM can be recalculated based on both the Oregon Green 514 , Oregon Green 488 , Oregon Green sequences of the 5 ' half and the 3 ' half of each primer. 15 488 -X , Pacific Blue , Rhodamine Green , Rhodol Green , The annealing temperature of the primers can be recal- Rhodamine Green - X , Rhodamine Red - X , and Texas Red - X . culated and increased after any cycle of amplification , The label can be at the 5 ' end of a probe . 3 ' end of the probe . including but not limited to about cycle 1 , 2 , 3 , 4 , 5 , about at both the 5 ' and 3 ' end of a probe , or internal to the probe . cycle 6 to about cycle 10 , about cycle 10 to about cycle 15 , A unique label can be used to detect each different locus in about cycle 15 to about cycle 20 , about cycle 20 to about 20 an experiment. cycle 25 , about cycle 25 to about cycle 30 , about cycle 30 A probe, e. g. , a Taqman probe , can comprise a quencher , to about 35 , or about cycle 35 to about cycle 40 . After the e .g . , a 3 ' quencher . The 3 ' quencher can be , e . g ., TAMARA , initial cycles of amplification , the 5 ' half of the primers can DABCYL , BHQ - 1, BHQ - 2, or BHQ - 3 . In some cases , a be incorporated into the products from each loci of interest, quencher used in the methods provided herein is a black hole thus the TM can be recalculated based on both the sequences 25 quencher (BHQ ) . In some cases , the quencher is a minor of the 5 ' half and the 3' half of each primer . groove binder (MGB ) . In some cases, the quencher is a DNA Polymerase fluorescent quencher . In other cases, the quencher is a Any DNA polymerase that catalyzes primer extension can non - fluorescent quencher (NFQ ) . be used including but not limited to E . coli DNA poly - probe can be about, or at least about, 5 , 6 , 7 , 8 , 9 , 10 , merase , Klenow fragment of E . coli DNA polymerase 1, T7 30 11 , 12, 13 , 14 , 15, 16 , 17 , 18 , 19 , 20 , 21, 22 , 23, 24 , 25 , 26 , DNA polymerase , T4 DNA polymerase , Taq polymerase, 27 , 28 , 29, 30 , 31 , 32, 33 , 34 , 35, 36 , 37 , 38 , 39, or 40 bases Pfu DNA polymerase , Pfx DNA polymerase , Tth DNA long . A probe can be about 8 to about 40 , about 10 to about polymerase , Vent DNA polymerase , bacteriophage 29 , 40 , about 10 to about 35 , about 10 to about 30 , about 10 to REDTaqTM Genomic DNA polymerase , or sequence . A about 25 , about 10 to about 20 , about 15 to about 40 , about thermostable DNA polymerase can be used . The DNA 35 15 to about 35 , about 15 to about 30 , about 15 to about 25 , polymerase can have 3 ' to 5 ' exonuclease activity . The DNA about 15 to about 20 , about 18 to about 40 , about 18 to about polymerase can possess 5 ' to 3 ' exonuclease activity . The 35, about 18 to about 30 , about 18 to about 25 , or about 18 DNA polymerase can possess both 3 to 5 ' exonuclease to 22 bases . activity and 5 ' to 3 ' exonuclease activity . Reagents and Additives Thermocycling 40 Solution and reagents for performing a PCR reaction can Any number of PCR cycles can be used to amplify the include buffers . The buffered solution can comprise about, DNA , e . g ., about, more than about, or less than about 1 , 2 , more than about, or less than about 1 , 5 , 10 , 15 , 20 , 30 , 50 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11, 12 , 13 , 14 , 15 , 16 , 17 , 18 , 19 , 20 , 100 , or 200 mM Tris . In some cases, the concentration of 21, 22 , 23 , 24 , 25 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 , 35 , 36 , potassium chloride can be about, more than about, or less 37 , 38 , 39, 40 , 41, 42 , 43 , 44 or 45 cycles . The number of 45 than about 10 , 20 , 30 , 40 , 50 , 60 , 80 , 100 , 200 mM . The amplification cycles can be about 1 to about 45 , about 10 to buffered solution can comprise about 15 mM Tris and 50 about 45 , about 20 to about 45 , about 30 to about 45 , about mM KC1. The nucleotides can comprise deoxyribonucle 35 to about 45 , about 10 to about 40 , about 10 to about 30 , otide triphosphate molecules, including dATP, DCTP, dGTP , about 10 to about 25 , about 10 to about 20 , about 10 to about dTTP , in concentrations of about, more than about, or less 15 , about 20 to about 35 , about 25 to about 35 , about 30 to 50 than about 50 , 100 , 200 , 300 , 400 , 500 , 600 , or 700 UM each . about 35 , or about 35 to about 40 . In some cases , a non - canonical nucleotide , e . g ., DUTP is Thermocycling reactions can be performed on samples added to amplification reaction to a concentration of about, contained in droplets . The droplets can remain intact during more than about , or less than about 50 , 100 , 200 , 300 , 400 , thermocycling . Droplets can remain intact during thermo - 500 , 600 , or 700 , 800 , 900 , or 1000 uM . In some cases , cycling at densities of greater than about 10 , 000 droplets / 55 magnesium chloride (MgCl , ) is added to an amplification mL , 100 ,000 droplets /mL , 200 , 000 droplets/ mL , 300 ,000 reaction at a concentration of about , more than about, or less droplets /mL , 400 ,000 droplets /mL , 500 ,000 droplets /mL , than about 1 .0 , 2 .0 , 3 .0 , 4 .0 , or 5 . 0 mM . The concentration 600 , 000 droplets / mL , 700 , 000 droplets /mL , 800 , 000 drop of MgCl, can be about 3 . 2 mM . lets /mL , 900 ,000 droplets /mL or 1 , 000 ,000 droplets /mL . In A non - specific blocking agent such as BSA or gelatin other cases , two or more droplets may coalesce during 60 from bovine skin can be used , wherein the gelatin or BSA thermocycling . In other cases, greater than 100 or greater is present in a concentration range of approximately 0 . 1 to than 1 ,000 droplets may coalesce during thermocycling . about 0 . 9 % w / v . Other possible blocking agents can include Probes betalactoglobulin , casein , dry milk , or other common block Universal probes can be designed by methods known in ing agents . In some cases , preferred concentrations of BSA the art . In some embodiments , the probe is a random 65 and gelatin are about 0 . 1 % w / v . sequence . The universal probe can be selected to ensure that In some embodiments , an amplification reaction can also it does not bind the target polynucleotide in an assay, or to comprise one or more additives including , but not limited to , US 10 , 167 ,509 B2 57 58 non - specific background /blocking nucleic acids ( e . g ., Computers salmon sperm DNA ) , biopreservatives ( e . g . sodium azide ), Following acquisition of fluorescence detection data , a PCR enhancers ( e . g . Betaine, Trehalose, etc . ) , and inhibitors computer can be used to store and process the data . A ( e . g . RNAse inhibitors ) . The one or more additives can computer -executable logic can be employed to perform such include , e . g . , 2 -pyrrolidone , acetamide , N -methylpyrolidone 5 functions as subtraction of background fluorescence , assign ( NMP ), B -hydroxyethylpyrrolidone (HEP ), propionamide , ment of target and /or reference sequences , and quantifica NN - dimethylacetamide ( DMA ), N -methylformamide tion of the data . A computer can be useful for displaying , storing , retrieving , or calculating diagnostic results from the (MMP ), NN -dimethylformamide (DMF ) , formamide, molecular profiling ; displaying , storing , retrieving , or cal N -methylacetamide (MMA ), dimethyl sulfoxide (DMSO ), 10 culating raw data from genomic or nucleic acid expression polyethylene glycol, betaine , tetramethylammonium chlo analysis ; or displaying , storing , retrieving , or calculating any ride ( TMAC ) , 7 -deaza -2 '- deoxyguanosine , bovine serum sample or patient information useful in the methods of the albumin (BSA ), T4 gene 32 protein , glycerol, or nonionic present invention . detergent ( Triton X - 100 , Tween 20 , Nonidet P - 40 (NP -40 ) Digital Analysis Tween 40 , SDS (e .g ., about 0 . 1 % SDS )) , salmon sperm 15 A digital resreadout assay, e . g . , digital PCR , can be used to DNA , sodium azide, betaine ( N , N , N -trimethylglycine ; [car count targets ( e . g ., target nucleic acid sequences ) by parti boxymethyl ] trimethylammonium ) , formamide , trehalose , tioning the targets in a sample and identifying partitions dithiothreitol (DTT ), betamercaptoethanol (BME ), a plant containing the target. A digital readout is an all or nothing polysaccharide , or an RNase inhibitor. analysis in that it specifies whether a given partition contains In some embodiments , an amplification reaction com - 20 the target of interest , but does not necessarily indicate how prises one or more buffers. The one or more buffers can many copies of the target are in the partition . For example , comprise , e .g ., TAPS , bicine , Tris , Tricine, TAPSO , HEPES , a single polynucleotide containing two targets can be in a TES , MOPS , PIPES , cacodylate , SSC , ADA , ACES , partition , but under normal analysis conditions , the partition cholamine chloride , acetamidoglycine , glycinamide, will only be considered to contain one target. If the targets maleate , phosphate , CABS , piperidine , glycine , citrate , gly - 25 on the same polynucleotide are separated by a large number cylglycine , malate , formate , succinate , acetate , propionate , of base pairs , some of the target nucleic acid sequences may pyridine , piperazine , histidine , bis - tris , ethanolamine , car - be separated by fragmentation during purification of a bonate , MOPSO , imidazole , BIS - TRIS propane , BES , sample — some linked target nucleic acid sequences may not MOBS , triethanolamine ( TEA ), HEPPSO , POPSO , hydra remain physically linked after sample preparation . Digital zine , Trizma (tris ), EPPS , HEPPS , bicine, HEPBS, AMPSO , 30 PCR is described generally , e . g. , at Vogelstein and Kinzler taurine (AES ) , borate , CHES , 2 - amino - 2 - methyl- 1 -propanol ( 1999 ) PNAS 96 :9236 - 9241. Applications of this technology ( AMP ) , ammonium hydroxide , methylamine, or MES . include, e . g ., high - resolution CNV measurements , follow -up In some cases , a non -ionic Ethylene Oxide /Propylene to genome- wide association studies , cytogenetic analysis , Oxide block copolymer is added to an amplification reaction CNV alterations in cancerous tissue , and CNV linkage in a concentration of about 0 . 1 % , 0 . 2 % , 0 . 3 % , 0 . 4 % , 0 . 5 % , 35 analysis . 0 . 6 % , 0 . 7 % , 0 . 8 % , 0 . 9 % , or 1 . 0 % . Common biosurfactants In general, dPCR can involve spatially isolating (or par include non - ionic surfactants such as Pluronic F -68 , Tetron - titioning ) individual polynucleotides from a sample and ics , Zonyl FSN . Pluronic F -68 can be present at a concen carrying out a polymerase chain reaction on each partition . tration of about 0 . 5 % w / v . The partition can be , e . g . , a well ( e . g ., wells of a microwell In some cases magnesium sulfate can be substituted for 40 plate ), capillary , dispersed phase of an emulsion , a chamber magnesium chloride , at similar concentrations . A wide range ( e . g ., a chamber in an array of miniaturized chambers ), a of common , commercial PCR buffers from varied vendors droplet , or a nucleic acid binding surface . The sample can be can be substituted for the buffered solution . distributed so that each partition has about 0 , 1 , or 2 target Detection polynucleotides . Each partition can have , on average , less Fluorescence detection can be achieved using a variety of 45 than 5 , 4 , 3 , 2 , or 1 copies of a target nucleic acid per detector devices equipped with a module to generate exci- partition ( e . g . , droplet ). In some cases , at least 0 , 1 , 2 , 3 , 4 , tation light that can be absorbed by a fluorescer, as well as 5 , 6 , 7 , 8 , 9 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 90 , 100 , 125 , 150 , a module to detect light emitted by the fluorescer. In some 175 , or 200 partitions ( e . g ., droplets ) have zero copies of a cases , samples ( such as droplets ) may be detected in bulk . target nucleic acid . After PCR amplification , the number of For example , samples can be allocated in plastic tubes that 50 partitions with or without a PCR product can be enumerated . are placed in a detector that measures bulk fluorescence The total number of partitions can be about, or more than from plastic tubes . In some cases, one or more samples ( such about, 500, 1000, 2000 , 3000, 4000 , 5000 , 6000 , 7000 , as droplets ) can be partitioned into one or more wells of a 8000 , 9000 , 10 , 000 , 11, 000 , 12 ,000 , 13 , 000 , 14 ,000 , 15 , 000 , plate , such as a 96 -well or 384 -well plate , and fluorescence 16 ,000 , 17 , 000 , 18 ,000 , 19 , 000 , 20 , 000 , 30 , 000 , 40 , 000 , of individual wells may be detected using a fluorescence 55 50 ,000 , 60 , 000 , 70 ,000 , 80 ,000 , 90 , 000 , 100 ,000 , 150 ,000 , plate reader. 200 , 000 , 500 ,000 , 750 , 000 , or 1 ,000 ,000 . The total number In some cases, the detector further comprises handling of partitions can be about 500 to about 1 , 000 ,000 , about 500 capabilities for droplet samples , with individual droplets to about 500 , 000 , about 500 to about 250, 000 , about 500 to entering the detector, undergoing detection , and then exiting about 100 ,000 , about 1000 to about 1 , 000 ,000 , about 1000 the detector . For example , a flow cytometry device can be 60 to about 500 ,000 , about 1000 to about 250 ,000 , about 1000 adapted for use in detecting fluorescence from droplet to about 100 , 000 , about 10 ,000 to about 1 , 000 ,000 , about samples . In some cases , a microfluidic device equipped with 10 , 000 to about 100 , 000 , or about 10 , 000 to about 50 , 000 . pumps to control droplet movement is used to detect fluo In one embodiment, the digital PCR is droplet digital rescence from droplets in single file . In some cases, droplets PCR . In some embodiments of a droplet digital PCR experi are arrayed on a two - dimensional surface and a detector 65 ment, less than about 0 . 00001, 0 . 00005 , 0 . 00010 , 0 . 00050 , moves relative to the surface, detecting fluorescence at each 0 .001 , 0 .005 , 0 .01 , 0 .05 , 0 . 1 , 0 . 5 , 1 , 2 , 2 . 5 , 3 , 3 . 5 , 4 , 4 . 5 , 5 , position containing a single droplet. 6 , 7, 8 , 9 , or 10 copies of target polynucleotide can detected . US 10 , 167, 509 B2 59 60 In some cases, less than about 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11 , from RNA of the subject , among others . The nucleic acid 12, 13, 14 , 15 , 20 , 25, 30 , 35 , 40 , 45 , 50 , 55, 60 , 65 , 70 , 75 , may have any suitable average length . Generally , the aver 80 , 85 , 90 , 95 , 100 , 150 , 200 , 250 , 300 , 350 , 400 , 450 , or 500 age length is substantially greater than the distance on a copies of a target polynucleotide are detected . In some cases, chromosome between the polymorphic loci to be analyzed . the droplets described herein are generated at a rate of 5 With this average length , alleles linked in the subject are also greater than 1 , 2 , 3 , 4 , 5 , 10 , 50 , 100 , 200 , 300 , 400 , 500 , linked frequently in the isolated nucleic acid and thus tend 600 , 700 , 800 , 900 , or 1000 droplets /second . to distribute together to the same volumes when the aqueous Droplet digital PCR (ddPCRTM ) can offer a practical phase is partitioned . In some embodiments , each primer set solution for validating copy number variations identified by may be capable of amplifying at least a pair of distinct next generation sequencers and microarrays . Methods using 10 alleles from a polymorphic locus. ddPCRTM can empower one person to screen many samples . Each volume may be partitioned to contain any suitable e . g ., hundreds of samples , for CNV analysis in a single work average concentration of nucleic acid . Generally , the process shift . In one embodiment , a ddPCRTM workflow is provided of partitioning, in combination with a suitable starting that involves using one or more restriction enzymes to concentration of the nucleic acid in the aqueous phase , separate tandem copies of a target nucleic acid sequence 15 produces volumes that have an average of less than about prior to assembling a duplex TaqMan® assay that includes several genome equivalents of the nucleic acid per volume. reagents to detect both the target nucleic acid sequence ( e . g ., Although the method may be performed with an average of a first gene ) and a single - copy reference nucleic acid more than one genome equivalent per volume ( e . g . , about sequence (e . g . , a second gene ). When ddPCR is used , the two genome equivalents per volume) , the analysis generally reaction mixture can then be partitioned into about, less than 20 becomes more efficient and reliable , with less background , about, or more than about, 500 , 1000 , 2000 , 3000 , 4000 , by limiting the concentration to an average of less than about 5000 , 6000, 7000 , 8000 , 9000 , 10 ,000 , 11 , 000 , 12 , 000 , one genome equivalent per volume. Accordingly , each vol 13 ,000 , 14 , 000 , 15 , 000 , 16 ,000 , 17 , 000 , 18 ,000 , 19 , 000 , ume may contain on average less than about one copy or 20 ,000 , 30 , 000 , 40 ,000 , 50 ,000 , 60 , 000 , 70 , 000 , 80 , 000 , molecule of a target region that includes each polymorphic 90 ,000 , 100 ,000 , 150 ,000 , 200 ,000 , 500 , 000 , 750 ,000 , 25 locus and /or an average of less than about one copy of any 1 ,000 ,000 , 2 ,000 , 000, 3 , 000, 000 , 4 ,000 ,000 , 5 ,000 , 000 , allele sequence of each polymorphic locus. 6 ,000 ,000 , 7 ,000 ,000 , 8 ,000 , 000 , 9 ,000 ,000 , or 10 ,000 ,000 An integrated , rapid , flow -through thermal cycler device nanoliter droplets that can be thermo- cycled to end -point can be used in the methods described herein . See , e. g ., before being analyzed . In some cases, the droplets are International Application No . PCT /US2009 / 005317 , filed greater than one nanoliter; in other cases , the droplets are 30 Sep . 23 , 2009 . In such an integrated device , a capillary is less than one nanoliter ( e . g . , picoliter ) . The number of wound around a cylinder that maintains 2 , 3 , or 4 tempera droplets per reaction can be about 1000 to about 1 ,000 ,000 , ture zones. As droplets flow through the capillary , they are about 1000 to about 750 , 000 , about 1000 to about 500 , 000 , subjected to different temperature zones to achieve thermal about 1000 to about 250 ,000 , about 1000 to about 100 ,000 , cycling . The small volume of each droplet results in an about 1000 to about 50 , 000 , about 1000 to about 30 , 000 , 35 extremely fast temperature transition as the droplet enters about 1000 to about 10 ,000 , about 10 ,000 to about 1 , 000 , each temperature zone . 000 , about 10 , 000 to about 750 ,000 , about 10 ,000 to about A digital PCR device ( e . g . , droplet digital PCR device ) for 500 ,000 , about 10 ,000 to about 250 ,000 , about 10 , 000 to use with the methods, compositions, and kits described about 100 ,000 , about 10 ,000 to about 50 , 000 , or about herein can detect multiple signals (see e . g . U . S . Provisional 10 ,000 to about 30 ,000 . The number of droplets per reaction 40 Patent Application No. 61/ 454 ,373 , filed Mar. 18 , 2011 , can be about 20 , 000 to about 1 , 000 ,000 , about 20 , 000 to herein incorporated by reference in its entirety ) . about 750 , 000 , about 20 ,000 to about 500 ,000 , about 20 ,000 Droplet digital PCR can involve the generation of thou to about 250 ,000 , about 20 ,000 to about 200 ,000 , about sands of discrete , robust microdroplet reactors per second . 20 ,000 to about 50 , 000 , about 50 , 000 to about 100 ,000 , ddPCR can involve standard thermal cycling with installed about 50 , 000 to about 200 , 000 ; or about 50 , 000 to about 45 base instruments , which can make digital data accessible 300 ,000 . immediately to researchers . Rapid interrogation of each An analysis can occur in a two - color reader . The fraction droplet can yield counts of target molecules present in the of positive - counted droplets can enable the absolute con - initial sample . centrations for the target and reference nucleic acid FIG . 16 illustrates an example of a general workflow for sequences ( e . g ., genes ) to be measured . This information can 50 a ddPCR experiment. As shown in FIG . 16 , the process can be used to determine a relative copy number. For example , start by partitioning a sample into multiple partitions ( e . g ., at least 20 , 000 PCR replicates per well can provide the droplets ), followed by thermal cycling the sample in a statistical power to resolve higher -order copy number dif - thermal cycler . The fluorescence of the droplets can then be ferences . This low - cost method can reliably generate copy detected using a reader ( e . g ., an optical reader ) . number measurements with 95 % confidence intervals that 55 Droplet Generation span integer without overlap of adjacent copy number states . The present disclosure includes compositions and meth This technology is capable of determining the linkage of ods using droplet digital PCR . The droplets described herein copy number variants , and it can be used to determine include emulsion compositions ( or mixtures of two or more whether gene copies are on the same or different chromo - immiscible fluids) described in U . S . Pat. No . 7 ,622 , 280 , and somes . 60 droplets generated by devices described in International The volumes may have any suitable size . In some embodi Application No . PCT/ US2009 /005317 , filed Sep . 23 , 2009 . ments , the volumes may have a diameter or characteristic The term emulsion , as used herein , can refer to a mixture of cross - sectional dimension of about 10 to 1000 micrometers . immiscible liquids ( such as oil and water ) . Oil- phase and /or The nucleic acid that is partitioned may have any suitable water- in -oil emulsions allow for the compartmentalization characteristics . The nucleic acid may include genetic mate - 65 of reaction mixtures within aqueous droplets. The emulsions rial of the subject ( e . g . , the subject' s genomic DNA and /or can comprise aqueous droplets within a continuous oil RNA ) , messenger RNA of the subject, and /or cDNA derived phase. The emulsions provided herein can be oil- in -water US 10 , 167 ,509 B2 61 emulsions, wherein the droplets are oil droplets within a about 0 .2 to about 1 .0 , about 0 .3 to about 1. 0 , about 0 . 4 to continuous aqueous phase . The droplets provided herein are about 1 . 0 , or about 0 . 5 to about 1 . 0 uM . The concentration designed to prevent mixing between compartments , with of primers can be about 0 . 5 UM . The aqueous phase can each compartment protecting its contents from evaporation comprise one or more probes for fluorescent detection , at a and coalescing with the contents of other compartments . 5 concentration of about , more than about, or less than about The mixtures or emulsions described herein can be stable 0 .05 , 0 . 1 , 0 . 2 , 0 . 3 , 0 . 4 , 0 . 5 , 0 .6 , 0 . 7 , 0 . 8 , 0 . 9 , 1 . 0 , 1 . 2 , 1 . 4 , or unstable . The emulsions can be relatively stable and have 1 . 6 , 1 . 8 , or 2 . 0 uM . The aqueous phase can comprise one or minimal coalescence . Coalescence occurs when small drop - more probes for fluorescent detection , at a concentration of lets combine to form progressively larger ones . In some about 0 .05 to about 2 .0 , about 0 . 1 to about 2 .0 , about 0 . 25 cases, less than about 0 .00001 % , 0 . 00005 % , 0 .00010 % , 10 to about 2 . 0 , about 0 . 5 to about 2 . 0 , about 0 . 05 to about 1 , 0 .00050 % , 0 . 001 % , 0 .005 % , 0 .01 % , 0 .05 % , 0 . 1 % , 0 . 5 % , about 0 . 1 to about 1, or about 0 . 1 to about 0 . 5 uM . The 1 % , 2 % , 2 . 5 % , 3 % , 3 . 5 % , 4 % , 4 . 5 % , 5 % , 6 % , 7 % , 8 % , 9 % , concentration of probes for fluorescent detection can be or 10 % of droplets generated from a droplet generator about 0 . 25 uM . Amenable ranges for target nucleic acid coalesce with other droplets . The emulsions can also have concentrations in PCR are between about 1 pg and about 500 limited flocculation , a process by which the dispersed phase 15 ng . comes out of suspension in flakes . The oil phase can comprise a fluorinated base oil which Splitting a sample into small reaction volumes as can be additionally stabilized by combination with a fluo described herein can enable the use of reduced amounts of rinated surfactant such as a perfluorinated polyether . In some reagents , thereby lowering the material cost of the analysis . cases , the base oil can be one or more of HFE 7500 , FC -40 , Reducing sample complexity by partitioning also improves 20 FC - 43 , FC - 70 , or another common fluorinated oil . In some the dynamic range of detection because higher -abundance cases , the anionic surfactant is Ammonium Krytox (Krytox molecules are separated from low - abundance molecules in AM ) , the ammonium salt of Krytox FSH , or morpholino different compartments , thereby allowing lower- abundance derivative of Krytox - FSH . Krytox - AS can be present at a molecules greater proportional access to reaction reagents, concentration of about, more than about, or less than about which in turn enhances the detection of lower- abundance 25 0 .1 % , 0 .2 % , 0 . 3 % , 0 .4 % , 0 .5 % , 0. 6 % , 0 . 7 % , 0 .8 % , 0 . 9 % , molecules. 1 . 0 % , 2 . 0 % , 3 . 0 % , or 4 . 0 % w / w . In some preferred embodi Droplets can be generated having an average diameter of ments , the concentration of Krytox - AS is 1 . 8 % . In other about, less than about, or more than about 0 .001 , 0 .01 , 0 .05 , preferred embodiments , the concentration of Krytox -AS is 0 . 1, 1 , 5 , 10 , 20 , 30 , 40 , 50 , 60 , 70 , 80 , 100 , 120 , 130 , 140 , 1 .62 % . Morpholino derivative of Krytox -FSH can be pres 150 , 160 , 180 , 200 , 300 , 400 , or 500 microns . Droplets can 30 ent at a concentration of about 0 . 1 % , 0 . 2 % , 0 . 3 % , 0 . 4 % , have an average diameter of about 0 .001 to about 500 , about 0 . 5 % , 0 .6 % , 0 .7 % , 0 . 8 % , 0 . 9 % , 1. 0 % , 2 .0 % , 3 .0 % , or 4 . 0 % 0 .01 to about 500 , about 0 . 1 to about 500 , about 0 . 1 to about w / w . The concentration of morpholino derivative ofKrytox 100 , about 0 .01 to about 100 , or about 1 to about 100 FSH can be about 1 . 8 % . The concentration of morpholino microns. Microfluidic methods of producing emulsion drop - derivative of Krytox -FSH can be about 1 .62 % . lets using microchannel cross - flow focusing or physical 35 The oil phase can further comprise an additive for tuning agitation are known to produce either monodisperse or the oil properties , such as vapor pressure or viscosity or polydisperse emulsions. The droplets can be monodisperse surface tension . Nonlimiting examples include perfluoro droplets . The droplets can be generated such that the size of octanol and 1H , 1H , 2H , 2H -Perfluorodecanol . 1H , 1H ,2H , said droplets does not vary by more than plus or minus 5 % 2H - Perfluorodecanol can be added to a concentration of of the average size of said droplets . In some cases, the 40 about, more than about, or less than about 0 .05 % , 0 .06 % , droplets are generated such that the size of said droplets does 0 .07 % , 0 . 08 % , 0 .09 % , 1 .00 % , 1 . 25 % , 1 .50 % , 1 .75 % , not vary by more than plus or minus 2 % of the average size 2 . 00 % , 2 . 25 % , 2 . 50 % , 2 . 75 % , or 3 .00 % w / w . 1H , 1H , 2H , of said droplets . A droplet generator can generate a popu - 2H -Perfluorodecanol can be added to a concentration of lation of droplets from a single sample , wherein none of the about 0 . 18 % w /w . droplets vary in size by more than plus or minus about 0 . 1 % , 45 The emulsion can be formulated to produce highly 0 . 5 % , 1 % , 1 . 5 % , 2 % , 2 . 5 % , 3 % , 3 . 5 % , 4 % , 4 . 5 % , 5 % , 5 . 5 % , monodisperse droplets having a liquid - like interfacial film 6 % , 6 . 5 % , 7 % , 7 . 5 % , 8 % , 8 . 5 % , 9 % , 9 . 5 % , or 10 % of the that can be converted by heating into microcapsules having average size of the total population of droplets . a solid - like interfacial film ; such microcapsules can behave Higher mechanical stability can be useful for microfluidic as bioreactors able to retain their contents through a reaction manipulations and higher- shear fluidic processing ( e . g . , in 50 process such as PCR amplification . The conversion to microfluidic capillaries or through 90 degree turns, such as microcapsule form can occur upon heating . For example , valves , in a fluidic path ). Pre- and post -thermally treated such conversion can occur at a temperature of greater than droplets or capsules can be mechanically stable to standard about 50 , 60 , 70 , 80 , 90 , or 95 degrees Celsius. In some cases pipet manipulations and centrifugation . this heating occurs using a thermocycler . During the heating A droplet can be formed by flowing an oil phase through 55 process, a fluid ormineral oil overlay can be used to prevent an aqueous sample. The aqueous phase can comprise a evaporation . Excess continuous phase oil may or may not be buffered solution and reagents for performing a PCR reac - removed prior to heating . The biocompatible capsules can be tion , including nucleotides , primers , probe ( s ) for fluorescent resistant to coalescence and /or flocculation across a wide detection , template nucleic acids, DNA polymerase enzyme, range of thermal and mechanical processing . and optionally , reverse transcriptase enzyme. 60 Following conversion , the capsules can be stored at about, The aqueous phase can comprise one or more buffers more than about, or less than about 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 15 , and / or additives described herein . 20 , 25 , 30 , 35 , or 40 degrees Celsius. These capsules can be Primers for amplification within the aqueous phase can useful in biomedical applications, such as stable , digitized have a concentration of about, more than about , or less than encapsulation of macromolecules , particularly aqueous bio about 0 .05 , 0 . 1 , 0 . 2 , 0 . 3 , 0 . 4 , 0 . 5 , 0 . 6 , 0 . 7 , 0 . 8 , 0 . 9 , 1 . 0 , 1 . 2 , 65 logical fluids containing a mix of nucleic acids or protein , or 1 . 5 , 1 . 7 , or 2 . 0 uM . Primer concentration within the aqueous both together ; drug and vaccine delivery ; biomolecular phase can be about 0 .05 to about 2 , about 0 . 1 to about 1. 0 , libraries ; clinical imaging applications, and others . US 10 , 167 ,509 B2 63 64 The microcapsules can contain one or more polynucle applications can be used to diagnose , predict, determine , or otides and may resist coalescence , particularly at high tem assess the nucleic acids in an embryo produced by in vitro peratures . Accordingly, PCR amplification reactions can fertilization or other assisted reproductive technology . Fur occur at a very high density ( e . g . , number of reactions per thermore , the methods provided herein can be used to unit volume ) . In some cases , greater than about 100 , 000 , 5 provide information to an expectant parent ( e . g . , a pregnant 500 , 000 , 1 , 000 ,000 , 1 ,500 ,000 , 2 , 000 ,000 , 2 ,500 , 000 , 5 , 000 , woman ) in order to assess CNV or genetic phasing within 000 , or 10 , 000 ,000 separate reactions can occur per ml. In the genome of a developing fetus. In other cases, the some cases, the reactions occur in a single well , e . g . , a well methods provided herein can be used to help counsel of a microtiter plate , without inter- mixing between reaction patients as to possible genetic attributes of future offspring . volumes . The microcapsules can also contain other compo - 10 In some cases , the methods can be used in connection with nents necessary to enable a PCR reaction to occur, e . g ., an Assisted Reproductive Technology . For example , the primers , probes, dNTPs, DNA or RNA polymerases, etc . information can be used to assess CNV or genetic phasing These capsules exhibit resistance to coalescence and floc - in a sample taken from an embryo produced by in vitro culation across a wide range of thermal and mechanical fertilization . processing . 15 One or more CNVs can be found in a cancer cell . For In one embodiment , droplet generation can be improved example , EGFR copy number can be increased in non -small after the size of DNA is reduced by, e . g . , digestion , heat cell lung cancer . CNVs can be associated with efficacy of a treatment, or shearing . therapy . For example , increased HER2 gene copy number FIG . 17 displays several images of droplets showing a ) can enhance the response to gefitinib therapy in advanced droplet formation as a droplet is pinched by inflow of oil 20 non - small cell lung cancer. See Cappuzzo F . et al . ( 2005 ) J . from the sides and b ) stretching /necking down as the droplet Clin . Oncol. 23 : 5007 - 5018. High EGFR gene copy number pulls away from the bulk fluid can predict for increased sensitivity to lapatinib and capecit FIG . 18 shows the effect of increasing DNA load . FIG . 18 abine . See Fabi et al. ( 2010 ) J . Clin . Oncol. 28 : 15s ( 2010 plots maximum extension versus flow rate . Extension is ASCO Annual Meeting ) . High EGFR gene copy number is measured from the center of the cross to the farthest extent 25 associated with increased sensitivity to cetuximab and pani of the droplet just as it breaks off Some droplet extension is tumumab . tolerable , but if it becomes excessive, a long " thread ” is In one embodiment, a method is provided comprising drawn that connects the droplet to the bulk fluid . As the determining number of copies of a target sequence using a droplet breaks off , this thread may collapse to microdroplets, method described herein , and designing a therapy based on leading to undesirable polydispersity . In extreme cases, the 30 said determination . In one embodiment, the target is EGFR , droplet does not break off ; instead the aqueous phase flows and the therapy comprises administration of cetuximab , as a continuous phase down the center of the channel, while panitumumab , lapatinib , and /or capecitabine . In another the oil flows along the channel walls, and no droplets are embodiment, the target is ERBB2, and the therapy com formed . prises trastuzumab (Herceptin ) . One way to decrease extension is to decrease flow rate . 35 Copy number variation can contribute to genetic variation Decreasing flow rate can have the undesirable side effects of among humans. See e . g . Shebat J. et al. ( 2004 ) Science 305 : lower throughput and also increased droplet size. The purple 525 - 528 . ( B ), teal ( E ) and green ( A ) curves have zero DNA . These Diseases associated with copy number variations can samples can tolerate high flow rates without substantially include , for example , DiGeorge /velocardiofacial syndrome increasing their extension into the channel. 40 (22q11 . 2 deletion ) , Prader - Willi syndrome ( 15q11 - q13 dele The blue ( D ) , orange ( F ) and red ( C ) curves have higher tion ), Williams- Beuren syndrome ( 7q11 . 23 deletion ) , DNA loads. For these conditions, higher flow rates cause Miller - Dieker syndrome (MDLS ) ( 17p13. 3 microdeletion ), droplet extension into the channel . Low flow rates can be Smith -Magenis syndrome (SMS ) ( 17p11 . 2 microdeletion ) , used to avoid excessive droplet extension . FIG . 19 shows Neurofibromatosis Type 1 (NF1 ) ( 17q11. 2 microdeletion ) , undigested samples 1 - 10 and digested samples 11 - 20 in an 45 Phelan -McErmid Syndrome ( 22q13 deletion ) , Rett syn experiment to investigate droplet properties . DNA load is drome ( loss -of - function mutations in MECp2 on chromo shown in the right- most column ; pressure ( roughly propor - some Xq28 ) , Merzbacher disease (CNV of PLP1 ) , spinal tional to flow rate ) is shown in the 2nd row . The table is color muscular atrophy (SMA ) (homozygous absence of telom and letter coded : J (RED ) indicates jetting , E ( YELLOW ) erec SMN1 on chromosome 5q13 ) , Potocki- Lupski Syn indicates extension , and N (GREEN ) indicates normal ( no 50 drome (PTLS , duplication of chromosome 17p . 11 . 2 ) . Addi jetting or extension ) droplet generation . As can be seen , tional copies of the PMP22 gene can be associated with digestion (with restriction enzymes ) resulted in improved Charcot- Marie - Tooth neuropathy type IA (CMT1A ) and droplet generation , even at high DNA loads and high flow hereditary neuropathy with liability to pressure palsies rates . (HNPP ) . The methods of detecting CNVs described herein Applications 55 can be used to diagnose CNV disorders described herein and The methods described herein can be used for diagnosing in publications incorporated by reference. The disease can or prognosing a disorder or disease . be a disease described in Lupski J . (2007 ) Nature Genetics The methods and compositions provided herein can be 39: S43 - S47 . useful for both human and non - human subjects. The appli - Aneuploides , e . g ., fetal aneuploidies , can include, e . g . , cations of the methods and compositions provided herein are 60 trisomy 13 , trisomy 18 , trisomy 21 (Down Syndrome) , numerous , e . g . , high - resolution CNV measurements , follow - Klinefelter Syndrome (XXY ) , monosomy of one or more up to genome -wide association studies, cytogenetic analysis , chromosomes ( X chromosome monosomy, Turner ' s syn CNV alterations in cancerous tissue , CNV linkage analysis , drome) , trisomy X , trisomy of one or more chromosomes, as well as haplotype analysis . tetrasomy or pentasomy of one or more chromosomes ( e . g . , The applications provided herein include applications for 65 XXXX , XXYY , XXXY, XYYY , XXXXX , XXXXY, diagnosing , predicting , determining or assessing the genetic XXXYY , XYYYY and XXYYY ) , triploidy ( three of every characteristics of a fetus or embryo . In some cases , the chromosome, e . g . 69 chromosomes in humans) , tetraploidy