US 2014023 5474Al (19) United States (12) Patent Application Publication (10) Pub. No.: US 2014/0235474 A1 Tang et al. (43) Pub. Date: Aug. 21, 2014

(54) METHODS AND PROCESSES FOR NON Publication Classi?cation INVASIVE ASSESSMENT OF A GENETIC VARIATION (51) Int. Cl. G06F 19/18 (2006.01) (75) Inventors: Lin Tang, San Diego, CA (US); Cosmin CIZQ 1/68 (2006.01) Deciu, San Diego, CA (US) (52) US. Cl. CPC ...... G06F 19/18 (2013.01); C12Q 1/6874 (73) Assignee: Sequenom, Inc., San Dlego, CA (US) (2013.01) USPC ...... 506/9; 702/19 (21) Appl. No.: 14/127,912 (57) ABSTRACT (22) PCT Filed; Jun, 20, 2012 Provided in part herein are methods and processes that can be used for non-invasive assessment of a genetic variation Which (86) PCT No.: PCT/US2012/043388 can lead to diagnosis of a particular medical condition or conditions. Such methods and processes can, for example, (2)§371(c)(1), (4) Date Apr. 7 2014 1.d enti.f y d.1ss1m1 . .1 antles.. or s1m1. .1 antles.. for one or more features ’ ' ’ between a sub] ect data set and a reference data set, generate a Related US. Application Data mult1d1mensional matrix, reduce'the'matnx mto a represen tation and classify the representation mto one or more groups. (60) Provisional application No. 61/500,842, ?led on Jun. Methods and processes described herein are applicable to 24, 2011. data in biotechnology and other ?elds.

LM Dissimilarity matrix (n x n)

O . . . . ‘6 I I I I . 2* 4- m '5 2_ o . . . 5 0.00 _ _>Z _> ‘2 OJ (3,08% . c_|a>SSIf|cation g- :6 -2_ @083 c O o _4_ In -0.05 m -6 3 O I O Pl:0 I I '8 | | | . 0.44 0.46 0.48 0.50 O , 5 1O d|m1 GC content Patent Application Publication Aug. 21, 2014 Sheet 1 0f 15 US 2014/0235474 A1

161.0 19.1 16;.2 16;.3 16;.4 I I I I—

' ' —17.0

I I _ seq.count.raw I I -16.8

—16.6 I I

I I 16.4— I ' I ' I I 16.3' I I

I I 162- seq.count

16.1—

16.0“l I

I I -16

I I —14

I I -12

' ' |ib.conc —10

I I —8

I I —6

I I —4 l l l I l l l l l | l l I 16.6 16.8 17.0 4 6 8 10121416 FIG. 1a Patent Application Publication Aug. 21, 2014 Sheet 2 0f 15 US 2014/0235474 A1

5 ,1 5 414 8 'a 5y5i512'4 04“ 6 313335: 11/4 0 6f6 33 0F '131. 6.O ‘ g8 03- 1 21 2-222231||1 63.6x 23 34 c- 1111 1H1 11 12 1 1 3 5 2 2 2

6 5 5 0.1—

6 6 | | | | 0.42 0.44 0.46 0.48 GC content FIG. 1b Patent Application Publication Aug. 21, 2014 Sheet 3 0f 15 US 2014/0235474 A1

cowmoEwwEo Al

AINAI

5: FNEQo omdwwdmic1‘0 OHn_o 00EoEoo

O | | | LO O LO C) O. Q o Q' onea Kigsuaq bag 501

Patent Application Publication Aug. 21, 2014 Sheet 5 0f 15 US 2014/0235474 A1

A

O

O O

O

O O O

-10- ° A Reference: T21 o 0 Reference: Normal 0 Test

0 '15 | | | | | 0 10 20 30 40 dim1 FIG. 4 Patent Application Publication Aug. 21, 2014 Sheet 6 0f 15 US 2014/0235474 A1

chr18 dhn1

10

chr13 FIG.5a

chr21 dhn1 Patent Application Publication Aug. 21, 2014 Sheet 7 0f 15 US 2014/0235474 A1

ROC: LM—MDS

Q 0.6— .2 '5 C $ 0.4

—AUC T21 :O.9944 02 _ --- AUC T13 : 0.9597 ---- ~ AUC T18 : 0.9451

0.0 — | | | | | | 0.0 0.2 0.4 0.6 0.8 1.0 1—Specificity FIG. 5b Patent Application Publication Aug. 21, 2014 Sheet 8 0f 15 US 2014/0235474 A1

0000ow0.05.00000.00P0P0P000P.0..0..P00.0000...0.....0. ..0..000000....0.00.0P00...0....0N0060PP.00P00N0P0.0...... 0...00..00000000000.00P000.00.000PP.0 .“0...0..P0000000000.0000P000.00P0PP.000.0.n P00.00..H“0...0000000P.0P00.0N00PP.0P0.000.0PNO000P0000PP.0 QH—0..0000PPUO00.0“0.00.000P0owP.0P0.000.00P0...0.PP0000000.00P.0 .00....nPN.0.P0.00.“.0.000090P000P0NPO0N00.000.00000P.00P0P0.0 N00000000P0.000.0P0000000.00.00....0. 000000N0N00P0000.0P.00P0PPP.0R000.00.00000.....0..“0...n .....0..0.0000000P.000000.0PP00.000.00000.0U 00.0H.0..0....0“000000000.000PPP.000.0P008.0H. “.0...00P000P0000.000000.0R0PP0.0PP.000 55%00.00P0PP.00.0.0...0.....0.P00.0000....0..P00:0R00P.0 BUB-mlogu“¢3_N>lumwulu .00..0.0.P0080P000005.0“.0.PP.0P00.00.0P0.00..0P0.. ..0.00080..0.00.00..0000P0P0.000.0P00P0.0 .0..0.....0."00000000.20.00.0000P005.0P.n0...0P.0N0...0.0.000.0.50. ..0...PP.0..P.0.0 ...00..000000000.0000P00P00.0.0 ..0H.0H .0...0....P000ow.“0..."0...00.0.60P...0.0P00P.000.0000P0060.P0.0H...... 0..00.....0.“.0.00000000.00.00000.0.00P.0P.0S00.0.80 H.0....H0H.....0H..0.0.....0....0...H0....0....H...0..H.0.....0H..H0....0H0...0..H.0..m..0.....0.H.0H.00.00088P...... 000080P00P000.0P0.0PP.060500....0.H 0.00000.6000.0PP.00P.00N0P000.000000.0...0P...0. ...0H000000H.h.0..00.0PP.00.0P000000...P 0.00000000.0P0000.00.00P0....0...... 0.PH ..0.NOPOQw00.0PP.00.0P00P000..0...... 0H00POQ00P.00.00.00P0P0000P00.0N00 m0...... 0U....0...m0..0m...0...0.....0.....0...m.0..0...... 0U.....0....H.0...0....m.0...0..0280 P.. mmwwmwmmmmmmmmmmmmmmmmmw mmmmmmmmmmmmmmmmmmmmmmmm 000000..n0..u..0.....0...0H.H0..w0u...0....0..x..0...... 0.....0....H0....0..H....0..0.0T...0...P“.0.“0.500..0..00.0. Patent Application Publication Aug. 21, 2014 Sheet 9 0f 15 US 2014/0235474 A1 Patent Application Publication Aug. 21, 2014 Sheet 10 0f 15 US 2014/0235474 A1

z—score based method: FC14,15,17,18,30&34 (chr21)

0.0145— A A A + uniplex 14 X uniplex 15 A V uniplex 17 A _ ale uniplex18 g 0-0140 0 4—plex F030 A Q 0 4—plex F634 A é A 21 3 0.0135— A G) E ...... o 1!) E A x xx 2 0.0130— ° ale 9% .C O ° 00 2;o oo co 0 X w *5

0.0125— 0 0 ° <> $5 8 o 0 ++ 0 Q 06> 0

0.0120— I ° I I I I 10.0 10.5 11.0 11.5 12.0 log sequence count

FIG. 7

Patent Application Publication Aug. 21, 2014 Sheet 12 0f 15 US 2014/0235474 A1

ROC: z—score

b :“"' E*5, 0.6— _:

= : G) I" w f'

—AUC (T21): 0.9913 0.2- .-1 ---AUC (T13): 0.7533 ' ------AUC (T18): 0.933

0.0 l l l l l l 0.0 0.2 0.4 0.6 0.3 1.0 1—Specificity FIG. 9a Patent Application Publication Aug. 21, 2014 Sheet 13 0f 15 US 2014/0235474 A1

ROC: z—score (GC corrected)

B“ 0.8—

0.6 .3; C 0) w 0.4—

—AUC (T21): 0.9948 0.2— --- AUC (T13): 0.9803 ------AUC (T18): 0.9577

0.0— | | | | | | 0.0 0.2 0.4 0.6 0.8 1.0 1—Specificity FIG. 9b Patent Application Publication Aug. 21, 2014 Sheet 14 of 15 US 2014/0235474 A1

ROC: LM—based Gender Prediction 1.0—

0.8— Sensitivity 0.6—

0.2— AUC: 0.987

0.0— | | 0.0 0.2 0.4 0.6 0.8 1.0 1—Specificity FIG. 10 Patent Application Publication Aug. 21, 2014 Sheet 15 0f 15 US 2014/0235474 A1

0.5

0.4—

O :EEIQEUE O 3.2 _

0.1—

Fet frac est from se quencing FIG. 11 US 2014/0235474 A1 Aug. 21,2014

METHODS AND PROCESSES FOR NON cental mRNA, DNA, or DNA methylation patterns. Further, INVASIVE ASSESSMENT OF A GENETIC some DNA sequences may predispose an individual to any of VARIATION a number of diseases such as diabetes, arteriosclerosis, obe sity, various autoimmune diseases and cancer (e.g., colorec RELATED PATENT APPLICATIONS tal, breast, ovarian, lung). [0001] This patent application is a national stage of inter SUMMARY national patent application number PCT/US2012/043388, ?led Jun. 20, 2012, entitled METHODS AND PROCESSES [0006] The invention provides in part a method for non FOR NON-INVASIVE ASSESSMENT OF A GENETIC invasive assessment of a genetic variation comprising: (a) VARIATION, naming Lin TANG and Cosmin DECIU as identifying one or more dissimilarities for a feature between inventors, and designated by Attorney Docket No. SEQ a subject data set and a reference data set by a statistical 6032-PC, which claims the bene?t of US. Provisional Patent analysis wherein the subject data set comprises genomic Application No. 61/500,842, ?led Jun. 24, 2011, entitled nucleic acid sequence information of a sample from a subject METHODS AND PROCESSES FOR NON-INVASIVE and the reference data set comprises genomic nucleic acid ASSESSMENT OF A GENETIC VARIATION, naming Lin sequence information of a biological specimen from one or TANG and Cosmin DECIU as inventors, and designated by more reference persons; (b) generating a multidimensional Attorney Docket No. SEQ-6032-PV. The entire content of the matrix from the dissimilarities; (c) reducing the multidimen foregoing patent applications are incorporated herein by ref sional matrix into a reduced data set representation of the erence, including all text, tables, and drawings. matrix; (d) classifying into one or more groups the reduced data set representation by one or more linear modeling analy FIELD sis algorithms thereby providing a classi?cation; and (e) [0002] Technology provided herein relates in part to meth determining the presence or absence of a genetic variation for ods and processes for non-invasive assessment of a genetic the sample based on the classi?cation. In some embodiments variation. the method further comprises obtaining genomic nucleic acid sequence information of a sample from a subject and obtain BACKGROUND ing genomic nucleic acid sequence information of a biologi cal specimen from one or more reference persons. In certain [0003] Genetic information of all living organisms (e.g., embodiments, the method further comprises receiving the animals, plants and microorganisms) and other forms of rep subject data set and the reference data set. In some embodi licating genetic information like viruses is encoded in deox ments, the genetic variation is a fetal . In certain yribonucleic acid (DNA) or ribonucleic acid (RNA). Genetic embodiments, the genetic variation is a fetal gender. In other information is the succession of nucleotides or modi?cations embodiments, the genetic variation is a fetal fraction estima thereof representing the primary structure of real or hypo tion. In certain embodiments, the subject is a pregnant female thetical DNA/RNA molecule or strands with the capacity to and the reference persons are pregnant females. In some carry information. In humans, the complete genome contains embodiments, the reference persons do not include the sub about 30,000 genes located on 24 chromosomes (The Human ject. In other embodiments, the reference data set comprises Genome, T. Strachan, BIOS Scienti?c Publishers, 1992). genomic nucleic acid sequence information of a biological Each gene codes for a speci?c protein, which after its expres specimen from one or more reference persons and the subject. sion via transcription and translation, ful?lls a speci?c bio In certain embodiments, the sample is blood serum or blood chemical function within a living cell. plasma from the subject. In some embodiments, the genomic [0004] Identifying genetic variations or variances can lead nucleic acid sequence information is from a multiplex to diagnosis of particular medical conditions including fetal sequence analysis. In other embodiments, the method com aneuploidy, fetal gender determination, fetal DNA/RNA/ prises reiterating identi?cation of the one or more dissimi fraction estimation, pathogen infection and other conditions larities in a pairwise analysis between each pair in the subject such as cancer and other diseases, for example. Personalized data set and the reference data set. In certain embodiments, therapy regimens based on a patient’s identi?ed genetic vari the subject data set and the reference data set comprise a ance can result in life saving medical interventions. ?uorescent signal or sequence tag information. In other [0005] Many medical conditions caused by genetic varia embodiments, the method comprises quantifying the signal tions are known and include hemophilia, thalassemia, Duch or tag using a technique selected from the group consisting of enne Muscular Dystrophy (DMD), Huntington’s Disease ?ow cytometry, quantitative polymerase chain reaction (HD), Alzheimer’s Disease and Cystic Fibrosis (CF) (Human (qPCR), gel electrophoresis, gene-chip analysis, microarray, Genome Mutations, D. N. Cooper and M. Krawczak, BIOS mass spectrometry, cyto?uorimetric analysis, ?uorescence Publishers, 1993). Genetic diseases such as these can result microscopy, confocal laser scanning microscopy, laser scan from a single addition, substitution, or of a single ning cytometry, af?nity chromatography, manual batch mode nucleotide in the deoxynucleic acid (DNA) forming the par separation, electric ?eld suspension, sequencing, and combi ticular gene. Certain birth defects are the result of chromo nation thereof. In certain embodiments, the statistical analy somal abnormalities such as Trisomy 21 (Down’s Syn sis is selected from the group consisting of decision tree, drome), Trisomy 13 (), Trisomy 18 countemull, multiple comparisons, omnibus test, Behrens (Edward’s Syndrome), X (Tumer’s Syndrome) Fisher problem, bootstrapping, Fisher’s method for combin and other sex chromosome such as Klinefelter’ s ing independent tests of signi?cance, null hypothesis, type I Syndrome C(XY). Medical conditions such as fetal aneup error, type II error, exact test, one-sample Z test, two-sample loidy, fetal gender prediction, and fetal DNA/RNA (or fetal Z test, paired Z-test, one-sample t-test, paired t-test, two fraction) estimation can be determined by analysis of fetal sample pooled t-test having equal variances, two-sample locus-independent markers and fetal speci?c markers for pla unpooled t-test having unequal variances, one-proportion US 2014/0235474 A1 Aug. 21,2014

Z-test, two-proportion Z-test pooled, two-proportion Z-test thereby providing a classi?cation; and (g) determining the unpooled, one-sample chi-square test, two-sample F test for presence or absence of a genetic variation for the sample equality of variances, con?dence interval, credible interval, based on the classi?cation. signi?cance, meta analysis, simple linear regression, robust [0008] The invention also in part provides a method for linear regression, and combination thereof. In some embodi non-invasive assessment of fetal gender or fetal fraction esti ments, the method for reducing the multidimensional matrix mation comprising: (a) receiving a subject data set compris is selected from the group consisting of metric and non ing genomic nucleic acid sequence information of a biologi metric multi-dimensional scaling, Sammon’s non-linear cal specimen sample from a subject; (b) receiving a reference mapping, principle component analysis and combinations data set comprising genomic nucleic acid sequence informa thereof. In other embodiments, the linear modeling analysis tion of a biological specimen from one or more reference algorithm is selected from the group consisting of analysis of persons; (b) classifying into one or more groups the subject data set for a feature by one or more linear modeling analysis variance, Anscombe’s quartet, cross-sectional regression, algorithms based on the reference data set thereby providing curve ?tting, empirical Bayes methods, M-estimator, nonlin a classi?cation; and (c) determining fetal aneuploidy or fetal ear regression, linear regression, multivariate adaptive regres gender for the sample based on the classi?cation. In certain sion splines, lack-of-?t sum of squares, truncated regression embodiments, the method further comprises performing lin model, censored regression model, simple linear regression, ear modeling analysis in a pairwise analysis between each segmented linear regression, decision tree, k-nearest neigh pair in the subject data set and the reference data set. bor, supporter vector machine, neural network, linear dis [0009] The invention also in part provides an apparatus that criminant analysis, quadratic discriminant analysis, and com identi?es the presence or absence of a genetic variation com binations thereof. In certain embodiments, the reference data prising a programmable processor that implements a data set set comprises features from pregnant females who are dimensionality reducer wherein the reducer implements a between 25 years old and 30 years old. In some embodiments, method comprising: (a) identifying one or more dissimilari the reference data set comprises features from pregnant ties for a feature between a subject data set and a reference females who are between 30 years old and 35 years old. In data set by a statistical analysis wherein the subject data set other embodiments, the reference data set comprises features comprises genomic nucleic acid sequence information of a from pregnant females who are between 35 years old and 40 sample from a subject and the reference data set comprises years old. In certain embodiments, the reference data set genomic nucleic acid sequence information of a biological comprises features from pregnant females who are in the ?rst specimen from one or more reference persons; (b) generating trimester of pregnancy. In some embodiments, the reference a multidimensional matrix from the dissimilarities; (c) reduc ing the multidimensional matrix into a reduced data set rep data set comprises features from pregnant females who are in resentation of the matrix; (d) classifying into one or more the second trimester of pregnancy. In other embodiments, the groups the reduced data set representation by one or more subject data set comprises features from pregnant females linear modeling analysis algorithms thereby providing a clas who are in the ?rst trimester of pregnancy. In certain embodi si?cation; and (e) determining the presence or absence of a ments, the reference data set comprises features chosen from genetic variation for the sample based on the classi?cation. one or more of a physiological condition, genetic or pro [0010] The invention also in part provides a computer pro teomic pro?le, genetic or proteomic characteristic, response gram product, comprising a computer usable medium having to previous treatment, weight, height, medical diagnosis, a computer readable program code embodied therein, the familial background, results of one or more medical tests, computer readable program code adapted to be executed to ethnic background, body mass index, age, presence or implement a method for generating a reduced data set repre absence of at least one disease or condition, species, ethnicity, sentation, the method comprising: (a) identifying one or more race, allergies, gender, presence or absence of at least one dissimilarities for a feature between a subject data set and a biological, chemical, or therapeutic agent in the subject, preg reference data set by a statistical analysis wherein the subject nancy status, lactation status, medical history, blood condi data set comprises genomic nucleic acid sequence informa tion, and combinations thereof. In some embodiments, a sta tion of a sample from a subject and the reference data set tistical sensitivity and a statistical speci?city is determined comprises genomic nucleic acid sequence information of a from the classi?ed reduced data set representation. In other biological specimen from one or more reference persons; (b) embodiments, the statistical sensitivity and statistical speci generating a multidimensional matrix from the dissimilari ?city are independently between 90% and 100%. ties; (c) reducing the multidimensional matrix into a reduced data set representation of the matrix; (d) classifying into one [0007] The invention also in part provides a method for or more groups the reduced data set representation by one or non-invasive assessment of a genetic variation comprising: more linear modeling analysis algorithms thereby providing a (a) obtaining a subject data set comprising genomic nucleic classi?cation; and (e) determining the presence or absence of acid sequence information of a sample from a subject; (b) a genetic variation for the sample based on the classi?cation. obtaining a reference data set comprising genomic nucleic [0011] Certain embodiments are described further in the acid sequence information of a biological specimen from one following description, examples, claims and drawings. or more reference persons; (c) identifying one or more dis similarities for a feature between the subject data set and the BRIEF DESCRIPTION OF THE DRAWINGS reference data set by a statistical analysis; (d) generating a multidimensional matrix from the dissimilarities; (e) reduc [0012] The drawings illustrate embodiments of the technol ing the multidimensional matrix and transforming the matrix ogy and are not limiting. For clarity and ease of illustration, into a reduced data set representation of the matrix; (f) clas the drawings are not made to scale and, in some instances, sifying into one or more groups the reduced data set repre various aspects may be shown exaggerated or enlarged to sentation by one or more linear modeling analysis algorithms facilitate an understanding of particular embodiments. US 2014/0235474 A1 Aug. 21,2014

[0013] FIG. 1a shows the relationship among raw log duplications caused by unbalanced translocations. The terms sequence count, ?ltered log sequence count and library con “aneuploidy” and “aneuploid” as used herein refer to an centration. FIG. 1b shows the log sequence count ratio dis abnormal number of chromosomes in cells of an organism. As played a high correlation with their GC content. different organisms have widely varying chromosome [0014] FIG. 2 shows a diagram of LM-MDS algorithm. complements, the term “aneuploidy” does not refer to a par [0015] FIGS. 3a and 3b show LM-MDS transformed ticular number of chromosomes, but rather to the situation in samples from different ?ow cells into the same space for which the chromosome content within a given cell or cells of classi?cation. an organism is abnormal. [0016] FIG. 4 shows a LM-MDS classi?cation plot for the [0030] The term “monosomy” as used herein refers to lack in-house dataset. of one chromosome of the normal complement. Partial mono [0017] FIGS. 5a and 5b show LM-MDS classi?cation for somy can occur in unbalanced translocations or deletions, in the Hong Kong dataset. which only a portion of the chromosome is present in a single [0018] FIGS. 6a and 6b show detection of trisomy 21 copy (see deletion (genetics)). Monosomy of sex chromo samples with pair-wise t-tests introduces false positives. somes (45, X) causes . [0019] FIG. 7 shows a Z-score based method in detecting [0031] The term “disomy” refers to the presence of two trisomy 21 samples. copies of a chromosome. For organisms such as humans that [0020] FIG. 8 shows LM-MDS on 4-plex ?ow cell 30 and have two copies of each chromosome (those that are diploid 34. or “euploid”), it is the normal condition. For organisms that [0021] FIGS. 9a and 9b show ROC (Receiver Operating normally have three or more copies of each chromosome Characteristic) plots for classi?cation with Z-score based (those that are triploid or above), disomy is an aneuploid method. chromosome complement. In , both cop [0022] FIG. 10 shows a ROC plot of LM-based gender ies of a chromosome come from the same parent (with no prediction. contribution from the other parent). [0023] FIG. 11 shows fetal fraction estimate from sequenc [0032] The term “trisomy” refers to the presence of three mg. copies, instead of the normal two, of a particular chromo some. The presence of an extra , which is DETAILED DESCRIPTION found in , is called trisomy 21. Trisomy 18 and Trisomy 13 are the two other autosomal recog [0024] In the following detailed description, reference is nized in live-bom humans. Trisomy of sex chromosomes can made to the accompanying drawings, which form a part be seen in females (47, XXX) or males (47, XXY which is hereof. In the drawings, similar symbols typically identify found in Klinefelter’s syndrome; or 47,XYY). similar components, unless context dictates otherwise. Illus [0033] The terms “” and “pentasomy” as used trative embodiments described in the detailed description, herein refer to the presence of four or ?ve copies of a chro drawings, and claims do not limit the technology. Some mosome, respectively. Although rarely seen with , embodiments may be utilized, and other changes may be sex chromosome tetrasomy and pentasomy have been made, without departing from the spirit or scope of the subject reported in humans, including XXXX, XXXY, XXYY, matter presented herein. It will be readily understood that XYYY, XXXXX, XXXXY, XXXYY, XXYYY and aspects of the present disclosure, as generally described XYYYY. herein, and illustrated in the drawings, can be arranged, sub [0034] Chromosome abnormalities can be caused by a vari stituted, combined, separated, and designed in a wide variety ety of mechanisms. Mechanisms include, but are not limited of different con?gurations, all of which are explicitly con to (i) nondisjunction occurring as the result of a weakened templated herein. mitotic checkpoint, (ii) inactive mitotic checkpoints causing [0025] Genetic Variations/Medical Conditions non-disjunction at multiple chromosomes, (iii) merotelic [0026] Technology described herein can be used to identify attachment occurring when one kinetochore is attached to the presence or absence of a genetic variation which are or are both mitotic spindle poles, (iv) a multipolar spindle forming associated with a medical condition(s). Non-limiting when more than two spindle poles form, (v) a monopolar examples of medical conditions are provided hereafter. spindle forming when only a single spindle pole forms, and [0027] Fetal Gender (vi) a tetraploid intermediate occurring as an end result of the [0028] In some embodiments, the prediction of a fetal gen monopolar spindle mechanism. der is determined. Gender determination generally is based [0035] The terms “partial monosomy” and “partial tri on sex chromosomes. In humans, there are two sex chromo somy” as used herein refer to an imbalance of genetic material somes, the X andY chromosomes. Individuals with XX are caused by loss or gain of part of a chromosome. A partial female and XY are male. Other variations may include XO, monosomy or partial trisomy can result from an unbalanced XYY, XXX, and XXY. translocation, where an individual carries a derivative chro mosome formed through the breakage and fusion of two Chromosome Abnormalities different chromosomes. In this situation, the individual would [0029] In some embodiments, the presence or absence of a have three copies of part of one chromosome (two normal fetal is determined. Chromosome copies and the portion that exists on the derivative chromo abnormalities include, without limitation, a gain or loss of an some) and only one copy of part of the other chromosome entire chromosome or a region of a chromosome comprising involved in the derivative chromosome. one or more genes. Chromosome abnormalities include [0036] The term “mosaicism” as used herein refers to aneu , trisomies, polysomies, loss of heterozygosity, ploidy in some cells, but not all cells, of an organism. Certain deletions and/or duplications of one or more nucleotide chromosome abnormalities can exist as and non-mo sequences (e.g., one or more genes), including deletions and saic chromosome abnormalities. For example, certain tri US 2014/0235474 A1 Aug. 21,2014

somy 21 individuals have mosaic Down syndrome and some tain instances, trisomy 12 has been identi?ed in chronic lym have non-mosaic Down syndrome. Different mechanisms phocytic (CLL) and has been identi?ed in can lead to mosaicism. For example, (i) an initial zygote may acute myeloid leukemia (AML). Also, genetic syndromes in have three 21 st chromosomes, Which normally would result Which an individual is predisposed to breakage of chromo in simple trisomy 21, but during the course of cell division somes (chromosome instability syndromes) are frequently one or more cell lines lost one of the 21st chromosomes; and associated With increased risk for various types of cancer, (ii) an initial zygote may have two 21st chromosomes, but thus highlighting the role of somatic aneuploidy in carcino during the course of cell division one of the 21st chromo genesis. Methods and protocols described herein can identify somes were duplicated. Somatic mosaicism most likely presence or absence of non-mosaic and mosaic chromosome occurs through mechanisms distinct from those typically abnormalities. associated With genetic syndromes involving complete or [0037] FolloWing is a non-limiting list of chromosome mosaic aneuploidy. Somatic mosaicism has been identi?ed in abnormalities that can be potentially identi?ed by methods certain types of cancers and in neurons, for example. In cer described herein.

Chromosome Abnormality Disease Association

X XO Turner’s Syndrome Y XXY Y XYY Double Y syndrome Y XXX syndrome Y XXXX Four X syndrome Y Xp21 deletion Duchenne’s/Becker syndrome, congenital adrenal hypoplasia, chronic granulomatus disease Y Xp22 deletion steroid sulfatase de?ciency Y Xq26 deletion X—linked lymphproliferative disease 1 1p (somatic) neuroblastoma monosomy trisomy 2 monosomy trisomy growth retardation, developmental and mental delay, and 2q minor physical abnormalities 3 monosomy trisomy Non—Hodgkin’s (somatic) 4 monosomy trsiomy Acute non lymphocytic leukemia (ANLL) (somatic) 5 5p Cri du chat; Lejeune syndrome 5 Sq myelodysplastic syndrome (somatic) monosomy trisomy 6 monosomy trisomy clear-cell sarcoma (somatic) 7 7q11.23 deletion William’s syndrome 7 monosomy trisomy monosomy 7 syndrome of childhood; somatic: renal cortical adenomas; myelodysplastic syndrome 8 8q24.1 deletion Langer—Giedon syndrome 8 monosomy trisomy myelodysplastic syndrome; Warkany syndrome; somatic: chronic myelogenous leukemia 9 monosomy 9p 9 monosomy 9p partial Rethore syndrome trisomy complete trisomy 9 syndrome; mosaic trisomy 9 syndrome 10 Monosomy trisomy ALL or ANLL (somatic) 1 1 1 1p Aniridia; Wilms tumor 1 1 1 1q Jacobson Syndrome 11 monosomy (somatic) myeloid lineages affected (ANLL, MDS) trisomy 12 monosomy trisomy CLL, Juvenile granulosa cell tumor (JGCT) (somatic) 13 13q 13q—syndrome; Orbeli syndrome 13 13q14 deletion retinoblastoma 13 monosomy trisomy Patau’s syndrome 14 monosomy trisomy myeloid disorders (MDS, ANLL, atypical CML) (somatic) 15 15q11—q13 deletion Prader—Willi, Angelman’s syndrome monosomy 15 trisomy (somatic) myeloid and lymphoid lineages affected, e.g., MDS, ANLL, ALL, CLL) 16 16q13.3 deletion Rubenstein-Taybi monosomy trisomy papillary renal cell carcinomas (malignant) (somatic) 17 17p—(somatic) 17p syndrome in myeloid malignancies 17 17q11.2 deletion Smith—Magenis 17 17q13 .3 Miller—Dicker 17 monosomy trisomy renal cortical adenomas (somatic)