© 2014 Nature America, Inc. All rights reserved. ( Stanford University, Stanford, California, USA. 5 4 Stanford Institute, Stanford University, Stanford, California, USA. 1 optimized combines CAPP-Seq, called approach, Our ctDNA. of sis analy for strategy new a developed we limitations, these overcome and/or Tocost. optimization of the for need patients, patient-specific sensitivity by modest limited been ctDNA detect to used have been sequencing parallel massively employing approaches Recently, genes. these in mutations lack patients of majority the but EGFR such genes in as mutations point recurrent detect to used been have assays PCR-based NSCLC, In procedures. invasive without sampled repeatedly be cannot which tumors, solid for attractive particularly DNA is to cancer-derived access Noninvasive of tumors. monitoring and detection revolutionize to potential the has ctDNA of Analysis thus malignancies, facilitating personalized cancer therapy. be routinely applied clinically to detect and monitor diverse genotyping with CAPP-Seq. We envision that CAPP-Seq could Finally, we evaluated biopsy-free tumor screening and for earlier response assessment than radiographic approaches. imaging changes, and measurement of ctDNA levels allowed between distinguished residual disease and treatment-related of ctDNA were highly correlated with tumor volume and specificity for mutant allele fractions down to ~0.02%. Levels NSCLC and in 50% of patients with stage I, with 96% We detected ctDNA in 100% of patients with stage II–IV somatic alterations that identified mutations in >95% of tumors. cancer (NSCLC) with a design covering multiple classes of ctDNA. We implemented CAPP-Seq for lung non–small-cell Seq), an economical and method ultrasensitive for quantifying cancer personalized profiling by deep sequencing (CAPP- coverage for broad clinical applicability. Here we introduce detection methods have insufficient sensitivity or patient noninvasive assessment of cancer burden, but existing ctDNA Circulating tumor DNA (ctDNA) is a promising biomarker for Jr WLoo Billy AModlin Leslie Aaron MNewman tumor DNA with broad patient coverage An method ultrasensitive for circulating quantitating na Received 15 August 2013; accepted 6 November 2013; published online 6 April 2014; [email protected] Division Division of Hematology, Department of Medicine, Stanford Cancer Institute, Stanford University, Stanford, California, USA. Division of Thoracic Surgery, Department of Surgery,Cardiothoracic Stanford School of Medicine, Stanford University, Stanford, California, USA. Institute for Stem Cell Biology and Regenerative Medicine, Stanford University, Stanford, California, USA. KRAS t u r (encoding epidermal growth factor receptor) in plasma DNA e medicine e (encoding kirsten rat sarcoma viral oncogene homolog) or homolog) oncogene viral sarcoma rat kirsten (encoding 5– ) ) or A.A.A. ( 3

, Ash AAlizadeh 1 3 2 advance online publication online advance . However, the methods reported to date have have date to reported methods the However, . , Chih Long Liu Long , Chih 1 , 2 , 7 [email protected] , Scott VBratman 1 3 , applicability to only a minority a minority only to , applicability 7 1 These These authors contributed equally to this work. should Correspondence be addressed to M.D. 1 , 2 , 2 , 5 , Joel WNeal &Maximilian Diehn ). 1 , 3 ,

7 , Jacqueline To 2

, Heather AWakelee 3 Department Department of Radiation Oncology, Stanford University, Stanford, California, USA. 1– 4 - ,

1 aberrations and then directly to circulating DNA to quantify them them quantify to DNA circulating to directly then and aberrations genetic patient’s a cancer-specific identify DNAto tumor to applied is selector the ctDNA, Tomonitor interest. of cancer the in regions mutated recurrently target that oligonucleotides DNA biotinylated of consisting ‘selector’ a design to approach tiphase mul a with masses DNAinput low for methods preparation library cell lines known to harbor fusions with previously uncharacterized uncharacterized previously with fusions harbor to known lines cell NSCLC two from data (NGS) sequencing next-generation to rithm ( data coverage for ultradeep optimized algorithm a breakpoint-mapping developed ( phase design final the in genes these in breakpoints fusion recurrent ning of structural rearrangements sequences junctional unique the in inherent rate detection false or kinase) tyrosine 1 kinase), tyrosine receptor phoma genes kinase tyrosine receptor the size selector ( minimizing while patient per mutations missense of number the maximize to algorithm iterative an applied we (TCGA), from 407 patients with NSCLC profiled by The Cancer Genome Atlas sources other from the Catalogue of Somatic Mutations in Cancer (COSMIC) including exons covering recurrent mutations in potential driver genes ( for NSCLC To a selector design identified. been have mutations rent although our approach is to generalizable any cancer for which recur on NSCLC, we focused of CAPP-Seq, implementation initial For the NSCLC for selector a CAPP-Seq of Design RESULTS NSCLC. stage of utility in CAPP-Seq and with early- the patients clinical advanced- ( , 3 3 Fig. 1 Fig. 1 Fig. Supplementary Fig. 1 Fig. Supplementary , Jacob FWynne , Approximately 8% of NSCLCs harbor rearrangements involving involving rearrangements harbor NSCLCs of 8% Approximately 6 doi:10.1038/nm.351 a b Fig. 1 Fig. ). ). Here we demonstrate the technical performance and explore , , Supplementary Table 1 Supplementary 15 b 2 ). To detect fusions in tumor and plasma DNA, we we DNA, plasma and tumor in fusions detect To ). , Supplementary Methods Supplementary , Robert EMerritt, Robert 1 6 . Next, using whole-exome sequencing (WES) data data (WES) sequencing whole-exome . using Next, 2 Division Division of Oncology, Department of Medicine, 9 3 RET , Neville CWEclov , Neville and and 5 , o p e R l a c i n h c e T proto-oncogene 6 , , we included the introns and exons span Supplementary Table 1 Supplementary 6 and Online Methods), we by began Methods), and Online Stanford Stanford Cancer Institute, ROS1 ALK 4 , Joseph BShrager (encoding anaplastic lym anaplastic (encoding (encoding c-ros oncogene oncogene c-ros (encoding ). Application of this algo this of Application ). 17– 2 3

1 , . To utilize the low low the utilize To .

).

4 T R , 1 4

and s  - - - - - © 2014 Nature America, Inc. All rights reserved. ( bp ~170 of length median a had fragments DNA plasma Sequenced efficient and uniform capture of genomic to DNAcirculating DNA ( purified from healthy control plasma and observed Table 2 including 5 healthy subjects, adultshuman 18 and from 13 samples patients plasma 40 with and NSCLC (PBLs) ( leukocytes blood peripheral matched with samples tumor primary 17 lines, cell limit ( of sequencing depth, median number of reporters and ctDNA detection coverage~10,000× (preduplication onremoval) considerations based achieve to selector NSCLC the with sequencing deep We performed assessment performance and optimization Methodological algorithm. design selector our ing of sampling random the exome ( from expected be would than more fourfold approximately patient, of a 88% of with covered four median patients SNVs per The selector adenocarcinoma lung with patients 183 of cohort independent an covered mutations per tumor, of we examined number the selector region the in WES data from validate To carcinoma. cell squamous or variants nucleotide (SNVs) and coverssingle 96% four of patientsof with lung median adenocarcinoma a covering identifies selector the total genome), in genes, mutated ( kb ~125 recurrently 139 from introns ( lution breakpoints study. this in patients from plasma in detected ctDNA of fraction median the represents line dashed vertical The genomes and exomes NSCLC in mutations of number reported the and 4) (i.e., CAPP-Seq for NSCLC per detected mutations of number median the on based are Calculations plasma. in ctDNA of limits detection different for sequencing whole-genome and sequencing ( selector; NSCLC the and selectors random between difference ( exome the from sampled randomly selectors to compared are (Validation; set data WES adenocarcinoma lung independent an and (Training; cohort WES TCGA the in selector NSCLC the by covered adenocarcinoma ( phase. design each during length selector of involving rearrangements in breakpoints harboring exons drivers NSCLC predicted of exons of addition 6: 5 and Phases exon. of kb per covered mutations with patients unique total to equal is (RI) index Recurrence ( TCGA from carcinomas cell squamous and adenocarcinomas lung from data WES using SNVs recurrent containing exons of addition 2–4: Phases captured. are NSCLC in mutations driver suspected and known harboring regions genomic 1: Phase selector. ( ctDNA. assessing for application their and selectors ( 1 Figure o p e R l a c i n h c e T  within a chromatosome d c a Fig. 2 Fig.

) Analysis of the number of SNVs per lung lung per SNVs of number the of ) Analysis CAPP-Seq of design depicting ) Schematic ) Analytical modeling of CAPP-Seq, whole-exome Collectively, the NSCLC selector design targets 521 exons and 13 13 and exons 521 targets design selector NSCLC the Collectively, Fig. 1 a ). To assess and optimize selector performance, we first applied it Supplementary Fig. 2 Fig. Supplementary

), which closely corresponds to the length of DNAofcontained length the to corresponds closely which ), b Development of CAPP-Seq. CAPP-Seq. of Development ALK ) Multiphase design of the NSCLC NSCLC the of design ) Multiphase Fig. 1 Fig. d 22 , , ). We profiled a total of 90 samples, including two NSCLC , ROS1 2 3 P Z readily identified the atbreakpoints reso identified nucleotide readily b -test, Online Methods). Methods). Online -test, < 1.0 × 10 < 1.0 ). Within this small target (0.004% of the human human the of (0.004% target small this Within ). and and 15 2 4 RET n n = . By optimizing library preparation from small , 1 6 n and introns and and introns and 183) = 407). = 407). −6 . Bottom: increase increase . Bottom: 2 for the the for 1 ). . Additional details, including assumed sequencing throughput (i.e., bases) per lane, are described in Online Methods. Methods. Online in described are lane, per bases) (i.e., throughput sequencing assumed including details, . Additional 2 n n = P 0 < 1.0 × 10 . Results . Results

T R 229) 229)

s

−6 Supplementary Table 2 ; ; Fig. 1 Fig. a c

Cumulative level analysis Population- CAPP-Seq c Supplementary Recurrent mutations selector ), ), thus validat

percentage of patients library 100 10 20 30 40 50 60 70 80 90 0 8 4 2 1

Mutations perpatient(log Training Validation Tumor/normal genomic DN 2 Tissue biopsy 0 ). discovery Mutation - - .

cancer owing to contributions from preneoplastic cells from diverse diverse from cells preneoplastic from contributions to owing cancer analysis ctDNA for methods NGS-based reported previously ( respectively 0.0003%, and 0.006% of rates background median and We mean SNPs. found line the for germ and SNVs tumor-derived selector excluding DNA samples, plasma 40 the across alleles nonreference of distribution the ture of reference alleles ( cap toward bias minimal observed and samples PBL patient within SNPs germline heterozygous in skew allelic the evaluated we Next, further analysis if found as a germline SNP in another profiled patient. Methods ( DNA plasma plexed multi in ~0.06% of cross-contamination found we samples, across patient-specific homozygous single nucleotide polymorphisms (SNPs) ( PCR following lated Table 2 of rate recovery DNA molecule circulating a ( of number input DNAthe per molecules sample with estimates comparing of library complexityby First, turn. in elements these of examined We each errors. the sequencing or in PCR bias (iv) and allelic reagent capture potential molecules, (iii) DNA cross-contamination, circulating sample of (ii) rate recovery and number input ing depth were minimal ( DNA ( of ng 4 as little as from constructed libraries for bias decreased and quantities of plasma DNA, we increased recovery by efficiency >300% Supplementary Fig. 4a level analysis 4 mutationsperpatient Personalized Nongermline plasma DNA could be present in the absence of of absence the in present be could DNA plasma Nongermline The detection limit and accuracy of CAPP-Seq are affected by (i) the A Patient- markers Supplementary Supplementary Fig. 3 ). This was in agreement with molecule recovery yields calcu yields recovery ). This was in with molecule agreement Random P =0.05 ), prompting us to exclude any tumor-derived SNV from from SNV tumor-derived any exclude to us prompting ), 2 scale) Mutation recovery 16 Cell-free DNA Blood draw advance online publication online advance 32 Supplementary Fig. 4b Fig. Supplementary d Supplementary Fig. 4c Fig. Supplementary Probability b and Supplementary Supplementary Fig. 4d

Fig. 2b of detection in plasma Length (kb) Patients with NSCLC (%) 100 100 0.4 0.6 0.8 1.0 0.2 50 20 40 60 80 0 0 0 Supplementary Methods Fig. 2 Fig. ). ). Consequently, fluctuations in sequenc 5 0 1 Detection limitinplasma(allele%) Known drivers , c No. oftargetedgenomicregions 1 ). Genome: 15,659SNVs(1lane) Exome: 218SNVs(1lane) Exome: 218SNVs(1/12lane) d Phases ofselectordesign ), both considerably lower than than lower considerably both ), 1 0 coverage ≥1 SNV Max

2 0.10 00 ≥2 SNVs ≥ ). Second, by analyzing by analyzing Second, ).

49% ( 49% and and ). ). Finally, we analyzed RI ≥3

3 ≥3 SNVs na 0

150 Supplementary Supplementary Supplementary Supplementary t u ), we calculated ≥2 RI 172 r 4 0 e medicine e 0.01 Predicte drivers CAPP-Seq: 526 (1/12 lane) Add fusions 4 SNVs 5 75% 96% 6 8 125 kb , d 1 0 . - - - - -

© 2014 Nature America, Inc. All rights reserved. to be significantly above global background ( jects ( plasma DNA samples, including from those patients and healthy sub ~0.01% pressor at higher marginally was ( mean the (~0%), background the selector Although global the to tumor. comparable was abundance patient’s fractional median each in found mutations somatic SNVs rates cancer-associated of 107 mutation in analyzed positions therefore and mutated genes driver recurrently cancer known for high particularly be would present, if background, biological that Wehypothesized sensitivity. CAPP-Seq affect also may background ‘biological’ such and tissues, in given are Details tested. The individuals. 5 healthy and NSCLC with patients 13 from collected samples DNA ( ( selector the in regions the on represented are counts (fragment fragments DNA circulating sequenced 2 Figure na and thus excluded heterogeneity clonal biological true that it reflects (blue solid line) using data from panel panel from data using line) solid (blue ( interval. from data using abundance fractional of estimates the on considered SNVs of number the of effect ( CAPP-Seq in background selector global of percentile 99th the The samples. DNA plasma 40 the across frequency mean to according recurrent, least to ( mutation outlier A single ~0.01%. was subjects panel SNaPshot reported a previously from mutations y d g a Fig. 2 Fig. axis) across plasma DNA samples from four patients. Orange envelope represents mean mean represents envelope Orange patients. four from samples DNA plasma across axis) CAPP-Seq predicted Frequency Frequency t 0.0005 0.0010 0.0015 0.0020 0.4955 0.4965 u fraction (%) 200,000 400,000 600,000 800,000 r Fig. 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 e e medicine e 0

0 ). Notably, we detected one mutational hotspot (tumor sup (tumor hotspot mutational one detected Notably,we ). TP53 Analytical performance. ( performance. Analytical 0 i ) Analysis of the effect of the number of SNVs considered on the mean correlation coefficient between expected and observed cancer fractions fractions cancer observed and expected between coefficient correlation mean the on considered SNVs of number the of effect the of ) Analysis 0 n 0.1 0 f 100 = 14 reporter alleles). Five concentrations of fragmented HCC78 DNA spiked into control circulating DNA are shown. ( shown. are DNA circulating control into spiked DNA HCC78 fragmented of concentrations Five alleles). reporter = 14 ). As we observed the frequency of this Selector-wide backgroundrate(%) , R175H) at a median frequency of ~0.18% across all all across ~0.18% of frequency median a at R175H) , 0.2 ( n 0.0001 =40plasmaDNAsamples) Known fraction(% 0.3

150 0.4 R Fragment length(bp) 2 advance online publication online advance Medi 0.001 =0.994 b 0.5

) (details of all plasma DNA samples sequenced are shown in in shown are sequenced samples DNA plasma all of ) (details an 0.05 0.10 0.15 0.6 Supplementary Methods Supplementary

75th 200 0.7 0.01 M pe 0.05

) ea

0.8 rc

2 95thn perc. 5 a in all 40 plasma samples, excluding excluding samples, plasma 40 all in 0.9 – 250 0.1 c 0.10 ) Quality parameters from a representative CAPP-Seq analysis of plasma DNA, including length distribution of distribution length including DNA, plasma of analysis CAPP-Seq a representative from parameters ) Quality 1.0 g . Median . 95% confidence intervals are shown for for shown are intervals confidence . 95% 300 1 e P Biological background TP53 < 0.01), we hypothesize rate (%) . Perc., percentile. ( percentile. . Perc., 0.2 0.4 0.6 0.8 h b d 2 0 . . ( Depth 5 Mean report fraction (%) TP53 R175H) is indicated by an yellow diamond. ( diamond. yellow an by indicated is R175H) . Mutations found in a given patient’s tumor were excluded. The mean frequency over all all over frequency mean The excluded. were tumor patient’s a given in found . Mutations 10,000 15,000 20,000 25,000 g 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 5,000 ) Dilution series analysis of expected versus observed frequencies of mutant alleles using using alleles mutant of frequencies observed versus expected of analysis series ) Dilution

0 ( n 1 mutant allele 0 =35NSCLCsamples,5healthy) 8 7 6 5 4 3 2 Plasma DNAsamples No. ofreportersconsidered NSCLC (ordered bygenomicregion) e CAPP-Seq selector ) Analysis of biological background in in background biological of ) Analysis - -

selector. Moreover, the fractional abundance of fusion breakpoints, breakpoints, fusion of abundance fractional the Moreover, selector. equivalent to the median number of SNVs per tumor identified by the four SNP reporters ( of threshold a above metrics error in improvements marginal only abundances between 0.025% andfractional 10% at with high DNA linearity ( NSCLC of inputs defined detected ( CAPP-Seq of limits detection at quantitation ~0.01%. above ctDNA affecting factor major a Methods of the ctDNAwhen assessing significance ( detection rate generally, in background differences more for allele-specific normalized we also background address To reporter. potential a as it 1 9 y axis) ( axis) Next, we empirically benchmarked the detection limit and linearity e 1 0 , f . Statistical variation for for variation . Statistical 1 1 y axis denotes the fraction of all alleles and selector positions positions selector and alleles all of fraction the denotes axis Supplementary Table 2 Supplementary Healthy 1 2 a ). As a result, we found that biological background is not not is background biological that found we result, a As ). ) and depth of sequencing coverage ( coverage sequencing of depth ) and 1 3 P ± value threshold of 0.01 (dotted line) corresponds to corresponds line) (dotted 0.01 of threshold value ~0 s.e.m. ( s.e.m. 4 Spike 0.025% 0.05% 0.1% 0.5% 1% . 01% Fig. 2 Fig. f

g Biological background d Fig. Fig. 2h f . Data are presented as means means as presented are . Data ) Individual mutations from from mutations ) Individual g ) Analysis of allelic background rate for 40 plasma plasma 40 for rate background allelic of ) Analysis rate (%) 0.05 0.10 0.15 0.20 0.25 and and

TP53 0

TP53: R175H o p e R l a c i n h c e T

, TP53: R306X Supplementary Fig. 5a Fig. Supplementary i c

and PTEN g : R248Q ). ( ). Depth is shown as mean mean as shown is TP53 i

d : R173C 10,00 12,50 15,00 17,50 2,500 5,000 7,500 focusing on 107 recurrent somatic somatic recurrent 107 on focusing Correlation coefficient TP53 Ranked listofrecurrentsomaticmutations : R273C c (top 25andbottomof107allelestested)

) Variation in sequencing depth depth sequencing in ) Variation TP53: R273H Supplementary Fig. 5b Supplementary 1.00 0.85 0.90 0.95 APC 0 : G245S 0 0 0 0 PTEN: R1450X PTEN

8 7 6 5 4 3 2 1 : R130X APC (ordered bydecreasingmeandepth) : R233X TP53: Q1338X

KRAS: R248W CTNNB1 No. ofreportersconsidered

y : Q61H

R NRAS axis) across all genomic genomic all across axis) : T41S CAPP-Seq selector

2 KRAS : Q61K

e EGFR ± ≥ : G12S ranked by most most by ranked

s.e.m. EGFR 0.994). We observed : T790M ±

h APC

95% confidence confidence 95% : G719C

) Analysis of the the of ) Analysis NRAS: R1114X

). We accurately Weaccurately ). CTNNB1 Supplementary Supplementary : Q61H NRAS CTNNB1: S45F 1 9 PIK3CA : G13S ,

c : D32Y 1 0 EGFR: G1049S ), ), which is

: G719S 1 1

CTNNB1 P T R

<0.01 1 2

: S37A 1 3

s 4 

© 2014 Nature America, Inc. All rights reserved. identified partner genes for each of eight known fusions and involving resolution base-pair at breakpoints characterized Moreover,we ( variants somatic additional 100% of SNVs previously identified and fusions and discovered many ( samples germline paired and a At fluid. tumor in removal) (preduplicate pleural of ~5,000× depth sequencing mean malignant and specimens biopsy needle tions, 3 Table Supplementary ( NSCLC with patients 17 from collected samples tumor in mutations somatic of discovery the to WeCAPP-Seq applied next quantitation burden tumor and detection mutation Somatic alterations 5d Fig. Supplementary number ( copy concentrations and expected with highly correlated (indels) (CNAs) deletions and insertions Case Table 1 o p e R l a c i n h c e T  Additional detailsareprovidedin Smoking history, ND, mutantDNAwasnotdetectedabovebackground(OnlineMethods);NA,tumorvolumecouldbereliablyassessed.Dashesindicateaplasmasample available. P8 P7 P11 P10 P9 P6 P5 P4 P15 P14 P3 P2 P13 P17 P16 P1 P12 stage predictions ( predictions stage (three missense) per patient ( SNVs six of median a identified we fusions, with patients Excluding expected as fusions, lacking those than SNVs fewer contained and never-smokers from exclusively almost or the log-log axes and dashed diagonal line are for display purposes only. purposes display for are line diagonal dashed and axes log-log the ( respectively fusions, of capture the to related differences and ( samples pretreatment ml (pg concentration and PET-CT, or CT by measured volume, tumor ( negative. false FN, negative; true TN, FP, positive; false ( specificity. Sp, sensitivity; Sn, Methods). at significant are ( II–IV stages ( stages all into divided controls, healthy and samples pretreatment ( 3 Figure b a

) ROC analysis of plasma DNA samples from from samples DNA plasma of analysis ) ROC ) Raw data related to to related data ) Raw ROS1

Age 48 50 38 35 49 54 49 47 41 55 67 61 90 85 82 66 86 ( Patient characteristics Patient and characteristics pretreatment CAPP-Seq monitoring results Sensitivity and specificity analysis. analysis. specificity and Sensitivity Supplementary Fig. 2 Fig. Supplementary n = 9 patients). AUC values values AUC = 9 patients). ≥ 20 packyears(Heavy),>0and<20(Light).SCC,smallcellcancer;Adeno,adenocarcinoma; TNM,tumor, nodeandmetastasisclassificationsystem. Sex M M M M M M M F F F F F F F F F F P < 0.0001 ( < 0.0001 Fig. 1b Fig. c ) Concordance between between ) Concordance Histology Adeno Adeno Adeno Adeno Adeno Adeno Adeno Adeno Adeno Adeno Adeno Large cell SCC Adeno Adeno Adeno SCC n n a = 9), measured by CAPP-Seq. Patients P6 and P9 were excluded owing to inability to accurately assess tumor volume volume tumor assess accurately to inability to owing excluded were P9 and P6 Patients CAPP-Seq. by measured = 9), = 13 patients) and and patients) = 13 . TP, true positive; . TP, positive; true −1 ). ), including formalin-fixed surgical resec surgical formalin-fixed including ), ) of ctDNA from from ctDNA ) of , c Supplementary Tables 3 ). Z Table -test, Online Online -test, Table Supplementary Table 2 Supplementary Stage IV IV IIIA IIIA IV IV IV IV IIIB IIIB IIIB IIIA IIB IB IB IB IA ). Tumors containing fusions were were fusions containing Tumors ).

1 T R ), in line with our selector design- 1 and and

T4N0M1b T1aN2M1b T3N2M0 T4N0M0 T4N3M1a T3N2M1b T1bN0M1a T2aN2M1b T3N3M0 T1aN3M0 T1bN3M0 T3N1M0 T3N0M0 T2aN0M0 T2aN0M0 T2aN0M0 T1bN0M0 s

2 1 TNM ( Supplementary Table 3 Table Supplementary Supplementary Fig. 2 Fig. Supplementary and 4 a

Sensitivity (%) . history Smoking 100 None Light None None None None None Heavy Light Heavy Light Heavy Heavy Heavy Heavy Heavy Heavy 20 40 60 80 0 2 0 ), we detected detected we ), Table Pretreatment plasmaDNA Supplementary Methods Supplementary R All stages Stages II–IV

100 –speci city(%) 4 0 2

No. ofSNVs (nonsilent) ≥ 1 0.97; 0.97; 25 25 (10) 12 (3) 26 (5) 12 (3) 6 0 ALK and and 1 1 (0) 2 (1) 3 (2) 4 (3) 3 (2) 8 (5) 1 (1) 5 (4) 2 (2) 6 (3) 0 0 0 ). ). ). ). - 8 0 100%

85% Sn the ctDNA detection index, we could increase specificity up to 98% 98% to up specificity increase could we index, detection ctDNA the ( Methods; Online II–IV stages for 0.91 and stages all for 0.89 of AUC with values performance, robust exhibited CAPP-Seq samples, ( 96% of groups both for specificity a with 100%, was it tumors, II–IV stage with those among and 50%, was tumors I stage with patients among Sensitivity controls. healthy and patients untreated sam from ples DNA plasma all for respectively, 96%, and 85% of specificity and sensitivity maximal (AUC) with of 0.95, curve the an under area analysis, (ROC) CAPP-Seq receiver operating achieved characteristic are integrated Methods). When (Online we applied this approach in a which in and background ent nonexist of their because precedence take breakpoints fusion which in tree on a decision based is and rate to a false-positive analogous is and instances classes of somatic mutations multiple into a ctDNA detection across index. This index content information integrated We collected samples from 13 with ( NSCLC patients 35 and controls healthy 5 using from samples detection plasma disease residual minimal and monitoring disease 0 Fig. 3a 96% 96% Sp Next, we assessed the sensitivity and specificity of CAPP-Seq for for CAPP-Seq of specificity and sensitivity the assessed we Next, Indels 100 AU 0.95 0.99 0 0 0 0 0 0 0 0 1 0 0 1 0 0 2 4 1 C , b ). Of note, linear regression was performed in non-log space; space; non-log in performed was regression linear note, Of ). ). Moreover, when considering both pre- and post-treatment b Plasma DNA samples (n = 18) C5 C4 C3 C2 C1 P9 P6 P5 P4 P15 P3 P2 P14 P13 P1 P16 P17 P12 ALK ALK ALK ROS1 ROS1 ROS1 ALK ALK ROS1

P12* Patient-specific reporters advance online publication online advance Stage FP TP P17 or P16* 6 Fig. Supplementary

P1* Fusion

FN TN S I P13 P14 ** * P2* EML4 EML4 CD74 SLC34A2 MKX EML4 KIF5B Includes fusion(s Includes indel(s tages II–IV P3 Partner P15*

P4 , P5 FYN

P6** P Table

P9** types reporter multiple from values NSCLC ctDNA (%)

) Healthy ) 0.04 1.0 3.2 0.039 0.58 0.05 0.095 0.896 1.78 0.019 0.025 1 ND ND − − − − and 3 c

Tumor volume (cm ) by adjusting ). Furthermore, 1,000 100 10 Supplementary Table Supplementary 4 1 1 1 Pretreatment Stage (pg ml

350.2 143.8 108.1 269.8 ctDNA 10.2 16.2 64.7 P na ND ND ctDNA (pgml 3.8 2.1 2.5 1.9 − − − − I < 0.0001, 0.0001, < 0 t −1 u ) r e medicine e 100 P R Tumor (ml) –1 =0.0002 2

121.8 339.3 ) =0.89 66.2 82.1 12.4 23.1 10.2 22.5 23.1 Z NA 5.2 7.9 5.5 − − − − 1,000 -test, -test,

). ). - -

© 2014 Nature America, Inc. All rights reserved. and/or indel reporters ranged from ~0.02% to 3.2% ( to 3.2% ~0.02% from ranged reporters indel and/or to by therapy.in of plasma SNV responses ctDNA Fractions detected clinical and volumes tumor measured radiographically with relate cor ctDNA of levels detectable significantly whether We asked next samples plasma in burden tumor NSCLC of Monitoring specificity. and sensitivity desired a deliver to tuned be can and den Fig. 6 and samples three-fourths of stage II–IV cancer-positive samples cancer-positive ( all of two-thirds capturing still while mean as ( point time P16-3; and P1-3 (P1-2, cancer-negative ( mutations tumor primary the in patients from samples DNA plasma All screening. cancer or ( response. complete CR, NSCLC. IB stage with patients ( disease. of dead DOD, disease; of evidence no NED, ( NSCLC IIB stage with patient ( (right). samples plasma and (left) tumor primary the in shown are clone T790M-containing and clone dominant the of abundance fractional The NSCLC. IV stage with a patient in mutation NSCLC. IV stage with a patient in a fusion) ( breakpoints rearrangement three using ( (SNVs/indel) indel an and SNVs using NSCLC IIIB stage with a patient in treatment to response in changes ( indicated. as therapies combination as administered were crizotinib or (bev), bevacizumab (pem), pemetrexed (cis), cisplatin 4 Figure na of ctDNA levels Absolute samples. pretreatment in of ~0.1% median d g a PET-CT t –1 Treatment Mont Mont u ctDNA (pg ml ) Imaging SNVs/indel (CAPP-Seq) ND (CAPP-Seq) 100 status 10 r ). ). Thus, CAPP-Seq can achieve robust assessment of tumor bur

–1 1 0 h CT e medicine e

ctDNA (pg ml ) h

0 1 2 3 ± Stage IIIB Noninvasive detection and monitoring of ctDNA. ( ctDNA. of monitoring and detection Noninvasive Mutant allele frequency (%) s.e.m. Scale bars ( bars Scale s.e.m. carbo/paclitaxel 10 20 30 40 50 Pretreatment Stage IB Supplementary Table 4 Supplementary 0 Radiation /cetuximab 5 0 Dominant clone:allSNVs Dominant clone: Sublone: Tu ND Tumor 0 0 P5 (stageIV) Surgery + EGFR Tu 1 5

Surgery advance online publication online advance Plasma T790M

(months) (months) EGF Time Time CR 11 3 3 1 R therapy: Chemo- cis/pem L858R a Supplementary Methods Supplementary , e b ( 0 1 2 3 4 5 ) and a patient with stage IIIB NSCLC ( NSCLC IIIB stage with a patient ) and n

,

=3)

e ) Mutant allele frequency (% frequency allele Mutant 15 – 15 ). The lowest fraction of ctDNA among positive samples was ~0.4% (dashed horizontal line). Data in in Data line). horizontal (dashed ~0.4% was samples positive among ctDNA of fraction lowest The ). h Surgery ), 10 cm. 10 ), NED 32 32 Tumor volume b Supplementary Methods Supplementary ). Tu, tumor; Ef, pleural effusion; ND, not detected. ( detected. not ND, effusion; pleural Tu,). Ef, tumor; e 19 19 , P1 f TMEM132D NED P1 ) CAPP-Seq results from post-treatment plasma DNA samples are predictive of clinical outcomes in a in outcomes clinical of predictive are samples DNA plasma post-treatment from results ) CAPP-Seq 35 5 e 100 10 1 Treatment PET-CT Imaging Mont

ctDNA (pg ml–1) status

) 100 200 300 400 (cm volume Tumor h 3 h g 0 ); samples with detectable mutations are shown, along with three samples assumed to be to assumed samples three with along shown, are mutations detectable with samples ); b Treatment , ND (CAPP-Seq) CAPP-Se Mont PET-CT h Stage IIB Imaging , transmembrane protein 132D gene. ( gene. 132D protein , transmembrane i Mont

status –1 ) Monitoring of tumor burden following complete tumor resection ( resection tumor complete following burden tumor of ) Monitoring ) Exploratory analysis of the potential application of CAPP-Seq for biopsy-free tumor genotyping genotyping tumor biopsy-free for CAPP-Seq of application potential the of analysis ) Exploratory CT

Supplementary Supplementary ctDNA (pg ml )

Table –1 Pretreatment 0 2 4 6 h

ctDNA (pg ml ) Fusions (CAPP-Seq) ND (CAPP-Seq) a 0 1 2 3 4 h – ND (CAPP-Seq) Stage IV CAPP-Se h q Pretreatment Stage IB Table ) Disease monitoring using CAPP-Seq. Carboplatin (carbo), paclitaxel, cetuximab, cetuximab, paclitaxel, (carbo), Carboplatin CAPP-Seq. using monitoring ) Disease 0 0 Tu 0 3 0 1 Tu ). The number following the hyphen in each sample (e.g., -1) represents the plasma plasma the represents -1) (e.g., sample each in hyphen the following number The ). ), with a ), with Tu Chemotherapy: Ef carbo/pem/bev 0 0 q Radiatio (months) 1 Time were examined for the presence of mutant allele outliers without knowledge of knowledge without outliers allele mutant of presence the for examined were SABR f 3 ). SD, stable disease; PD, progressive disease; PR, partial response; response; partial PR, disease; progressive PD, disease; stable SD, ). - - Tu n (months) (months) Time PR Time fusions (P9, (P9, fusions (P15, indel an and SNVs of collection a were ured meas types mutation the whether observed was behavior This P9). ( therapy during volumes tumor with correlated highly levels ctDNA samples, pretreatment in ( therapies distinct undergoing NSCLC advanced with in longitudinal samples, we analyzed plasma DNA from three patients ( imaging (PET) tomography emission positron and (CT) tomography computed by measured as volume tumor with correlated significantly plasma pretreatment in 4 4 11 SD 11 4 4 Pem/bev To determine whether ctDNA concentrations reflect disease burden Tumor volume Tu Ef P1

14 14 3 NED 22 100 200 300 400 0 CR Crizotinib 18 18 Tumor volume Tu Ef

Tumor volume

Fig. 4 Fig.

) P1 (cm volume Tumor 3 16 c 6 d 16 NE ) Concordance between different reporters (SNVs and and (SNVs reporters different between ) Concordance 21 ) Detection of a subclonal a subclonal of ) Detection P9 10 20 30 0

D Tu Ef

b

) m

) or SNVs and a fusion (P6, (P6, fusion a and SNVs or ) (c volume Tumor 3 100 50 0

f

) (cm volume Tumor Plasma DNA 3 Treatment R Imaging Mont status c Screening o p e R l a c i n h c e T 2 –1 a = 0.95 for patient 15 (P15); (P15); 15 for patient = 0.95 ctDNA (pg ml ) Mutant allele frequency (%) sampl CT ) and a patient with stage IV NSCLC NSCLC IV stage with a patient ) and 10 15 i R 0.125 0.250 0.500 h result SNV frequency (%) 0 5 2 CAPP-Seq 0 1 2 3 4 5 = 0.89, 0.89, = Stage IIIB 1 2 4 e Time sinceinitiationoftherapy(months Pretreatmen + + + + + Stage IV –1 Tu SNV: P5-1 SNV: Crizotinib initiated 0 5 4 3 2 1 0 P13-1 0

P NAV3 TP53 EGFR + carbo/paclitaxel = 0.0002; 0.0002; = g

) and SABR ( SABR ) and P15-5 t Radiatio (months a Time

, P15-1 T790M resistance resistance T790M b ) Disease burden burden ) Disease SNV: Fusion: Fig. 4 Fig. P15-4 n ) + + Fig. 4 Fig. Fig. 3 Fig. Tu P2-1 TMEM132D d KIF5B-ALK PR 5 are expressed expressed are h c 5 Fig. 4a Fig. P6-1 ). Of note, in in note, Of ). ) for two two ) for R Tumor volume TN FP FN TP a – –

c 2 P6 6 ), multiple multiple ), = 0.85 for = 0.85 ). P1-2 P1 + + – – In tumor ) 4

1,024 256 PD – T R P1-3 12 + + – –

c 10 In plasma 0 2 4 6 8 – DO ). As).

P16-3 13

) (cm volume Tumor )

ml (pg ctDNA

D 3 –1 s  - © 2014 Nature America, Inc. All rights reserved. Methods ( 0% of rate false-positive a with 0.4% of abundances fractional above ctDNA with samples plasma patient of 100% classified correctly we specificity, high for method screening in sample plasma ( each cohort in our DNA cancer of presence the for test to method statistical new a applied patient’s and tumor each in present mutations the to ourselves blinded we principle, of proof As typing. could ctDNA be for tumor used and of geno potentially biopsy-free screening cancer analysis CAPP-Seq whether explored we Finally, genotyping tumor and screening cancer Biopsy-free monitoring for therapy.of types and distinct during ctDNA NSCLC advanced-stage and early- in burden tumor measuring for CAPP-Seq of utility potential the demonstrate results these therapy. Taken together, after months 21 follow-up last at disease support of free remained ctDNA, patient the and by hypothesis, latter the ing disease residual of evidence no detected We inflammation. postradiotherapy or tumor residual either represent to interpreted was that mass residual a showed SABR following scan cured probably and disease of free was patient this that suggests which surgery, ing follow months 32 or 3 at not but P1 patient of plasma pretreatment (SABR) ( allow monitoring in early-stage NSCLC. Patients P1 ( disease residual with patients therapy. after identifying for analysis ctDNA of promise the patient highlight data the These NSCLC. and to later, succumbed ultimately months 7 detected was progression clinical following increased slightly therapy, suggesting progression of occult microscopic disease. Indeed, concentration ( ctDNA response the near-complete However, a and NSCLC, revealed IIIB imaging (P14) stage follow-up for patient chemoradiotherapy Another with result. treated ctDNA was the supports which later, represent ( tectable to interpreted was unde was point time follow-up same the at that However, ctDNA disease. residual mass NSCLC, large IIB a stage showed for imaging radiotherapy with treated was who P13, patient For tissues. surrounding and lung the in changes fibrotic and inflammatory radiation-induced to owing interpret to CT therapy or often have PET-CT surveillance scans that are difficult subclones. has method forpotential and relevant detecting clinically quantifying ( plasma sampled in simultaneously tion ‘gatekeeper’ a with subclone erlotinib-resistant a an clone with dominant tify activating iden to us allowed design this (P5), tumor. patient In one per SNVs ALK To the best of our this knowledge, is of the first observation independently recovered in plasma were samples and ( DNA genomic of amplification (qPCR) PCR quantitative ROS1 involving fusions unreported previously two classic a both identified we (P9), patient one o p e R l a c i n h c e T  Fig. 4 Fig.

We the next limit low asked of whether detection CAPP-Seq would radio definitive undergoing NSCLC III or II stage with Patients multiple detect to selector CAPP-Seq NSCLC the designed We 2 fusions in the same individual with NSCLC. with individual same the in fusions 6 - . . The ratios between clones were identical in a tumor biopsy and MKX h 2 ) underwent surgery and stereotactic ablative radiotherapy radiotherapy ablative stereotactic and surgery underwent ) 7 ). CAPP-Seq could therefore potentially improve upon the the upon improve potentially therefore could CAPP-Seq ). Fig. 4 Fig. , respectively, for stage IB NSCLC. We detected ctDNA in in ctDNA We detected NSCLC. IB stage for respectively, , ( Supplementary Fig. 7 Fig. Supplementary Supplementary Fig. 2 Fig. Supplementary e 2 8 ), and the patient remained disease free 22 months months 22 free disease remained patient the and ), . For patient P16, the initial surveillance PET-CT PET-CT surveillance initial the P16, patient For . T R Fig. 4 Fig. ). All fusions were confirmed by confirmed were fusions All ). ). By implementing our cancer cancer our implementing By ). s EGFR Fig. 4 Fig. d ), demonstrating that our that ), demonstrating Supplementary Table 4 EML4 ROS1 mutation as mutation as well an i and and EGFR - : : Fig. Fig. 4 Supplementary Supplementary ALK FYN T790M muta T790M fusion and and fusion - g ROS1 ROS1 ) ) and P16 i. 4 Fig. and and and f ). ). ). ------

somatic mutations. These features facilitated the detection of minimal of classes and instances multiple across content information grating or tumor subclonal on evolution) tumor burden quantitation by inte (for variability example, biological mutations near limit the and detection noise stochastic of impact potential the reduces also approach Our cost. reasonable a at coverage patient broad and limit detection NGS-based method first for ctDNA the analysis that is achieves both an CAPP-Seq ultralow knowledge, our To NSCLC. with patients all lack of a need for optimization and patient-specific coverage of nearly ctDNA for specificity, and sensitivity method high include features new Its key quantitation. a as CAPP-Seq present we study, this In DISCUSSION prior without tumors I genotype. tumor of stage knowledge genotype and detect to required be will improvements methodological However, NSCLC. metastatic or advanced locally with patients in genotyping tumor biopsy-free for have utility may therefore CAPP-Seq specificity. 99% with 0.1% than greater fractions allelic at mutations of 100% identified correctly we in mutations actionable detect noninvasively to NSCLC developing of risk high at patients in screening CT of low-dose value predictive positive low in the majority of patients with lung and colorectal carcinomas in of lung the with majority and patients colorectal methods sequencing-based described previously of thresholds detection the available. are data mutation recurrent which for malignancy we Although on focused NSCLC, our could method be applied to any residual and disease ctDNA quantitation from stage I NSCLC tumors. cancer cell content. cell cancer low with specimens and fluids biological alternative in DNA cancer of assessment the including settings, clinical of variety a in valuable prove will CAPP-Seq that Weanticipate cancer. of monitoring and therapy detection, personalized the for accelerating has potential the low cost. CAPP-Seq could thereforeat be routinelyNSCLC appliedwith clinically and patients of majority vast the in ctDNA of detection sequencing of plasma DNA allows for highly sensitive and no . of types various monitoring for useful prove of Weaberrations. anticipate types that adding coverage for CNAsthese certain will prioritize not did design selector current our CNAs, Table 4 P6; example, (for present are types reporter other Methods P9; example, (for burden tumor of underestimates to lead could which fusions, of capture is inefficient for CAPP-Seq potential of the limitation second A study. our in used average ml the ~1.5 above of analysis ctDNA for used plasma of amount the preparation library from resulting errors PCR Potential approaches include using that strategies barcoding suppress quantitation, additional gains in the detection threshold are desirable. ( costs sequencing detect ctDNA in most patients with NSCLC, even at tenfold or greater whole-genome amplicon employing methods ctDNA published detection Previously thresholds. detection lower even ing Following therapy, ctDNA concentrations drop,typically thus requir Separately, when we specifically examined the ability of the ability CAPP-Seq examined Separately, we when specifically In many patients, levels of ctDNA are considerably lower than than lower considerably are ctDNA of levels patients, many In In summary, targeted hybrid capture and high-throughput high-throughput and capture hybrid targeted summary, In ctDNA of applications clinical potential the expand further To ). Finally, although ). we Finally, found quantitate that could although CAPP-Seq 1 ). However, this bias can be analytically addressed when when addressed analytically be can bias this However, ). 3 . . For example, pretreatment ctDNA concentration is <0.5% 9 advance online publication online advance , 24 , Fig. 1 Fig. 32 , 3 3 sequencing would not be sensitive enough to enough sensitive be not would sequencing d and and Supplementary Fig. 8 Fig. Supplementary 2 9 . 8 , 10 , 1 1

whole-exome , 34 EGFR na , 3 5 Supplementary Supplementary Supplementary Supplementary t and increasing increasing and ). u and and r e medicine e n KRAS invasive 1 1 , 30 2 or or , 2 3 5 1 - - , .

© 2014 Nature America, Inc. All rights reserved. 9. 8. 7. 6. 5. 4. 3. 2. 1. reprints/index.htm at online available is information permissions and Reprints The authors declare no competing interests.financial authors commented on the manuscript at all stages. patient specimens. A.A.A. and M.D. contributed equally as senior authors. All N.C.W.E., L.A.M., J.W.N., H.A.W., J.B.S.,R.E.M., B.W.L. Jr. and M.D. provided analyses. C.L.L. helped develop analytical pipeline software. S.V.B., J.T., J.F.W., the molecular biology experiments, and A.M.N. performed the bioinformatics experiments, analyzed the data and wrote the manuscript. S.V.B. performed A.M.N., S.V.B., A.A.A. and M.D. developed the concept, the designed M.D. are supported by Doris Duke Scientist DevelopmentClinical Awards. Institute and the Thomas and Stacey Siebel Foundation (A.M.N.). A.A.A. and Fellowship (S.V.B.) and a grant from both the Siebel Stem Cell (S.V.B.; #RR1221), an Association of American Cancer Institutes Translational Cancer Research (M.D.,of Society North A.A.A.), the Radiological America Innovator Award Program (M.D.; the 1-DP2-CA186569), Ludwig Institute for (M.D., A.A.A., A.M.N.), the US National Institutes of Health Director’s New technical assistance. This work was supported by the US Department of Defense We thank S. Quake and members of his lab for suggestions and N. Neff for online version of the pape Note: Any Supplementary Information and Source Data files are available in the number accession with Archive Read Accession codes. the of version in available are references associated any and Methods M na C AUTH A c O

ethods t k MPETIN Leary,R.J. Forshew,T. J. He, D.J. McBride, R.J. Leary, O. Gautschi, Y. Kuang, R. Rosell, Taniguchi,K. patients with whole-genome sequencing. whole-genome with patients DNA. plasma of sequencing deep targeted patients. lymphoma tumors. solid with patients Cancer from plasma in burden disease sequencing. parallel patients. cancer . lung cell non-small resistant cancer. adenocarcinomas. (2011). lung from derived DNA u n r o O e medicine e wled R R C

t al. et 49 N. Engl. J. Med. J. Engl. N. the pape the O G , 1062–1069 (2010). 1062–1069 , et al. et t al. et et al. et NTRIBUTI t al. et FINANCIAL INTERESTS FINANCIAL et al. et g g gn rarneet a pam boakr i non-Hodgkin’s in biomarkers plasma as rearrangements gene IgH t al. et et al. et ments l t al. et . Screening for epidermal growth factor receptor mutations in lung in mutations receptor factor growth epidermal for Screening oivsv dtcin of detection Noninvasive Detection of chromosomal alterations in the circulation of cancer of circulation the in alterations chromosomal of Detection Noninvasive identification and monitoring of cancer mutations by mutations cancer of monitoring and identification Noninvasive eeomn o proaie tmr imres sn massively using biomarkers tumor personalized of Development Raw sequencing data were deposited in the Sequence rgn n ponsi vle f icltn KA mttos in mutations KRAS circulating of value prognostic and Origin Quantitative detection of detection Quantitative

r Oncotarget s o cne-pcfc eoi rarneet toquantify rearrangements genomic cancer-specific of Use Sci. Transl.Med. Sci. . r Cancer Lett. Cancer advance online publication online advance . O

361 NS , 958–967 (2009). 958–967 ,

2 , 178–185 (2011). 178–185 ,

254 Clin. Cancer Res. Cancer Clin.

2 , 265–273 (2007). 265–273 , , 20ra14 (2010). 20ra14 , Sci. Transl.Med. Sci. Sci. Transl.Med. Sci. EGFR SRP04022 ln Cne Res. Cancer Clin. EGFR T790M in gefitinib or erlotinib or gefitinib in T790M mutations in circulating tumor circulating in mutations

15 8 http://www.nature.c . , 2630–2636 (2009). 2630–2636 ,

4

, 162ra154 (2012). 162ra154 , 4 , 136ra168 (2012). 136ra168 ,

ee Chromosom. Genes

17 7808–7815 ,

online online

om/ 24. 23. 22. 21. 20. 19. 18. 17. 16. 15. 14. 13. 12. 11. 10. 35. 34. 33. 32. 31. 30. 29. 28. 27. 26. 25.

Fan, H.C., Blumenfeld, Y.J., Chitkara, U., Hudgins, L. & Quake, S.R. Noninvasive S.R. Quake, & L. Hudgins, U., Y.J.,Chitkara, Blumenfeld, H.C., Fan, K. Rikova, J.P.Koivunen, R. Govindan, M. Imielinski, genome. cancer lung the at away Chipping K.E. Hutchinson, W.& Pao, E.L. Kwak, K. Bergethon, sequencing genome tumor in genes driver cancer Identifying R. Simon, & Youn,A. L. Ding, S.A. Forbes, Crowley, E., Di Nicolantonio, F., Loupakis, F. & Bardelli, A. Liquid biopsy: monitoring M. Murtaza, S.J. Dawson, A. Narayan, Shiroguchi, K., Jia, T.Z., Sims, P.A. & Xie, X.S. Digital RNA sequencing minimizes sequencing RNA Digital X.S. P.A.T.Z.,Xie, Sims, Jia, & K., Shiroguchi, Schmitt, M.W. E. Heitzer, Chan, K.C. F. Diehl, F.Diehl, D.R. Aberle, C.F.Survival Mountain, & J.A. Roth, G.L., Walsh, Jr., J.B. Putnam, J.C., Nesbitt, Iyengar,P. Timmerman,cell & non-small for radiotherapy ablative Stereotactic R.D. S. Kobayashi, Z. Su, igoi o ftl nuliy y htu sqecn DA rm aenl blood. maternal from DNA sequencing shotgun by aneuploidy fetal of diagnosis cancer. lung in kinases cancer. lung in never-smokers. and sequencing. parallel 18 cancer. cancers. studies. Nature 38 cancer. human in mutations acquired investigate to resource blood. the in cancer-genetics DNA. plasma of sequencing by cancer. (2012). sequencing. 3492–3498 deep multiplexed error-suppressed using blood in barcodes. single-molecule optimized with noise amplification and bias sequence-dependent USA Sci. Acad. Natl. Proc. Med. patients with identified through whole-genome sequencing. sequencing. parallel massively by heterogeneity tumoral and variants, single-nucleotide aberrations, number copy patients. cancer colorectal (2005). tumors. colorectal with screening. tomographic cancer. lung (1995). cell non-small early-stage in (2012). outcomes. and rationale cancer: lung gefitinib. to cancer. (2011). lung 74–84 non-small-cell in therapy targeted to relevance USA Sci. Acad. Natl. Proc. , 349–351 (2012). 349–351 , (2010). D652–D657 ,

5

et al. et , 30 (2013). 30 , 455 N. Engl. J. Med. J. Engl. N. Med. J. Engl. N. Bioinformatics et al. et J. Clin. Oncol. Clin. J. t al. et t al. et Proc. Natl. Acad. Sci. USA Sci. Acad. Natl. Proc. et al. t al. et , 1069–1075 (2008). 1069–1075 , t al. et t al. et N. Engl. J. Med. J. Engl. N. A platform for rapid detection of multiple oncogenic mutations with mutations oncogenic multiple of detection rapid for platform A et al. et et al. et et al. et t al. et et al. et Detection and quantification of mutations in the plasma of patients of plasma the in mutations of quantification and Detection et al. et et al. et al. et et al. et et al. et t al. et oai mttos fet e ptwy i ln adenocarcinoma. lung in pathways key affect mutations Somatic nlss f uain i DA sltd rm lsa n sol of stool and plasma from isolated DNA in mutations of Analysis Clin. Cancer Res. Cancer Clin. Cancer genome scanning in plasma: detection of tumor-associated Global survey of phosphotyrosine signaling identifies oncogenic identifies signaling phosphotyrosine of survey Global uo ascae cp nme cags n h cruain of circulation the in changes number copy associated Tumor Ultrasensitive measurement of hotspot mutations in tumor DNA tumor in mutations hotspot of measurement Ultrasensitive npatc ypoa iae niiin n o-ml-el lung non-small-cell in inhibition kinase lymphoma Anaplastic Non-invasive analysis of acquired resistance to cancer therapy cancer to resistance acquired of analysis Non-invasive Analysis of circulating tumor DNA to monitor metastatic breast metastatic monitor to DNA tumor circulating of Analysis OMC te aaou o Smtc uain i Cne) a Cancer): in Mutations Somatic of Catalogue (the COSMIC Genomic landscape of non-small cell lung cancer in smokers in cancer lung cell non-small of landscape Genomic Detection of ultra-rare mutations by next-generation sequencing. Mapping the hallmarks of lung adenocarcinoma with massively with adenocarcinoma lung of hallmarks the Mapping

Cell ROS1 EML4-ALK eue ln-acr otlt wt lwds computed low-dose with mortality lung-cancer Reduced Cell EGFR

27

N. Engl. J. Med. J. Engl. N. Cell 30

150 150 363 368 rearrangements define a unique molecular class of lung of class molecular unique a define rearrangements , 175–181 (2011). 175–181 , mutation and resistance of non-small-cell lung cancer lung non-small-cell of resistance and mutation rc Nt. cd Si USA Sci. Acad. Natl. Proc. , 863–870 (2012). 863–870 ,

Gastroenterology

109 105 131 , 1121–1134 (2012). 1121–1134 ,

o p e R l a c i n h c e T , 1107–1120 (2012). 1107–1120 , , 1693–1703 (2010). 1693–1703 , (2013). 1199–1209 , 352 Nat. Rev. Clin. Oncol. Clin. Rev. Nat. fusion gene and efficacy of an ALK kinase inhibitor kinase ALK an of efficacy and gene fusion Clin. Chem. Clin. Nature , 14508–14513 (2012). 14508–14513 , (2008). 16266–16271 , , 1190–1203 (2007). 1190–1203 ,

, 786–792 (2005). 786–792 , 14 , 4275–4283 (2008). 4275–4283 , J. Natl. Compr. Canc. Netw.Compr.Canc. Natl. J.

109

497

365 , 1347–1352 (2012). 1347–1352 ,

, 108–112 (2013). 108–112 , 59 135 , 395–409 (2011). 395–409 , n. hrc Surg. Thorac. Ann. , 211–224 (2013). 211–224 , , 489–498 (2008). 489–498 ,

10 , 472–484 (2013). 472–484 ,

102 . o. Diagn. Mol. J. uli Ais Res. Acids Nucleic 16368–16373 ,

acr Res. Cancer 10

60 , 1514–1520 , 466–472 , Nat. Med. Nat. T R Genome

13 72 s  , ,

© 2014 Nature America, Inc. All rights reserved. Choice Choice was used according to the manufacturer’s protocol with modifications. sequencing. next-generation and selection Hybrid in provided the in described as sources published other and TCGA database, COSMIC the from identified were regions These size. selector minimizing while tumor per number mutations the of maximize iteratively to chosen were and NSCLC in exons and genomic regions were selected according to the most frequently mutated genes Input 1. to set Matches Close Maximum with and 37.1/GRCh37 Build NCBI was designed through the library NimbleDesign portal (v1.2.R1) This using genome build hg19 NimbleGen). (Roche Library Choice EZ SeqCap custom a selection. hybrid for design Library (Agilent). Kit DNA 1000 the using Bioanalyzer 2100 a on Biosystems), determined was (KAPA length qPCR Fragment respectively. and 2000) (NanoDrop spectrophotometer by depending on input DNA mass. Library purity and concentration was cycles, assessed PCR 4–9 and oligonucleotides backbone Illumina nM 500 using fied ampli were then fragments ligated The DNA for ligated fragments. to enrich adapters. was size by TruSeq Single-step selection performed adding 40 Illumina indexed of excess molar 100-fold using °C 16 at Agencourt using steps (Beckman-Coulter) cleanup beads XP and AMPure enzymatic with-bead available incorporate commercially to all of fidelity) polymerases DNA B-family highest (i.e., rate error published strong possessing polymerase DNA 3 a employing Biosystems) (Kapa Kit in provided are Details fragments. 200-bp for settings recommended the using library instrument S2 Covaris before a with sheared construction was DNA ng 69–1,000 DNA, genomic line cell and germline tumor, For fragmentation. additional without construction library and DNA. genomic line cell DNA, For plasma ng patient 7–32 DNA for germline were used tumor, shorn and DNA plasma from prepared were libraries construction. library sequencing Next-generation parameters. cycling thermal PCR (Applied standard using machine Biosystems), PCR Time Real HT7900 a on qPCR for used was Green intact male human genomic DNA (Promega) as a standard curve. Power SYBR (ref. 1 chromosome on amplicon 81-bp an using (Qiagen). The concentration of purified plasma DNA was determined by qPCR Kit Acid Nucleic QIAamp Circulating the with plasma quantification. mL 1–5 from isolated and purification DNA Cell-free (Invitrogen). Kit Assay dsDNA PicoGreen Quant-iT by quantified was DNA Genomic pellet effusions. cell pleural the of from or specimens isolated was paraffin-embedded DNA tumor formalin-fixed, Matched from (Qiagen). Kit Tissue & the Blood with PBLs DNeasy from DNA genomic germline of isolation for used was spin 16,000 at 2,500 centrifugation at by separated was Plasma collection. of h 3 within processed were samples Blood in EDTA Dickinson). collected Vacutainer (Becton tubes processing. and collection Sample width × (length/2) formula: ellipsoid the to according calculated of and CT on tumor measurements visible on Volumetric based were burden care. tumor clinical standard of part as in performed listed are characteristics Patient Table 3 Supplementary collection. blood transfusions of blood months 3 received not within had patients Enrolled consent. study a informed in enrolled NSCLC recurrent approved by or the Stanford and Board University provided Review Institutional diagnosed newly for treatment selection. Patient METHODS ONLINE na ′ → The NGS libraries were constructed using the KAPA Library Preparation Preparation KAPALibrary the using constructed were libraries NGS The t 5 u ′ r exonuclease (or proofreading) activity and displaying the lowest lowest the displaying and activity proofreading) (or exonuclease e medicine e g for 10 min, transferred to microcentrifuge tubes and centrifuged centrifuged and tubes microcentrifuge to transferred min, 10 for g for 10 min to remove cell debris. The cell pellet from the initial initial the from pellet cell The debris. cell remove to min 10 for Supplementary Table 1 Supplementary Between April 2010 and June 2012, patients unde patients 2012, June and 2010 April Between Supplementary Methods Supplementary . All treatments and radiographic examinations were examinations radiographic and treatments . All 36 , 3 7 . The manufacturer’s protocol was modified modified was protocol manufacturer’s The . . Peripheral blood from patients was was patients from blood Peripheral Hybrid selection was performed with with performed was selection Hybrid 3 8 . Ligation was performed for 16 h 16 for performed was Ligation . 2 . . Final selector coordinates are are coordinates selector Final . Supplementary Table 2 Supplementary 2 4 ) and a dilution series of of series dilution a and ) Circulating DNA was was DNA Circulating NimbleGen SeqCap EZ EZ SeqCap NimbleGen Indexed Illumina NGS NGS Illumina Indexed µ L L (0.8×) of PEG buffer . r going going -

eeto thresholds. Detection analyzed. further scores quality Phred with bases and only reads, paired to properly restricted were fusions not involving for visual QC assessment. To mitigate the impact of sequencing generated errors, analyses automatically were coverage and depth sequence and distribution projects/fastqc ( FastQC flagstat, SAMtools by that reads unique of intersect number the selector space divided by (i.e., all aligned reads), generated rate respectively on-target selector and quality read characteristics, mapping including statistics, of variety a collect to script Perl SAMtools with indexed parameters) (default 0.6.2 BWA with genome reference control. quality and Mapping under SRP040228. number Archive accession Read Sequence the in deposited been have Raw data 2000. HiSeq sequencing Illumina an on runs paired-end 100-bp using sequenced were libraries Multiplexed Kit (Qiagen). QIAquick the with Purification PCR 50- separate 2 and Mix Start Hot HiFi Ready KAPA 1× fragments using PCR of DNA cycles 14 captured to 12 with the amplified were selection, hybrid Following hybridization. in were a capture included single libraries 9 and Illumina 12 Between indexed we performed Monte Carlo sampling (1,000×), varying the number of of number 4 Fig. Supplementary the varying {1,2,… (1,000×), available sampling reporters Carlo Monte performed we WES. and CAPP-Seq for assumed was 60% of rate on-target an Moreover, platform). 2000 HiSeq for 4 an lane (for using per example, Illumina reads paired-end 100-bp 250 million of median example, on (for are based Estimates distribution. by plasma a geometric modeled was CAPP-Seq) in mutations tumor identified all Given distribution. binomial a by modeled was limit detection and depth given a for plasma in allele mutant single a of in considered were lesions somatic of identified analysis. this performance all note, The Of selector. 10,000 accordingly. and NSCLC distribution, using normal size a had CAPP-Seq identical selectors cohort random an the with same to each exome, the the distribution from analyzed sampled we selectors random significance, adenocarcinomas lung statistical of cohort independent an analyses. Statistical FACTERA. with enumerated were detected, if and indels were using assessed the output of SAMtools mpileup To quantify tumor burden in plasma DNA, allele offrequencies reporter SNVs terization, we used an algorithm called FACTERA ( charac breakpoint and identification fusion for and confidence, call variant VarScan (ref. 2 employed we the in pipeline. Bioinformatics DNA ( genomic ( DNA plasma and DNA genomic HCC78 such between Fourteen identified were >80%. reporters fraction allelic an and depth sequencing 20× least at with sample each to unique alleles as identified were control, reporters quality homozygous and mapping Following construction. library for used DNA was ng 32 of total A (HCC78). line NSCLC second a from DNA genomic DNAinto was spiked genomic shorn from one (NCI-H3122) line cell NSCLC circulating DNA from a healthy individual, and in a second shorn experiment, into spiked was (HCC78) line cell DNA a NSCLC from genomic shorn ment, experi one In ctDNA. quantitating for CAPP-Seq of accuracy and linearity Fig. Fig. 2g To evaluate the impact of reporter number on tumor burden estimates, estimates, burden tumor on number reporter of impact the evaluate To to Related Supplementary Methods Supplementary , h ), ), whereas 24 reporters were found between NCI-H3122 and HCC78 µ Figure 1 Figure L reactions. The reactions were then pooled and processed processed and pooled then were reactions The reactions. L / ad ETos coverageBed BEDTools and ) Supplementary Fig. 5 Fig. Supplementary µ M Illumina backbone oligonucleotides in four to six six to four in oligonucleotides backbone Illumina M The NSCLC selector was validated validated was selector NSCLC The ). d Two dilution series were performed to assess the the assess to performed were series dilution Two 4 , the probability ( probability the , 0 . Quality control (QC) was assessed using a custom custom a using assessed was (QC) control Quality . ≥ Details of bioinformatics methods are supplied supplied are methods bioinformatics of Details 30 ( 30 n i to pkn eprmns ( experiments spiking two in } 4 2 ≤ Paired-end reads were mapped to the hg19 hg19 the to mapped were reads Paired-end ) with strict postprocessing filters to improve to filters postprocessing strict with ) http://www.bioinfor 0.1% probability of a sequencing error) were error) of a sequencing probability 0.1% . Briefly, for detection of SNVs and indels, indels, and SNVs of detection for Briefly, . ). P ) of recovering at least two reads reads two least at recovering of ) P , the probability of detecting detecting of probability the , 4 1 . Plots of fragment length length fragment of Plots . Supplementary Supplementary Methods P matics.babraham.ac.u 2 values were calculated calculated were values 0 ( doi: i. 1 Fig. 3 9 and sorted and and sorted and 10.1038/nm.3519 n silico in 4 i. 2g Fig. 0 c , , and fusions, ). To assess assess To ). – using using i and and k/ ). ). - -

© 2014 Nature America, Inc. All rights reserved. patient identifying patient information, identifying including stage, disease circulating DNA time equal rate false-positive a 0; to and 1 to equal rate true-positive a i.e., classifier; and specificity in ROC analyses (determined by Euclidean distance to a perfect rate false-positive was metric the if significant considered was index tion the inefficiency; hybridization to owing (potentially used. If a fusion in detected the tumor was its not found and in corresponding plasma types, mutation other all trumped it patient, same involving (i.e., the SNV method independ each if a and had ently detected were reporters indel and SNV If used. was single reporter type was present in a patient’s tumor, the corresponding ( on based rate) positive rate. detection false ultralow their to due support read >0 with present when significant considered were using separately rates background and mutation-specific analyzed were Indels alleles. background selector-wide of to distribution null the SNV frequencies we SNVs, patient-specific compared doi: Table We evaluated CAPP-Seq performance in a blinded fashion by masking all all masking by fashion blinded a in performance CAPP-Seq Weevaluated false- a to (akin index detection ctDNA a calculated we patient, each For DNA in plasma using of estimates burden tumor To significance the assess 10.1038/nm.3519 P Figs. Figs. value for any remaining mutation type(s) was used. The ctDNA detec ctDNA The used. was type(s) mutation remaining any for value 1 4 and and 3 P . Otherwise, given the prioritization of SNVs in the selector design, design, selector the in of SNVs prioritization the given . Otherwise, value was used. If a fusion breakpoint identified in a tumor sample If identified was used. a breakpoint value fusion 3 P

and value <0.1, we combined their respective respective their we combined <0.1, value Supplementary Table 4 Table Supplementary ROS1 4 , , ≤ Table 5%), the threshold that maximized CAPP-Seq sensitivity sensitivity CAPP-Seq maximized that threshold the 5%), , , ALK P value integration from his or her array of reporters reporters of array her or his from integration value 1 and and or or RET Supplementary Table 4 Supplementary ) was recovered in plasma DNA from the the from DNA plasma in recovered was ) ). Specifically, for cases where only a only where cases for Specifically, ). Z -score statistics. Fusion -score statistics. breakpoints Supplementary Methods Supplementary P ). values using Fisher’s using values ≤ 0.05 (approximate (approximate 0.05 P value (~0) was was (~0) value P value ), ), - -

42. 41. 40. 39. 38. 37. 36. GraphPad using 6. Prism performed were estimates significance and analyses ROC determine true classification) categories ( to data insufficient (i.e., or cancer-unknown cured) was patient (i.e., negative cancer- body), patient’s the in present was cancer (i.e., cancer-positive into samples patient grouped and ourselves we ‘unblinded’ specificity, and sitivity sen To calculate pairs). 520 or samples, plasma 40 across reporters of somatic sets patient-specific (13 samples DNA plasma deidentified of grid entire the across index detection ctDNA our We applied etc. then treatment, point, 43. Additional details are presented in the the in presented are details Additional

ood, D.C. forcomparing Koboldt, of utilities suite flexible a BEDTools: I.M. Hall, & A.R. Quinlan, H. Li, Burrows-Wheeler with alignment read short accurate and Fast R. Durbin, & H. Li, S. Fisher, Oyola, S.O. M.A. Quail, Fisher, R.A. Fisher, icvr i cne b eoe sequencing. exome by cancer (2012). in discovery features. genomic 25 transform. (2011). libraries. capture targeted exome human sequence-ready AT-biasedextremely genomes. for Methods , 2078–2079 (2009). 2078–2079 , t al. et

9 , 10–11 (2012). 10–11 , Bioinformatics et al. t al. et h Sqec AinetMp omt n SAMtools. and format Alignment/Map Sequence The Statistical Methods for Research Workers Research for Methods Statistical t al. et t al. et Optimizing Illumina next-generation sequencing library preparation saal, ul atmtd rcs fr osrcin of construction for process automated fully scalable, A Bioinformatics pia ezms o apiyn sqecn libraries. sequencing amplifying for enzymes Optimal aSa 2 smtc uain n cp nme alteration number copy and mutation somatic 2: VarScan

25 , 1754–1760 (2009). 1754–1760 ,

26 BMC Genomics BMC , 841–842 (2010). 841–842 , Fig. 3a Supplementary Methods Supplementary , b

13 eoe Res. Genome and , 1 (2012). 1 , (Oliver and Boyd, 1925). Boyd, and (Oliver Supplementary Supplementary Fig. 6 na eoe Biol. Genome t u

r 22 e medicine e Bioinformatics 568–576 , .

12 R1 , Nat. ). ). -