Supporting Information
Total Page:16
File Type:pdf, Size:1020Kb
Supporting Information Gupta et al. 10.1073/pnas.1418577112 SI Materials and Methods solute ethanol. This process imparts positive charge to the cleaned Case History. A 58-y-old male patient presented in the clinic with surfaces, which helps in immobilizing DNA in later steps (2). renal failure and hypercalcemia in April 2006 and was diagnosed DNA deposition and digestion. The optical mapping surfaces were with ISS Stage IIIb MM, with multiple lytic bone lesions and 63% taken out of ethanol incubation and left to dry under ambient plasma cells in bone marrow biopsy. The patient was initially conditions. Once dried, high molecular weight genomic DNA was treated with bortezomib/dexamethasone, followed by tandem elongated on optical mapping surfaces via capillary flow in a autologous stem cell transplants in October 2006 and January microfluidic device made of poly-dimethylsiloxane with micro- 2007. However, the patient relapsed in January 2008 with kar- channels that were 10 mm long, 100 μm wide, and 3 μm high (3). yotypic abnormalities, including hypoploidy and del(17p), and DNA is immobilized on positively charged glass surfaces via received salvage therapy with lenalidomide/dexamethasone, fol- electrostatic interactions and elongated via flow-based forces lowed by bortezomib/dexamethasone. Bone marrow sample from in microchannels (3). A 3.3% polyacrylamide overlay [111 μL August 2008 (MM-S sample) showed effacement of marrow of 29:1 acrylamide:bisacrylamide solution (Amresco), 889 μLof cavity with plasma cells and multiple new karyotypic abnormal- ddH2O, 7.5 μL of 10% ammonium persulfate (APS), 0.8 μLof ities, including structural losses of chromosomes 1, 2, 7, 13, 14, tetramethylethylenediamine (TEMED)] was then deposited on and 17. At this time, the patient achieved very good partial re- the arrays of elongated DNA molecules (4). The polyacrylamide sponse (VGPR) to cyclophosphamide but relapsed again in overlay prevents desorption of small fragments of DNA, which October 2009, with 90% plasma cells in bone marrow sample are formed after restriction digestion. The surfaces were then (MM-R sample). The patient died of progressive disease in rinsed twice with 1 mL of TE, equilibrated with 200 μL of re- February 2010, ∼4 y after first diagnosis. striction digestion buffer [1× New England Biolabs (NEB) buffer 3], and finally incubated in a humidified chamber with 20 U of Study Design, Cell Samples, and DNA Preparation for Optical Mapping BamHI (New England Biolabs) in 200 μL, 1× NEB buffer 3 and DNA Sequencing. From bone marrow aspirates collected from solution containing 0.02% Triton X-100 (Roche) at 37 °C. The the patient in August 2008 (MM-S; patient was sensitive to digestion time varied from 30 min to 2 h based on the quality of subsequent treatments) and October 2009 (MM-R; patient was (+) data obtained in preliminary data collection. After digestion was refractory to all subsequent treatments), CD138 cells were complete, the BamHI solution was washed off with two rinses of isolated using the autoMACS Pro Separation System (Miltenyi 1× TE. Restriction fragments from DNA molecules were then Biotec) as described previously (1). Mesenchymal stromal cells, stained with DNA intercalating dye YOYO-1 (12 μL of 0.5 μM cultured from the MM-R sample (passage 3), were used as the in β-mercaptoethanol/TE; Life Technologies). The restriction matched normal sample. enzyme (BamHI) introduces double-stranded DNA cuts at For optical mapping, high molecular weight genomic DNA cognate restriction sites on DNA molecules, which leads to small was extracted using a gentle liquid lysis protocol. Briefly, cell gaps in elongated molecules (∼1 μm wide) because of coil re- sampleswerethawedonice,washedwith1mLofice-cold laxation at ends of newly formed DNA fragments. The DNA Dulbecco’s phosphate-buffered saline (DPBS) (Gibco) twice and fragments were imaged in the next step using epifluorescence resuspended in 1× TE (10 mM Tris·HCl, 1 mM EDTA, pH 8.0) microscopy. with 34 pg/μL Lambda DashII DNA (Stratagene) and 0.5 mg/mL Data collection via epifluorescence microscopy and image processing. The Proteinase K (Bioline USA), at a concentration of 20–200 cells arrays of microchannels were imaged at optical mapping work- per μL. The cell suspension was distributed into 200 μL aliquots, stations, which comprise a Zeiss 135M inverted microscope (Carl incubated at 72 °C for 10 min, and then left overnight at 25 °C Zeiss) equipped with a 63× oil immersion objective and illumi- and finally stored at 4 °C for further use. nated by a 488-nm Argon ion laser (Spectra-Physics). Other For whole genome sequencing, genomic DNA was isolated components (e.g., a high speed CCD camera and shutter, stage, from frozen cell samples using the DNA Isolation Kit for Cells and Tissues (Roche) following the manufacturer’s protocol. and focus controllers) were interconnected and controlled to Paired-end whole genome sequencing was done by Beckman yield an automated image collection and processing system (3, Coulter Genomics on an Illumina HiSeq 2000 platform. 5). As images along arrays of microchannels were collected, they were processed using a custom image processing software to Optical Mapping. extract single-molecule Rmaps. Cleaning optical mapping surfaces. The 22 × 22 mm cover glass slips Rmap coverage analysis. The hidden Markov model-based coverage (Fisherfinest; Fisher Scientific) were mounted in a custom poly- analysis algorithm summarizes the Rmap alignments by a single tetrafluoroethylene (PTFE) rack in a sealed reaction vessel. number (midpoint) that represents locations, which are then They were cleaned by heating at 70 °C with Nano-strip (Cyantek) modeled as realizations of a nonhomogeneous Poisson process. for 1 h, followed by multiple rinses with high purity water, and The nonhomogeneity of alignment data, which arises due to then heating at 104 °C with 12.1 N hydrochloric acid (Fisher varied cognate site density across the genome (for restriction Scientific) for 6 h, and a final set of rinses with high purity water. endonuclease), is accounted for by using alignment data from a The cleaned surfaces were stored in absolute ethanol (Pharmco- normal genome, which are used to define intervals with counts AAPER) until use. that follow a negative binomial distribution. These counts are Derivatizing optical mapping surfaces. The cleaned glass coverslips then modeled by a hidden Markov model, which incorporates were loaded on a custom PTFE rack and incubated in 250 mL of spatial dependence in the data and allows more natural esti- ddH2O with 90 μLofN-trimethoxylsilylpropyl-N,N,N-trimethyl- mation of certain parameters. As a result, the sample genome is ammonium chloride (50% solution in methanol) and 10 μLof partitioned into high, normal, and low copy number states (6–8). vinyltrimethoxysilane (as per Gelest Inc.) with light agitation at The results were plotted using Circos (circos-0.64) (9) and 65 °C for 17.5 h. They were rinsed twice with distilled water, and annotated using the UCSC genome browser (genome.ucsc.edu/ then twice with absolute ethanol and finally stored for use in ab- index.html) (10). Gupta et al. www.pnas.org/cgi/content/short/1418577112 1of8 Iterative Rmap assembly. The computational framework for Rmap marked, indexed, and mate-fixed using picard tools (v1.84; assembly has been described in detail in Teague et al. (5). Briefly, broadinstitute.github.io/picard/). The formatted bam files were each stage of this iterative process is similar to sequence assembly realigned around indels and base recalibrated with the GATK and consists of three steps. First, single-molecule maps are package (14, 15) before being used for all downstream analysis. aligned to an in silico restriction map derived from human ref- Somatic copy number analysis. Control-FREEC (16) was used for erence sequence (NCBI Build 37) using an in-house alignment somatic copy number analysis using a window size of 50 kb and a software called Software for Optical Mapping Analysis (SOMA). step size of 10 kb. This step clusters similar maps to the genomic region where they Structural variation analysis. Read pair (BreakDancer, version 1.3.6) align. Single maps, however, have experimental errors comprising (17), split read (Pindel, version 0.2.4o) (18), and read depth false extra cuts, false missing cuts, and sizing issues, which are (CNVnator, version 0.2.7) (19) based methods were used to modeled with different error models (5, 6). In the second step, identify structural variation in all three samples. The break- these map clusters are assembled by Gentig, a Bayesian maxi- dancer-max from the BreakDancer package was used with default mum-likelihood assembler (11), to generate consensus restriction settings. With CNVnator, the genome was divided into 100 bp maps. This step addresses the experimental noise from individual bins. It yielded an average read depth per bin of 61.649 ± 10.1042, maps. Finally, consensus maps are aligned back to the in silico 62.6684 ± 28.2886, and 100.998 ± 23.8661 for normal, MM-S, reference map, which helps us identify differences in our samples. and MM-R samples after GC correction. Pindel was run on all In the first iteration, the in silico reference maps are used as three samples together using the default parameters except for reference. In the following iterations, the consensus maps gen- –x 7, which defines the maximum event size as 524 kb. The erated in previous iterations are used as reference. Based on output from these programs was analyzed and compared using previous work (5), we determined that eight iterations were custom scripts in R (20). sufficient to provide good accuracy and coverage for the human SNP/SNV analysis and comparison of SNVs with previous studies. The genome. Using this approach, we generated genome-wide physical Genome Analysis ToolKit (GATK) pipeline was used for SNP assemblies for the three genomes being analyzed.