Alignment Alignment Read Ends Pairing Valid Fragment

Total Page:16

File Type:pdf, Size:1020Kb

Alignment Alignment Read Ends Pairing Valid Fragment Supplementary Figures and Tables Hi-C reads Ligation site End 1 End 2 Alignment Alignment >25bps Chimeric reads Multi-reads Multi-reads Uni-reads Uni-reads Read ends pairing Uniquely mapping read pairs Multi-mapping read pairs >99 Unmapped reads Singleton reads Low quality Multi-reads Valid fragment filtering d1 d1 d2 50 bps < d1 + d2 < 800 bps d1 + d2 >800 bps d2 <25k bps d1 + d2 < 50 bps >25k bps Short-range contacts Valid read pairs Invalid alignments End 1 End 2 End 1 End 2 End 1 End 2 Dangling end Self circle Religation 24 Supplementary Figure 1 mHi-C pipeline (Alignment - Read end pairing - Valid fragment filtering). 1. Read ends are aligned to reference genome separately allow- ing multi-reads and chimeric reads are rescued. 2. Read ends are paired by their read query names. Multi-reads form more than one read pair with the same read query name. Read ends that fail to align form either unmapped reads or singleton reads and are discarded. Multi-reads with ends aligning to more than 99 positions are regarded as low quality multi-reads and are excluded from the downstream analysis. 3. Vali- dation checking to filter short-range contacts and alignments far away from restriction enzyme recognition sites. Contacts residing within the same restriction fragment, i.e., dangling end or self circle, as well as adjacent fragments (religation) are discarded. The above three processing steps are applied to each read independently enabling parallel implementation. 25 Valid fragment filtering d1 d2 50 bps < d1 + d2 < 800 bps >25k bps Valid read pairs Duplicate removal Uni-reads Multi-reads A 1 mismatch 2 mismatches Multi-reads Multi-reads B Genome binning 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb Uni-bin pairs Multi-bin pairs Multi-reads reduced to Uni-bin pairs mHi-C Prob=0.9 0.1 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb Uni-bin pairs Multi-reads reduced to Multi-bin pairs Uni-binpairs Contact matrix Bin k Bin j Bin k Bin j 3 contact counts 26 Supplementary Figure 2 mHi-C pipeline (Duplicate removal - Genome binning - mHi-C). 4. PCR duplicates are removed to ensure that when a uni-read and a multi- read have the same alignment position and strand direction, the uni-read is kept. In the case of multi-reads that overlap with other multi-reads, the ones with alphabetically larger IDs are removed. 5. Genome is split into fix-sized non-overlapping intervals, i.e., bins or fixed number of restriction fragments and, as a result, read alignment position pairs are reduced to bin pairs. Multi-reads, candidate alignment positions of which fall into the same bin, are reduced to uni-bin pairs. 6. mHi-C model estimates an allocation probability for each potential contact and enables filtering of contacts by thresholding this allocation probability. 7. Uni-reads and thresholded multi-reads are utilized to construct contact matrix. 27 a b 6e+08 10.78% 12.24% 12.14% 10.86% 11.35% 11.35% 11.03% 11.03% 10.86% 10.82% 10.81% 10.78% 9.64% 10.74% 10.74% 10.73% 10.68% 10.67% 10.61% 9.62% 10.39% 10.31% 9.93% 10 9.89% 9.64% 9.63% 9.62% 9.6% 4e+08 9.89% 9.93% 10.68% 10.74% # of Reads 5 9.6% 2e+08 9.63% 10.31% 10.39% Percentage of Multi-reads 11.03% 11.03% 10.73% 10.74% 11.35% 11.35% 10.81% 10.82% 10.67% 10.61% 12.14% 12.24% 0e+00 0 rep1 rep2 rep3 rep4 rep5 rep6 rep1 rep2 rep3 rep4 rep5 rep6 read end 1.w/o chimeric reads read end 1.w/o chimeric reads read end 1.chimeric reads read end 1.chimeric reads read end 2.w/o chimeric reads read end 2.w/o chimeric reads read end 2.chimeric reads read end 2.chimeric reads Supplementary Figure 3 Multi-reads due to chimeric reads (IMR90). Both chimeric reads and multi-reads require extra processing to rescue. a. Numbers of read ends with and without chimeric rescue are displayed along with what percentage of these sets are multi-reads. Darker shades on the bars represent multi-reads. Multi-reads constitute a larger percentage of the usable reads compared to chimeric reads. b. Same information as (a) but displayed in terms of percentages. The actual percent- ages of multi-reads (y-axis) for each category are also displayed on top of each bar. As expected, chimeric reads lead to larger percentages of multi-reads. 28 a b c Rings Trophozoites Rep1 8.07% 1500 Count Count 13.21% 13.17% 13.10% 13.05% 1e+06 12.61% 12.60% 12.53% 12.53% 10000 12.41% 1e+04 12.26% 12.19% 12.13% 11.90% 8.97% 11.79% 11.73% 100 11.64% 1e+02 1e+00 1 6e+07 2000 10 9.48% 9.39% 1000 9.21% 8.97% 8.97% 8.45% 9.39% 8.32% 8.28% 8.07% 8.06% 9.21% 7.32% 7.32% 4e+07 1000 500 # of Reads 5 2e+07 8.06% 8.97% 13.05% 13.10% 0 0 13.17% 13.21% 12.53% 12.53% Percentage of Multi−mapping Reads 12.61% 12.60% 0 200 400 600 800 0 200 400 600 7.32% 8.32% 8.45% 9.48% 7.32% 0 8.28% 12.13% 12.19% 12.26% 11.79% 11.90% 11.64% 12.41% 11.73% 0e+00 Trophozoites Rep2 Schizonts AT_L1 AT_L2 AT_L1 AT_L2 RINGS_L1 GGG_L1 GGG_L2 AGGG_L1 AGGG_L2 A A RINGS_L1 Count SCHIZONTS_L1SCHIZONTS_L2 SCHIZONTS_L1SCHIZONTS_L2 4000 1e+06 Count 1e+06 ROPHOZOITES−XL−ROPHOZOITES−XL−ROPHOZOITES−XL−CCROPHOZOITES−XL−CC OPHOZOITES−XL−OPHOZOITES−XL−OPHOZOITES−XL−CCOPHOZOITES−XL−CC T T T T TR TR TR TR 1e+04 read end 1.w/o chimeric reads 1e+04 read end 1.w/o chimeric reads 1e+02 read end 2.w/o chimeric reads read end 2.w/o chimeric reads 30000 1e+02 1e+00 read end 1.chimeric reads 1e+00 read end 1.chimeric reads 3000 read end 2.chimeric reads read end 2.chimeric reads e Rings Trophozoites Rep1 Uni&Multi−mapping Bin−pair Contact Count 20000 Uni−setting Uni&Multi−setting Uni−setting Uni&Multi−setting 2000 5941 18654 17326 10000 1000 4296 0 0 0 500 1000 1500 0 1000 2000 Uni−mapping Bin−pair Contact Count 2407 d Rings Trophozoites Rep1 0.001 0.01 0.05 0.001 0.01 0.05 1553 300 280 3294 260 766 80 1968 1553 240 220 155 155 200 60 180 Trophozoites Rep2 Schizonts 160 140 Uni−setting Uni&Multi−setting Uni−setting Uni&Multi−setting 120 4381 40 2274 100 80 60 20 40 1725 20 1553 0 0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.90.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.90.5 0.6 0.7 0.8 0.9 2446 2202 Trophozoites Rep2 Schizonts 0.001 0.01 0.05 0.001 0.01 0.05 1553 180 25 651 160 140 20 268 102 155 155 120 100 Change in the Number of Significant Contacts (*100%) 15 80 Uni (FDR 1%) 60 Uni.Specific (Uni&Multi FDR 1%) 40 10 Uni.Specific (Uni&Multi FDR 10%) Uni&Multi (FDR 1%) 20 Uni&Multi.Specific (Uni FDR 1%) Gain 0 0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.90.5 0.6 0.7 0.8 0.9 Uni&Multi.Specific (Uni FDR 10%) Loss 0.5 0.6 0.7 0.8 0.9 0.5 0.6 0.7 0.8 0.90.5 0.6 0.7 0.8 0.9 Multi−mapping Reads Posterior Probability Thresholding 29 Supplementary Figure 4 Multi-reads due to chimeric reads and improvement in the number of significant contacts due to multi-reads (P. falciparum). a, b Same as Supplementary Figure. 3a, b, but for P. falciparum. c. mHi-C leads to im- proved bin coverage by the Uni&Multi-setting compared to Uni-setting across all the P. falciparum samples. Dashed line is y = x. d. Percentage change in the numbers of significant contacts: red and blue depict gain and loss of Uni&Multi-setting compared to the Uni-setting, respectively. e. Recovery of significant contacts identified at FDR 1% by analysis at FDR 10%. Uni&Multi-setting. Specific (Uni FDR 10%) is the set of significant contacts identified at 1% FDR by the Uni&Multi-setting but are still unrecov- erable by the Uni-setting even with a liberal FDR of 10%. More detailed explanation is provided in Supplementary Figure 10. 30 a Uni-setting Uni&Multi-setting 74.50 MB 93.00 MB 74.50 MB 93.00 MB 74.50 MB 74.50 MB Chromosome 1 Chromosome 1 93.00 MB 30 93.00 MB 30 Chromosome 1 Chromosome 1 b Uni-setting Uni&Multi-setting 65.20 MB 83.70 MB 65.20 MB 83.70 MB 65.20 MB 65.20 MB Chromosome 3 Chromosome 3 30 83.70 MB 30 83.70 MB Chromosome 3 Chromosome 3 Supplementary Figure 5 Gaps in contact matrices are filled in after incorporat- ing multi-reads (IMR90).
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • ABSTRACT MITCHELL III, ROBERT DRAKE. Global Human Health
    ABSTRACT MITCHELL III, ROBERT DRAKE. Global Human Health Risks for Arthropod Repellents or Insecticides and Alternative Control Strategies. (Under the direction of Dr. R. Michael Roe). Protein-coding genes and environmental chemicals. New paradigms for human health risk assessment of environmental chemicals emphasize the use of molecular methods and human-derived cell lines. In this study, we examined the effects of the insect repellent DEET (N, N-diethyl-m-toluamide) and the phenylpyrazole insecticide fipronil (fluocyanobenpyrazole) on transcript levels in primary human hepatocytes. These chemicals were tested individually and as a mixture. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 108 genes and lowered transcript levels for 64 genes and fipronil at 10 µM increased the levels of 2,246 transcripts and decreased the levels for 1,428 transcripts. Fipronil was 21-times more effective than DEET in eliciting changes, even though the treatment concentration was 10-fold lower for fipronil versus DEET. The mixture of DEET and fipronil produced a more than additive effect (levels increased for 3,017 transcripts and decreased for 2,087 transcripts). The transcripts affected in our treatments influenced various biological pathways and processes important to normal cellular functions. Long non-protein coding RNAs and environmental chemicals. While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind. We chose to examine the effects of two common environmental chemicals, the insect repellent DEET and the insecticide fipronil, on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes.
    [Show full text]
  • Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Nutrition Publications and Other Works Nutrition 12-19-2010 Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P. Stewart Marshall University Hyoung Y. Kim University of Tennessee - Knoxville, [email protected] Arnold M. Saxton University of Tennessee - Knoxville, [email protected] Jung H. Kim Marshall University Follow this and additional works at: https://trace.tennessee.edu/utk_nutrpubs Part of the Animal Sciences Commons, and the Nutrition Commons Recommended Citation BMC Genomics 2010, 11:713 doi:10.1186/1471-2164-11-713 This Article is brought to you for free and open access by the Nutrition at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Nutrition Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Stewart et al. BMC Genomics 2010, 11:713 http://www.biomedcentral.com/1471-2164/11/713 RESEARCH ARTICLE Open Access Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P Stewart1, Hyoung Yon Kim2, Arnold M Saxton3, Jung Han Kim1* Abstract Background: Type 2 diabetes (T2D) is the most common form of diabetes in humans and is closely associated with dyslipidemia and obesity that magnifies the mortality and morbidity related to T2D. The genetic contribution to human T2D and related metabolic disorders is evident, and mostly follows polygenic inheritance. The TALLYHO/ JngJ (TH) mice are a polygenic model for T2D characterized by obesity, hyperinsulinemia, impaired glucose uptake and tolerance, hyperlipidemia, and hyperglycemia.
    [Show full text]
  • Systematic Data-Querying of Large Pediatric Biorepository Identifies Novel Ehlers-Danlos Syndrome Variant Akshatha Desai1, John J
    Desai et al. BMC Musculoskeletal Disorders (2016) 17:80 DOI 10.1186/s12891-016-0936-8 RESEARCH ARTICLE Open Access Systematic data-querying of large pediatric biorepository identifies novel Ehlers-Danlos Syndrome variant Akshatha Desai1, John J. Connolly1, Michael March1, Cuiping Hou1, Rosetta Chiavacci1, Cecilia Kim1, Gholson Lyon1, Dexter Hadley1 and Hakon Hakonarson1,2* Abstract Background: Ehlers Danlos Syndrome is a rare form of inherited connective tissue disorder, which primarily affects skin, joints, muscle, and blood cells. The current study aimed at finding the mutation that causing EDS type VII C also known as “Dermatosparaxis” in this family. Methods: Through systematic data querying of the electronic medical records (EMRs) of over 80,000 individuals, we recently identified an EDS family that indicate an autosomal dominant inheritance. The family was consented for genomic analysis of their de-identified data. After a negative screen for known mutations, we performed whole genome sequencing on the male proband, his affected father, and unaffected mother. We filtered the list of non- synonymous variants that are common between the affected individuals. Results: The analysis of non-synonymous variants lead to identifying a novel mutation in the ADAMTSL2 (p. Gly421Ser) gene in the affected individuals. Sanger sequencing confirmed the mutation. Conclusion: Our work is significant not only because it sheds new light on the pathophysiology of EDS for the affected family and the field at large, but also because it demonstrates the utility of unbiased large-scale clinical recruitment in deciphering the genetic etiology of rare mendelian diseases. With unbiased large-scale clinical recruitment we strive to sequence as many rare mendelian diseases as possible, and this work in EDS serves as a successful proof of concept to that effect.
    [Show full text]
  • WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization I International Bureau (10) International Publication Number (43) International Publication Date WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT (51) International Patent Classification: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, C12Q 1/68 (2018.01) A61P 31/18 (2006.01) DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, C12Q 1/70 (2006.01) HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, (21) International Application Number: MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, PCT/US2018/056167 OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (22) International Filing Date: SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, 16 October 2018 (16. 10.2018) TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (25) Filing Language: English (84) Designated States (unless otherwise indicated, for every kind of regional protection available): ARIPO (BW, GH, (26) Publication Language: English GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, (30) Priority Data: UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, 62/573,025 16 October 2017 (16. 10.2017) US TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, ΓΕ , IS, IT, LT, LU, LV, (71) Applicant: MASSACHUSETTS INSTITUTE OF MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TECHNOLOGY [US/US]; 77 Massachusetts Avenue, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, Cambridge, Massachusetts 02139 (US).
    [Show full text]
  • Identification of Potential Key Genes and Pathway Linked with Sporadic Creutzfeldt-Jakob Disease Based on Integrated Bioinformatics Analyses
    medRxiv preprint doi: https://doi.org/10.1101/2020.12.21.20248688; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Identification of potential key genes and pathway linked with sporadic Creutzfeldt-Jakob disease based on integrated bioinformatics analyses Basavaraj Vastrad1, Chanabasayya Vastrad*2 , Iranna Kotturshetti 1. Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India. 2. Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karanataka, India. 3. Department of Ayurveda, Rajiv Gandhi Education Society`s Ayurvedic Medical College, Ron, Karnataka 562209, India. * Chanabasayya Vastrad [email protected] Ph: +919480073398 Chanabasava Nilaya, Bharthinagar, Dharwad 580001 , Karanataka, India NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. medRxiv preprint doi: https://doi.org/10.1101/2020.12.21.20248688; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Abstract Sporadic Creutzfeldt-Jakob disease (sCJD) is neurodegenerative disease also called prion disease linked with poor prognosis. The aim of the current study was to illuminate the underlying molecular mechanisms of sCJD. The mRNA microarray dataset GSE124571 was downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were screened.
    [Show full text]
  • WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T (51) International Patent Classification: David [US/US]; 13539 N . 95th Way, Scottsdale, AZ C12Q 1/68 (2006.01) 85260 (US). (21) International Application Number: (74) Agent: AKHAVAN, Ramin; Caris Science, Inc., 6655 N . PCT/US20 12/0425 19 Macarthur Blvd., Irving, TX 75039 (US). (22) International Filing Date: (81) Designated States (unless otherwise indicated, for every 14 June 2012 (14.06.2012) kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, English (25) Filing Language: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, Publication Language: English DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, KR, (30) Priority Data: KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, 61/497,895 16 June 201 1 (16.06.201 1) US MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, 61/499,138 20 June 201 1 (20.06.201 1) US OM, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SC, SD, 61/501,680 27 June 201 1 (27.06.201 1) u s SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, 61/506,019 8 July 201 1(08.07.201 1) u s TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
    [Show full text]
  • Single Cell Derived Clonal Analysis of Human Glioblastoma Links
    SUPPLEMENTARY INFORMATION: Single cell derived clonal analysis of human glioblastoma links functional and genomic heterogeneity ! Mona Meyer*, Jüri Reimand*, Xiaoyang Lan, Renee Head, Xueming Zhu, Michelle Kushida, Jane Bayani, Jessica C. Pressey, Anath Lionel, Ian D. Clarke, Michael Cusimano, Jeremy Squire, Stephen Scherer, Mark Bernstein, Melanie A. Woodin, Gary D. Bader**, and Peter B. Dirks**! ! * These authors contributed equally to this work.! ** Correspondence: [email protected] or [email protected]! ! Supplementary information - Meyer, Reimand et al. Supplementary methods" 4" Patient samples and fluorescence activated cell sorting (FACS)! 4! Differentiation! 4! Immunocytochemistry and EdU Imaging! 4! Proliferation! 5! Western blotting ! 5! Temozolomide treatment! 5! NCI drug library screen! 6! Orthotopic injections! 6! Immunohistochemistry on tumor sections! 6! Promoter methylation of MGMT! 6! Fluorescence in situ Hybridization (FISH)! 7! SNP6 microarray analysis and genome segmentation! 7! Calling copy number alterations! 8! Mapping altered genome segments to genes! 8! Recurrently altered genes with clonal variability! 9! Global analyses of copy number alterations! 9! Phylogenetic analysis of copy number alterations! 10! Microarray analysis! 10! Gene expression differences of TMZ resistant and sensitive clones of GBM-482! 10! Reverse transcription-PCR analyses! 11! Tumor subtype analysis of TMZ-sensitive and resistant clones! 11! Pathway analysis of gene expression in the TMZ-sensitive clone of GBM-482! 11! Supplementary figures and tables" 13" "2 Supplementary information - Meyer, Reimand et al. Table S1: Individual clones from all patient tumors are tumorigenic. ! 14! Fig. S1: clonal tumorigenicity.! 15! Fig. S2: clonal heterogeneity of EGFR and PTEN expression.! 20! Fig. S3: clonal heterogeneity of proliferation.! 21! Fig.
    [Show full text]
  • Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis
    Getting “Under the Skin”: Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis by Kristen Monét Brown A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Epidemiological Science) in the University of Michigan 2017 Doctoral Committee: Professor Ana V. Diez-Roux, Co-Chair, Drexel University Professor Sharon R. Kardia, Co-Chair Professor Bhramar Mukherjee Assistant Professor Belinda Needham Assistant Professor Jennifer A. Smith © Kristen Monét Brown, 2017 [email protected] ORCID iD: 0000-0002-9955-0568 Dedication I dedicate this dissertation to my grandmother, Gertrude Delores Hampton. Nanny, no one wanted to see me become “Dr. Brown” more than you. I know that you are standing over the bannister of heaven smiling and beaming with pride. I love you more than my words could ever fully express. ii Acknowledgements First, I give honor to God, who is the head of my life. Truly, without Him, none of this would be possible. Countless times throughout this doctoral journey I have relied my favorite scripture, “And we know that all things work together for good, to them that love God, to them who are called according to His purpose (Romans 8:28).” Secondly, I acknowledge my parents, James and Marilyn Brown. From an early age, you two instilled in me the value of education and have been my biggest cheerleaders throughout my entire life. I thank you for your unconditional love, encouragement, sacrifices, and support. I would not be here today without you. I truly thank God that out of the all of the people in the world that He could have chosen to be my parents, that He chose the two of you.
    [Show full text]
  • Network-Based Method for Drug Target Discovery at the Isoform Level
    www.nature.com/scientificreports OPEN Network-based method for drug target discovery at the isoform level Received: 20 November 2018 Jun Ma1,2, Jenny Wang2, Laleh Soltan Ghoraie2, Xin Men3, Linna Liu4 & Penggao Dai 1 Accepted: 6 September 2019 Identifcation of primary targets associated with phenotypes can facilitate exploration of the underlying Published: xx xx xxxx molecular mechanisms of compounds and optimization of the structures of promising drugs. However, the literature reports limited efort to identify the target major isoform of a single known target gene. The majority of genes generate multiple transcripts that are translated into proteins that may carry out distinct and even opposing biological functions through alternative splicing. In addition, isoform expression is dynamic and varies depending on the developmental stage and cell type. To identify target major isoforms, we integrated a breast cancer type-specifc isoform coexpression network with gene perturbation signatures in the MCF7 cell line in the Connectivity Map database using the ‘shortest path’ drug target prioritization method. We used a leukemia cancer network and diferential expression data for drugs in the HL-60 cell line to test the robustness of the detection algorithm for target major isoforms. We further analyzed the properties of target major isoforms for each multi-isoform gene using pharmacogenomic datasets, proteomic data and the principal isoforms defned by the APPRIS and STRING datasets. Then, we tested our predictions for the most promising target major protein isoforms of DNMT1, MGEA5 and P4HB4 based on expression data and topological features in the coexpression network. Interestingly, these isoforms are not annotated as principal isoforms in APPRIS.
    [Show full text]
  • Generative Modeling of Multi-Mapping Reads with Mhi-C
    Manuscript submitted to eLife 1 Generative Modeling of 2 Multi-mapping Reads with mHi-C 3 Advances Analysis of Hi-C Studies 1 2,3 1,4* 4 Ye Zheng , Ferhat Ay , Sündüz Keleş *For correspondence: [email protected] (SK) 1 2 5 Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA; La 3 6 Jolla Institute for Allergy and Immunology, La Jolla, CA 92037, USA; School of Medicine, 4 7 University of California San Diego, La Jolla, CA 92093, USA; Department of Biostatistics 8 and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA 9 10 Abstract Current Hi-C analysis approaches are unable to account for reads that align to multiple 11 locations, and hence underestimate biological signal from repetitive regions of genomes. We 12 developed and validated mHi-C,amulti-read mapping strategy to probabilistically allocate Hi-C 13 multi-reads. mHi-C exhibited superior performance over utilizing only uni-reads and heuristic 14 approaches aimed at rescuing multi-reads on benchmarks. Specifically, mHi-C increased the 15 sequencing depth by an average of 20% resulting in higher reproducibility of contact matrices and 16 detected interactions across biological replicates. The impact of the multi-reads on the detection of 17 significant interactions is influenced marginally by the relative contribution of multi-reads to the 18 sequencing depth compared to uni-reads, cis-to-trans ratio of contacts, and the broad data quality 19 as reflected by the proportion of mappable reads of datasets. Computational experiments 20 highlighted that in Hi-C studies with short read lengths, mHi-C rescued multi-reads can emulate the 21 effect of longer reads.
    [Show full text]
  • Incorporating Sequence Capture Into Library Preparation for Miniontm
    Incorporating sequence capture into library preparation for MinIONTM, GridIONTM and PromethIONTM Hybrid sequence capture allows users to select thousands of loci of interest simultaneously prior to sequencing, making more efficient use of the sequencing run Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com gDNA a) Randomly fragmented ~6 kb library b) Human exome report containing regions of interest Fragmentation OVERVIEW PER GENE OVERVIEW PER GENE Reads by exit status Alignments per gene End-prep Showing the exit status of all the reads analysed Select a gene in the table below to show summary Size-selection and p A p USP17L8 hybridisation to A Filter . NCBI gene ID 100287144 Ensembl: ENSG00000231396 T Barcoded PCR- biotinylated capture probes Ubiquitin-specific peptidase 17-like family member 10 adapter ligation Gene Estimated coverage Alignment count B B B B B B B B B B B B B B B B B B 82.8% B B B B USP17L8 114 X 350 B 105 81 X 86.6% B successful Alignments Estimated coverage Average accuracy B B B B B B B B B B B B B B USP17L7 87 X 127 B B B B B B B B B B B B B B B B F PCR with R universal primers USP17L10 81 X 105 100 Coverage Gene coverage Capture of probe- USP17L12 73 X 90 50 Alignment quality too low 96,424 reads template duplexes POM121L7 67 X 115 Workflow successful onto beads 465,170 reads CTAGE9 62 X 90 0 0 500 1,000 1,500 2,000 Reference sequence position B B B B B B B B B CTAGE1 61 X 148 B B B B B B B B B Pooling and sequence capture B B B B B B B B B FAM90A26 58 X 197 B B B B B B B B B 70 Count Accuracy distribution Alignments - key figures MST1 58 X 190 35 Elution and PCR TSPY2 57 X 94 Elution, amplification, 0 adapter-ligation 80 85 90 95 465,170 17,702 86.6% First < 1 of 1768 > Last and sequencing Amplicon-sequencing protocol Alignments Genes Average accuracy Accuracy (%) Fig.
    [Show full text]