Incorporating Sequence Capture Into Library Preparation for Miniontm

Total Page:16

File Type:pdf, Size:1020Kb

Incorporating Sequence Capture Into Library Preparation for Miniontm Incorporating sequence capture into library preparation for MinIONTM, GridIONTM and PromethIONTM Hybrid sequence capture allows users to select thousands of loci of interest simultaneously prior to sequencing, making more efficient use of the sequencing run Contact: [email protected] More information at: www.nanoporetech.com and publications.nanoporetech.com gDNA a) Randomly fragmented ~6 kb library b) Human exome report containing regions of interest Fragmentation OVERVIEW PER GENE OVERVIEW PER GENE Reads by exit status Alignments per gene End-prep Showing the exit status of all the reads analysed Select a gene in the table below to show summary Size-selection and p A p USP17L8 hybridisation to A Filter . NCBI gene ID 100287144 Ensembl: ENSG00000231396 T Barcoded PCR- biotinylated capture probes Ubiquitin-specific peptidase 17-like family member 10 adapter ligation Gene Estimated coverage Alignment count B B B B B B B B B B B B B B B B B B 82.8% B B B B USP17L8 114 X 350 B 105 81 X 86.6% B successful Alignments Estimated coverage Average accuracy B B B B B B B B B B B B B B USP17L7 87 X 127 B B B B B B B B B B B B B B B B F PCR with R universal primers USP17L10 81 X 105 100 Coverage Gene coverage Capture of probe- USP17L12 73 X 90 50 Alignment quality too low 96,424 reads template duplexes POM121L7 67 X 115 Workflow successful onto beads 465,170 reads CTAGE9 62 X 90 0 0 500 1,000 1,500 2,000 Reference sequence position B B B B B B B B B CTAGE1 61 X 148 B B B B B B B B B Pooling and sequence capture B B B B B B B B B FAM90A26 58 X 197 B B B B B B B B B 70 Count Accuracy distribution Alignments - key figures MST1 58 X 190 35 Elution and PCR TSPY2 57 X 94 Elution, amplification, 0 adapter-ligation 80 85 90 95 465,170 17,702 86.6% First < 1 of 1768 > Last and sequencing Amplicon-sequencing protocol Alignments Genes Average accuracy Accuracy (%) Fig. 1 Long-read sequence capture a) workflow overview b) multiplexed sequence capture Fig. 2 Analysis report of sequence-capture data for the human exome Sequence capture uses complementary Resequencing analysis workflow for probes to enrich for targets of interest sequence-capture experiments Sequence capture is a technique which allows the enrichment of specific regions of interest from We have released an updated analysis workflow for sequence-capture experiments. To illustrate a genome. It is useful when: this, we captured the human exome using Agilent’s SureSelect Human All Exon V6 panel, and i) the user is not interested in analysing the entire genome generated ~ 2.35 Gb of 1D sequence data from a MinION run, representing ~40x average ii) the genome is too large for the throughput of the sequencer coverage. We analysed the data using the resequencing analysis workflow (Fig. 2). Following iii) the user wishes to save money and time on sequencing and analysis iv) the regions are longer than can be amplified by PCR, or too many PCRs would be required. basecalling, reads are mapped to the human exome reference sequence. Individual gene information is displayed, including coverage and read-accuracy distribution at that position. Sequence capture is performed during library preparation by hybridising the library fragments to Future releases of this application will support the uploading of target regions, highlight known probes which are specific to the regions of interest (Fig. 1). SNPs in the target regions and allow those SNPs to be displayed with a confidence value. a) BRCA1, position: chr17:41569479–41650296, band: 17q21.31 Chromosome 1: 16,536,698–16,639,185 p31.3 p31.3 q32.1 q41 q43 q44 p36.13 80,818 bp Forward strand Contig 1.00 Mb < EPHA2 RP4-733M16.2 > SZRD1 > NECAP2 > RP4-798A10.7 > RP5-1182A14.5 > < RP5-1182A14.6 < RP11-108M9.1 < MFAP2 RP11-276H7.2 > < MTL1 < RP11-430L17.1RP4-798A10.2 > < RP5-875O13.6MST1P2 RP1-163M9.7> > RP11-108M9.2 > RP1-37C10.3 > RP11-276H7.3 > < FBXO42 < SPATA21 < CROCCP3 < RP5-875O13.1< CROCCP2< RP5-875O13.7 RP11-108M9.3 > < ATP13A2 < ARHGEF19 RP4-798A10.4 > RP1-163M9.8 > < RP5-1182A14.7 < RP11-108M9.4 < SDHB b) Tyr1563Ter, wild type is C and SNP is G < ANO7P1 FAM231B > FAM231.A > CEOCC > RP11-108M9.5 > < RSG1 < EIF1AP1RP1-163M9.5 > < RP11-108M9.6 A T T C C T T T C A G A G G G A A * C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T G A < ESPNP < MST1L A T T C C T T T G A G A G G G A A C C C C T T A C C T G G A A T C T G G C A T C A G C C T C T T C T C T G A < NBPF1 A T T C C A T T C A G A G G G A A C C C C T T A G C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A G C T G G A A T C T G G A A T C A G C C T C G T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A C C T G G A A T C T T G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G A A G G A A C C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A G C T G G A A T C T G G A A T C A G C C T C T T C A C T G A A T T C C T T T C A G A G G G A A C C C C T T A C C T G G A A T C T G G A A A C A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A G C T G G * A T C T G G A A T C A C C C T C T T C T C T G A 40 kb A T T C C T T T C A G A G G G * A C C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A * T C C T T T C A G A G G G A A C C C C T T A G C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A T C C C T T A G C T G G A A T C T G G A A T T A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G A G G G A A C C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T T A A T T C C T T T * A G A G G G A A C C C C T T A G C T G G A A T C T G G A A T C A G C C T C T T G T C T G A A T T C C T T T C A G A G G G A A C C C C T T A C C T G G A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C * G A G G G A A C C C C T T A C C T G A A A T C T G G A A T C A G C C T C T T C T C T G A A T T C C T T T C A G C G G G A A C C C C T T A G C T G G A A T C T G G * A T C A G C C T C T T C T C T G A c) Arg1443Gly, wild type is C and SNP is G A A G T G A C T C T T C T G C C C T T G A G G A C C T G G G A A A T C C A G A A C A A A G C A C A T C A G A A A G T G A C T C T T C T G C C C T T G A G G A C C T G C G A A A T C * A G A A C A A A G C A C A T C A G A A A G T G A C T * C T C T G C C C T T G A G G A C C T G G G A A A T C C A G A A C A A A G C A C G T C A G A A A G T G A C T C T T C T G C C C T T G A G G A C C T G G G A A A T C C A G A A C A A A G C A C A T C A G A T A G T G A C T C T T C T G C C C T T G A G G A C C T G G G A A A T C C A G A A C A A A G C A C A T C A G A A A G T G A C T C T * * * G C C C T T G A G G A C C T G C G A A A T C C A G A A C A A A G C A C A T C A G A A A G T G A C T C T T C T * C C C T T G A G G A C C T G C G A A A T C C A G A A C A A A G C A C A T C A G A A A G T G * C T C T T C T G C C C T T G A G G A C C T G G G A A A T G C A G A A C A A A G C A C A T C A G A A A G T G A C T C C T C T G C C C T T G A G G A C C T G C G A A A T C C A G A A C A A A G C A C A T C A G A A A G T G A C T C T A C T G C C C T T G A G G A C C T G C G A A A T C C A G A A C A A A G C A A A T C A G A A A G T G A C T C T T C T G C C C T T G A G G A C C T G G G A A A T C C A G A A C A A A G C A C A T C A G A Fig.
Recommended publications
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • ABSTRACT MITCHELL III, ROBERT DRAKE. Global Human Health
    ABSTRACT MITCHELL III, ROBERT DRAKE. Global Human Health Risks for Arthropod Repellents or Insecticides and Alternative Control Strategies. (Under the direction of Dr. R. Michael Roe). Protein-coding genes and environmental chemicals. New paradigms for human health risk assessment of environmental chemicals emphasize the use of molecular methods and human-derived cell lines. In this study, we examined the effects of the insect repellent DEET (N, N-diethyl-m-toluamide) and the phenylpyrazole insecticide fipronil (fluocyanobenpyrazole) on transcript levels in primary human hepatocytes. These chemicals were tested individually and as a mixture. RNA-Seq showed that 100 µM DEET significantly increased transcript levels for 108 genes and lowered transcript levels for 64 genes and fipronil at 10 µM increased the levels of 2,246 transcripts and decreased the levels for 1,428 transcripts. Fipronil was 21-times more effective than DEET in eliciting changes, even though the treatment concentration was 10-fold lower for fipronil versus DEET. The mixture of DEET and fipronil produced a more than additive effect (levels increased for 3,017 transcripts and decreased for 2,087 transcripts). The transcripts affected in our treatments influenced various biological pathways and processes important to normal cellular functions. Long non-protein coding RNAs and environmental chemicals. While the synthesis and use of new chemical compounds is at an all-time high, the study of their potential impact on human health is quickly falling behind. We chose to examine the effects of two common environmental chemicals, the insect repellent DEET and the insecticide fipronil, on transcript levels of long non-protein coding RNAs (lncRNAs) in primary human hepatocytes.
    [Show full text]
  • Genetic and Genomic Analysis of Hyperlipidemia, Obesity and Diabetes Using (C57BL/6J × TALLYHO/Jngj) F2 Mice
    University of Tennessee, Knoxville TRACE: Tennessee Research and Creative Exchange Nutrition Publications and Other Works Nutrition 12-19-2010 Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P. Stewart Marshall University Hyoung Y. Kim University of Tennessee - Knoxville, [email protected] Arnold M. Saxton University of Tennessee - Knoxville, [email protected] Jung H. Kim Marshall University Follow this and additional works at: https://trace.tennessee.edu/utk_nutrpubs Part of the Animal Sciences Commons, and the Nutrition Commons Recommended Citation BMC Genomics 2010, 11:713 doi:10.1186/1471-2164-11-713 This Article is brought to you for free and open access by the Nutrition at TRACE: Tennessee Research and Creative Exchange. It has been accepted for inclusion in Nutrition Publications and Other Works by an authorized administrator of TRACE: Tennessee Research and Creative Exchange. For more information, please contact [email protected]. Stewart et al. BMC Genomics 2010, 11:713 http://www.biomedcentral.com/1471-2164/11/713 RESEARCH ARTICLE Open Access Genetic and genomic analysis of hyperlipidemia, obesity and diabetes using (C57BL/6J × TALLYHO/JngJ) F2 mice Taryn P Stewart1, Hyoung Yon Kim2, Arnold M Saxton3, Jung Han Kim1* Abstract Background: Type 2 diabetes (T2D) is the most common form of diabetes in humans and is closely associated with dyslipidemia and obesity that magnifies the mortality and morbidity related to T2D. The genetic contribution to human T2D and related metabolic disorders is evident, and mostly follows polygenic inheritance. The TALLYHO/ JngJ (TH) mice are a polygenic model for T2D characterized by obesity, hyperinsulinemia, impaired glucose uptake and tolerance, hyperlipidemia, and hyperglycemia.
    [Show full text]
  • Systematic Data-Querying of Large Pediatric Biorepository Identifies Novel Ehlers-Danlos Syndrome Variant Akshatha Desai1, John J
    Desai et al. BMC Musculoskeletal Disorders (2016) 17:80 DOI 10.1186/s12891-016-0936-8 RESEARCH ARTICLE Open Access Systematic data-querying of large pediatric biorepository identifies novel Ehlers-Danlos Syndrome variant Akshatha Desai1, John J. Connolly1, Michael March1, Cuiping Hou1, Rosetta Chiavacci1, Cecilia Kim1, Gholson Lyon1, Dexter Hadley1 and Hakon Hakonarson1,2* Abstract Background: Ehlers Danlos Syndrome is a rare form of inherited connective tissue disorder, which primarily affects skin, joints, muscle, and blood cells. The current study aimed at finding the mutation that causing EDS type VII C also known as “Dermatosparaxis” in this family. Methods: Through systematic data querying of the electronic medical records (EMRs) of over 80,000 individuals, we recently identified an EDS family that indicate an autosomal dominant inheritance. The family was consented for genomic analysis of their de-identified data. After a negative screen for known mutations, we performed whole genome sequencing on the male proband, his affected father, and unaffected mother. We filtered the list of non- synonymous variants that are common between the affected individuals. Results: The analysis of non-synonymous variants lead to identifying a novel mutation in the ADAMTSL2 (p. Gly421Ser) gene in the affected individuals. Sanger sequencing confirmed the mutation. Conclusion: Our work is significant not only because it sheds new light on the pathophysiology of EDS for the affected family and the field at large, but also because it demonstrates the utility of unbiased large-scale clinical recruitment in deciphering the genetic etiology of rare mendelian diseases. With unbiased large-scale clinical recruitment we strive to sequence as many rare mendelian diseases as possible, and this work in EDS serves as a successful proof of concept to that effect.
    [Show full text]
  • WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization I International Bureau (10) International Publication Number (43) International Publication Date WO 2019/079361 Al 25 April 2019 (25.04.2019) W 1P O PCT (51) International Patent Classification: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DJ, DK, DM, DO, C12Q 1/68 (2018.01) A61P 31/18 (2006.01) DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, C12Q 1/70 (2006.01) HR, HU, ID, IL, IN, IR, IS, JO, JP, KE, KG, KH, KN, KP, KR, KW, KZ, LA, LC, LK, LR, LS, LU, LY, MA, MD, ME, (21) International Application Number: MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, PCT/US2018/056167 OM, PA, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SA, (22) International Filing Date: SC, SD, SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, 16 October 2018 (16. 10.2018) TR, TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW. (25) Filing Language: English (84) Designated States (unless otherwise indicated, for every kind of regional protection available): ARIPO (BW, GH, (26) Publication Language: English GM, KE, LR, LS, MW, MZ, NA, RW, SD, SL, ST, SZ, TZ, (30) Priority Data: UG, ZM, ZW), Eurasian (AM, AZ, BY, KG, KZ, RU, TJ, 62/573,025 16 October 2017 (16. 10.2017) US TM), European (AL, AT, BE, BG, CH, CY, CZ, DE, DK, EE, ES, FI, FR, GB, GR, HR, HU, ΓΕ , IS, IT, LT, LU, LV, (71) Applicant: MASSACHUSETTS INSTITUTE OF MC, MK, MT, NL, NO, PL, PT, RO, RS, SE, SI, SK, SM, TECHNOLOGY [US/US]; 77 Massachusetts Avenue, TR), OAPI (BF, BJ, CF, CG, CI, CM, GA, GN, GQ, GW, Cambridge, Massachusetts 02139 (US).
    [Show full text]
  • Identification of Potential Key Genes and Pathway Linked with Sporadic Creutzfeldt-Jakob Disease Based on Integrated Bioinformatics Analyses
    medRxiv preprint doi: https://doi.org/10.1101/2020.12.21.20248688; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Identification of potential key genes and pathway linked with sporadic Creutzfeldt-Jakob disease based on integrated bioinformatics analyses Basavaraj Vastrad1, Chanabasayya Vastrad*2 , Iranna Kotturshetti 1. Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India. 2. Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karanataka, India. 3. Department of Ayurveda, Rajiv Gandhi Education Society`s Ayurvedic Medical College, Ron, Karnataka 562209, India. * Chanabasayya Vastrad [email protected] Ph: +919480073398 Chanabasava Nilaya, Bharthinagar, Dharwad 580001 , Karanataka, India NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice. medRxiv preprint doi: https://doi.org/10.1101/2020.12.21.20248688; this version posted December 24, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. Abstract Sporadic Creutzfeldt-Jakob disease (sCJD) is neurodegenerative disease also called prion disease linked with poor prognosis. The aim of the current study was to illuminate the underlying molecular mechanisms of sCJD. The mRNA microarray dataset GSE124571 was downloaded from the Gene Expression Omnibus database. Differentially expressed genes (DEGs) were screened.
    [Show full text]
  • WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T
    (12) INTERNATIONAL APPLICATION PUBLISHED UNDER THE PATENT COOPERATION TREATY (PCT) (19) World Intellectual Property Organization International Bureau (10) International Publication Number (43) International Publication Date WO 2012/174282 A2 20 December 2012 (20.12.2012) P O P C T (51) International Patent Classification: David [US/US]; 13539 N . 95th Way, Scottsdale, AZ C12Q 1/68 (2006.01) 85260 (US). (21) International Application Number: (74) Agent: AKHAVAN, Ramin; Caris Science, Inc., 6655 N . PCT/US20 12/0425 19 Macarthur Blvd., Irving, TX 75039 (US). (22) International Filing Date: (81) Designated States (unless otherwise indicated, for every 14 June 2012 (14.06.2012) kind of national protection available): AE, AG, AL, AM, AO, AT, AU, AZ, BA, BB, BG, BH, BR, BW, BY, BZ, English (25) Filing Language: CA, CH, CL, CN, CO, CR, CU, CZ, DE, DK, DM, DO, Publication Language: English DZ, EC, EE, EG, ES, FI, GB, GD, GE, GH, GM, GT, HN, HR, HU, ID, IL, IN, IS, JP, KE, KG, KM, KN, KP, KR, (30) Priority Data: KZ, LA, LC, LK, LR, LS, LT, LU, LY, MA, MD, ME, 61/497,895 16 June 201 1 (16.06.201 1) US MG, MK, MN, MW, MX, MY, MZ, NA, NG, NI, NO, NZ, 61/499,138 20 June 201 1 (20.06.201 1) US OM, PE, PG, PH, PL, PT, QA, RO, RS, RU, RW, SC, SD, 61/501,680 27 June 201 1 (27.06.201 1) u s SE, SG, SK, SL, SM, ST, SV, SY, TH, TJ, TM, TN, TR, 61/506,019 8 July 201 1(08.07.201 1) u s TT, TZ, UA, UG, US, UZ, VC, VN, ZA, ZM, ZW.
    [Show full text]
  • Single Cell Derived Clonal Analysis of Human Glioblastoma Links
    SUPPLEMENTARY INFORMATION: Single cell derived clonal analysis of human glioblastoma links functional and genomic heterogeneity ! Mona Meyer*, Jüri Reimand*, Xiaoyang Lan, Renee Head, Xueming Zhu, Michelle Kushida, Jane Bayani, Jessica C. Pressey, Anath Lionel, Ian D. Clarke, Michael Cusimano, Jeremy Squire, Stephen Scherer, Mark Bernstein, Melanie A. Woodin, Gary D. Bader**, and Peter B. Dirks**! ! * These authors contributed equally to this work.! ** Correspondence: [email protected] or [email protected]! ! Supplementary information - Meyer, Reimand et al. Supplementary methods" 4" Patient samples and fluorescence activated cell sorting (FACS)! 4! Differentiation! 4! Immunocytochemistry and EdU Imaging! 4! Proliferation! 5! Western blotting ! 5! Temozolomide treatment! 5! NCI drug library screen! 6! Orthotopic injections! 6! Immunohistochemistry on tumor sections! 6! Promoter methylation of MGMT! 6! Fluorescence in situ Hybridization (FISH)! 7! SNP6 microarray analysis and genome segmentation! 7! Calling copy number alterations! 8! Mapping altered genome segments to genes! 8! Recurrently altered genes with clonal variability! 9! Global analyses of copy number alterations! 9! Phylogenetic analysis of copy number alterations! 10! Microarray analysis! 10! Gene expression differences of TMZ resistant and sensitive clones of GBM-482! 10! Reverse transcription-PCR analyses! 11! Tumor subtype analysis of TMZ-sensitive and resistant clones! 11! Pathway analysis of gene expression in the TMZ-sensitive clone of GBM-482! 11! Supplementary figures and tables" 13" "2 Supplementary information - Meyer, Reimand et al. Table S1: Individual clones from all patient tumors are tumorigenic. ! 14! Fig. S1: clonal tumorigenicity.! 15! Fig. S2: clonal heterogeneity of EGFR and PTEN expression.! 20! Fig. S3: clonal heterogeneity of proliferation.! 21! Fig.
    [Show full text]
  • Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis
    Getting “Under the Skin”: Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis by Kristen Monét Brown A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Epidemiological Science) in the University of Michigan 2017 Doctoral Committee: Professor Ana V. Diez-Roux, Co-Chair, Drexel University Professor Sharon R. Kardia, Co-Chair Professor Bhramar Mukherjee Assistant Professor Belinda Needham Assistant Professor Jennifer A. Smith © Kristen Monét Brown, 2017 [email protected] ORCID iD: 0000-0002-9955-0568 Dedication I dedicate this dissertation to my grandmother, Gertrude Delores Hampton. Nanny, no one wanted to see me become “Dr. Brown” more than you. I know that you are standing over the bannister of heaven smiling and beaming with pride. I love you more than my words could ever fully express. ii Acknowledgements First, I give honor to God, who is the head of my life. Truly, without Him, none of this would be possible. Countless times throughout this doctoral journey I have relied my favorite scripture, “And we know that all things work together for good, to them that love God, to them who are called according to His purpose (Romans 8:28).” Secondly, I acknowledge my parents, James and Marilyn Brown. From an early age, you two instilled in me the value of education and have been my biggest cheerleaders throughout my entire life. I thank you for your unconditional love, encouragement, sacrifices, and support. I would not be here today without you. I truly thank God that out of the all of the people in the world that He could have chosen to be my parents, that He chose the two of you.
    [Show full text]
  • Alignment Alignment Read Ends Pairing Valid Fragment
    Supplementary Figures and Tables Hi-C reads Ligation site End 1 End 2 Alignment Alignment >25bps Chimeric reads Multi-reads Multi-reads Uni-reads Uni-reads Read ends pairing Uniquely mapping read pairs Multi-mapping read pairs >99 Unmapped reads Singleton reads Low quality Multi-reads Valid fragment filtering d1 d1 d2 50 bps < d1 + d2 < 800 bps d1 + d2 >800 bps d2 <25k bps d1 + d2 < 50 bps >25k bps Short-range contacts Valid read pairs Invalid alignments End 1 End 2 End 1 End 2 End 1 End 2 Dangling end Self circle Religation 24 Supplementary Figure 1 mHi-C pipeline (Alignment - Read end pairing - Valid fragment filtering). 1. Read ends are aligned to reference genome separately allow- ing multi-reads and chimeric reads are rescued. 2. Read ends are paired by their read query names. Multi-reads form more than one read pair with the same read query name. Read ends that fail to align form either unmapped reads or singleton reads and are discarded. Multi-reads with ends aligning to more than 99 positions are regarded as low quality multi-reads and are excluded from the downstream analysis. 3. Vali- dation checking to filter short-range contacts and alignments far away from restriction enzyme recognition sites. Contacts residing within the same restriction fragment, i.e., dangling end or self circle, as well as adjacent fragments (religation) are discarded. The above three processing steps are applied to each read independently enabling parallel implementation. 25 Valid fragment filtering d1 d2 50 bps < d1 + d2 < 800 bps >25k bps Valid read pairs Duplicate removal Uni-reads Multi-reads A 1 mismatch 2 mismatches Multi-reads Multi-reads B Genome binning 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb Uni-bin pairs Multi-bin pairs Multi-reads reduced to Uni-bin pairs mHi-C Prob=0.9 0.1 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb 40Kb Uni-bin pairs Multi-reads reduced to Multi-bin pairs Uni-binpairs Contact matrix Bin k Bin j Bin k Bin j 3 contact counts 26 Supplementary Figure 2 mHi-C pipeline (Duplicate removal - Genome binning - mHi-C).
    [Show full text]
  • Network-Based Method for Drug Target Discovery at the Isoform Level
    www.nature.com/scientificreports OPEN Network-based method for drug target discovery at the isoform level Received: 20 November 2018 Jun Ma1,2, Jenny Wang2, Laleh Soltan Ghoraie2, Xin Men3, Linna Liu4 & Penggao Dai 1 Accepted: 6 September 2019 Identifcation of primary targets associated with phenotypes can facilitate exploration of the underlying Published: xx xx xxxx molecular mechanisms of compounds and optimization of the structures of promising drugs. However, the literature reports limited efort to identify the target major isoform of a single known target gene. The majority of genes generate multiple transcripts that are translated into proteins that may carry out distinct and even opposing biological functions through alternative splicing. In addition, isoform expression is dynamic and varies depending on the developmental stage and cell type. To identify target major isoforms, we integrated a breast cancer type-specifc isoform coexpression network with gene perturbation signatures in the MCF7 cell line in the Connectivity Map database using the ‘shortest path’ drug target prioritization method. We used a leukemia cancer network and diferential expression data for drugs in the HL-60 cell line to test the robustness of the detection algorithm for target major isoforms. We further analyzed the properties of target major isoforms for each multi-isoform gene using pharmacogenomic datasets, proteomic data and the principal isoforms defned by the APPRIS and STRING datasets. Then, we tested our predictions for the most promising target major protein isoforms of DNMT1, MGEA5 and P4HB4 based on expression data and topological features in the coexpression network. Interestingly, these isoforms are not annotated as principal isoforms in APPRIS.
    [Show full text]
  • Generative Modeling of Multi-Mapping Reads with Mhi-C
    Manuscript submitted to eLife 1 Generative Modeling of 2 Multi-mapping Reads with mHi-C 3 Advances Analysis of Hi-C Studies 1 2,3 1,4* 4 Ye Zheng , Ferhat Ay , Sündüz Keleş *For correspondence: [email protected] (SK) 1 2 5 Department of Statistics, University of Wisconsin-Madison, Madison, WI 53706, USA; La 3 6 Jolla Institute for Allergy and Immunology, La Jolla, CA 92037, USA; School of Medicine, 4 7 University of California San Diego, La Jolla, CA 92093, USA; Department of Biostatistics 8 and Medical Informatics, University of Wisconsin-Madison, Madison, WI 53706, USA 9 10 Abstract Current Hi-C analysis approaches are unable to account for reads that align to multiple 11 locations, and hence underestimate biological signal from repetitive regions of genomes. We 12 developed and validated mHi-C,amulti-read mapping strategy to probabilistically allocate Hi-C 13 multi-reads. mHi-C exhibited superior performance over utilizing only uni-reads and heuristic 14 approaches aimed at rescuing multi-reads on benchmarks. Specifically, mHi-C increased the 15 sequencing depth by an average of 20% resulting in higher reproducibility of contact matrices and 16 detected interactions across biological replicates. The impact of the multi-reads on the detection of 17 significant interactions is influenced marginally by the relative contribution of multi-reads to the 18 sequencing depth compared to uni-reads, cis-to-trans ratio of contacts, and the broad data quality 19 as reflected by the proportion of mappable reads of datasets. Computational experiments 20 highlighted that in Hi-C studies with short read lengths, mHi-C rescued multi-reads can emulate the 21 effect of longer reads.
    [Show full text]