Structural Variation Discovery: the Easy, the Hard

Total Page:16

File Type:pdf, Size:1020Kb

Structural Variation Discovery: the Easy, the Hard STRUCTURAL VARIATION DISCOVERY: THE EASY, THE HARD AND THE UGLY by Fereydoun Hormozdiari B.Sc., Sharif University of Technology, 2004 M.Sc., Simon Fraser University, 2007 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY in the School of Computing Science Faculty of Applied Science c Fereydoun Hormozdiari 2011 SIMON FRASER UNIVERSITY Fall 2011 All rights reserved. However, in accordance with the Copyright Act of Canada, this work may be reproduced without authorization under the conditions for Fair Dealing. Therefore, limited reproduction of this work for the purposes of private study, research, criticism, review and news reporting is likely to be in accordance with the law, particularly if cited appropriately. APPROVAL Name: Fereydoun Hormozdiari Degree: Doctor of Philosophy Title of thesis: Structural Variation Discovery: the easy, the hard and the ugly Examining Committee: Dr. Ramesh Krishnamurti Chair Dr. S. Cenk Sahinalp, Senior Supervisor Computing Science, Simon Fraser University Dr. Evan E. Eichler, Supervisor Genome Sciences, University of Washington Dr. Artem Cherkasov, Supervisor Urologic Sciences, University of British Columbia Dr. Inanc Birol, Supervisor Bioinformatics group leader, GSC, BC Cancer Agency Dr. Fiona Brinkman, Internal Examiner Molecular Biology and Chemistry, Simon Fraser Univer- sity Dr. Serafim Batzoglou, External Examiner Computer Science, Stanford University Date Approved: August 22nd, 2011 ii Partial Copyright Licence Abstract Comparison of human genomes shows that along with single nucleotide polymorphisms and small indels, larger structural variants (SVs) are common. Recent studies even suggest that more base pairs are altered as a result of structural variations (including copy number variations) than as a result of single nucleotide variations or small indels. It is also been known that structural variations can cause a loss or gain of functionality and can have phenotypic effects. Recently, with the advent of high-throughput sequencing technologies, the field of genomics has been revolutionized. The realization of high-throughput sequencing platforms now makes it feasible to detect the full spectrum of genomic variation (including SVs) among many individual genomes, including cancer patients and others suffering from diseases of genomic origin. In addition, high- throughput sequencing technologies make it possible to extend the scope of structural variation studies to a point previously unimaginable as exemplified with the 1000 Genomes Project. In this dissertation we consider the structural variation discovery problem using high-throughput sequencing technologies. We provide combinatorial formulations for this problem under a maxi- mum parsimony assumption, and design approximation algorithms for them. We also extend our proposed algorithms to consider conflicts between potential structural variations and resolve them. It should be noted that our algorithms are able to detect most of the well-known structural variation types including small insertions, deletions, inversions, and transpositions. Finally we extend our algorithms to allow simultaneous discovery of structural variations in multiple genomes and thus improve the final comparative results between different donors. iii To My Mom and Dad And All Other Cancer Survivors iv “If they Release All Political Prisoners, I will Forgive My Son’s Murderers” — Sohrab Arabi’s Mother, ONE OF THE MARTYRS OF IRAN GREEN MOVEMENT , June 2009 v Acknowledgments First and foremost I offer my sincerest and utmost gratitude to my senior supervisor, Dr Cenk Sahi- nalp, who has supported me throughout my graduate studies with his encouragement, guidance and resourcefulness. In addition to technical skills of computer science, I have learned the requirements of being a good researcher from him. I thank him one more time for all his help. I would like to give my regards and appreciations to Dr. Evan Eichler, his experience in genomics was a crucial factor in the success of this thesis. My visit to his lab greatly influenced the direction of my PhD research. I would also like to thank other members of my committee Dr. Inanc Birol, Dr. Artem Cherkasov and Dr. Fiona Brinkman for their help in my graduate studies. I would like to sincerely thank Dr. Serafim Batzoglou, who kindly accepted to be my external examiner. The ideas and comments, that he shared during his visit, have provided me with additional directions for extending my research. I would also like to thank my other collaborators Can Alkan, Iman Hajirasouliha, Phuong Dao, Raheleh Salari, Faraz Hach, Deniz Yorukoglu, and Farhad Hormozdiari. I benefited from these collaborations, and hope to continue working with them. I would also like to thank all my dear friends, Laleh Samii, Hamed Yaghoubi, Sara Ejtemaee, Shahab Tasharrofi, Mohammad Tayebi, Navid Imani, Alireza Hojjati, Azadeh Akhtari, Pashootan Vaezi, Ali Khalili, Amir Hedayaty, Hos- sein Jowhari, and Kamyar Khodamoradi. Last, but not least, I thank my family for the the support they gave me in all these years. Their motivation was a huge driving force for me. Fereydoun Hormozdiari vi Contents Approval ii Abstract iii Dedication iv Quotation v Acknowledgments vi Contents vii List of Tables xi List of Figures xiii 1 Introduction 1 1.1 Genome........................................ 1 1.2 GenomeVariation................................. 2 1.2.1 Non Sequencing-Based Variation Discovery . ........ 2 1.3 High throughput sequencing . ..... 4 1.4 De novo assembly and comparison . ..... 5 1.4.1 Denovoassembly .............................. 6 1.4.2 Edit distance and block edit distance . ....... 9 1.5 Mapping based strategy . 12 1.5.1 MappingAlgorithms .. ...... ..... ...... ...... .... 15 1.5.2 Read-PairAnalysis . 17 vii 1.5.3 Read-DepthAnalysis. 22 1.5.4 Split-ReadAnalysis . 22 1.6 ThesisOverview .................................. 23 2 Optimal pooling for genome re-sequencing 27 2.1 Problem definition and general algorithmic approach . ............. 29 2.1.1 Poolingproblem ............................... 30 2.2 Methods........................................ 32 Ci,b 2.2.1 The pooling problem under f(Ci,b)= 2 ................ 33 Ci,b 2.2.2 The balanced pooling problem under f(Ci,b)= 2 ........... 34 2.2.3 The pooling problem under f(Ci,b)= Ci,b − 1 ............... 35 2.2.4 The balanced pooling problem under f(Ci,b)= Ci,b − 1 .......... 38 2.3 Results and discussion . 40 2.3.1 Pooling experiments with LSMnC/LSMnS algorithms . ........ 41 2.3.2 Pooling experiments with GAHP/GABHP algorithms . ........ 44 2.4 OtherApplications ............................... 47 2.4.1 Fosmid clone resequencing . 48 2.4.2 Exomesequencing .............................. 48 3 Structural Variation Discovery Via Maximum Parsimony 49 3.1 Insertion, Deletion, Inversion and Transposition Events............... 51 3.1.1 Notations and Definition . 51 3.1.2 Structural Variation Detection based on Maximum Parsimony . 54 3.1.3 O(log n) approximation for MPSV . 55 3.1.4 MaximalValidClusters . 57 3.2 Novel Insertion Discovery . ..... 66 3.2.1 NovelSeq pipeline............................... 67 3.3 Maximum Parsimony Structural Variation with Conflict Resolution . 72 3.3.1 Conflicting SV clusters in haploid and diploid genome sequences . 73 3.3.2 Formal definition of the MPSV-CR problem . ..... 75 3.3.3 Computational complexity of MPSV-CR . 76 3.3.4 An efficient solution to the MPSV-CR problem . ...... 78 3.4 ExperimentalResults . 79 3.4.1 BestMappingvsAllMapping . 79 viii 3.4.2 Mobile Element Insertion Discovery . ...... 80 3.4.3 Deletion and Inversion Discovery on NA18507 . ....... 82 3.4.4 Novel Insertion Discovery on NA18507 . ..... 87 3.4.5 Structural Variation Prediction with VariationHunter-CR .......... 90 4 Alu transposition map on human genomes 92 4.1 Discovery and Validation . ..... 93 4.2 Familial Transmission . 100 4.2.1 Genotyping.................................. 102 4.3 Genomecomparisons ............................... 103 5 Simultaneous Structural Variation Discovery 106 5.1 Introduction.................................... 107 5.2 Methods........................................ 110 5.2.1 Simultaneous Structural Variation Discovery among Multiple Genomes . 110 5.2.2 The Algorithmic Formulation of the SSV-MG Problem . ........ 111 5.2.3 Complexity of SSV-MG problem . 113 5.2.4 An simple approximation algorithm for SSV-MG problem ......... 115 5.2.5 A maximum flow-based update for Red-Black Assignment . ........ 116 5.2.6 An O(1 + ωmax ) approximation algorithm for limited read mapping loci . 117 ωmin 5.2.7 Efficient heuristic methods for the SSV-MG problem . ......... 119 5.3 Results......................................... 121 5.3.1 Mobile element insertions . 122 5.3.2 Deletions ................................... 123 6 Conclusion and Discussion 127 6.1 FutureWorks ..................................... 129 Appendices A Conflicting Valid Clusters for Haploid Genome 131 B Alu insertions in genes and exons 134 C Alu Insertion Genotyping 137 ix Bibliography 139 x List of Tables 1.1 In this table a summarization of different methods used for structural variation dis- covery based on the read pair, read depth and split read analysis is provided. 14 3.1 Summary of mobile element insertion prediction results in the Venter genome. We show the precision and recall rates of our mobile element (Alu, NCAI, and SVA) insertion discovery. We compare our mobile element insertion predictions with both [154] and the HuRef genome assembly[87]. The results demonstrate
Recommended publications
  • Michael Antoni
    Stress Management Effects on Biological and Molecular Pathways in Women Treated for Breast Cancer APS/NCI Conference on “Toward Precision Cancer Care: Biobehavioral Contributions to the Exposome” Chicago IL Michael H. Antoni, Ph.D. Department of Psychology Div of Health Psychology Director, Center for Psycho- Oncology Director, Cancer Prevention and Control Research, Sylvester Cancer Center University of Miami E.g., Stress Management for Women with Breast Cancer • Rationale – Breast Cancer (BCa) is a stressor – Challenges of surgery and adjuvant tx – Patient assets can facilitate adjustment – Cognitive Behavioral Stress Management (CBSM) can fortify these assets in women with BCa – Improving Psychosocial Adaptation may Affect Physiological Adaptation Theoretical Model for CBSM during CA Tx Negative Adapt Positive Adapt Physical Funct Awareness C ∆ Cog Appraisals B Physical Emot Processing + Health Beh. S Health M Relaxation QOL Social Support Normalize endocrine and immune regulation Antoni (2003). Stress Management for Women with Breast Cancer. American Psychological Association. Assessment Time Points T1 T2 T3 T4 B SMART-10 wks. 2-8 wks post 3 months post 6 months post surgery One year post surgery Topics of CBSM Week Relaxation Stress Management 1 PMR 7 Stress & Awareness 2 PMR 4/D.B. Stress & Awareness/Stress Appraisals 3 D.B./PMR Disease-Specific, Automatic Thoughts 4 Autogenics Auto. Thghts, Distortions, Thght Rep. 5 D.B./Visualiz. Cognitive Restructuring 6 Sunlight Med. Effective Coping I 7 Color Meditation Effective Coping
    [Show full text]
  • Interactions Between the Parasite Philasterides Dicentrarchi and the Immune System of the Turbot Scophthalmus Maximus.A Transcriptomic Analysis
    biology Article Interactions between the Parasite Philasterides dicentrarchi and the Immune System of the Turbot Scophthalmus maximus.A Transcriptomic Analysis Alejandra Valle 1 , José Manuel Leiro 2 , Patricia Pereiro 3 , Antonio Figueras 3 , Beatriz Novoa 3, Ron P. H. Dirks 4 and Jesús Lamas 1,* 1 Department of Fundamental Biology, Institute of Aquaculture, Campus Vida, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain; [email protected] 2 Department of Microbiology and Parasitology, Laboratory of Parasitology, Institute of Research on Chemical and Biological Analysis, Campus Vida, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain; [email protected] 3 Institute of Marine Research, Consejo Superior de Investigaciones Científicas-CSIC, 36208 Vigo, Spain; [email protected] (P.P.); antoniofi[email protected] (A.F.); [email protected] (B.N.) 4 Future Genomics Technologies, Leiden BioScience Park, 2333 BE Leiden, The Netherlands; [email protected] * Correspondence: [email protected]; Tel.: +34-88-181-6951; Fax: +34-88-159-6904 Received: 4 September 2020; Accepted: 14 October 2020; Published: 15 October 2020 Simple Summary: Philasterides dicentrarchi is a free-living ciliate that causes high mortality in marine cultured fish, particularly flatfish, and in fish kept in aquaria. At present, there is still no clear picture of what makes this ciliate a fish pathogen and what makes fish resistant to this ciliate. In the present study, we used transcriptomic techniques to evaluate the interactions between P. dicentrarchi and turbot leucocytes during the early stages of infection. The findings enabled us to identify some parasite genes/proteins that may be involved in virulence and host resistance, some of which may be good candidates for inclusion in fish vaccines.
    [Show full text]
  • Classifying Transport Proteins Using Profile Hidden Markov Models And
    Classifying Transport Proteins Using Profile Hidden Markov Models and Specificity Determining Sites Qing Ye A Thesis in The Department of Computer Science and Software Engineering Presented in Partial Fulfillment of the Requirements for the Degree of Master of Computer Science (MCompSc) at Concordia University Montréal, Québec, Canada April 2019 ⃝c Qing Ye, 2019 CONCORDIA UNIVERSITY School of Graduate Studies This is to certify that the thesis prepared By: Qing Ye Entitled: Classifying Transport Proteins Using Profile Hidden Markov Models and Specificity Determining Sites and submitted in partial fulfillment of the requirements for the degree of Master of Computer Science (MCompSc) complies with the regulations of this University and meets the accepted standards with respect to originality and quality. Signed by the Final Examining Committee: Chair Dr. T.-H. Chen Examiner Dr. T. Glatard Examiner Dr. A. Krzyzak Supervisor Dr. G. Butler Approved by Martin D. Pugh, Chair Department of Computer Science and Software Engineering 2019 Amir Asif, Dean Faculty of Engineering and Computer Science Abstract Classifying Transport Proteins Using Profile Hidden Markov Models and Specificity Determining Sites Qing Ye This thesis develops methods to classifiy the substrates transported across a membrane by a given transmembrane protein. Our methods use tools that predict specificity determining sites (SDS) after computing a multiple sequence alignment (MSA), and then building a profile Hidden Markov Model (HMM) using HMMER. In bioinformatics, HMMER is a set of widely used applications for sequence analysis based on profile HMM. Specificity determining sites (SDS) are the key positions in a protein sequence that play a crucial role in functional variation within the protein family during the course of evolution.
    [Show full text]
  • BIOGRAPHICAL SKETCH NAME: Berger
    BIOGRAPHICAL SKETCH NAME: Berger, Bonnie eRA COMMONS USER NAME (credential, e.g., agency login): BABERGER POSITION TITLE: Simons Professor of Mathematics and Professor of Electrical Engineering and Computer Science EDUCATION/TRAINING (Begin with baccalaureate or other initial professional education, such as nursing, include postdoctoral training and residency training if applicable. Add/delete rows as necessary.) EDUCATION/TRAINING DEGREE Completion (if Date FIELD OF STUDY INSTITUTION AND LOCATION applicable) MM/YYYY Brandeis University, Waltham, MA AB 06/1983 Computer Science Massachusetts Institute of Technology SM 01/1986 Computer Science Massachusetts Institute of Technology Ph.D. 06/1990 Computer Science Massachusetts Institute of Technology Postdoc 06/1992 Applied Mathematics A. Personal Statement Advances in modern biology revolve around automated data collection and sharing of the large resulting datasets. I am considered a pioneer in the area of bringing computer algorithms to the study of biological data, and a founder in this community that I have witnessed grow so profoundly over the last 26 years. I have made major contributions to many areas of computational biology and biomedicine, largely, though not exclusively through algorithmic innovations, as demonstrated by nearly twenty thousand citations to my scientific papers and widely-used software. In recognition of my success, I have just been elected to the National Academy of Sciences and in 2019 received the ISCB Senior Scientist Award, the pinnacle award in computational biology. My research group works on diverse challenges, including Computational Genomics, High-throughput Technology Analysis and Design, Biological Networks, Structural Bioinformatics, Population Genetics and Biomedical Privacy. I spearheaded research on analyzing large and complex biological data sets through topological and machine learning approaches; e.g.
    [Show full text]
  • Human CCL3L1 Copy Number Variation, Gene Expression, and The
    bioRxiv preprint doi: https://doi.org/10.1101/249508; this version posted January 17, 2018. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is Pagemade available 1 of 19 under aCC-BY 4.0 International license. 1 Human CCL3L1 copy number variation, gene expression, and the role of the CCL3L1-CCR5 axis 2 in lung function 3 4 Adeolu B Adewoye (1)*, Nick Shrine (2)*, Linda Odenthal-Hesse (1), Samantha Welsh (3), 5 Anders Malarstig (4), Scott Jelinsky (5), Iain Kilty (5), Martin D Tobin (2,6), Edward J Hollox (1)†, 6 Louise V Wain (2,6)† 7 8 1. Department of Genetics and Genome Biology, University of Leicester, Leicester, UK 9 2. Department of Health Sciences, University of Leicester, Leicester, UK 10 3. UK Biobank, Stockport, UK 11 4. Pfizer Worldwide Research and Development, Stockholm, Sweden. 12 5. Pfizer Worldwide Research and Development, Cambridge, Massachusetts, USA 13 6. National Institute of Health Research Biomedical Research Centre, University of 14 Leicester, Leicester, UK 15 16 * † These authors contributed equally to this work. 17 18 Corresponding authors: Edward J Hollox ([email protected]), Louise V Wain 19 ([email protected]) 20 21 Keywords: copy number variation, lung function, CCL3L1, CCR5, CNV, UK Biobank 22 23 Abstract 24 25 The CCL3L1-CCR5 signaling axis is important in a number of inflammatory responses, including 26 macrophage function, and T-cell-dependent immune responses. Small molecule CCR5 27 antagonists exist, including the approved antiretroviral drug maraviroc, and therapeutic 28 monoclonal antibodies are in development.
    [Show full text]
  • Comparative Transcriptomics Reveals Similarities and Differences
    Seifert et al. BMC Cancer (2015) 15:952 DOI 10.1186/s12885-015-1939-9 RESEARCH ARTICLE Open Access Comparative transcriptomics reveals similarities and differences between astrocytoma grades Michael Seifert1,2,5*, Martin Garbe1, Betty Friedrich1,3, Michel Mittelbronn4 and Barbara Klink5,6,7 Abstract Background: Astrocytomas are the most common primary brain tumors distinguished into four histological grades. Molecular analyses of individual astrocytoma grades have revealed detailed insights into genetic, transcriptomic and epigenetic alterations. This provides an excellent basis to identify similarities and differences between astrocytoma grades. Methods: We utilized public omics data of all four astrocytoma grades focusing on pilocytic astrocytomas (PA I), diffuse astrocytomas (AS II), anaplastic astrocytomas (AS III) and glioblastomas (GBM IV) to identify similarities and differences using well-established bioinformatics and systems biology approaches. We further validated the expression and localization of Ang2 involved in angiogenesis using immunohistochemistry. Results: Our analyses show similarities and differences between astrocytoma grades at the level of individual genes, signaling pathways and regulatory networks. We identified many differentially expressed genes that were either exclusively observed in a specific astrocytoma grade or commonly affected in specific subsets of astrocytoma grades in comparison to normal brain. Further, the number of differentially expressed genes generally increased with the astrocytoma grade with one major exception. The cytokine receptor pathway showed nearly the same number of differentially expressed genes in PA I and GBM IV and was further characterized by a significant overlap of commonly altered genes and an exclusive enrichment of overexpressed cancer genes in GBM IV. Additional analyses revealed a strong exclusive overexpression of CX3CL1 (fractalkine) and its receptor CX3CR1 in PA I possibly contributing to the absence of invasive growth.
    [Show full text]
  • The Genome-Wide Landscape of Copy Number Variations in the MUSGEN Study Provides Evidence for a Founder Effect in the Isolated Finnish Population
    European Journal of Human Genetics (2013) 21, 1411–1416 & 2013 Macmillan Publishers Limited All rights reserved 1018-4813/13 www.nature.com/ejhg ARTICLE The genome-wide landscape of copy number variations in the MUSGEN study provides evidence for a founder effect in the isolated Finnish population Chakravarthi Kanduri1, Liisa Ukkola-Vuoti1, Jaana Oikkonen1, Gemma Buck2, Christine Blancher2, Pirre Raijas3, Kai Karma4, Harri La¨hdesma¨ki5 and Irma Ja¨rvela¨*,1 Here we characterized the genome-wide architecture of copy number variations (CNVs) in 286 healthy, unrelated Finnish individuals belonging to the MUSGEN study, where molecular background underlying musical aptitude and related traits are studied. By using Illumina HumanOmniExpress-12v.1.0 beadchip, we identified 5493 CNVs that were spread across 467 different cytogenetic regions, spanning a total size of 287.83 Mb (B9.6% of the human genome). Merging the overlapping CNVs across samples resulted in 999 discrete copy number variable regions (CNVRs), of which B6.9% were putatively novel. The average number of CNVs per person was 20, whereas the average size of CNV per locus was 52.39 kb. Large CNVs (41 Mb) were present in 4% of the samples. The proportion of homozygous deletions in this data set (B12.4%) seemed to be higher when compared with three other populations. Interestingly, several CNVRs were significantly enriched in this sample set, whereas several others were totally depleted. For example, a CNVR at chr2p22.1 intersecting GALM was more common in this population (P ¼ 3.3706 Â 10 À44) than in African and other European populations. The enriched CNVRs, however, showed no significant association with music-related phenotypes.
    [Show full text]
  • Effects of Sex and Aging on the Immune Cell Landscape As Assessed by Single-Cell Transcriptomic Analysis
    Effects of sex and aging on the immune cell landscape as assessed by single-cell transcriptomic analysis Zhaohao Huanga,1, Binyao Chena,1, Xiuxing Liua,1,HeLia,1, Lihui Xiea,1, Yuehan Gaoa,1, Runping Duana, Zhaohuai Lia, Jian Zhangb, Yingfeng Zhenga,2, and Wenru Sua,2 aState Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, China; and bDepartment of Clinical Research Center, Zhongshan Ophthalmic Center, Sun Yat-Sen University, Guangzhou 510060, China Edited by Lawrence Steinman, Stanford University School of Medicine, Stanford, CA, and approved June 22, 2021 (received for review November 19, 2020) Sex and aging influence the human immune system, resulting in the immune system integrates numerous interconnected compo- disparate responses to infection, autoimmunity, and cancer. How- nents, pathways, and cell types in sex and aging. ever, the impact of sex and aging on the immune system is not yet To this end, we profiled the transcriptomes of peripheral im- fully elucidated. Using small conditional RNA sequencing, we mune cells sampled from men and women of two distinct age found that females had a lower percentage of natural killer (NK) groups. Then, we investigated sex- and age-related differences in cells and a higher percentage of plasma cells in peripheral blood PBMC immune cell composition, molecular programs, and com- compared with males. Bioinformatics revealed that young females munication network. Extensive differences in all of these aspects exhibited an overrepresentation of pathways that relate to T and were observed between sexes, particularly in aging. Compared – B cell activation. Moreover, cell cell communication analysis revealed with males, females exhibited a higher expression of T cell (TC)– evidence of increased activity of the BAFF/APRIL systems in females.
    [Show full text]
  • Supplementary Figure 1. Network Map Associated with Upregulated Canonical Pathways Shows Interferon Alpha As a Key Regulator
    Supplementary Figure 1. Network map associated with upregulated canonical pathways shows interferon alpha as a key regulator. IPA core analysis determined interferon-alpha as an upstream regulator in the significantly upregulated genes from RNAseq data from nasopharyngeal swabs of COVID-19 patients (GSE152075). Network map was generated in IPA, overlaid with the Coronavirus Replication Pathway. Supplementary Figure 2. Network map associated with Cell Cycle, Cellular Assembly and Organization, DNA Replication, Recombination, and Repair shows relationships among significant canonical pathways. Significant pathways were identified from pathway analysis of RNAseq from PBMCs of COVID-19 patients. Coronavirus Pathogenesis Pathway was also overlaid on the network map. The orange and blue colors in indicate predicted activation or predicted inhibition, respectively. Supplementary Figure 3. Significant biological processes affected in brochoalveolar lung fluid of severe COVID-19 patients. Network map was generated by IPA core analysis of differentially expressed genes for severe vs mild COVID-19 patients in bronchoalveolar lung fluid (BALF) from scRNA-seq profile of GSE145926. Orange color represents predicted activation. Red boxes highlight important cytokines involved. Supplementary Figure 4. 10X Genomics Human Immunology Panel filtered differentially expressed genes in each immune subset (NK cells, T cells, B cells, and Macrophages) of severe versus mild COVID-19 patients. Three genes (HLA-DQA2, IFIT1, and MX1) were found significantly and consistently differentially expressed. Gene expression is shown per the disease severity (mild, severe, recovered) is shown on the top row and expression across immune cell subsets are shown on the bottom row. Supplementary Figure 5. Network map shows interactions between differentially expressed genes in severe versus mild COVID-19 patients.
    [Show full text]
  • Measurement of Copy Number at Complex Loci Using High-Throughput Sequencing Data
    METHODS ARTICLE published: 01 August 2014 doi: 10.3389/fgene.2014.00248 The CNVrd2 package: measurement of copy number at complex loci using high-throughput sequencing data HoangT.Nguyen1,2,3,TonyR.Merriman1,3* and Michael A. Black 1,3 1 Department of Biochemistry, University of Otago, Dunedin, New Zealand 2 Department of Mathematics and Statistics, University of Otago, Dunedin, New Zealand 3 Department of Biochemistry, Virtual Institute of Statistical Genetics, University of Otago, Dunedin, New Zealand Edited by: Recent advances in high-throughout sequencing technologies have made it possible to Alexandre V. Morozov, Rutgers accurately assign copy number (CN) at CN variable loci. However, current analytic methods University, USA often perform poorly in regions in which complex CN variation is observed. Here we Reviewed by: report the development of a read depth-based approach, CNVrd2, for investigation of David John Studholme, University of Exeter, UK CN variation using high-throughput sequencing data. This methodology was developed Mikhail P.Ponomarenko, Institute of using data from the 1000 Genomes Project from the CCL3L1 locus, and tested using Cytology and Genetics of Siberian data from the DEFB103A locus. In both cases, samples were selected for which paralog Branch of Russian Academy of ratio test data were also available for comparison. The CNVrd2 method first uses Sciences, Russia observed read-count ratios to refine segmentation results in one population. Then a *Correspondence: Tony R. Merriman, Department of linear regression model is applied to adjust the results across multiple populations, in Biochemistry, University of Otago, combination with a Bayesian normal mixture model to cluster segmentation scores into 710 Cumberland Street, Dunedin groups for individual CN counts.
    [Show full text]
  • CCL3L1 Copy Number, HIV Load, and Immune Reconstitution in Sub
    Aklillu et al. BMC Infectious Diseases 2013, 13:536 http://www.biomedcentral.com/1471-2334/13/536 RESEARCH ARTICLE Open Access CCL3L1 copy number, HIV load, and immune reconstitution in sub-Saharan Africans Eleni Aklillu1, Linda Odenthal-Hesse2, Jennifer Bowdrey2, Abiy Habtewold1,3, Eliford Ngaimisi1,4, Getnet Yimer1,3, Wondwossen Amogne5,6, Sabina Mugusi7, Omary Minzi4, Eyasu Makonnen3, Mohammed Janabi8, Ferdinand Mugusi8, Getachew Aderaye5, Robert Hardwick2, Beiyuan Fu9, Maria Viskaduraki10, Fengtang Yang9 and Edward J Hollox2* Abstract Background: The role of copy number variation of the CCL3L1 gene, encoding MIP1α, in contributing to the host variation in susceptibility and response to HIV infection is controversial. Here we analyse a sub-Saharan African co- hort from Tanzania and Ethiopia, two countries with a high prevalence of HIV-1 and a high co-morbidity of HIV with tuberculosis. Methods: We use a form of quantitative PCR called the paralogue ratio test to determine CCL3L1 gene copy number in 1134 individuals and validate our copy number typing using array comparative genomic hybridisation and fiber-FISH. Results: We find no significant association of CCL3L1 gene copy number with HIV load in antiretroviral-naïve pa- tients prior to initiation of combination highly active anti-retroviral therapy. However, we find a significant associ- ation of low CCL3L1 gene copy number with improved immune reconstitution following initiation of highly active anti-retroviral therapy (p = 0.012), replicating a previous study. Conclusions: Our work supports a role for CCL3L1 copy number in immune reconstitution following antiretroviral therapy in HIV, and suggests that the MIP1α -CCR5 axis might be targeted to aid immune reconstitution.
    [Show full text]
  • The Myth of Junk DNA
    The Myth of Junk DNA JoATN h A N W ells s eattle Discovery Institute Press 2011 Description According to a number of leading proponents of Darwin’s theory, “junk DNA”—the non-protein coding portion of DNA—provides decisive evidence for Darwinian evolution and against intelligent design, since an intelligent designer would presumably not have filled our genome with so much garbage. But in this provocative book, biologist Jonathan Wells exposes the claim that most of the genome is little more than junk as an anti-scientific myth that ignores the evidence, impedes research, and is based more on theological speculation than good science. Copyright Notice Copyright © 2011 by Jonathan Wells. All Rights Reserved. Publisher’s Note This book is part of a series published by the Center for Science & Culture at Discovery Institute in Seattle. Previous books include The Deniable Darwin by David Berlinski, In the Beginning and Other Essays on Intelligent Design by Granville Sewell, God and Evolution: Protestants, Catholics, and Jews Explore Darwin’s Challenge to Faith, edited by Jay Richards, and Darwin’s Conservatives: The Misguided Questby John G. West. Library Cataloging Data The Myth of Junk DNA by Jonathan Wells (1942– ) Illustrations by Ray Braun 174 pages, 6 x 9 x 0.4 inches & 0.6 lb, 229 x 152 x 10 mm. & 0.26 kg Library of Congress Control Number: 2011925471 BISAC: SCI029000 SCIENCE / Life Sciences / Genetics & Genomics BISAC: SCI027000 SCIENCE / Life Sciences / Evolution ISBN-13: 978-1-9365990-0-4 (paperback) Publisher Information Discovery Institute Press, 208 Columbia Street, Seattle, WA 98104 Internet: http://www.discoveryinstitutepress.com/ Published in the United States of America on acid-free paper.
    [Show full text]