IDENTIFICATION and ANNOTATION of TRANSPOSABLE ELEMENTS and AGENT- and GIS-BASED MODELING of PATHOGEN TRANSMISSION a Dissertation

Total Page:16

File Type:pdf, Size:1020Kb

IDENTIFICATION and ANNOTATION of TRANSPOSABLE ELEMENTS and AGENT- and GIS-BASED MODELING of PATHOGEN TRANSMISSION a Dissertation IDENTIFICATION AND ANNOTATION OF TRANSPOSABLE ELEMENTS AND AGENT- AND GIS-BASED MODELING OF PATHOGEN TRANSMISSION A Dissertation Submitted to the Graduate School of the University of Notre Dame in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy by Ryan C. Kennedy, Gregory R. Madey, Co-Director Frank H. Collins, Co-Director Graduate Program in Computer Science and Engineering Notre Dame, Indiana January 2011 IDENTIFICATION AND ANNOTATION OF TRANSPOSABLE ELEMENTS AND AGENT- AND GIS-BASED MODELING OF PATHOGEN TRANSMISSION Abstract by Ryan C. Kennedy The work presented here has two primary components: 1) the identification and annotation of transposable elements (TEs) and 2) a spatially-aware agent- based model of pathogen transmission. Recent advances in sequencing technology have resulted in an explosion of genomic data. The identification of TEs is an important part of every genome project. This dissertation presents an automated homology-based approach to identify TEs, implemented as TESeeker, that produces consensus TEs up to 98% identical to manually annotated sequences. It also offers a design and implementa- tion plan to allow for the inclusion of TEs on VectorBase's community annotation pipeline. Agent-based modeling is very adept at modeling natural phenomena. Coupling geographical information system (GIS) data with agent-based modeling further increases the utility of such simulations. This dissertation presents a GIS aware agent-based model of pathogen transmission as well as methods and recommenda- tions for incorporating GIS data into a simulation. The model, named LiNK, was specifically developed to study the impact of landscape on pathogen transmission. DEDICATION To my family and friends ii CONTENTS FIGURES . ix TABLES . xi ACKNOWLEDGMENTS . xii CHAPTER 1: INTRODUCTION . .1 1.1 Overview . .1 1.2 Identification and Annotation of Transposable Elements . .1 1.3 Agent- and GIS-based Modeling of Pathogen Transmission . .3 1.4 Goals . .3 1.5 Organization . .4 1.6 Contributions . .4 CHAPTER 2: TRANSPOSABLE ELEMENT AND BIOINFORMATICS BACKGROUND . .6 2.1 Introduction . .6 2.2 Molecular Biology . .6 2.3 Bioinformatics . .8 2.3.1 VectorBase . 11 2.4 Transposable Elements . 12 2.5 Transposable Element Identification . 16 2.5.1 De novo Discovery . 17 2.5.2 Structure-based Discovery . 17 2.5.3 Comparative Genomic Methods . 18 2.5.4 Homology-based Discovery . 18 2.6 Annotation . 19 2.6.1 DAS . 20 2.6.2 Ensembl . 20 2.6.2.1 Ensembl Genebuild . 21 2.6.3 Chado . 25 iii 2.6.4 Hibernate . 25 2.6.5 VectorBase Community Annotation Pipeline . 25 2.6.5.1 Planned Updates to the VectorBase Community An- notation Pipeline . 28 2.7 Transposable Element Annotation . 28 2.7.1 VisualRepbase . 29 2.8 Summary . 31 CHAPTER 3: AUTOMATED HOMOLOGY-BASED APPROACH FOR THE IDENTIFICATION OF TRANSPOSABLE ELEMENTS . 32 3.1 Introduction . 32 3.2 Approach for Identification of Transposable Elements . 33 3.2.1 Dependencies . 33 3.2.1.1 Library of Representative Sequences . 33 3.2.1.2 BLAST . 34 3.2.1.3 DNASTAR SeqMan II . 34 3.2.1.4 CAP3 .......................... 34 3.2.1.5 ClustalW2 ....................... 34 3.2.1.6 BioPerl . 35 3.2.2 General Description of Approach . 35 3.2.2.1 Identify Coding Region . 37 3.2.2.2 Encompass Complete Transposable Element . 39 3.2.2.3 Generate Consensus . 41 3.2.2.4 Identify Complete Transposable Element . 41 3.2.3 Implementation . 42 3.2.4 Advantages . 42 3.2.5 Limitations . 43 3.3 Results . 44 3.3.1 Pediculus humanus humanus ................. 45 3.3.1.1 Class I Elements . 47 3.3.1.2 Class II Elements . 48 3.3.2 Culex quinquefasciatus .................... 49 3.3.3 Anopheles gambiae PEST Genome . 49 3.3.4 Other Organisms . 51 3.4 Conclusion . 52 CHAPTER 4: DESIGN AND PROOF-OF-CONCEPT PLAN FOR COM- MUNITY ANNOTATION OF TRANSPOSABLE ELEMENTS ON VEC- TORBASE . 54 4.1 Introduction . 54 4.2 Transposable Elements and the VectorBase Community Annotation Pipeline . 56 iv 4.2.1 Similarities to the VectorBase Community Annotation Pipeline 56 4.2.2 Differences from the VectorBase Community Annotation Pipeline . 60 4.2.3 Transposable Element Representation in Chado . 60 4.2.4 Proof-of-Concept . 62 4.3 Design and Implementation Plan . 65 4.4 Conclusion . 66 CHAPTER 5: SIMULATION AND MODELING BACKGROUND . 68 5.1 Introduction . 68 5.2 Simulation and Modeling . 68 5.2.1 Advantages and Disadvantages . 70 5.2.2 Building a Simulation Model . 71 5.2.3 Simulation Model Types . 72 5.2.4 Agent-based Modeling . 74 5.2.5 Equation-based Modeling . 74 5.3 Geographic Information Systems . 75 5.3.1 Raster Data . 75 5.3.2 Vector Data . 76 5.4 Integrating Geographic Information System Data into Agent-based Modeling . 76 5.5 Summary . 78 CHAPTER 6: A GIS AWARE AGENT-BASED MODEL OF PATHOGEN TRANSMISSION . 79 6.1 Introduction . 79 6.2 LiNK Simulation Model . 79 6.2.1 Model Background . 80 6.2.2 Conceptual Model . 82 6.2.3 ODD Protocol Description of LiNK ............. 91 6.2.3.1 Purpose . 91 6.2.3.2 State Variables and Scales . 91 6.2.3.3 Process Overview and Scheduling . 94 6.2.3.4 Design Concepts . 95 6.2.3.5 Initialization . 96 6.2.3.6 Input . 96 6.2.3.7 Submodels . 97 6.2.4 Implementation . 98 6.2.5 Verification and Validation . 98 6.3 Geographic Information System Data and Agent-Based Modeling 100 6.3.1 Approximating Geographic Information System Data in Sim- ulations . 100 v 6.3.2 Raster Queries . 100 6.3.3 Spatial Queries . 101 6.3.3.1 Simplified Spatial Queries . 101 6.3.4 Precalculated Query Matrix . 103 6.3.5 GIS Aware Agents . 104 6.3.5.1 Movement . 104 6.4 Results . 108 6.4.1 Performance . 111 6.5 Analyzing Massive Amounts of Simulation Data . 116 6.5.1 LiNKStat ........................... 116 6.6 Conclusion . 116 CHAPTER 7: CONCLUSION . 121 7.1 Overview . 121 7.2 Automated Homology-based Approach for the Identification of Trans- posable Elements . 121 7.2.1 Future Work . 122 7.3 Community Annotation of Transposable Elements on VectorBase 122 7.3.1 Future Work . 123 7.4 GIS Aware Agent-based Model of Pathogen Transmission . 123 7.4.1 Future Work . 124 7.5 Contributions . 124 APPENDIX A: AUTOMATED APPROACH WALKTHROUGH . 127 A.1 Representative Amino Acid Coding Regions . 127 A.2 Identify Coding Region . 131 A.2.1 tblastn Search . 131 A.2.2 Extract Sequences from the Genome . 135 A.2.3 CAP3 Assembly . 137 A.2.3.1 CAP3 Contigs . 137 A.2.3.2 CAP3 Contigs Quality Scores . 141 A.3 Encompass Complete Transposable Element . 148 A.4 Generate Consensus . 149 A.5 Identify Complete Transposable Element . 150 A.5.1 CAP3 Assembly . 150 A.5.2 CAP3 Contigs Quality File . 151 A.5.3 Trimmed CAP3 Contigs . 153 APPENDIX B: TESeeker WEBSITE . 154.
Recommended publications
  • 4-6 Weeks Old Female C57BL/6 Mice Obtained from Jackson Labs Were Used for Cell Isolation
    Methods Mice: 4-6 weeks old female C57BL/6 mice obtained from Jackson labs were used for cell isolation. Female Foxp3-IRES-GFP reporter mice (1), backcrossed to B6/C57 background for 10 generations, were used for the isolation of naïve CD4 and naïve CD8 cells for the RNAseq experiments. The mice were housed in pathogen-free animal facility in the La Jolla Institute for Allergy and Immunology and were used according to protocols approved by the Institutional Animal Care and use Committee. Preparation of cells: Subsets of thymocytes were isolated by cell sorting as previously described (2), after cell surface staining using CD4 (GK1.5), CD8 (53-6.7), CD3ε (145- 2C11), CD24 (M1/69) (all from Biolegend). DP cells: CD4+CD8 int/hi; CD4 SP cells: CD4CD3 hi, CD24 int/lo; CD8 SP cells: CD8 int/hi CD4 CD3 hi, CD24 int/lo (Fig S2). Peripheral subsets were isolated after pooling spleen and lymph nodes. T cells were enriched by negative isolation using Dynabeads (Dynabeads untouched mouse T cells, 11413D, Invitrogen). After surface staining for CD4 (GK1.5), CD8 (53-6.7), CD62L (MEL-14), CD25 (PC61) and CD44 (IM7), naïve CD4+CD62L hiCD25-CD44lo and naïve CD8+CD62L hiCD25-CD44lo were obtained by sorting (BD FACS Aria). Additionally, for the RNAseq experiments, CD4 and CD8 naïve cells were isolated by sorting T cells from the Foxp3- IRES-GFP mice: CD4+CD62LhiCD25–CD44lo GFP(FOXP3)– and CD8+CD62LhiCD25– CD44lo GFP(FOXP3)– (antibodies were from Biolegend). In some cases, naïve CD4 cells were cultured in vitro under Th1 or Th2 polarizing conditions (3, 4).
    [Show full text]
  • Variation in Protein Coding Genes Identifies Information
    bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 1 1 2 3 4 5 Variation in protein coding genes identifies information flow as a contributor to 6 animal complexity 7 8 Jack Dean, Daniela Lopes Cardoso and Colin Sharpe* 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Institute of Biological and Biomedical Sciences 25 School of Biological Science 26 University of Portsmouth, 27 Portsmouth, UK 28 PO16 7YH 29 30 * Author for correspondence 31 [email protected] 32 33 Orcid numbers: 34 DLC: 0000-0003-2683-1745 35 CS: 0000-0002-5022-0840 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Abstract bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 2 1 Across the metazoans there is a trend towards greater organismal complexity. How 2 complexity is generated, however, is uncertain. Since C.elegans and humans have 3 approximately the same number of genes, the explanation will depend on how genes are 4 used, rather than their absolute number.
    [Show full text]
  • Supplementary Table S4. FGA Co-Expressed Gene List in LUAD
    Supplementary Table S4. FGA co-expressed gene list in LUAD tumors Symbol R Locus Description FGG 0.919 4q28 fibrinogen gamma chain FGL1 0.635 8p22 fibrinogen-like 1 SLC7A2 0.536 8p22 solute carrier family 7 (cationic amino acid transporter, y+ system), member 2 DUSP4 0.521 8p12-p11 dual specificity phosphatase 4 HAL 0.51 12q22-q24.1histidine ammonia-lyase PDE4D 0.499 5q12 phosphodiesterase 4D, cAMP-specific FURIN 0.497 15q26.1 furin (paired basic amino acid cleaving enzyme) CPS1 0.49 2q35 carbamoyl-phosphate synthase 1, mitochondrial TESC 0.478 12q24.22 tescalcin INHA 0.465 2q35 inhibin, alpha S100P 0.461 4p16 S100 calcium binding protein P VPS37A 0.447 8p22 vacuolar protein sorting 37 homolog A (S. cerevisiae) SLC16A14 0.447 2q36.3 solute carrier family 16, member 14 PPARGC1A 0.443 4p15.1 peroxisome proliferator-activated receptor gamma, coactivator 1 alpha SIK1 0.435 21q22.3 salt-inducible kinase 1 IRS2 0.434 13q34 insulin receptor substrate 2 RND1 0.433 12q12 Rho family GTPase 1 HGD 0.433 3q13.33 homogentisate 1,2-dioxygenase PTP4A1 0.432 6q12 protein tyrosine phosphatase type IVA, member 1 C8orf4 0.428 8p11.2 chromosome 8 open reading frame 4 DDC 0.427 7p12.2 dopa decarboxylase (aromatic L-amino acid decarboxylase) TACC2 0.427 10q26 transforming, acidic coiled-coil containing protein 2 MUC13 0.422 3q21.2 mucin 13, cell surface associated C5 0.412 9q33-q34 complement component 5 NR4A2 0.412 2q22-q23 nuclear receptor subfamily 4, group A, member 2 EYS 0.411 6q12 eyes shut homolog (Drosophila) GPX2 0.406 14q24.1 glutathione peroxidase
    [Show full text]
  • Human Induced Pluripotent Stem Cell–Derived Podocytes Mature Into Vascularized Glomeruli Upon Experimental Transplantation
    BASIC RESEARCH www.jasn.org Human Induced Pluripotent Stem Cell–Derived Podocytes Mature into Vascularized Glomeruli upon Experimental Transplantation † Sazia Sharmin,* Atsuhiro Taguchi,* Yusuke Kaku,* Yasuhiro Yoshimura,* Tomoko Ohmori,* ‡ † ‡ Tetsushi Sakuma, Masashi Mukoyama, Takashi Yamamoto, Hidetake Kurihara,§ and | Ryuichi Nishinakamura* *Department of Kidney Development, Institute of Molecular Embryology and Genetics, and †Department of Nephrology, Faculty of Life Sciences, Kumamoto University, Kumamoto, Japan; ‡Department of Mathematical and Life Sciences, Graduate School of Science, Hiroshima University, Hiroshima, Japan; §Division of Anatomy, Juntendo University School of Medicine, Tokyo, Japan; and |Japan Science and Technology Agency, CREST, Kumamoto, Japan ABSTRACT Glomerular podocytes express proteins, such as nephrin, that constitute the slit diaphragm, thereby contributing to the filtration process in the kidney. Glomerular development has been analyzed mainly in mice, whereas analysis of human kidney development has been minimal because of limited access to embryonic kidneys. We previously reported the induction of three-dimensional primordial glomeruli from human induced pluripotent stem (iPS) cells. Here, using transcription activator–like effector nuclease-mediated homologous recombination, we generated human iPS cell lines that express green fluorescent protein (GFP) in the NPHS1 locus, which encodes nephrin, and we show that GFP expression facilitated accurate visualization of nephrin-positive podocyte formation in
    [Show full text]
  • Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis
    Getting “Under the Skin”: Human Social Genomics in the Multi-Ethnic Study of Atherosclerosis by Kristen Monét Brown A dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy (Epidemiological Science) in the University of Michigan 2017 Doctoral Committee: Professor Ana V. Diez-Roux, Co-Chair, Drexel University Professor Sharon R. Kardia, Co-Chair Professor Bhramar Mukherjee Assistant Professor Belinda Needham Assistant Professor Jennifer A. Smith © Kristen Monét Brown, 2017 [email protected] ORCID iD: 0000-0002-9955-0568 Dedication I dedicate this dissertation to my grandmother, Gertrude Delores Hampton. Nanny, no one wanted to see me become “Dr. Brown” more than you. I know that you are standing over the bannister of heaven smiling and beaming with pride. I love you more than my words could ever fully express. ii Acknowledgements First, I give honor to God, who is the head of my life. Truly, without Him, none of this would be possible. Countless times throughout this doctoral journey I have relied my favorite scripture, “And we know that all things work together for good, to them that love God, to them who are called according to His purpose (Romans 8:28).” Secondly, I acknowledge my parents, James and Marilyn Brown. From an early age, you two instilled in me the value of education and have been my biggest cheerleaders throughout my entire life. I thank you for your unconditional love, encouragement, sacrifices, and support. I would not be here today without you. I truly thank God that out of the all of the people in the world that He could have chosen to be my parents, that He chose the two of you.
    [Show full text]
  • Variation in Protein Coding Genes Identifies Information Flow
    bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 1 1 2 3 4 5 Variation in protein coding genes identifies information flow as a contributor to 6 animal complexity 7 8 Jack Dean, Daniela Lopes Cardoso and Colin Sharpe* 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Institute of Biological and Biomedical Sciences 25 School of Biological Science 26 University of Portsmouth, 27 Portsmouth, UK 28 PO16 7YH 29 30 * Author for correspondence 31 [email protected] 32 33 Orcid numbers: 34 DLC: 0000-0003-2683-1745 35 CS: 0000-0002-5022-0840 36 37 38 39 40 41 42 43 44 45 46 47 48 49 Abstract bioRxiv preprint doi: https://doi.org/10.1101/679456; this version posted June 21, 2019. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY-NC-ND 4.0 International license. Animal complexity and information flow 2 1 Across the metazoans there is a trend towards greater organismal complexity. How 2 complexity is generated, however, is uncertain. Since C.elegans and humans have 3 approximately the same number of genes, the explanation will depend on how genes are 4 used, rather than their absolute number.
    [Show full text]
  • Chromatin Conformation Links Distal Target Genes to CKD Loci
    BASIC RESEARCH www.jasn.org Chromatin Conformation Links Distal Target Genes to CKD Loci Maarten M. Brandt,1 Claartje A. Meddens,2,3 Laura Louzao-Martinez,4 Noortje A.M. van den Dungen,5,6 Nico R. Lansu,2,3,6 Edward E.S. Nieuwenhuis,2 Dirk J. Duncker,1 Marianne C. Verhaar,4 Jaap A. Joles,4 Michal Mokry,2,3,6 and Caroline Cheng1,4 1Experimental Cardiology, Department of Cardiology, Thoraxcenter Erasmus University Medical Center, Rotterdam, The Netherlands; and 2Department of Pediatrics, Wilhelmina Children’s Hospital, 3Regenerative Medicine Center Utrecht, Department of Pediatrics, 4Department of Nephrology and Hypertension, Division of Internal Medicine and Dermatology, 5Department of Cardiology, Division Heart and Lungs, and 6Epigenomics Facility, Department of Cardiology, University Medical Center Utrecht, Utrecht, The Netherlands ABSTRACT Genome-wide association studies (GWASs) have identified many genetic risk factors for CKD. However, linking common variants to genes that are causal for CKD etiology remains challenging. By adapting self-transcribing active regulatory region sequencing, we evaluated the effect of genetic variation on DNA regulatory elements (DREs). Variants in linkage with the CKD-associated single-nucleotide polymorphism rs11959928 were shown to affect DRE function, illustrating that genes regulated by DREs colocalizing with CKD-associated variation can be dysregulated and therefore, considered as CKD candidate genes. To identify target genes of these DREs, we used circular chro- mosome conformation capture (4C) sequencing on glomerular endothelial cells and renal tubular epithelial cells. Our 4C analyses revealed interactions of CKD-associated susceptibility regions with the transcriptional start sites of 304 target genes. Overlap with multiple databases confirmed that many of these target genes are involved in kidney homeostasis.
    [Show full text]
  • Content Based Search in Gene Expression Databases and a Meta-Analysis of Host Responses to Infection
    Content Based Search in Gene Expression Databases and a Meta-analysis of Host Responses to Infection A Thesis Submitted to the Faculty of Drexel University by Francis X. Bell in partial fulfillment of the requirements for the degree of Doctor of Philosophy November 2015 c Copyright 2015 Francis X. Bell. All Rights Reserved. ii Acknowledgments I would like to acknowledge and thank my advisor, Dr. Ahmet Sacan. Without his advice, support, and patience I would not have been able to accomplish all that I have. I would also like to thank my committee members and the Biomed Faculty that have guided me. I would like to give a special thanks for the members of the bioinformatics lab, in particular the members of the Sacan lab: Rehman Qureshi, Daisy Heng Yang, April Chunyu Zhao, and Yiqian Zhou. Thank you for creating a pleasant and friendly environment in the lab. I give the members of my family my sincerest gratitude for all that they have done for me. I cannot begin to repay my parents for their sacrifices. I am eternally grateful for everything they have done. The support of my sisters and their encouragement gave me the strength to persevere to the end. iii Table of Contents LIST OF TABLES.......................................................................... vii LIST OF FIGURES ........................................................................ xiv ABSTRACT ................................................................................ xvii 1. A BRIEF INTRODUCTION TO GENE EXPRESSION............................. 1 1.1 Central Dogma of Molecular Biology........................................... 1 1.1.1 Basic Transfers .......................................................... 1 1.1.2 Uncommon Transfers ................................................... 3 1.2 Gene Expression ................................................................. 4 1.2.1 Estimating Gene Expression ............................................ 4 1.2.2 DNA Microarrays ......................................................
    [Show full text]
  • IN VIVO ANALYSIS of METAZOAN TRNA INTRON SPLICING Casey
    IN VIVO ANALYSIS OF METAZOAN TRNA INTRON SPLICING Casey Alexandra Schmidt A dissertation submitted to the faculty at the University of North Carolina at Chapel Hill in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the Curriculum in Genetics and Molecular Biology in the School of Medicine. Chapel Hill 2019 Approved by: Robert Duronio Amy Maddox William Marzluff A. Gregory Matera Mark Peifer ©2019 Casey Alexandra Schmidt ALL RIGHTS RESERVED ii ABSTRACT Casey Alexandra Schmidt: In vivo analysis of metazoan tRNA intron splicing (Under the direction of A. Gregory Matera) Pre-tRNA processing is an essential step in generating a supply of functional mature tRNAs. In some instances, this processing event includes removal of an intron. Recent work from our lab has shown that these introns are cut out of pre-tRNAs and ligated into circular RNAs, called tricRNAs, in the fruit fly Drosophila melanogaster. To study the mechanism of tricRNA biogenesis in Drosophila, I generated a series of splicing reporters adapted for expression in human and fly cells. Using these reporters, I discovered specific cis-acting elements important for proper spicing, including a conserved base pair found in all metazoan pre-tRNAs to date. I also identified candidate tRNA processing factors in Drosophila via sequence homology to human factors, and verified these candidates using an in vivo cellular splicing assay. My results show that Drosophila use the “direct ligation”-type tRNA ligation pathway also found in archaea and human cells. I also examined Drosophila tRNA processing factors in animals. Using available stocks, I observed striking neurological phenotypes when mutating or depleting these enzymes.
    [Show full text]
  • Genetic Analysis of Familial Alzheimer's Disease, Primary Lateral
    Genetic analysis of familial Alzheimer’s disease, primary lateral sclerosis and paroxysmal kinesigenic dyskinesia: a tool to uncover common mechanistic points Author Jacek Szymański Doctoral programme in neurosciences Director Tutor Jordi Pérez-Tur Alicia Salvador Fernández-Montejo November 2020 Dr Jordi Pérez-Tur, Investigador Científico del Instituto de Biomedicina de Valencia (IBV- CSIC), en calidad de Director de la tesis doctoral de Jacek Szymański, adscrito al Programa de Doctorado en Neurociencias de la Universitat de València. CERTIFICA Que la tesis titulada “Genetic analysis of familial Alzheimer’s disease, primary lateral sclerosis and paroxysmal kinesigenic dyskinesia: a tool to uncover common mechanistic points” se ha desarrollado bajo su dirección y supervisión, y que el trabajo de investigación realizado y la memoria del mismo, ha sido elaborada por el doctorado y cumple los requisitos científicos y formales para proceder al acto de defensa de la Tesis Doctoral. Y para que conste, en el cumplimiento de la legislación presente, firman el presente certificado en València, 2020. ___________________ ____________________ ____________________ Dr Jordi Pérez-Tur Dra Alicia Salvador Jacek Szymański Director Fernández-Montejo Doctorando Tutor Académico ACKNOWLEDGEMENTS This thesis is dedicated to Hanna Szymańska, the person to whom I owe my decision to pursue a doctoral degree and whose knowledge of science and fullest support I could always count on. I would like to express my deepest appreciation and gratitude to my supervisor and director of this thesis, Dr Jordi Pérez-Tur, for his valuable advice, unparalleled support, constructive criticism and guidance. He accepted me under his wing, created a wonderful work environment and is undoubtedly the best group leader I have encountered in my scientific career.
    [Show full text]
  • Supplementary Material a Polygenic Risk Score Predicts Mosaic Loss Of
    Supplementary material A polygenic risk score predicts mosaic loss of chromosome Y in circulating blood cells Riaz et al. Items Page(s) Supplementary Methods 2-5 Figures S1-S8 6-13 Tables S1-S6 14-17 Supplementary references 18 1 Supplementary Methods LOY estimation from SNP-array data Whole blood DNA samples collected from 6,140 male ASPREE participants were genotyped using the Axiom 2.0 Precision Medicine Diversity Research Array (PMDA) following standard protocols (ThermoFisher). We followed best practice genotyping and quality control (QC) protocols from Thermo Fisher, starting from raw intensity CEL files, we used a command line custom script designed for the the Axiom PMDA array, mapped to human genome reference GRCh38, to produce variant call files. We performed sample level QC using plink version 1.9, excluding samples for gender discordance (80 samples mismatched and excluded) using plink default F statistics threshold (≤ 0.2 female and ≥ 0.8 male), relatedness (124 indviduals excluded) using default PI-HAT threshold >0.025 to exclude one sample from each related pair. To estimate population structure in the ASPREE cohort we performed principal component analysis (PCA) using The 1000 Genomes Project as a reference population1. Directly genotyped data from ASPREE and The 1000 Genomes Project 1K phase 3 (liftover to GRCh38) were merged and LD pruned (r2 < 0.1) using plink version 1.92 followed by R package SNPrelate3. We calculated the Z score for first 2 principal component eigenvectors and excluded samples with ± 2SD (standard deviation) of Z score compared to their respective five reference superpopulation groups from the 1000 Genomes Project that included: Europenas, South Asians, East Asians, African American (African super population) and Hispanics (Ad Mixed American) (Figure S5).
    [Show full text]
  • Combining Genome Wide Association Studies and Differential Gene
    Open Journal of Animal Sciences, 2015, 5, 358-393 Published Online October 2015 in SciRes. http://www.scirp.org/journal/ojas http://dx.doi.org/10.4236/ojas.2015.54040 Combining Genome Wide Association Studies and Differential Gene Expression Data Analyses Identifies Candidate Genes Affecting Mastitis Caused by Two Different Pathogens in the Dairy Cow Xing Chen1,2, Zhangrui Cheng1, Shujun Zhang1,2, Dirk Werling3, D. Claire Wathes1* 1Department of Production and Population Health, Royal Veterinary College, Hatfield, UK 2Key Laboratory of Agricultural Animal Genetics, Breeding and Reproduction, Education Ministry of China, College of Animal Science and Technology, Huazhong Agricultural University, Wuhan, China 3Department of Pathology and Pathogen Biology, Royal Veterinary College, Hatfield, UK Email: *[email protected] Received 13 July 2015; accepted 20 September 2015; published 23 September 2015 Copyright © 2015 by authors and Scientific Research Publishing Inc. This work is licensed under the Creative Commons Attribution International License (CC BY). http://creativecommons.org/licenses/by/4.0/ Abstract Mastitis is a costly disease which hampers the dairy industry. Inflammation of the mammary gland is commonly caused by bacterial infection, mainly Escherichia coli, Streptococcus uberis and Staphy- lococcus aureus. As more bacteria become multi-drug resistant, one potential approach to reduce the disease incidence rate is to breed selectively for the most appropriate and potentially protec- tive innate immune response. The genetic contribution to effective disease resistance is, however, difficult to identify due to the complex interactions that occur. In the present study two published datasets were searched for common differentially expressed genes (DEGs) with similar changes in expression in mammary tissue following intra-mammary challenge with either E.
    [Show full text]