Literature Mining Sustains and Enhances Knowledge Discovery from Omic Studies

Total Page:16

File Type:pdf, Size:1020Kb

Literature Mining Sustains and Enhances Knowledge Discovery from Omic Studies LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES by Rick Matthew Jordan B.S. Biology, University of Pittsburgh, 1996 M.S. Molecular Biology/Biotechnology, East Carolina University, 2001 M.S. Biomedical Informatics, University of Pittsburgh, 2005 Submitted to the Graduate Faculty of School of Medicine in partial fulfillment of the requirements for the degree of Doctor of Philosophy University of Pittsburgh 2016 UNIVERSITY OF PITTSBURGH SCHOOL OF MEDICINE This dissertation was presented by Rick Matthew Jordan It was defended on December 2, 2015 and approved by Shyam Visweswaran, M.D., Ph.D., Associate Professor Rebecca Jacobson, M.D., M.S., Professor Songjian Lu, Ph.D., Assistant Professor Dissertation Advisor: Vanathi Gopalakrishnan, Ph.D., Associate Professor ii Copyright © by Rick Matthew Jordan 2016 iii LITERATURE MINING SUSTAINS AND ENHANCES KNOWLEDGE DISCOVERY FROM OMIC STUDIES Rick Matthew Jordan, M.S. University of Pittsburgh, 2016 Genomic, proteomic and other experimentally generated data from studies of biological systems aiming to discover disease biomarkers are currently analyzed without sufficient supporting evidence from the literature due to complexities associated with automated processing. Extracting prior knowledge about markers associated with biological sample types and disease states from the literature is tedious, and little research has been performed to understand how to use this knowledge to inform the generation of classification models from ‘omic’ data. Using pathway analysis methods to better understand the underlying biology of complex diseases such as breast and lung cancers is state-of-the-art. However, the problem of how to combine literature- mining evidence with pathway analysis evidence is an open problem in biomedical informatics research. This dissertation presents a novel semi-automated framework, named Knowledge Enhanced Data Analysis (KEDA), which incorporates the following components: 1) literature mining of text; 2) classification modeling; and 3) pathway analysis. This framework aids iv researchers in assigning literature-mining-based prior knowledge values to genes and proteins associated with disease biology. It incorporates prior knowledge into the modeling of experimental datasets, enriching the development process with current findings from the scientific community. New knowledge is presented in the form of lists of known disease-specific biomarkers and their accompanying scores obtained through literature mining of millions of lung and breast cancer abstracts. These scores can subsequently be used as prior knowledge values in Bayesian modeling and pathway analysis. Ranked, newly discovered biomarker-disease-biofluid relationships which identify biomarker specificity across biofluids are presented. A novel method of identifying biomarker relationships is discussed that examines the attributes from the best- performing models. Pathway analysis results from the addition of prior information, ultimately lead to more robust evidence for pathway involvement in diseases of interest based on statistically significant standard measures of impact factor and p-values. The outcome of implementing the KEDA framework is enhanced modeling and pathway analysis findings. Enhanced knowledge discovery analysis leads to new disease-specific entities and relationships that otherwise would not have been identified. Increased disease understanding, as well as identification of biomarkers for disease diagnosis, treatment, or therapy targets should ultimately lead to validation and clinical implementation. v TABLE OF CONTENTS ACKNOWLEDGEMENTS ..................................................................................................... xiv GLOSSARY ……………………………………………………………………..…………….xvi PREFACE ……………………………………………………………………………………....xx 1.0 INTRODUCTION ………………..……………………….………………….......…….1 1.1 THE PROBLEM ……….………………………………………………...……....4 1.1.1 Obtaining and organizing prior information from literature mining .5 1.1.2 Combining prior knowledge with experimental data ………..…..……7 1.1.3 Interpreting downstream results .…….…..….………………………….8 1.2 THE APPROACH …....…..………………………………………………………9 1.2.1 Theses ...…................……………………………………………………12 1.3 SIGNIFICANCE …………............……………………………………………...13 1.4 DISSERTATION OVERVIEW …….....……………………………………..…14 2.0 BACKGROUND ………...........……………........……………………………………15 2.1 MOLECULAR ASPECTS OF LUNG AND BREAST CANCER ….........…..15 2.2 LITERATURE MINING ……….....……..………………..……………………25 2.3 CLASSIFICATION MODELING.……...……....................………..………….34 2.3.1 Bayesian analysis …….……………….....……………………...………..35 2.4 PATHWAY ANALYSIS …..…….......……………………………………….....41 2.5 PRIOR KNOWLEDGE USE IN BIOINFORMATICS …….…......…………46 vi 3.0 IMPLEMENTATION OF KEDA FRAMEWORK …....…………………………..65 3.1 KEDA FRAMEWORK OVERVIEW...…………………..………...……………65 3.2 DESCRIPTION OF KEDA COMPONENTS ………………...…………………66 3.2.1 Literature mining methodology …………….……….…...…………...66 3.2.1.1 Information retrieval …………………………......….…….67 3.2.1.2 Named entity recognition ………………………......…..….71 3.2.1.3 Entity extraction ………...………………...……………….72 3.2.1.4 Dictionary ………………...………...………………………72 3.2.1.5 Z-score calculation……………...……......………………....72 3.2.1.6 Verification of relationships ……………………….......…..75 3.2.1.7 Error rate determination ………………………...…….…..78 3.2.2 Classification modeling methodology ……………….…………….......78 3.2.2.1 Experimental datasets …………………...………………....79 3.2.2.2 Preprocessing steps ……………….……..………….……....92 3.2.2.3 Transforming counts to prior probabilities …...……...…..96 3.2.2.4 Bayesian Rule Learner ………………...…………...………96 3.2.2.5 Execution of the BRL algorithm ………..……………........99 3.2.2.5.1 BRL input ……….………………...……..……..…99 3.2.2.5.2 BRL output ……….………...………….…...........100 3.2.2.6 Confirmatory research ……………..……………...….…..101 3.2.3 Pathway analysis methodology ………………..……….………..........101 3.2.3.1 Pathway Express…………………………..…………..…...102 3.2.3.2 Execution of Pathway Express algorithm…..…..….....…..103 vii 3.2.3.2.1 Pathway Express input ……....……….…..…….103 3.2.3.2.2 Pathway Express output …………...…...........…104 4.0 EVALUATION OF KEDA ……………..…………….............…...........….....……..107 4.1 LITERATURE MINING RESULTS …………..……….…..……….……......107 4.1.1 Z-score threshold optimization .………………………………….......108 4.1.2 Known markers per biofluid ……………………………………........109 4.1.3 Known markers found significant vs. non-significant ……………...110 4.1.4 Newly discovered markers found significant vs. non-significant ......111 4.1.5 Potential marker biofluid specificity ………………………………...112 4.1.6 Manual verification of findings ………………………...………….....115 4.1.7 Error rate estimation of new discoveries ………...…….……...…......115 4.2 CLASSIFICATION MODELING RESULTS ……………...………………....117 4.2.1 Experimental design using literature mining results ……….………117 4.2.2 Dataset size effects on accuracy ………………………..……...…......118 4.2.3 Weighting effects on accuracy…………………………...………..…119 4.2.4 Accuracy across different types of data ...……...…...…...…......120 4.2.5 Statistical significance ....………..……………………………...…..122 4.2.6 Examining modeling attributes used ……………………......…122 4.2.7 Confirmatory research …………….……………………….……..........123 4.3 PATHWAY ANALYSIS RESULTS ………...…………………………..….....127 4.3.1 Experimental design using literature mining results ……………….128 4.3.2 Breast cancer datasets ………………………………………………...129 4.3.3 Lung cancer datasets ……………………………………………….....136 viii 4.4 SUMMARY …………..…....……………..………………………....………...142 5.0 CONCLUSIONS, LIMITATIONS, AND FUTURE WORK ……...……...……..144 5.1 CONCLUSIONS ……………………………………………….……………...144 5.2 LIMITATIONS ……………………………………………….…………...…..146 5.3 FUTURE WORK ………..……………………………..…………….……..…147 5.3.1 Informatics …………………………………………………………….147 5.3.1.1 Resources …………………….……...……….…..………...147 5.3.1.2 Tools …...............………………………………………...…148 5.3.1.2.1 Literature mining ……..……………....………..148 5.3.1.2.2 Modeling …….…………….............…….....……149 5.3.1.2.3 Pathway analysis….…........................................ 150 5.3.2 Laboratory verification ...………………………………………….....150 APPENDIX A - PYTHON SCRIPT FOR KEDA TEXT-MINING COMPONENT …….152 APPENDIX B – INPUT DATASET DESCRIPTIONS FROM GEO ………………..……159 APPENDIX C - BIOMARKERS FOUND BY KEDA TEXT-MINING AND MODELING COMPONENTS………………………..........……………………………...170 BIBLIOGRAPHY …………………………………………………………………………….201 ix LIST OF TABLES Table 1. Pathway analysis tools ……………………………………………………………….....42 Table 2. Size of the abstract sets returned from breast and lung cancer PubMed queries ……….69 Table 3. Breast cancer-related genes from text-mining final table …………………………........74 Table 4. Known Breast Cancer Biomarkers ……………………………………………………..76 Table 5. Known Lung Cancer Biomarkers ……………………………………………………...77 Table 6. Breast and lung cancer dataset summary…………...………………...…………..…….82 Table 7. Input files for matching script ………………………………………………………….95 Table 8. Matching script output files ……………………………………………………………95 Table 9. Number of markers identified for each disease-biofluid combination …………...…..111 Table 10. Identification of the significant validated potential markers found to be in common to several biofluids or biofluid specific for breast and lung cancer………...113 Table 11. Manually verified biomarker table …………………,………………………………116 Table 12. Number of input genes in pathway from breast cancer datasets ………….....……....130 Table 13. Impact factors from breast cancer datasets …………………………………….........134 Table 14. P-values from breast cancer datasets
Recommended publications
  • Screening and Identification of Key Biomarkers in Clear Cell Renal Cell Carcinoma Based on Bioinformatics Analysis
    bioRxiv preprint doi: https://doi.org/10.1101/2020.12.21.423889; this version posted December 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Screening and identification of key biomarkers in clear cell renal cell carcinoma based on bioinformatics analysis Basavaraj Vastrad1, Chanabasayya Vastrad*2 , Iranna Kotturshetti 1. Department of Biochemistry, Basaveshwar College of Pharmacy, Gadag, Karnataka 582103, India. 2. Biostatistics and Bioinformatics, Chanabasava Nilaya, Bharthinagar, Dharwad 580001, Karanataka, India. 3. Department of Ayurveda, Rajiv Gandhi Education Society`s Ayurvedic Medical College, Ron, Karnataka 562209, India. * Chanabasayya Vastrad [email protected] Ph: +919480073398 Chanabasava Nilaya, Bharthinagar, Dharwad 580001 , Karanataka, India bioRxiv preprint doi: https://doi.org/10.1101/2020.12.21.423889; this version posted December 23, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Abstract Clear cell renal cell carcinoma (ccRCC) is one of the most common types of malignancy of the urinary system. The pathogenesis and effective diagnosis of ccRCC have become popular topics for research in the previous decade. In the current study, an integrated bioinformatics analysis was performed to identify core genes associated in ccRCC. An expression dataset (GSE105261) was downloaded from the Gene Expression Omnibus database, and included 26 ccRCC and 9 normal kideny samples. Assessment of the microarray dataset led to the recognition of differentially expressed genes (DEGs), which was subsequently used for pathway and gene ontology (GO) enrichment analysis.
    [Show full text]
  • The Title of the Article
    Mechanism-Anchored Profiling Derived from Epigenetic Networks Predicts Outcome in Acute Lymphoblastic Leukemia Xinan Yang, PhD1, Yong Huang, MD1, James L Chen, MD1, Jianming Xie, MSc2, Xiao Sun, PhD2, Yves A Lussier, MD1,3,4§ 1Center for Biomedical Informatics and Section of Genetic Medicine, Department of Medicine, The University of Chicago, Chicago, IL 60637 USA 2State Key Laboratory of Bioelectronics, Southeast University, 210096 Nanjing, P.R.China 3The University of Chicago Cancer Research Center, and The Ludwig Center for Metastasis Research, The University of Chicago, Chicago, IL 60637 USA 4The Institute for Genomics and Systems Biology, and the Computational Institute, The University of Chicago, Chicago, IL 60637 USA §Corresponding author Email addresses: XY: [email protected] YH: [email protected] JC: [email protected] JX: [email protected] XS: [email protected] YL: [email protected] - 1 - Abstract Background Current outcome predictors based on “molecular profiling” rely on gene lists selected without consideration for their molecular mechanisms. This study was designed to demonstrate that we could learn about genes related to a specific mechanism and further use this knowledge to predict outcome in patients – a paradigm shift towards accurate “mechanism-anchored profiling”. We propose a novel algorithm, PGnet, which predicts a tripartite mechanism-anchored network associated to epigenetic regulation consisting of phenotypes, genes and mechanisms. Genes termed as GEMs in this network meet all of the following criteria: (i) they are co-expressed with genes known to be involved in the biological mechanism of interest, (ii) they are also differentially expressed between distinct phenotypes relevant to the study, and (iii) as a biomodule, genes correlate with both the mechanism and the phenotype.
    [Show full text]
  • Bioinformatics Identi Cation of Prognostic Factors Associated With
    Bioinformatics Identication of Prognostic Factors Associated with Breast Cancer Ying Wei Sichuan University https://orcid.org/0000-0001-8178-4705 Shipeng Zhang College of Pharmacy, North Sichuan Medical College Li Xiao West China School of Basic Medical Sciences and Forensic Medicine Jing Zou West China School of Basic Medical Sciences and Forensic Medicine Yingqing Fu West China School of Basic Medical Sciences and Forensic Medicine Yi Ye West China School of Basic Medical Sciences and Forensic Medicine Linchuan Liao ( [email protected] ) West China School of Basic Medical Sciences and Forensic Medicine https://orcid.org/0000-0003-3700-8471 Research Keywords: Breast cancer, Differentially expressed genes, miRNAs, Transcription factors, Bioinformatic analysis Posted Date: December 2nd, 2020 DOI: https://doi.org/10.21203/rs.3.rs-117477/v1 License: This work is licensed under a Creative Commons Attribution 4.0 International License. Read Full License Page 1/23 Abstract Background: Breast cancer (BRCA) remains one of the most common forms of cancer and is the most prominent driver of cancer-related death among women. The mechanistic basis for BRCA, however, remains incompletely understood. In particular, the relationships between driver mutations and signaling pathways in BRCA are poorly characterized, making it dicult to identify reliable clinical biomarkers that can be employed in diagnostic, therapeutic, or prognostic contexts. Methods: First, we downloaded publically available BRCA datasets (GSE45827, GSE42568, and GSE61304) from the Gene Expression Omnibus (GEO) database. We then compared gene expression proles between tumor and control tissues in these datasets using Venn diagrams and the GEO2R analytical tool. We further explore the functional relevance of BRCA-associated differentially expressed genes (DEGs) via functional and pathway enrichment analyses using the DAVID tool, and we then constructed a protein-protein interaction network incorporating DEGs of interest using the Search Tool for the Retrieval of Interacting Genes (STRING) database.
    [Show full text]
  • The Inhibitory NKR-P1B:Clr-B Recognition Axis Facilitates Detection of Oncogenic Transformation and Cancer Immunosurveillance Miho Tanaka1,2, Jason H
    Published OnlineFirst April 24, 2018; DOI: 10.1158/0008-5472.CAN-17-1688 Cancer Tumor Biology and Immunology Research The Inhibitory NKR-P1B:Clr-b Recognition Axis Facilitates Detection of Oncogenic Transformation and Cancer Immunosurveillance Miho Tanaka1,2, Jason H. Fine1,2, Christina L. Kirkham1,2, Oscar A. Aguilar1,2, Antoaneta Belcheva1, Alberto Martin1, Troy Ketela3,4, Jason Moffat3,4, David S.J. Allan1,2, and James R. Carlyle1,2 Abstract Natural killer (NK) cells express receptors specific for MHC class in a primary lymphoma model despite preferential rejection of – – I (MHC-I) molecules involved in "missing-self" recognition of Clr-b / hematopoietic cells previously observed following adop- cancer and virus-infected cells. Here we elucidate the role of MHC- tive transfer into na€ve wild-type mice in vivo. Collectively, these I-independent NKR-P1B:Clr-b interactions in the detection of findings suggest that the inhibitory NKR-P1B:Clr-b axis plays a oncogenic transformation by NK cells. Ras oncogene overexpres- beneficial role in innate detection of oncogenic transformation sion was found to promote a real-time loss of Clr-b on mouse via NK-cell–mediated cancer immune surveillance, in addition fibroblasts and leukemia cells, mediated in part via the Raf/MEK/ to a pathologic role in the immune escape of primary lympho- ERK and PI3K pathways. Ras-driven Clr-b downregulation ma cells in Em-cMyc mice in vivo. These results provide a model occurred at the level of the Clrb (Clec2d) promoter, nascent Clr-b for the human NKR-P1A:LLT1 system in cancer immunosur- transcripts, and cell surface Clr-b protein, in turn promoting veillance in patients with lymphoma and suggest it may rep- missing-self recognition via the NKR-P1B inhibitory receptor.
    [Show full text]
  • Dual Proteome-Scale Networks Reveal Cell-Specific Remodeling of the Human Interactome
    bioRxiv preprint doi: https://doi.org/10.1101/2020.01.19.905109; this version posted January 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. Dual Proteome-scale Networks Reveal Cell-specific Remodeling of the Human Interactome Edward L. Huttlin1*, Raphael J. Bruckner1,3, Jose Navarrete-Perea1, Joe R. Cannon1,4, Kurt Baltier1,5, Fana Gebreab1, Melanie P. Gygi1, Alexandra Thornock1, Gabriela Zarraga1,6, Stanley Tam1,7, John Szpyt1, Alexandra Panov1, Hannah Parzen1,8, Sipei Fu1, Arvene Golbazi1, Eila Maenpaa1, Keegan Stricker1, Sanjukta Guha Thakurta1, Ramin Rad1, Joshua Pan2, David P. Nusinow1, Joao A. Paulo1, Devin K. Schweppe1, Laura Pontano Vaites1, J. Wade Harper1*, Steven P. Gygi1*# 1Department of Cell Biology, Harvard Medical School, Boston, MA, 02115, USA. 2Broad Institute, Cambridge, MA, 02142, USA. 3Present address: ICCB-Longwood Screening Facility, Harvard Medical School, Boston, MA, 02115, USA. 4Present address: Merck, West Point, PA, 19486, USA. 5Present address: IQ Proteomics, Cambridge, MA, 02139, USA. 6Present address: Vor Biopharma, Cambridge, MA, 02142, USA. 7Present address: Rubius Therapeutics, Cambridge, MA, 02139, USA. 8Present address: RPS North America, South Kingstown, RI, 02879, USA. *Correspondence: [email protected] (E.L.H.), [email protected] (J.W.H.), [email protected] (S.P.G.) #Lead Contact: [email protected] bioRxiv preprint doi: https://doi.org/10.1101/2020.01.19.905109; this version posted January 19, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder.
    [Show full text]
  • A Computational Approach for Defining a Signature of Β-Cell Golgi Stress in Diabetes Mellitus
    Page 1 of 781 Diabetes A Computational Approach for Defining a Signature of β-Cell Golgi Stress in Diabetes Mellitus Robert N. Bone1,6,7, Olufunmilola Oyebamiji2, Sayali Talware2, Sharmila Selvaraj2, Preethi Krishnan3,6, Farooq Syed1,6,7, Huanmei Wu2, Carmella Evans-Molina 1,3,4,5,6,7,8* Departments of 1Pediatrics, 3Medicine, 4Anatomy, Cell Biology & Physiology, 5Biochemistry & Molecular Biology, the 6Center for Diabetes & Metabolic Diseases, and the 7Herman B. Wells Center for Pediatric Research, Indiana University School of Medicine, Indianapolis, IN 46202; 2Department of BioHealth Informatics, Indiana University-Purdue University Indianapolis, Indianapolis, IN, 46202; 8Roudebush VA Medical Center, Indianapolis, IN 46202. *Corresponding Author(s): Carmella Evans-Molina, MD, PhD ([email protected]) Indiana University School of Medicine, 635 Barnhill Drive, MS 2031A, Indianapolis, IN 46202, Telephone: (317) 274-4145, Fax (317) 274-4107 Running Title: Golgi Stress Response in Diabetes Word Count: 4358 Number of Figures: 6 Keywords: Golgi apparatus stress, Islets, β cell, Type 1 diabetes, Type 2 diabetes 1 Diabetes Publish Ahead of Print, published online August 20, 2020 Diabetes Page 2 of 781 ABSTRACT The Golgi apparatus (GA) is an important site of insulin processing and granule maturation, but whether GA organelle dysfunction and GA stress are present in the diabetic β-cell has not been tested. We utilized an informatics-based approach to develop a transcriptional signature of β-cell GA stress using existing RNA sequencing and microarray datasets generated using human islets from donors with diabetes and islets where type 1(T1D) and type 2 diabetes (T2D) had been modeled ex vivo. To narrow our results to GA-specific genes, we applied a filter set of 1,030 genes accepted as GA associated.
    [Show full text]
  • Multivariate Meta-Analysis of Differential Principal Components Underlying Human Primed and Naive-Like Pluripotent States
    bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347666; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. October 20, 2020 To: bioRxiv Multivariate Meta-Analysis of Differential Principal Components underlying Human Primed and Naive-like Pluripotent States Kory R. Johnson1*, Barbara S. Mallon2, Yang C. Fann1, and Kevin G. Chen2*, 1Intramural IT and Bioinformatics Program, 2NIH Stem Cell Unit, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, Maryland 20892, USA Keywords: human pluripotent stem cells; naive pluripotency, meta-analysis, principal component analysis, t-SNE, consensus clustering *Correspondence to: Dr. Kory R. Johnson ([email protected]) Dr. Kevin G. Chen ([email protected]) 1 bioRxiv preprint doi: https://doi.org/10.1101/2020.10.20.347666; this version posted October 21, 2020. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available for use under a CC0 license. ABSTRACT The ground or naive pluripotent state of human pluripotent stem cells (hPSCs), which was initially established in mouse embryonic stem cells (mESCs), is an emerging and tentative concept. To verify this important concept in hPSCs, we performed a multivariate meta-analysis of major hPSC datasets via the combined analytic powers of percentile normalization, principal component analysis (PCA), t-distributed stochastic neighbor embedding (t-SNE), and SC3 consensus clustering.
    [Show full text]
  • CEACAM7 (NM 001291485) Human Tagged ORF Clone Product Data
    OriGene Technologies, Inc. 9620 Medical Center Drive, Ste 200 Rockville, MD 20850, US Phone: +1-888-267-4436 [email protected] EU: [email protected] CN: [email protected] Product datasheet for RC237032 CEACAM7 (NM_001291485) Human Tagged ORF Clone Product data: Product Type: Expression Plasmids Product Name: CEACAM7 (NM_001291485) Human Tagged ORF Clone Tag: Myc-DDK Symbol: CEACAM7 Synonyms: CGM2 Vector: pCMV6-Entry (PS100001) E. coli Selection: Kanamycin (25 ug/mL) Cell Selection: Neomycin ORF Nucleotide >RC237032 representing NM_001291485 Sequence: Red=Cloning site Blue=ORF Green=Tags(s) TTTTGTAATACGACTCACTATAGGGCGGCCGGGAATTCGTCGACTGGATCCGGTACCGAGGAGATCTGCC GCCGCGATCGCC ATGGGGTCCCCTTCAGCCTGTCCATACAGAGTGTGCATTCCCTGGCAGGGGCTCCTGCTCACAGCCTCGC TTTTAACCTTCTGGAACCTGCCAAACAGTGCCCAGACCAATATTGATGTCGTGCCGTTCAATGTCGCAGA AGGGAAGGAGGTCCTTCTAGTAGTCCATAATGAGTCCCAGAATCTTTATGGCTACAACTGGTACAAAGGG GAAAGGGTGCATGCCAACTATCGAATTATAGGATATGTAAAAAATATAAGTCAAGAAAATGCCCCAGGGC CCGCACACAACGGTCGAGAGACAATATACCCCAATGGAACCCTGCTGATCCAGAACGTCACCCACAATGA CGCAGGAATCTATACCCTACACGTTATAAAAGAAAATCTTGTGAATGAAGAAGTAACCAGACAATTCTAC GTATTCTCGGAGCCACCCAAGCCCTCCATCACCAGCAACAACTTCAATCCGGTGGAGAACAAAGATATTG TGGTTTTAACCTGTCAACCTGAGACTCAGAACACAACCTACCTGTGGTGGGTAAACAATCAGAGCCTCCT GGTCAGTCCCAGGCTGCTGCTCTCCACTGACAACAGGACCCTCGTTCTACTCAGCGCCACAAAGAATGAC ATAGGACCCTATGAATGTGAAATACAGAACCCAGTGGGTGCCAGCCGCAGTGACCCAGTCACCCTGAATG TCCGCTATGAGTCAGTACAAGCAAGTTCACCTGACCTCTCAGCTGGGACCGCTGTCAGCATCATGATTGG AGTACTGGCTGGGATGGCTCTGATA ACGCGTACGCGGCCGCTCGAGCAGAAACTCATCTCAGAAGAGGATCTGGCAGCAAATGATATCCTGGATT
    [Show full text]
  • Download The
    PROBING THE INTERACTION OF ASPERGILLUS FUMIGATUS CONIDIA AND HUMAN AIRWAY EPITHELIAL CELLS BY TRANSCRIPTIONAL PROFILING IN BOTH SPECIES by POL GOMEZ B.Sc., The University of British Columbia, 2002 A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE in THE FACULTY OF GRADUATE STUDIES (Experimental Medicine) THE UNIVERSITY OF BRITISH COLUMBIA (Vancouver) January 2010 © Pol Gomez, 2010 ABSTRACT The cells of the airway epithelium play critical roles in host defense to inhaled irritants, and in asthma pathogenesis. These cells are constantly exposed to environmental factors, including the conidia of the ubiquitous mould Aspergillus fumigatus, which are small enough to reach the alveoli. A. fumigatus is associated with a spectrum of diseases ranging from asthma and allergic bronchopulmonary aspergillosis to aspergilloma and invasive aspergillosis. Airway epithelial cells have been shown to internalize A. fumigatus conidia in vitro, but the implications of this process for pathogenesis remain unclear. We have developed a cell culture model for this interaction using the human bronchial epithelium cell line 16HBE and a transgenic A. fumigatus strain expressing green fluorescent protein (GFP). Immunofluorescent staining and nystatin protection assays indicated that cells internalized upwards of 50% of bound conidia. Using fluorescence-activated cell sorting (FACS), cells directly interacting with conidia and cells not associated with any conidia were sorted into separate samples, with an overall accuracy of 75%. Genome-wide transcriptional profiling using microarrays revealed significant responses of 16HBE cells and conidia to each other. Significant changes in gene expression were identified between cells and conidia incubated alone versus together, as well as between GFP positive and negative sorted cells.
    [Show full text]
  • A Bootstrap-Based Regression Method for Comprehensive Discovery of Differential Gene Expressions: an Application to the Osteoporosis Study
    A bootstrap-based regression method for comprehensive discovery of differential gene expressions: an application to the osteoporosis study Yan Lu1,2, Yao-Zhong Liu3, Peng–Yuan Liu2, Volodymyr Dvornyk4, and Hong–Wen Deng1,3 1. College of Life Sciences and Bioengineering, Beijing Jiaotong University, Beijing 100044, P. R. China 2. Department of Physiology and the Cancer Center, Medical College of Wisconsin, Milwaukee, WI 53226, USA 3. Department of Biostatistics and Bioinformatics & Center for Bioinformatics and Genomics, Tulane University School of Public Health, New Orleans, LA 70112, USA 4. School of Biological Sciences, University of Hong Kong, Pokfulam Rd., Pokfulam, Hong Kong, P.R. China Running title: Bootstrap-based regression method for microarray analysis Key words: microarray, bootstrap, regression, osteoporosis Corresponding author: Hong–Wen Deng, Ph. D. Department of Biostatistics and Bioinformatics & Center for Bioinformatics and Genomics, Tulane University School of Public Health, 1440 Canal Street, Suite 2001, New Orleans, LA 70112 Tel: 504-988-1310 Email: [email protected] 1 Abstract A common purpose of microarray experiments is to study the variation in gene expression across the categories of an experimental factor such as tissue types and drug treatments. However, it is not uncommon that the studied experimental factor is a quantitative variable rather than categorical variable. Loss of information would occur by comparing gene-expression levels between groups that are factitiously defined according to the quantitative threshold values of an experimental factor. Additionally, lack of control for some sensitive clinical factors may bring serious false positive or negative findings. In the present study, we described a bootstrap-based regression method for analyzing gene expression data from the non-categorical microarray experiments.
    [Show full text]
  • Comprehensive Array CGH of Normal Karyotype Myelodysplastic
    Leukemia (2011) 25, 387–399 & 2011 Macmillan Publishers Limited All rights reserved 0887-6924/11 www.nature.com/leu LEADING ARTICLE Comprehensive array CGH of normal karyotype myelodysplastic syndromes reveals hidden recurrent and individual genomic copy number alterations with prognostic relevance A Thiel1, M Beier1, D Ingenhag1, K Servan1, M Hein1, V Moeller1, B Betz1, B Hildebrandt1, C Evers1,3, U Germing2 and B Royer-Pokora1 1Institute of Human Genetics and Anthropology, Medical Faculty, Heinrich Heine University, Duesseldorf, Germany and 2Department of Hematology, Oncology and Clinical Immunology, Heinrich Heine University, Duesseldorf, Germany About 40% of patients with myelodysplastic syndromes (MDSs) 40–50% of MDS cases have a normal karyotype. MDS patients present with a normal karyotype, and they are facing different with a normal karyotype and low-risk clinical parameters are courses of disease. To advance the biological understanding often assigned into the IPSS low and intermediate-1 risk groups. and to find molecular prognostic markers, we performed a high- resolution oligonucleotide array study of 107 MDS patients In the absence of genetic or biological markers, prognostic (French American British) with a normal karyotype and clinical stratification of these patients is difficult. To better prognosticate follow-up through the Duesseldorf MDS registry. Recurrent these patients, new parameters to identify patients at higher risk hidden deletions overlapping with known cytogenetic aberra- are urgently needed. With the more recently introduced modern tions or sites of known tumor-associated genes were identi- technologies of whole-genome-wide surveys of genetic aberra- fied in 4q24 (TET2, 2x), 5q31.2 (2x), 7q22.1 (3x) and 21q22.12 tions, it is hoped that more insights into the biology of disease (RUNX1, 2x).
    [Show full text]
  • PRODUCT SPECIFICATION Prest Antigen C19orf25 Product
    PrEST Antigen C19orf25 Product Datasheet PrEST Antigen PRODUCT SPECIFICATION Product Name PrEST Antigen C19orf25 Product Number APrEST85422 Gene Description chromosome 19 open reading frame 25 Alternative Gene FLJ36666 Names Corresponding Anti-C19orf25 (HPA050270) Antibodies Description Recombinant protein fragment of Human C19orf25 Amino Acid Sequence Recombinant Protein Epitope Signature Tag (PrEST) antigen sequence: GEQLYQQSRAYVAANQRLQQAGNVLRQRCELLQRAGEDLEREVAQMKQAA LPAAEAASSG Fusion Tag N-terminal His6ABP (ABP = Albumin Binding Protein derived from Streptococcal Protein G) Expression Host E. coli Purification IMAC purification Predicted MW 24 kDa including tags Usage Suitable as control in WB and preadsorption assays using indicated corresponding antibodies. Purity >80% by SDS-PAGE and Coomassie blue staining Buffer PBS and 1M Urea, pH 7.4. Unit Size 100 µl Concentration Lot dependent Storage Upon delivery store at -20°C. Avoid repeated freeze/thaw cycles. Notes Gently mix before use. Optimal concentrations and conditions for each application should be determined by the user. Product of Sweden. For research use only. Not intended for pharmaceutical development, diagnostic, therapeutic or any in vivo use. No products from Atlas Antibodies may be resold, modified for resale or used to manufacture commercial products without prior written approval from Atlas Antibodies AB. Warranty: The products supplied by Atlas Antibodies are warranted to meet stated product specifications and to conform to label descriptions when used and stored properly. Unless otherwise stated, this warranty is limited to one year from date of sales for products used, handled and stored according to Atlas Antibodies AB's instructions. Atlas Antibodies AB's sole liability is limited to replacement of the product or refund of the purchase price.
    [Show full text]