Multiple Kernel Learning for Gene Prioritization, Clustering

Total Page:16

File Type:pdf, Size:1020Kb

Multiple Kernel Learning for Gene Prioritization, Clustering MULTIPLE KERNEL LEARNING FOR GENE PRIORITIZA.TION.CLUSTERING. AND FI.INCTIONAL ENRICHMENT ANAI-YSIS bv DavidH. Millis A Dissertation Submittedto the GraduateFaculty of GeorgeMason University in PartialFulfillment of The Requirementsfor the Degree of Doctorof Philosophy Bioinformaticsand Computational Biology Dr. JeffreyL. Solka,Dissertation Director Dr. JamesD. Willett, DissertationChair Dr. LakshmiK. Matukumalli. CommitteeMember Dr. JamesD. Willett, DepartmentChair Dr. DonnaFox, Associate Dean, Student Affairs & SpecialPrograms, College of Science Dr. PeggyAgouris, Interim Dean, College of Science SpringSemester 2014 GeorgeMason University Fairfax,VA Multiple Kernel Learning for Gene Prioritization, Clustering, and Functional Enrichment Analysis A Dissertation submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at George Mason University by David H. Millis Doctor of Medicine Howard University College of Medicine, 1983 Director: Jeffrey L. Solka, Professor Department of Bioinformatics and Computational Biology Spring Semester 2014 George Mason University Fairfax, VA This work is licensed under a creative commons attribution-noderivs 3.0 unported license. ii DEDICATION In memory of my parents, Ena Grace Millis Clovis Bolton Millis You believed that I could do anything. iii ACKNOWLEDGMENTS I would like first to express gratitude to my dissertation committee. Dr. Jeffrey Solka served as dissertation director, helped provide order and focus to this research effort, and patiently reviewed multiple document drafts. Dr. Lakshmi Matukumalli allowed access to data generated by the USDA Bovine Functional Genomics Laboratory, and gave wise, practical advice on the dissertation completion process. Dr. James Willett served as dissertation chair, offered biological insights that helped move the research forward, and provided validation and support for the usefulness of the ideas developed over the course of this project. Many thanks to Dr. Ronald Kostoff, who arranged funding for a summer internship with The Mitre Corporation and provided helpful insights on text mining and literature-based discovery. I worked at the Thomas B. Finan Center, a psychiatric hospital in Cumberland, Maryland, while completing this dissertation. To the leadership, staff, and patients of the Finan Center: You have taught me much about suffering, survival, resilience, community, and the challenges of being human. For all you have given me, I will always be grateful. To the many music teachers and church musicians with whom I worked over the years, in New York, California, and Maryland: I learned a lot from you about persistence, about being unafraid to make mistakes, and about honing one’s craft by always going back to the fundamentals. Everything that you taught me about artistic growth and craftsmanship is reflected in these pages. To my sisters Dianne Millis and Kathy Brown, brother Michael Millis, and brothers-in- law Eric Brown and Peter Brown: Each of you has had a role in making this accomplishment a reality. Thank you for you love and support throughout this long process. Spending time with you during the holidays always reminds me of the things in life that are truly important. iv To my relatives across the country and around the world, in New York, New Jersey, Connecticut, Georgia, Florida, California, England, and Jamaica: Although we have been far apart geographically, we remain close in spirit. I am honored to make this accomplishment a part of our family history. Finally, to my nephews Marcus Brown and Jordan Brown: Find something that you like doing and that the world needs to get done, and have fun working hard at it. The world needs you. Everything you need to change the world, you already possess. You are perfect as you are. David H. Millis, M.D., Ph.D. Gainesville, Virginia April 29, 2014 v TABLE OF CONTENTS Page List of Tables ...................................................................................................................... x List of Figures .................................................................................................................... xi List of Equations ............................................................................................................... xii List of Abbreviations ....................................................................................................... xiii Abstract ............................................................................................................................ xiv Chapter One: Introduction .................................................................................................. 1 Research Problem: Gene Prioritization ........................................................................... 1 Research Hypothesis ....................................................................................................... 5 Research Contributions ................................................................................................... 5 Chapter Two: Ensemble Methods for Classification .......................................................... 7 Classification Strategies .................................................................................................. 7 Ensemble Classifiers ....................................................................................................... 8 Structure of Base Classifiers........................................................................................ 9 Design of Base Classifier Training Sets .................................................................... 10 Combining Classifier Outputs ................................................................................... 12 Chapter Three: Kernel Methods and Multiple Kernel Learning ....................................... 14 Kernel Methods ............................................................................................................. 14 Support Vector Machines .............................................................................................. 15 Combining Kernels ....................................................................................................... 20 Multiple Kernel Learning .............................................................................................. 20 Chapter Four: Text Processing Methods for Measuring Gene Similarity ....................... 25 Gene Similarity Based on Shared Abstracts ................................................................. 26 Gene Similarity Based on Co-Occurrence of Gene Names in Abstracts ...................... 26 Gene Similarity Based on Cosine Similarity of Free Text in Abstracts ........................ 27 Chapter Five: Classifier Design ........................................................................................ 29 Classifier Design: Overview ......................................................................................... 29 vi Data Sources .................................................................................................................. 30 The Gene Ontology (GO) Database .......................................................................... 30 The KEGG Pathway Database .................................................................................. 31 The REACTOME Pathway Database ........................................................................ 31 The STRING Protein-Protein Interaction (PPI) Database ......................................... 31 PubMed: Shared Identifiers ....................................................................................... 32 PubMed: Named Entity Recognition ......................................................................... 32 PubMed: Cosine Similarity of Abstracts ................................................................... 33 MicroRNA Target Prediction Algorithms ................................................................. 34 Risperidone Differential Expression Data ................................................................. 34 Data Preprocessing ........................................................................................................ 35 SVM Step ...................................................................................................................... 36 MKL Step ...................................................................................................................... 36 Classifier Diversity ........................................................................................................ 37 Measuring Classifier Performance ................................................................................ 37 Selecting Best MKL Classifier ...................................................................................... 38 Clustering Methods ....................................................................................................... 38 Functional Enrichment Analysis ................................................................................... 39 Chapter Six: Project #1: microRNA-Gene Interactions ................................................... 41 Background: Biology of microRNAs and microRNA Target Prediction ..................... 41 MicroRNAs and Gene Expression ................................................................................ 42 MicroRNA Target Prediction ........................................................................................ 44 Computational microRNA Target Prediction
Recommended publications
  • Molecular Reprogramming in Tomato Pollen During Development And
    Molecular reprogramming in tomato pollen during development and heat stress Dissertation zur Erlangung des Doktorgrades der Naturwissenschaften vorgelegt beim Fachbereich Biowissenschaften (FB15) der Goethe-Universität Frankfurt am Main von Mario Keller geboren in Frankfurt am Main (Hessen) Frankfurt am Main 2018 (D30) Vom Fachbereich Biowissenschaften (FB15) der Goethe-Universität als Dissertation angenommen. Dekan: Prof. Dr. Sven Klimpel Gutachter: Prof. Dr. Enrico Schleiff, Jun. Prof. Dr. Michaela Müller-McNicoll Datum der Disputation: 25.03.2019 Index of contents Index of contents Index of contents ....................................................................................................................................... i Index of figures ........................................................................................................................................ iii Index of tables ......................................................................................................................................... iv Index of supplemental material ................................................................................................................ v Index of supplemental figures ............................................................................................................... v Index of supplemental tables ................................................................................................................ v Abbreviations ..........................................................................................................................................
    [Show full text]
  • Full Text (PDF)
    Published OnlineFirst November 11, 2014; DOI: 10.1158/0008-5472.CAN-14-1041 Cancer Tumor and Stem Cell Biology Research Loss of Estrogen-Regulated microRNA Expression Increases HER2 Signaling and Is Prognostic of Poor Outcome in Luminal Breast Cancer Shannon T. Bailey1,2,3, Thomas Westerling1,2,3, and Myles Brown1,2,3 Abstract Among the genes regulated by estrogen receptor (ER) are Genome Atlas (TCGA). Among luminal tumors, these miRNAs miRNAs that play a role in breast cancer signaling pathways. To are expressed at higher levels in luminal A versus B tumors. determine whether miRNAs are involved in ER-positive breast Although their expression is uniformly low in luminal B tumors, cancer progression to hormone independence, we profiled the they are lost only in a subset of luminal A patients. Interestingly, expression of 800 miRNAs in the estrogen-dependent human this subset with low expression of these miRNAs had worse breast cancer cell line MCF7 and its estrogen-independent deriv- overall survival compared with luminal A patients with high ative MCF7:2A (MCF7:2A) using NanoString. We found 78 expression. We confirmed that miR125b directly targets HER2 miRNAs differentially expressed between the two cell lines, and that let-7c also regulates HER2 protein expression. In addi- including a cluster comprising let-7c, miR99a, and miR125b, tion, HER2 protein expression and activity are negatively corre- which is encoded in an intron of the long noncoding RNA lated with let-7c expression in TCGA. In summary, we identified LINC00478. These miRNAs are ER targets in MCF7 cells, and an ER-regulated miRNA cluster that regulates HER2, is lost with nearby ER binding and their expression are significantly progression to estrogen independence, and may serve as a bio- þ decreased in MCF7:2A cells.
    [Show full text]
  • Timo Lassmann, Yoshiko Maida 2, Yasuhiro Tomaru, Mami Yasukawa
    Int. J. Mol. Sci. 2015, 16, 1192-1208; doi:10.3390/ijms16011192 OPEN ACCESS International Journal of Molecular Sciences ISSN 1422-0067 www.mdpi.com/journal/ijms Article Telomerase Reverse Transcriptase Regulates microRNAs Timo Lassmann 1,†, Yoshiko Maida 2,†, Yasuhiro Tomaru 1, Mami Yasukawa 2, Yoshinari Ando 1, Miki Kojima 1, Vivi Kasim 2, Christophe Simon 1, Carsten O. Daub 1, Piero Carninci 1, Yoshihide Hayashizaki 1,* and Kenkichi Masutomi 2,3,* 1 RIKEN Omics Science Center, RIKEN Yokohama Institute, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama 230-0045, Japan; E-Mails: [email protected] (T.L.); [email protected] (Y.T.); [email protected] (Y.A.); [email protected] (M.K.); [email protected] (C.S.); [email protected] (C.O.D.); [email protected] (P.C.) 2 Cancer Stem Cell Project, National Cancer Center Research Institute, 5-1-1 Tsukiji, Chuo-ku, Tokyo 104-0045, Japan; E-Mails: [email protected] (Y.M.); [email protected] (M.Y.); [email protected] (V.K.) 3 Precursory Research for Embryonic Science and Technology (PREST), Japan Science and Technology Agency, 4-1-8 Honcho Kawaguchi, Saitama 332-0012, Japan † These authors contributed equally to this work. * Authors to whom correspondence should be addressed; E-Mails: [email protected] (Y.H.); [email protected] (K.M.); Tel.: +81-45-503-9248 (Y.H.); +81-3-3547-5173 (K.M.); Fax: +81-45-503-9216 (Y.H.); +81-3-3547-5123 (K.M.). Academic Editor: Martin Pichler Received: 22 November 2014 / Accepted: 26 December 2014 / Published: 6 January 2015 Abstract: MicroRNAs are small non-coding RNAs that inhibit the translation of target mRNAs.
    [Show full text]
  • Charakterisierung Interindividueller Strahlenempfindlichkeit in Zellen Aus Tumorpatienten
    Charakterisierung interindividueller Strahlenempfindlichkeit in Zellen aus Tumorpatienten Anne Maria Dietz Charakterisierung interindividueller Strahlenempfindlichkeit in Zellen aus Tumorpatienten Dissertation an der Fakultät für Biologie der Ludwig-Maximilians-Universität München zur Erlangung des Doktorgrades der Naturwissenschaften Angefertigt an der Klinik und Poliklinik für Strahlentherapie und Radioonkolgie in der Arbeitsgruppe Molekulare Strahlenbiologie unter der Leitung von PD. Dr. Anna Friedl Vorgelegt von Anne Maria Dietz München, den 03. Dezember 2014 Erstgutachter: PD Dr. Anna A. Friedl Zweitgutachter: Prof. Dr. Peter Becker Tag der mündlichen Prüfung: 05.05.2015 Meinem Vater Inhaltsverzeichnis Inhaltsverzeichnis 1 Zusammenfassung ........................................................................................................................... 5 2 Einleitung ......................................................................................................................................... 7 2.1 Individuelle Strahlenempfindlichkeit ...................................................................................... 7 2.2 Das Chromosomeninstabilitätssyndrom Ataxia Telangiectasia .............................................. 9 2.3 Biomarker-Entdeckung im Zeitalter der Omics-Technologien .............................................. 12 2.3.1 Charakterisierung eines validen Biomarkers ................................................................. 12 2.3.2 Biomarker für Strahlenempfindlichkeit ........................................................................
    [Show full text]
  • Untersuchung Zur Rolle Des La-Verwandten Proteins LARP4B Im Mrna- Metabolismus
    Untersuchung zur Rolle des La-verwandten Proteins LARP4B im mRNA-Metabolismus Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Julius-Maximilians-Universität Würzburg vorgelegt von Maritta Küspert aus Schwäbisch Hall Würzburg 2014 Eingereicht bei der Fakultät für Chemie und Pharmazie am: ....................................... Gutachter der schriftlichen Arbeit: 1. Gutachter: Prof. Dr. Utz Fischer 2. Gutachter: Prof. Dr. Alexander Buchberger Prüfer des öffentlichen Promotionskolloquiums: 1. Prüfer: Prof. Dr. Utz Fischer 2. Prüfer: Prof. Dr. Alexander Buchberger 3. Prüfer: Prof. Dr. Stefan Gaubatz Datum des öffentlichen Promotionskolloquiums: ......................................................... Doktorurkunde ausgehändigt am: ................................................................................ Zusammenfassung Eukaryotische messenger-RNAs (mRNAs) müssen diverse Prozessierungsreaktionen durchlaufen, bevor sie der Translationsmaschinerie als Template für die Proteinbiosynthese dienen können. Diese Reaktionen beginnen bereits kotranskriptionell und schließen das Capping, das Spleißen und die Polyadenylierung ein. Erst nach dem die Prozessierung abschlossen ist, kann die reife mRNA ins Zytoplasma transportiert und translatiert werden. mRNAs interagieren in jeder Phase ihres Metabolismus mit verschiedenen trans-agierenden Faktoren und bilden mRNA-Ribonukleoproteinkomplexe (mRNPs) aus. Dieser „mRNP-Code“ bestimmt das Schicksal jeder mRNA und reguliert dadurch die Genexpression auf posttranskriptioneller
    [Show full text]
  • Gynecologic and Obstetric Pathology
    ANNUAL MEETING ABSTRACTS 273A Design: From January 2011 to August 2013, 162 patients underwent robotic laparoscopic radical prostatectomy for clinically localized prostatic carcinoma at our institution. Gynecologic and Obstetric Pathology Periprostatic fat pads, yielded during defatting of the prostate, were dissected and sent to pathology for histopathologic examination in 133 cases. Clinical and pathological 1128 Recurrent Grade I, Stage Ia Endometrioid Carcinomas of the Uterus: staging was recorded according to the 2009 American Joint Committee on Cancer Analysis of Pathology and Correlation with Clinical Data (AJCC) criterion. SN Agoff. Virginia Mason Medical Center, Seattle, WA. Results: Of 133 patients whose periprostatic fat was examined, 32 (24%) patients had Background: Low-grade and low-stage endometrioid carcinoma of the uterus is thought lymph nodes in the periprostatic fat pads. Metastatic prostatic carcinoma to periprostatic to have an excellent prognosis in the vast majority of patients. However, despite the lack lymph nodes was detected in 5 individuals (3.8%). All 32 patients had bilateral pelvic of myometrial invasion or superfi cial invasion, some patients develop recurrent disease, lymphadenectomy. 3 of the 5 patients with positive periprostatic lymph nodes had no often in the vaginal apex. There is limited literature on these tumors, but recent analyses metastasis in pelvic lymph nodes, thereby upstaging 3 cases from T3N0 to T3N1. No have implicated the pattern of myoinvasion as a prognostic factor. The emphasis of relationship exists between the presences of periprostatic LNs and prostate weight, the current study is non-invasive or stage Ia, grade I endometrioid adenocarcinomas. patient age, pathological staging or Gleason score.
    [Show full text]
  • The Discovery Potential of RNA Processing Profiles
    bioRxiv preprint doi: https://doi.org/10.1101/049809; this version posted April 22, 2016. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. The discovery potential of RNA processing profiles Amadís Pagès1,2, Ivan Dotu1,3, Roderic Guigó1,2 and Eduardo Eyras1,4,* 1Universitat Pompeu Fabra (UPF), E08003 Barcelona, Spain. 2Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E08003 Barcelona, Spain. 3IMIM - Hospital del Mar Medical Research Institute. E08003 Barcelona, Spain. 4Catalan Institution for Research and Advanced Studies, E08010 Barcelona, Spain *correspondence to: [email protected] Abstract Small non-coding RNAs are highly abundant molecules that regulate essential cellular processes and are classified according to sequence and structure. Here we argue that read profiles from size-selected RNA sequencing, by capturing the post-transcriptional processing specific to each RNA family, provide functional information independently of sequence and structure. SeRPeNT is a computational method that exploits reproducibility across replicates and uses dynamic time-warping and density-based clustering algorithms to identify, characterize and compare small non-coding RNAs, by harnessing the power of read profiles. SeRPeNT is applied to: a) generate an extended human annotation with 671 new RNAs from known classes and 131 from new potential classes, b) show pervasive differential processing between cell compartments and c) predict new molecules with miRNA-like behaviour from snoRNA, tRNA and long non-coding RNA precursors, dependent on the miRNA biogenesis pathway.
    [Show full text]
  • Human Cytomegalovirus Reprograms the Expression of Host Micro-Rnas Whose Target Networks Are Required for Viral Replication: a Dissertation
    University of Massachusetts Medical School eScholarship@UMMS GSBS Dissertations and Theses Graduate School of Biomedical Sciences 2013-08-26 Human Cytomegalovirus Reprograms the Expression of Host Micro-RNAs whose Target Networks are Required for Viral Replication: A Dissertation Alexander N. Lagadinos University of Massachusetts Medical School Let us know how access to this document benefits ou.y Follow this and additional works at: https://escholarship.umassmed.edu/gsbs_diss Part of the Immunology and Infectious Disease Commons, Molecular Genetics Commons, and the Virology Commons Repository Citation Lagadinos AN. (2013). Human Cytomegalovirus Reprograms the Expression of Host Micro-RNAs whose Target Networks are Required for Viral Replication: A Dissertation. GSBS Dissertations and Theses. https://doi.org/10.13028/M2R88R. Retrieved from https://escholarship.umassmed.edu/gsbs_diss/683 This material is brought to you by eScholarship@UMMS. It has been accepted for inclusion in GSBS Dissertations and Theses by an authorized administrator of eScholarship@UMMS. For more information, please contact [email protected]. HUMAN CYTOMEGALOVIRUS REPROGRAMS THE EXPRESSION OF HOST MICRO-RNAS WHOSE TARGET NETWORKS ARE REQUIRED FOR VIRAL REPLICATION A Dissertation Presented By Alexander Nicholas Lagadinos Submitted to the Faculty of the University of Massachusetts Graduate School of Biomedical Sciences, Worcester In partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY August 26th, 2013 Program in Immunology and Virology
    [Show full text]
  • An Epigenetic Switch Regulates the Ontogeny of AXL-Positive/EGFR-Tki
    RESEARCH ARTICLE An epigenetic switch regulates the ontogeny of AXL-positive/EGFR-TKi- resistant cells by modulating miR-335 expression Polona Safaric Tepes1,2†, Debjani Pal1,3†, Trine Lindsted1, Ingrid Ibarra1, Amaia Lujambio4, Vilma Jimenez Sabinina1, Serif Senturk1, Madison Miller1, Navya Korimerla1,5, Jiahao Huang1, Lawrence Glassman6, Paul Lee6, David Zeltsman6, Kevin Hyman6, Michael Esposito6, Gregory J Hannon1,7, Raffaella Sordella1,8* 1Cold Spring Harbor Laboratory, Cold Spring Harbor, United States; 2Faculty of Pharmacy University of Ljubljana, Ljubljana, Slovenia; 3Graduate Program in Molecular and Cellular Biology, Stony Brook University, Stony Brook, New York, United States; 4Icahn School of Medicine at Mount Sinai, Hess Center for Science and Medicine, New York, United States; 5Graduate Program in Biomedical Engineering, Stony Brook University, New York, United States; 6Northwell Health Long Island, Jewish Medical Center, New York, United States; 7Cancer Research UK – Cambridge Institute, University of Cambridge, Cambridge, United Kingdom; 8Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, United States *For correspondence: [email protected] Abstract Despite current advancements in research and therapeutics, lung cancer remains the †These authors contributed leading cause of cancer-related mortality worldwide. This is mainly due to the resistance that equally to this work patients develop against chemotherapeutic agents over the course of treatment. In the context of Competing interests: The non-small cell lung cancers (NSCLC) harboring EGFR-oncogenic mutations, augmented levels of authors declare that no AXL and GAS6 have been found to drive resistance to EGFR tyrosine kinase inhibitors such as competing interests exist. Erlotinib and Osimertinib in certain tumors with mesenchymal-like features.
    [Show full text]
  • Lupus Nephritis Supp Table 5
    Supplementary Table 5 : Transcripts and DAVID pathways correlating with the expression of CD4 in lupus kidney biopsies Positive correlation Negative correlation Transcripts Pathways Transcripts Pathways Identifier Gene Symbol Correlation coefficient with CD4 Annotation Cluster 1 Enrichment Score: 26.47 Count P_Value Benjamini Identifier Gene Symbol Correlation coefficient with CD4 Annotation Cluster 1 Enrichment Score: 3.16 Count P_Value Benjamini ILMN_1727284 CD4 1 GOTERM_BP_FAT translational elongation 74 2.50E-42 1.00E-38 ILMN_1681389 C2H2 zinc finger protein-0.40001984 INTERPRO Ubiquitin-conjugating enzyme/RWD-like 17 2.00E-05 4.20E-02 ILMN_1772218 HLA-DPA1 0.934229063 SP_PIR_KEYWORDS ribosome 60 2.00E-41 4.60E-39 ILMN_1768954 RIBC1 -0.400186083 SMART UBCc 14 1.00E-04 3.50E-02 ILMN_1778977 TYROBP 0.933302249 KEGG_PATHWAY Ribosome 65 3.80E-35 6.60E-33 ILMN_1699190 SORCS1 -0.400223681 SP_PIR_KEYWORDS ubl conjugation pathway 81 1.30E-04 2.30E-02 ILMN_1689655 HLA-DRA 0.915891173 SP_PIR_KEYWORDS protein biosynthesis 91 4.10E-34 7.20E-32 ILMN_3249088 LOC93432 -0.400285215 GOTERM_MF_FAT small conjugating protein ligase activity 35 1.40E-04 4.40E-02 ILMN_3228688 HLA-DRB1 0.906190291 SP_PIR_KEYWORDS ribonucleoprotein 114 4.80E-34 6.70E-32 ILMN_1680436 CSH2 -0.400299744 SP_PIR_KEYWORDS ligase 54 1.50E-04 2.00E-02 ILMN_2157441 HLA-DRA 0.902996561 GOTERM_CC_FAT cytosolic ribosome 59 3.20E-33 2.30E-30 ILMN_1722755 KRTAP6-2 -0.400334007 GOTERM_MF_FAT acid-amino acid ligase activity 40 1.60E-04 4.00E-02 ILMN_2066066 HLA-DRB6 0.901531942 SP_PIR_KEYWORDS
    [Show full text]
  • The Discovery Potential of RNA Processing Profiles
    bioRxiv preprint doi: https://doi.org/10.1101/049809; this version posted July 31, 2017. The copyright holder for this preprint (which was not certified by peer review) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. It is made available under aCC-BY 4.0 International license. The discovery potential of RNA processing profiles Amadís Pagès1,2, Ivan Dotu1,3, Joan Pallarès-Albanell1,2, Eulàlia Martí1,2, Roderic Guigó1,2, Eduardo Eyras1,4,* 1Pompeu Fabra University (UPF), E08003 Barcelona, Spain. 2Centre for Genomic Regulation (CRG), The Barcelona Institute of Science and Technology, E08003 Barcelona, Spain. 3IMIM - Hospital del Mar Medical Research Institute. E08003 Barcelona, Spain. 4Catalan Institution for Research and Advanced Studies (ICREA). E08010 Barcelona, Spain *correspondence to: [email protected] Abstract Small non-coding RNAs are highly abundant molecules that regulate essential cellular processes and are classified according to sequence and structure. Here we argue that read profiles from size-selected RNA sequencing capture the post-transcriptional processing specific to each RNA family, thereby providing functional information independently of sequence and structure. We developed SeRPeNT, the first unsupervised computational method that exploits reproducibility across replicates and uses dynamic time-warping and density-based clustering algorithms to identify, characterize and compare small non-coding RNAs (sncRNAs) by harnessing the power of read profiles. We applied SeRPeNT to: a) generate an extended human annotation with 671 new sncRNAs from known classes and 131 from new potential classes, b) show pervasive differential processing between cell compartments and c) predict new molecules with miRNA-like behaviour from snoRNA, tRNA and long non-coding RNA precursors, potentially dependent on the miRNA biogenesis pathway.
    [Show full text]
  • André Guollo
    AVALIAÇÃO DO PERFIL DE EXPRESSÃO GÊNICA GLOBAL NA PROGRESSÃO DE LEUCOPLASIA VERRUCOSA PROLIFERATIVA ORAL ANDRÉ GUOLLO Dissertação apresentada à Fundação Antônio Prudente para a obtenção do título de Mestre em Ciências Área de Concentração: Oncologia Orientador: Prof. Dr. Fábio de Abreu Alves Co-Orientadora: Prof.ª Dra. Silvia Regina Rogatto São Paulo 2016 FICHA CATALOGRÁFICA Preparada pela Biblioteca da Fundação Antônio Prudente Guollo, André Avaliação do perfil de expressão gênica global na progressão de leucoplasia verrucosa proliferativa oral / André Guollo – São Paulo, 2016. 103p. Dissertação (Mestrado)-Fundação Antônio Prudente. Curso de Pós-Graduação em Ciências - Área de concentração: Oncologia. Orientador: Fábio de Abreu Alves Descritores: 1. LEUCOPLASIA/genética. 2. LEUCOPLASIA/etiologia. 3. NEOPLASIAS BUCAIS/diagnóstico. 4. INFECÇÕES POR PAPILLOMAVIRUS. “With a little help from my friends” Paul and John DEDICATÓRIA Dedico este trabalho a minha mãe Leda e meu pai Nelsi. E a todos os pacientes portadores desta condição. AGRADECIMENTOS Ao meu orientador Prof. Dr. Fábio de Abreu Alves, que abriu as portas do seu serviço para eu poder realizar minha formação, mudando minha carreira e minha vida. E pelo exemplo de profissionalismo e dedicação. A minha co-orientadora Profª Dra. Silvia Regina Rogatto, pela colaboração indispensável neste projeto, pelos conselhos, e aceitar que eu participasse da rotina do seu laboratório. E pelo exemplo de liderança. A minha família, minha mãe Leda, meu pai Nelsi, minha irmã Renata e meu irmão Felipe, por entenderem minha longa ausência, e por apoiarem minhas decisões. A minha namorada Fernanda Mariá, pelo Amor, pelo carinho, pela ajuda, e por estar sempre ao meu lado.
    [Show full text]