CYCLONET—An Integrated Database on Cell Cycle Regulation And

Total Page:16

File Type:pdf, Size:1020Kb

CYCLONET—An Integrated Database on Cell Cycle Regulation And D550–D556 Nucleic Acids Research, 2007, Vol. 35, Database issue doi:10.1093/nar/gkl912 CYCLONET—an integrated database on cell cycle regulation and carcinogenesis Fedor Kolpakov1,2, Vladimir Poroikov3, Ruslan Sharipov1,2,4,*, Yury Kondrakhin1,2, Alexey Zakharov3, Alexey Lagunin3, Luciano Milanesi5 and Alexander Kel6 1Institute of Systems Biology, 15, Detskiy proezd, Novosibirsk 630090, Russia, 2Design Technological Institute of Digital Techniques, Siberian Branch of Russian Academy of Sciences, 6, Institutskaya, Novosibirsk 630090, Russia, 3Institute of Biomedical Chemistry of Russian Academy of Medical Sciences, 10, 4 Pogodinskaya Street, Moscow 119121, Russia, Institute of Cytology and Genetics, Siberian Branch of Downloaded from Russian Academy of Sciences, 10, Lavrentyev Avenue, Novosibirsk 630090, Russia, 5CNR-Institute of Biomedical Technologies, 93, Via Fratelli Cervi, Segrate (MI) 20090, Italy and 6BIOBASE GmbH, 33, Halchtersche Strasse, Wolfenbuettel 38304, Germany Received August 16, 2006; Revised October 12, 2006; Accepted October 13, 2006 http://nar.oxfordjournals.org/ ABSTRACT INTRODUCTION Computational modelling of mammalian cell cycle The main goal of the Cyclonet database is to integrate information from genomics, proteomics, chemoinformatics regulation is a challenging task, which requires at McGill University Libraries on December 28, 2012 comprehensive knowledge on many interrelated and systems biology on mammalian cell cycle regulation in processes in the cell. We have developed a web- normal and pathological states. This will help molecular biologists working in the field of anticancer drug develop- based integrated database on cell cycle regulation ment to analyze systematically all these data and generate in mammals in normal and pathological states experimentally testable hypotheses (Figure 1). (Cyclonet database). It integrates data obtained by Cyclonet incorporates data on various carcinogenesis related ‘omics’ sciences and chemoinformatics on the topics, such as: cell cycle control in mammals (Figure 2), basis of systems biology approach. Cyclonet is a cell survival programs (e.g. NF-kB pathway), regulation of specialized resource, which enables researchers covalent histone modifications and chromatin remodelling in working in the field of anticancer drug discovery cell cycle, DNA methylation and other epigenetic mechanisms to analyze the wealth of currently available informa- of cell growth and differentiation. Biological pathways, tion in a systematic way. Cyclonet contains informa- computer models of cell cycle, microarray data coming from tion on relevant genes and molecules; diagrams studies of cell cycle and analysis of cancer-related materials and models of cell cycle regulation and results of are also systematically collected in this database (1) (http:// www.impb.ru/~rcdl2004/cgi/get_paper_pdf.cgi?pid=30). their simulation; microarray data on cell cycle and Cyclonet supports discovery of novel drug targets and on various types of cancer, information on drug development of effective anticancer therapies by collecting targets and their ligands, as well as extensive biblio- all available data related to the control of cell cycle in graphy on modelling of cell cycle and cancer-related normal and pathological states and providing a system bio- gene expression data. The Cyclonet database is also logy platform for knowledge-based anticancer drug accessible through the BioUML workbench, which discovery. allows flexible querying, analyzing and editing the Novel software technologies were used for the database data by means of visual modelling. Cyclonet aims development: to predict promising anticancer targets and their (i) the BioUML workbench [http://www.biouml.org, (2,3)] agents by application of Prediction of Activity was used for formal description and visual modelling of Spectra for Substances. The Cyclonet database is biological pathways and processes related to the cell available at http://cyclonet.biouml.org. cycle regulation and cancer (Figure 2). It also allows to *To whom correspondence should be addressed. Tel/Fax: +7 383 3303070; Email: [email protected] Ó 2006 The Author(s). This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. Nucleic Acids Research, 2007, Vol. 35, Database issue D551 simulate the behaviour of the described systems using THE CYCLONET DATABASE STRUCTURE Java or MATLAB simulation engines; AND CONTENT (ii) BeanExplorer Enterprise Edition (http://www. The Cyclonet database consists of three main components beanexplorer.com) was used to develop web interface (see Table 1): for the Cyclonet database (Figure 3). (i) diagrams and models of biological pathways (metabolic pathways, signal transduction pathways and gene networks) involved in cell cycle regulation and carcinogenesis; (ii) microarray original data and results of their statistical analysis; (iii) chemoinformatics data—drug targets, ligands and phar- macological activities for cancer treatment. Downloaded from The Cyclonet database is organized as a relational database (MySQL DBMS). All sections contain a number of tables that are highly interconnected through crossreferences. Such elaborated relational schema enables complex queries com- bining various types of information. http://nar.oxfordjournals.org/ Data in Cyclonet are compiled mainly by manual literature annotation. Links to the public databases, such as, Gene- Ontology (4), RefSeq (5) and Ensembl (6) are provided from genes, proteins and other respective entries. Cyclonet Figure 1. Diagrams and models of carcinogenesis related processes as the also contains a vast body of literature references that are basis for information integration in the Cyclonet database. arranged by categories. at McGill University Libraries on December 28, 2012 Figure 2. An example of cell cycle model visualization and simulation by the BioUML workbench (the diagram DGR0068a of the Cyclonet database). D552 Nucleic Acids Research, 2007, Vol. 35, Database issue Downloaded from http://nar.oxfordjournals.org/ at McGill University Libraries on December 28, 2012 Figure 3. Web interface of the Cyclonet database generated by BeanExplorer technology. Top screen displays fragment of microarray series classification in the Cyclonet database, bottom left screen demonstrates a fragment of the list of pharmacological activities for anticancer therapy, bottom right—examples of chemical structure of two tubulin antagonists. Biological pathways Models We use BioUML for formal description of signal transduction The BioUML technology was also used for visual modelling pathways and gene networks involved in cell cycle regu- of cell cycle regulation. Known cell cycle models were lation and carcinogenesis (2,3,7). Cyclonet pathways section imported from SBML (8) and CellML (9) model repositories. allows to store the detailed description of biological pathways, We added into Cyclonet several new recent models by their components, models as well as the results of simulation. manual annotation of respective literature sources. We also We are using several diagram types of BioUML to describe created our own novel model of regulation of G1/S transition cell cycle regulation and carcinogenesis: of cell cycle. Currently, Cyclonet contains 37 models of cell cycle regu- (i) semantic networks describing relationships between the lation. All models can be classified into two groups: main concepts (for example, G1 phase, G1/S transition, (i) general models that simulate behaviour of rather small sys- mitotic checkpoints) and components of cell cycle tems including abstract objects that reflect real biological com- regulation; ponents in the cell; (ii) ‘portrait’ models that try to simulate (ii) pathways describing structure of cell cycle regulatory different sub-processes in cell cycle and include real genes, networks as compartmentalized graphs. We have classi- proteins and other cellular components. We validated each fied the annotated networks into a number of categories model by using the BioUML simulation engine and comparing that describe different parts of cell cycle regulatory the results with the published results. The results of such networks in details (for example, a network that provides simulations were then stored in the Cyclonet database. These G1/S transition, NF-kB signal transduction pathway and data can be displayed as graphs by the BioUML workbench its influence on apoptosis and others). (Figure 2) or web isnterface generated by BeanExplorer EE. Nucleic Acids Research, 2007, Vol. 35, Database issue D553 Table 1. Number of entries for the main blocks (tables, sections) of the were loaded into the Cyclonet database and analysed. We did Cyclonet database a statistical analysis as well as meta-analysis of the data (see Supplementary Data) and obtained 33 gene lists (IDs Sections of the Cyclonet database Number of Entries GL0001–GL0033 in ‘Microarray data and results’ of Cyclonet) that belong to several categories: Biological pathways Compartments: cellular and organism compartments, 120 (i) lists of genes periodically expressed during cell cycle for example, cytoplasm, liver, blood (GL0007, GL0020 and GL0021) (15,16); Concepts: biological processes, states, functions etc. 987 (ii) lists of genes whose expression is changing mono- For example, G1
Recommended publications
  • Web-Link Name Reference DRSC Flockhart I, Booker M, Kiger A, Et Al.: Flyrnai: the Drosophila Rnai Screening Center Database
    web-link Name Reference http://flyrnai.org/cgi-bin/RNAi_screens.pl DRSC Flockhart I, Booker M, Kiger A, et al.: FlyRNAi: the Drosophila RNAi screening center database. Nucleic Acids Res. 34(Database issue):D489-94 (2006) http://flybase.bio.indiana.edu/ FlyBase Grumbling G, Strelets V.: FlyBase: anatomical data, images and queries. Nucleic Acids Res. 34(Database issue):D484-8 (2006) http://rnai.org/ RNAiDB Gunsalus KC, Yueh WC, MacMenamin P, et al.: RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects. Nucleic Acids Res. 32(Database issue):D406-10 (2004) http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIMOMIM Hamosh A, Scott AF, Amberger JS, et al.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Res. 33(Database issue):D514-7 (2005) http://www.phenomicdb.de/ PhenomicDB Kahraman A, Avramov A, Nashev LG, et al.: PhenomicDB: a multi-species genotype/phenotype database for comparative phenomics. Bioinformatics 21(3):418-20 (2005) http://www.worm.mpi-cbg.de/phenobank2/cgi-bin/MenuPage.pyPhenoBank http://www.informatics.jax.org/ MGI - Mouse Genome Informatics Eppig JT, Bult CJ, Kadin JA, et al.: The Mouse Genome Database (MGD): from genes to mice--a community resource for mouse biology. Nucleic Acids Res. 33(Database issue):D471-5 (2005) http://flight.licr.org/ FLIGHT Sims D, Bursteinas B, Gao Q, et al.: FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets. Nucleic Acids Res. 34(Database issue):D479-83 (2006) http://www.wormbase.org/ WormBase Schwarz EM, Antoshechkin I, Bastiani C, et al.: WormBase: better software, richer content.
    [Show full text]
  • A Bayesian Inference Transcription Factor Activity Model for the Analysis of Single-Cell Transcriptomes
    Downloaded from genome.cshlp.org on October 7, 2021 - Published by Cold Spring Harbor Laboratory Press Method A Bayesian inference transcription factor activity model for the analysis of single-cell transcriptomes Shang Gao,1,2,3 Yang Dai,1 and Jalees Rehman1,2,3,4 1Department of Bioengineering, 2Department of Medicine, Division of Cardiology, 3Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA; 4University of Illinois Cancer Center, Chicago, Illinois 60612, USA Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful experimental approach to study cellular heterogeneity. One of the challenges in scRNA-seq data analysis is integrating different types of biological data to consistently recognize discrete biological functions and regulatory mechanisms of cells, such as transcription factor activities and gene regulatory networks in distinct cell populations. We have developed an approach to infer transcription factor activities from scRNA-seq data that leverages existing biological data on transcription factor binding sites. The Bayesian inference tran- scription factor activity model (BITFAM) integrates ChIP-seq transcription factor binding information into scRNA-seq data analysis. We show that the inferred transcription factor activities for key cell types identify regulatory transcription factors that are known to mechanistically control cell function and cell fate. The BITFAM approach not only identifies bio- logically meaningful transcription factor activities,
    [Show full text]
  • Data Management in Systems Biology I
    Data management in systems biology I – Overview and bibliography Gerhard Mayer, University of Stuttgart, Institute of Biochemical Engineering (IBVT), Allmandring 31, D-70569 Stuttgart Abstract Large systems biology projects can encompass several workgroups often located in different countries. An overview about existing data standards in systems biology and the management, storage, exchange and integration of the generated data in large distributed research projects is given, the pros and cons of the different approaches are illustrated from a practical point of view, the existing software – open source as well as commercial - and the relevant literature is extensively overviewed, so that the reader should be enabled to decide which data management approach is the best suited for his special needs. An emphasis is laid on the use of workflow systems and of TAB-based formats. The data in this format can be viewed and edited easily using spreadsheet programs which are familiar to the working experimental biologists. The use of workflows for the standardized access to data in either own or publicly available databanks and the standardization of operation procedures is presented. The use of ontologies and semantic web technologies for data management will be discussed in a further paper. Keywords: MIBBI; data standards; data management; data integration; databases; TAB-based formats; workflows; Open Data INTRODUCTION the foundation of a new journal about biological The large amount of data produced by biological databases [24], the foundation of the ISB research projects grows at a fast rate. The 2009 (International Society for Biocuration) and special edition of the annual Nucleic Acids Research conferences like DILS (Data Integration in the Life database issue mentions 1170 databases [1]; alone Sciences) [25].
    [Show full text]
  • Standards and Tools for Model Exchange and Analysis in Systems Biology
    Standards and Tools for Model Exchange and Analysis in Systems Biology Ralph Gauges Dissertation submitted to the Combined Faculties for the Natural Sciences and for Mathematics of the Ruperto-Carola University of Heidelberg, Germany for the degree of Doctor of Natural Sciences presented by Diplom-Biochemiker Ralph Gauges born in: Sigmaringen, Germany Oral-examination: 07/11/2011 Standards and Tools for Model Exchange and Analysis in Systems Biology Referees: Prof. Dr. Ursula Kummer Dr. Rebecca Wade Contents Zusammenfassung vii Summary x Abbreviations xvii 1 Introduction 1 2 Materials & Methods 19 2.1 Operating Systems . 19 2.2 Programming Languages . 20 2.3 Unit Testing . 24 2.4 Debugging & Profiling Tools . 26 2.5 Libraries & Standards . 29 3 SBML Layout & Render Extension 46 3.1 SBML & Diagrams . 46 3.2 Alternative Diagram Formats . 47 3.3 Design & History . 49 3.4 The SBML Layout Extension Specification . 51 3.5 Implementation Of The Layout Extension . 57 3.6 The SBML Render Extension . 61 3.7 Third Party Implementations . 86 3.8 The SBML Layout And Render Extension In NF-κB Modeling 87 4 Standards In COPASI 93 4.1 SBML Support In COPASI . 93 4.2 Layout And Render Information In COPASI . 112 4.3 Graphical Display Of Time Course Simulation Data . 115 4.4 Graphical Display Of Elementary Modes . 116 4.5 COPASI Language Bindings . 117 4.6 NF-κB Modeling with COPASI . 120 i ii CONTENTS 4.7 Work Contributions . 128 5 Expression Normalization 130 5.1 Normal Form Classes . 131 5.2 Expression Tree Classes . 137 5.3 Normalization Algorithm .
    [Show full text]
  • A Modular Mathematical Model of Exercise-Induced Changes In
    bioRxiv preprint doi: https://doi.org/10.1101/2021.05.31.446385; this version posted May 31, 2021. The copyright holder for this preprint (which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. 1 A modular mathematical model of 2 exercise-induced changes in metabolism, 3 signaling, and gene expression in human 4 skeletal muscle 5 6 Akberdin I.R.1,2,3,4, Kiselev I.N.1,4,5, Pintus S.S.1,4,5, Sharipov, R.N.1,3,4,5, Vertyshev A.Yu.6, 7 Vinogradova O.L.7, Popov D.V.7, Kolpakov F.A.1,4,5 8 9 1 BIOSOFT.RU, LLC, Novosibirsk, Russian Federation, [email protected] 10 2 Federal Research Center Institute of Cytology and Genetics SB RAS, Novosibirsk, Russia 11 3 Novosibirsk State University, Novosibirsk, Russia 12 4 Sirius University of Science and Technology, Sochi, Russia 13 5 Federal Research Center for Information and Computational Technologies, Novosibirsk, Russia5 14 6 CJSC "Sites-Tsentr", Moscow, Russia 15 7 Institute of Biomedical Problems of the Russian Academy of Sciences, Moscow, Russia 16 17 Abstract 18 Skeletal muscle is the principal contributor to exercise-induced changes in human 19 metabolism. Strikingly, although it has been demonstrated that a lot of 20 metabolites accumulating in blood and human skeletal muscle during an exercise 21 activate different signaling pathways and induce expression of many genes in 22 working muscle fibres, the system understanding of signaling-metabolic 23 pathways interrelations with downstream genetic regulation in the skeletal 24 muscle is still elusive.
    [Show full text]
  • View
    Boyarskikh et al. BMC Medical Genomics 2018, 11(Suppl 1):12 DOI 10.1186/s12920-018-0330-5 RESEARCH Open Access Computational master-regulator search reveals mTOR and PI3K pathways responsible for low sensitivity of NCI-H292 and A427 lung cancer cell lines to cytotoxic action of p53 activator Nutlin-3 Ulyana Boyarskikh1, Sergey Pintus2, Nikita Mandrik2, Daria Stelmashenko2, Ilya Kiselev2, Ivan Evshin2, Ruslan Sharipov2, Philip Stegmaier3, Fedor Kolpakov2, Maxim Filipenko1 and Alexander Kel1,2,3* From Belyaev Conference Novosibirsk, Russia. 07-10 August 2017 Abstract Background: Small molecule Nutlin-3 reactivates p53 in cancer cells by interacting with the complex between p53 and its repressor Mdm-2 and causing an increase in cancer cell apoptosis. Therefore, Nutlin-3 has potent anticancer properties. Clinical and experimental studies of Nutlin-3 showed that some cancer cells may lose sensitivity to this compound. Here we analyze possible mechanisms for insensitivity of cancer cells to Nutlin-3. Methods: We applied upstream analysis approach implemented in geneXplain platform (genexplain.com) using TRANSFAC® database of transcription factors and their binding sites in genome and using TRANSPATH® database of signal transduction network with associated software such as Match™ and Composite Module Analyst (CMA). Results: Using genome-wide gene expression profiling we compared several lung cancer cell lines and showed that expression programs executed in Nutlin-3 insensitive cell lines significantly differ from that of Nutlin-3 sensitive cell lines. Using artificial intelligence approach embed in CMA software, we identified a set of transcription factors cooperatively binding to the promoters of genes up-regulated in the Nutlin-3 insensitive cell lines.
    [Show full text]
  • Comprehensive Modeling Platform for Photosynthetic Organisms
    MASARYK UNIVERSITY FACULTY OF INFORMATICS COMPREHENSIVE MODELING PLATFORM FOR PHOTOSYNTHETIC ORGANISMS THESIS MATEJ KLEMENT 2012 Contents 1 Introduction4 1.1 Objectives..................................5 2 State of the art6 2.1 Data Exchange formats..........................8 2.1.1 SBML................................8 2.1.2 CellML................................9 2.1.3 BioPAX................................9 2.1.4 PSI-MI................................ 10 2.1.5 SBGN................................ 10 2.1.6 Format of Matlab.......................... 11 2.1.7 Format Octave........................... 11 2.2 Data Exchange and modeling tools................... 11 2.2.1 Biomodels.net........................... 12 2.2.2 CellML.org.............................. 13 2.2.3 Copasi................................ 13 2.2.4 Vcell................................. 13 2.2.5 E-cell................................. 14 2.2.6 ProMot................................ 14 2.2.7 PaxTools............................... 15 2.2.8 Matlab................................ 15 2.2.9 Octave................................ 16 2.2.10Scilab................................ 16 2.2.11BioUML............................... 16 2.3 Annotation ontology............................ 17 2.3.1 Gene Ontology........................... 17 2.3.2 KEGG................................ 17 2.3.3 SBO................................. 17 2.4 Photosynthesis modeling......................... 17 3 Aims 18 3.1 Theoretical aims.............................. 18 3.2 Practical aims..............................
    [Show full text]
  • Synthetic Animal: Trends in Animal Breeding and Genetics
    Open Access Insights in Biology and Medicine Review Article Synthetic Animal: Trends in Animal Breeding and Genetics ISSN 1 2 2639-6769 Abolfazl Bahrami * and Ali Najafi 1Department of Animal Science, University College of Agriculture and Natural Resources, University of Tehran, Karaj, Iran 2Molecular Biology Research Center, Baqiyatallah University of Medical Sciences, Tehran, Iran *Address for Correspondence: A Bahrami, Abstract Department of Animal Science, Tehran University, Karaj, I.R. Iran, Tel/Fax: +98 9199300065; Email: Synthetic biology is an interdisciplinary branch of biology and engineering. The subject [email protected] combines various disciplines from within these domains, such as biotechnology, evolutionary Submitted: 31 December 2018 biology, molecular biology, systems biology, biophysics, computer engineering, and genetic Approved: 10 January 2019 engineering. Synthetic biology aims to understand whole biological systems working as a unit, Published: 11 January 2019 rather than investigating their individual components and design new genome. Signifi cant advances have been made using systems biology and synthetic biology approaches, especially in the fi eld Copyright: © 2018 Bahrami A, et al. This is of bacterial and eukaryotic cells. Similarly, progress is being made with ‘synthetic approaches’ in an open access article distributed under the genetics and animal sciences, providing exciting opportunities to modulate, genome design and Creative Commons Attribution License, which permits unrestricted use, distribution, and fi nally synthesis animal for favorite traits. reproduction in any medium, provided the original work is properly cited Keywords: Synthetic biology; Systems biology; Introduction Synthetic approaches; Genetic engineering Animal breeding In 1859, Charles Darwin published his book ‘On the origin of species’, based on the indings that he collected during his voyage on ‘the Beagle’ [1].
    [Show full text]
  • Systems Biotechnology: an Emerging Trend in Metabolic Engineering Of
    Research Article OPEN ACCESS Freely available online doi:10.4172/jcsb.1000054 JCSB/Vol.3 Issue 2 Systems Biotechnology: an Emerging Trend in Metabolic Engineering of Industrial Microorganisms Chellapandi P1*, Sivaramakrishnan S2 and Viswanathan M.B3 1Department of Bioinformatics, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India 2Department of Biotechnology, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India 3Department of Plant Sciences, School of Life Sciences, Bharathidasan University, Tiruchirappalli-620024, Tamilnadu, India Abstract The improvement of production processes to achieve commercially viable production levels is a prerequisite to any bioprocess. Currently, production strains are enhanced using a combination of random and targeted approach. By combining metabolomics technology and genome data analysis, it is possible to replace empirical target-selection strategies with a more scientifi c approach. All steps of biotechnological development, from up-stream and mid-stream to down-stream processes will benefi t signifi cantly by taking systems biotechnological approaches. The prevalence of genome sequence information, in concert with modern molecular biology advances, should have facilitated the easy manipulation of specifi c genes and pathways for the production of microbial metabolites. A notable success has been made on designing optimized production systems that maximize productivity and minimize raw materials costs for valuable metabolites. A remarkable advantage of this approach up-to-data and its relevant web resources is critically reviewed in this article. Indeed, this alternative approach will not only hopefully be useful for improving the productivity of many meaningful metabolites including antibiotics, enzymes, organic acids, etc. from industrially signifi cant microorganisms but also will ensure correlation of many experimental reliabilities.
    [Show full text]
  • A Bayesian Inference Transcription Factor Activity Model for the Analysis of Single-Cell Transcriptomes
    Downloaded from genome.cshlp.org on October 5, 2021 - Published by Cold Spring Harbor Laboratory Press Method A Bayesian inference transcription factor activity model for the analysis of single-cell transcriptomes Shang Gao,1,2,3 Yang Dai,1 and Jalees Rehman1,2,3,4 1Department of Bioengineering, 2Department of Medicine, Division of Cardiology, 3Department of Pharmacology and Regenerative Medicine, University of Illinois at Chicago, Chicago, Illinois 60612, USA; 4University of Illinois Cancer Center, Chicago, Illinois 60612, USA Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful experimental approach to study cellular heterogeneity. One of the challenges in scRNA-seq data analysis is integrating different types of biological data to consistently recognize discrete biological functions and regulatory mechanisms of cells, such as transcription factor activities and gene regulatory networks in distinct cell populations. We have developed an approach to infer transcription factor activities from scRNA-seq data that leverages existing biological data on transcription factor binding sites. The Bayesian inference tran- scription factor activity model (BITFAM) integrates ChIP-seq transcription factor binding information into scRNA-seq data analysis. We show that the inferred transcription factor activities for key cell types identify regulatory transcription factors that are known to mechanistically control cell function and cell fate. The BITFAM approach not only identifies bio- logically meaningful transcription factor activities,
    [Show full text]