A Biological Network-Based Regularized Artificial Neural Network Model for Robust Phenotype Prediction from Gene Expression Data
Total Page:16
File Type:pdf, Size:1020Kb
Load more
Recommended publications
-
Simplified Computational Model for Generating Bio- Logical Networks
Simplified Computational Model for Generating Bio- logical Networks† Matthew H J Bailey,∗ David Ormrod Morley,∗ and Mark Wilson∗ A method to generate and simulate biological networks is discussed. An expanded Wooten- Winer-Weaire bond switching methods is proposed which allows for a distribution of node degrees in the network while conserving the mean average node degree. The networks are characterised in terms of their polygon structure and assortativities (a measure of local ordering). A wide range of experimental images are analysed and the underlying networks quantified in an analogous manner. Limitations in obtaining the network structure are discussed. A “network landscape” of the experimentally observed and simulated networks is constructed from the underlying metrics. The enhanced bond switching algorithm is able to generate networks spanning the full range of experimental observations. Two dimensional random networks are observed in a range of 17 contexts across considerably different length scales in nature: to be limited to avoid damaging the delicate networks . How- from nanometres, in the form of amorphous graphene; to me- ever, even when a high-quality image is obtained, there is further tres, in the form of the Giant’s causeway; to tens of kilometres, in difficulty in analysis. For example, each edge in a network is a the form of geopolitical borders 1–3. A framework for describing complex molecule made up of tens of thousands of atoms which these continuous random networks for chemical systems was first may not lie strictly in a single plane. In addition, the most inter- introduced by Zachariasen to describe silica-like glasses, and has esting dynamic behaviour often occurs over very long timescales proved to be extremely versatile in the years since 4, being used to – potentially decades. -
Gene Regulatory Networks
Gene Regulatory Networks 02-710 Computaonal Genomics Seyoung Kim Transcrip6on Factor Binding Transcrip6on Control • Gene transcrip.on is influenced by – Transcrip.on factor binding affinity for the regulatory regions of target genes – Transcrip.on factor concentraon – Nucleosome posi.oning and chroman states – Enhancer ac.vity Gene Transcrip6onal Regulatory Network • The expression of a gene is controlled by cis and trans regulatory elements – Cis regulatory elements: DNA sequences in the regulatory region of the gene (e.g., TF binding sites) – Trans regulatory elements: RNAs and proteins that interact with the cis regulatory elements Gene Transcrip6onal Regulatory Network • Consider the following regulatory relaonships: Target gene1 Target TF gene2 Target gene3 Cis/Trans Regulatory Elements Binding site: cis Target TF regulatory element TF binding affinity gene1 can influence the TF target gene Target gene2 expression Target TF gene3 TF: trans regulatory element TF concentra6on TF can influence the target gene TF expression Gene Transcrip6onal Regulatory Network • Cis and trans regulatory elements form a complex transcrip.onal regulatory network – Each trans regulatory element (proteins/RNAs) can regulate mul.ple target genes – Cis regulatory modules (CRMs) • Mul.ple different regulators need to be recruited to ini.ate the transcrip.on of a gene • The DNA binding sites of those regulators are clustered in the regulatory region of a gene and form a CRM How Can We Learn Transcriponal Networks? • Leverage allele specific expressions – In diploid organisms, the transcript levels from the two copies of the genes may be different – RNA-seq can capture allele- specific transcript levels How Can We Learn Transcriponal Networks? • Leverage allele specific gene expressions – Teasing out cis/trans regulatory divergence between two species (WiZkopp et al. -
Biological Network Approaches and Applications in Rare Disease Studies
G C A T T A C G G C A T genes Review Biological Network Approaches and Applications in Rare Disease Studies Peng Zhang 1,* and Yuval Itan 2,3 1 St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA 2 The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; [email protected] 3 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA * Correspondence: [email protected]; Tel.: +1-646-830-6622 Received: 3 September 2019; Accepted: 10 October 2019; Published: 12 October 2019 Abstract: Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases. Keywords: biological network; bioinformatics; database; software; application; rare diseases 1. Introduction Network biology provides insights into complex biological systems and can reveal informative patterns within these systems through the integration of biological omics data (e.g., genome, transcriptome, proteome, and metabolome) and biological interactome data (e.g., protein-protein interactions and gene-gene associations). -
Spectrum of Mutations and Genotype ± Phenotype Analysis in Currarino Syndrome
European Journal of Human Genetics (2001) 9, 599 ± 605 ã 2001 Nature Publishing Group All rights reserved 1018-4813/01 $15.00 www.nature.com/ejhg ARTICLE Spectrum of mutations and genotype ± phenotype analysis in Currarino syndrome Joachim KoÈchling1, Mohsen Karbasiyan2 and Andre Reis*,2,3 1Department of Pediatric Oncology/Hematology, ChariteÂ, Humboldt University, Berlin, Germany; 2Institute of Human Genetics, ChariteÂ, Humboldt University, Berlin, Germany; 3Institute of Human Genetics, Friedrich- Alexander University Erlangen-NuÈrnberg, Erlangen, Germany The triad of a presacral tumour, sacral agenesis and anorectal malformation constitutes the Currarino syndrome which is caused by dorsal-ventral patterning defects during embryonic development. The syndrome occurs in the majority of patients as an autosomal dominant trait associated with mutations in the homeobox gene HLXB9 which encodes the nuclear protein HB9. However, genotype ± phenotype analyses have been performed only in a few families and there are no reports about the specific impact of HLXB9 mutations on HB9 function. We performed a mutational analysis in 72 individuals from nine families with Currarino syndrome. We identified a total of five HLXB9 mutations, four novel and one known mutation, in four out of four families and one out of five sporadic cases. Highly variable phenotypes and a low penetrance with half of all carriers being clinically asymptomatic were found in three families, whereas affected members of one family showed almost identical phenotypes. However, an obvious genotype ± phenotype correlation was not found. While HLXB9 mutations were diagnosed in 23 patients, no mutation or microdeletion was detected in four sporadic patients with Currarino syndrome. The distribution pattern of here and previously reported HLXB9 mutations indicates mutational predilection sites within exon 1 and the homeobox. -
Solutions for Practice Problems for Molecular Biology, Session 5
Solutions to Practice Problems for Molecular Biology, Session 5: Gene Regulation and the Lac Operon Question 1 a) How does lactose (allolactose) promote transcription of LacZ? 1) Lactose binds to the polymerase and increases efficiency. 2) Lactose binds to a repressor protein, and alters its conformation to prevent it from binding to the DNA and interfering with the binding of RNA polymerase. 3) Lactose binds to an activator protein, which can then help the RNA polymerase bind to the promoter and begin transcription. 4) Lactose prevents premature termination of transcription by directly binding to and bending the DNA. Solution: 2) Lactose binds to a repressor protein, and alters its conformation to prevent it from binding to the DNA and interfering with the binding of RNA polymerase. b) What molecule is used to signal low glucose levels to the Lac operon regulatory system? 1) Cyclic AMP 2) Calcium 3) Lactose 4) Pyruvate Solution: 1) Cyclic AMP. Question 2 You design a summer class where you recreate experiments studying the lac operon in E. coli (see schematic below). In your experiments, the activity of the enzyme b-galactosidase (β -gal) is measured by including X-gal and IPTG in the growth media. X-gal is a lactose analog that turns blue when metabolisize by b-gal, but it does not induce the lac operon. IPTG is an inducer of the lac operon but is not metabolized by b-gal. I O lacZ Plac Binding site for CAP Pi Gene encoding β-gal Promoter for activator protein Repressor (I) a) Which of the following would you expect to bind to β-galactosidase? Circle all that apply. -
Evolutionary Developmental Biology 573
EVOC20 29/08/2003 11:15 AM Page 572 Evolutionary 20 Developmental Biology volutionary developmental biology, now often known Eas “evo-devo,” is the study of the relation between evolution and development. The relation between evolution and development has been the subject of research for many years, and the chapter begins by looking at some classic ideas. However, the subject has been transformed in recent years as the genes that control development have begun to be identified. This chapter looks at how changes in these developmental genes, such as changes in their spatial or temporal expression in the embryo, are associated with changes in adult morphology. The origin of a set of genes controlling development may have opened up new and more flexible ways in which evolution could occur: life may have become more “evolvable.” EVOC20 29/08/2003 11:15 AM Page 573 CHAPTER 20 / Evolutionary Developmental Biology 573 20.1 Changes in development, and the genes controlling development, underlie morphological evolution Morphological structures, such as heads, legs, and tails, are produced in each individual organism by development. The organism begins life as a single cell. The organism grows by cell division, and the various cell types (bone cells, skin cells, and so on) are produced by differentiation within dividing cell lines. When one species evolves into Morphological evolution is driven another, with a changed morphological form, the developmental process must have by developmental evolution changed too. If the descendant species has longer legs, it is because the developmental process that produces legs has been accelerated, or extended over time. -
Myths and Legends of the Baldwin Effect
Myths and Legends of the Baldwin Effect Peter Turney Institute for Information Technology National Research Council Canada Ottawa, Ontario, Canada, K1A 0R6 [email protected] Abstract ary computation when there is an evolving population of learning individuals (Ackley and Littman, 1991; Belew, This position paper argues that the Baldwin effect 1989; Belew et al., 1991; French and Messinger, 1994; is widely misunderstood by the evolutionary Hart, 1994; Hart and Belew, 1996; Hightower et al., 1996; computation community. The misunderstandings Whitley and Gruau, 1993; Whitley et al., 1994). This syn- appear to fall into two general categories. Firstly, ergetic effect is usually called the Baldwin effect. This has it is commonly believed that the Baldwin effect is produced the misleading impression that there is nothing concerned with the synergy that results when more to the Baldwin effect than synergy. A myth or legend there is an evolving population of learning indi- has arisen that the Baldwin effect is simply a special viduals. This is only half of the story. The full instance of synergy. One of the goals of this paper is to story is more complicated and more interesting. dispel this myth. The Baldwin effect is concerned with the costs Roughly speaking (we will be more precise later), the and benefits of lifetime learning by individuals in Baldwin effect has two aspects. First, lifetime learning in an evolving population. Several researchers have individuals can, in some situations, accelerate evolution. focussed exclusively on the benefits, but there is Second, learning is expensive. Therefore, in relatively sta- much to be gained from attention to the costs. -
Principles of Differentiation and Morphogenesis
Swarthmore College Works Biology Faculty Works Biology 2016 Principles Of Differentiation And Morphogenesis Scott F. Gilbert Swarthmore College, [email protected] R. Rice Follow this and additional works at: https://works.swarthmore.edu/fac-biology Part of the Biology Commons Let us know how access to these works benefits ouy Recommended Citation Scott F. Gilbert and R. Rice. (2016). 3rd. "Principles Of Differentiation And Morphogenesis". Epstein's Inborn Errors Of Development: The Molecular Basis Of Clinical Disorders On Morphogenesis. 9-21. https://works.swarthmore.edu/fac-biology/436 This work is brought to you for free by Swarthmore College Libraries' Works. It has been accepted for inclusion in Biology Faculty Works by an authorized administrator of Works. For more information, please contact [email protected]. 2 Principles of Differentiation and Morphogenesis SCOTT F. GILBERT AND RITVA RICE evelopmental biology is the science connecting genetics with transcription factors, such as TFHA and TFIIH, help stabilize the poly anatomy, making sense out of both. The body builds itself from merase once it is there (Kostrewa et al. 2009). Dthe instructions of the inherited DNA and the cytoplasmic system that Where and when a gene is expressed depends on another regula interprets the DNA into genes and creates intracellular and cellular tory unit of the gene, the enhancer. An enhancer is a DNA sequence networks to generate the observable phenotype. Even ecological fac that can activate or repress the utilization of a promoter, controlling tors such as diet and stress may modify the DNA such that different the efficiency and rate of transcription from that particular promoter. -
Lamarck's Ladder
Lamarck’s Ladder Overview Darwinism is a theory of biological evolution developed by the English naturalist Charles Darwin (1809–1882) and others. It states that all species of organisms arise and develop through the natural selection of small, inherited variations that increase the individual's ability to compete, survive, and reproduce. The environment plays a critical role in the natural selection process of Darwinian evolution through the phenotypic variation produced over time by small genetic changes to the DNA sequence, such as beneficial random mutations. However, the vast majority of environmental factors cannot directly alter DNA sequence. Jean-Baptiste Pierre Antoine de Monet, chevalier de Lamarck (1744 – 1829), often known simply as Lamarck, was a French naturalist. Before Darwin, Lamarck was one of the first scientists to support the Theory of Evolution. His theory was that species evolved based on the behavior of their parents. His hypothesis stated that an organism can pass on phenotypes acquired because of environmental pressures within their lifetime and pass them to its offspring. One of his most famous examples is his explanation of a giraffe’s long neck. He proposed that the continuous stretching of the neck by the giraffe to reach higher and higher leaves caused the muscles in the neck to strengthen and gradually lengthen over a period of time, which gave rise to longer necks in giraffes. Lamarck’s hypothesis was rejected and later became associated with flawed science concepts. However, scientists today debate whether advances in the field of transgenerational epigenetics indicate that Lamarck was correct, at least to some extent. -
From Genotype to Phenotype: a Shortcut Through the Library
RESEARCH HIGHLIGHTS URLs BIOINFORMATICS From genotype to phenotype: a shortcut through the library A list of genes tells you little about 11,026 orthologous gene sets were PLoS Biol. 3, e134 (2005) WEB SITES their biological roles: understanding identified from the STRING (search Peer Bork’s laboratory: http://www-db.embl.de/ this generally requires time-con- tool for the retrieval of interacting jss/EmblGroupsHD/g_27.html suming functional or comparative genes/proteins) database, defining The STRING database: http://string.embl.de studies. A new method provides a shared sets of genes. shortcut on this path from genotype The final stage was to look for to phenotype — it uses a combina- correlations between the groupings tion of comparative genomics and of MEDLINE nouns and the group- literature mining to predict the ings of shared genes. Bork and col- functions of large sets of sequenced leagues identified 2,700 significant genes. associations between orthologous Peer Bork and colleagues rea- groups and trait words, which soned that if a group of species has a allowed them to relate 28,888 genes shared phenotypic trait, orthologous to at least one trait. Among these genes shared among the species are associations, many of the gene–phe- likely to be involved in the underly- notype associations were already ing biological process. Such geno- known, confirming the validity of type–phenotype correlations have the method. However, many new been made in the past, but required discoveries were also made. For initial manual collection of pheno- example, previously unknown asso- typic information for each species, ciations were made between a group which is labour-intensive and might of metabolic genes and trait words lead to biases in the phenotypes linked to food poisoning. -
Denoising Large-Scale Biological Data Using Network Filters
Kavran and Clauset BMC Bioinformatics (2021) 22:157 https://doi.org/10.1186/s12859‑021‑04075‑x RESEARCH ARTICLE Open Access Denoising large‑scale biological data using network flters Andrew J. Kavran1,2 and Aaron Clauset2,3,4* *Correspondence: [email protected] Abstract 2 BioFrontiers Institute, Background: Large-scale biological data sets are often contaminated by noise, which University of Colorado, Boulder, CO, USA can impede accurate inferences about underlying processes. Such measurement Full list of author information noise can arise from endogenous biological factors like cell cycle and life history vari- is available at the end of the ation, and from exogenous technical factors like sample preparation and instrument article variation. Results: We describe a general method for automatically reducing noise in large- scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “fltered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network flter may be applied to an entire system, or the system may be frst decomposed into distinct modules and a diferent flter applied to each. Applied to synthetic data with known network structure and signal, network flters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancer- ous tissues, network fltering prior to training increases accuracy up to 43% compared to using unfltered data. Conclusions: Network flters are a general way to denoise biological data and can account for both correlation and anti-correlation between diferent measurements. -
Inference of Phenotype-Relevant Transcriptional Regulatory Networks Elucidates Cancer Type-Specific Regulatory Mechanisms In
www.nature.com/npjsba ARTICLE OPEN Inference of phenotype-relevant transcriptional regulatory networks elucidates cancer type-specific regulatory mechanisms in a pan-cancer study ✉ ✉ Amin Emad 1 and Saurabh Sinha2,3,4 Reconstruction of transcriptional regulatory networks (TRNs) is a powerful approach to unravel the gene expression programs involved in healthy and disease states of a cell. However, these networks are usually reconstructed independent of the phenotypic (or clinical) properties of the samples. Therefore, they may confound regulatory mechanisms that are specifically related to a phenotypic property with more general mechanisms underlying the full complement of the analyzed samples. In this study, we develop a method called InPheRNo to identify “phenotype-relevant” TRNs. This method is based on a probabilistic graphical model that models the simultaneous effects of multiple transcription factors (TFs) on their target genes and the statistical relationship between the target genes’ expression and the phenotype. Extensive comparison of InPheRNo with related approaches using primary tumor samples of 18 cancer types from The Cancer Genome Atlas reveals that InPheRNo can accurately reconstruct cancer type-relevant TRNs and identify cancer driver TFs. In addition, survival analysis reveals that the activity level of TFs with many target genes could distinguish patients with poor prognosis from those with better prognosis. npj Systems Biology and Applications (2021) 7:9 ; https://doi.org/10.1038/s41540-021-00169-7 1234567890():,; INTRODUCTION among samples. In this example, the expression levels of TF and Gene expression programs are responsible for many biological gene are only weakly correlated in each phenotypic class (case or processes in a cell and extensive efforts have been devoted to control) separately, possibly owing to noisy data and small sample elucidating these programs in healthy and disease states.