Introduction to System Biology
Dr. Carlo Cosentino School of Computer and Biomedical Engineering Department of Experimental and Clinical Medicine Università degli Studi Magna Graecia Catanzaro, Italy [email protected] http://bioingegneria.unicz.it/~cosentino
Dr. Carlo Cosentino1 Carnegie Mellon University, Pittsburgh, 2008 Outline of the Lectures
Ó Introduction to Systems Biology Ô Basic Principles Ô Biology in a nutshell Ô Getting experimental data Ó Modeling biochemical reactions Ô Deterministic Models Ô Stochastic Models Ó Biological Networks Ô Types of biological networks Ô Dynamical models of biological networks Ô Inference of biological networks Ó Analysis and Simulation of Biological Systems Ô A case-study: The cell cycle Ô Modeling the cell cycle Ô Analysis of biological models Ô Software tools
Dr. Carlo Cosentino2 Carnegie Mellon University, Pittsburgh, 2008 Introduction to System Biology
Ó Basic Principles
Ó Biology in a nutshell
Ó Experimental Techniques
Dr. Carlo Cosentino3 Carnegie Mellon University, Pittsburgh, 2008 What is Systems Biology?
Ó Systems biology is concerned with the study of biological functions and mechanisms, underpinning inter- and intra-cellular dynamical networks, by means of signal- and system-oriented approaches
Ó “Life is an emergent, rather than an immanent and inherent, property of matter. Although it arises from the material world, it cannot be reduced to it” (E. Schrödinger)
Ó “Science is built up of facts, as a house is with stones. But a collection of facts is no more a science than a heap of stones is a house” (H. Poincaré)
Dr. Carlo Cosentino4 Carnegie Mellon University, Pittsburgh, 2008 Systems Biology Approach
Ó A systems biology approach means Ô Investigating the components of cellular networks and their interactions Ô Applying experimental high-throughput and whole-genome techniques Ô Integrating computational and theoretical methods with experimental efforts
Dr. Carlo Cosentino5 Carnegie Mellon University, Pittsburgh, 2008 An Iterative Approach
Experiments
Data Handling
Modeling
Dr. Carlo Cosentino6 Carnegie Mellon University, Pittsburgh, 2008 Main Related Disciplines
Ó Biology!
Ó Mathematics and Statistics
Ó Physics and Chemistry
Ó Information Science
Ó Engineering (Biomedical, Chemical, Computer, Systems & Control, Electronic)
Ó …
Dr. Carlo Cosentino7 Carnegie Mellon University, Pittsburgh, 2008 Foundations
Ó Improved biological knowledge with the prospect of utilization in biotechnology and health care
Ó New experimental techniques in genomics and proteomics
Ó Classical mathematical modeling of biological processes
Ó Computer power for simulation of complex systems
Ó Storage and retrieval capability in large databases and data mining techniques
Ó Internet as the medium for the widespread availability from multiple sources of knowledge
Dr. Carlo Cosentino8 Carnegie Mellon University, Pittsburgh, 2008 Goals
Ó Models to unveil mechanisms causing alterated phenotypes and devise novel therapies and drugs
Ó Predictive tools to design cells with desired properties cheaply and reliably
Ó Individualized and predictive medicine
Dr. Carlo Cosentino9 Carnegie Mellon University, Pittsburgh, 2008 Models are not Real, though Reality can be Modeled
Ó In almost any case, models are only rough representations of their biological counterparts
Ó Nevertheless, models enable to Ô Elucidate network properties Ô Check the reliability of basic assumptions Ô Uncover lack of knowledge and requirements for clarification Ô Create large repository of current knowledge, formalized in a non- ambiguous way and including quantitative data
Dr. Carlo Cosentino10 Carnegie Mellon University, Pittsburgh, 2008 What is a Model?
Ó It depends on whom you ask… Ô Genetist: the mouse family Ts65DN serves as a model for human trisomy 21 Ô Chemist: a reaction network, described by dots (for metabolites) and arrows (for reactions) Ô Mathematician/Engineer: the same reaction network can be modeled by a system of nonlinear ODEs
Ó Abstract representation of objects or processes that explains features of these objects or processes
Dr. Carlo Cosentino11 Carnegie Mellon University, Pittsburgh, 2008 Mathematical Models
Ó Biological processes can be described in mathematical terms, however Ô A biological object can be investigated by means of different experimental methods Ô Each biological process can be described through different (mathematical) models Ô The choice of a mathematical model or an algorithm depends on the problem, the purpose, and the intention of the investigator Ô Modeling has to reflect essential properties of the system: Different models may highlight different aspects of the same instance
Dr. Carlo Cosentino12 Carnegie Mellon University, Pittsburgh, 2008 Model Development
Ó Formulation of the problem: Ô Identify the specific questions that shall be answered, along with background, problem and hypotheses
Ó Available Knowledge: Ô Check and collect quantitative and structural knowledge ÕComponents of the system ÕInteraction map and kind of interactions ÕExperimental results with respect to phenotypic responses against different stimuli (gene knockout, RNAi, environmental conditions)
Dr. Carlo Cosentino13 Carnegie Mellon University, Pittsburgh, 2008 Model Development (cont’d)
Ó Selection of model structure: Ô Level of description (atomistic, molecular, cellular, physiological) Ô Deterministic or stochastic model Ô Discrete or continuous variables Ô Static, dynamical, spatio-temporal dynamical
Ó Robustness/Sensitivity Analysis: Ô Test the dependence of the system behavior on changes of the parameters ÕNumerical simulations ÕBifurcation analysis
Dr. Carlo Cosentino14 Carnegie Mellon University, Pittsburgh, 2008 Model Development (cont’d)
Ó Experimental Tests Ô Hypotheses driven Ô Choice of parameters to be measured, different types of experiments, number of samples and repetitions, …
Ó Assessment of the agreement and divergences between experimental results and model behavior
Ó Iterative refinement of the hypotheses (and of the model)
Dr. Carlo Cosentino15 Carnegie Mellon University, Pittsburgh, 2008 Data Integration
Ó Observation of biological phenomena is restricted to the granularity and precision of the available experimental techniques
Ó A strong impulse to the development of a systematic approach in the last years has been given by the new high-throughput biotechnologies Ô Sequencing of human and other genomes (genomics) Ô Monitoring genome expression (transcriptomics) Ô Discovering protein-protein and -DNA interactions (proteomics)
Ó Different types of information need to be integrated
Dr. Carlo Cosentino16 Carnegie Mellon University, Pittsburgh, 2008 Issues in Data Integration
Ó Data representation and storage: Ô (too) Many databases (GO, KEGG, PDB, Reactome…) Ô XML-like annotation languages (SBML, CellML)
Ó Information retrieval Ô Tools for retrieving information from multiple remote DBs
Ó Data correlation Ô Find the correlation between phenotypes and genomic/proteomic profiles Ô Statistics, data mining, pattern analysis, clustering, PCA, …
Dr. Carlo Cosentino17 Carnegie Mellon University, Pittsburgh, 2008 Interdisciplinary Information Exchange
Ó The interdisciplinary nature of systems biology requires the exchange of information among scientists from different fields
Ó Mathematical formulas have to be made understandable for biologists
Ó People acquainted with the rigid rules of mathematics and computers have to understand the diversity of biological objects and the uncertainty in the outcome of experiments
Dr. Carlo Cosentino18 Carnegie Mellon University, Pittsburgh, 2008 Outline
Ó Basic principles
Ó Biology in a nutshell
Ó Experimental techniques
Dr. Carlo Cosentino19 Carnegie Mellon University, Pittsburgh, 2008 Branches of Biology
Ó Biology is the science that deals with living organisms and their interrelationships between each other and their environment in light of the evolutionary origin Ô Biochemistry Ô Molecular biology Ô Genetics Ô Physiology Ô Morphology Ô Cytology Ô Ecology Ô …
Dr. Carlo Cosentino20 Carnegie Mellon University, Pittsburgh, 2008 Prokaryotes and Eukaryotes
Dr. Carlo Cosentino21 Carnegie Mellon University, Pittsburgh, 2008 Model Organisms
Dr. Carlo Cosentino22 Carnegie Mellon University, Pittsburgh, 2008 Structure of the (Animal) Cell
Nelson, Cox, Lehninger Principles of Biochemistry
Dr. Carlo Cosentino23 Carnegie Mellon University, Pittsburgh, 2008 Composition and Sizes of Biological Objects
Ó Biological compounds are prevalently composed of four basic components: C, N, O, H
Ó Many compounds are made out of a set of few repeated basic blocks (nucleic acids, polypeptides, polymers)
Dr. Carlo Cosentino24 Carnegie Mellon University, Pittsburgh, 2008 Biomolecules Classifications
Ó Biomolecules can be divided into four families Ô Carbohydrates (energy storage) Ô Lipids (structure, signaling) Ô Proteins (catalysis, regulation, signaling, structure and support) Ô Nucleic acids (Information coding, regulation)
Ó Each family has multiple functions (e.g. RNAs are coding polymers, but can also have regulatory functions, whereas some lipids act as hormones, i.e. signals)
Ó Most molecules are made up of sub-complexes belonging to different families
Dr. Carlo Cosentino25 Carnegie Mellon University, Pittsburgh, 2008 Nucleic Acids
Ó Nucleic acids are polymers built up of covalently bound mononucleotides, they comprise DNA and RNA
Ó DNA contains coding sequences (genes) for proteins and RNAs and non-coding sequences
Ó Nucleotides are also important in energy transfer in a multitude of reactions (e.g. ATP – ADP)
Ó Some nucleotides also serve as signaling packets (e.g. cAMP)
Dr. Carlo Cosentino26 Carnegie Mellon University, Pittsburgh, 2008 Central Dogma of Molecular Biology
Lodish et al, Molecular Cell Biology
Dr. Carlo Cosentino27 Carnegie Mellon University, Pittsburgh, 2008 Transcription and Translation
Ó Through alternative splicing, the same coding region can produce more than one protein
Dr. Carlo Cosentino28 Carnegie Mellon University, Pittsburgh, 2008 Structure of Proteins
Ó Proteins consist of one or more polypeptides
Ó Each peptide is composed of covalently linked amino acids, chosen from a set of 20 different ones
Nelson, Cox, Lehninger Principles of Biochemistry
Dr. Carlo Cosentino29 Carnegie Mellon University, Pittsburgh, 2008 Structure of Proteins (cont’d)
Ó The functioning of a protein is strictly dependent on its three-dimensional structure
Ó Chemical modifications (e.g. phosphorylation) can induce relevant conformational changes, exposing or hiding the active sites
Ó On the other hand, environmental changes (e.g. pH, temperature) or endogenous alterations (e.g. genetic mutations) usually modify the protein structure, hindering its correct functioning
Dr. Carlo Cosentino30 Carnegie Mellon University, Pittsburgh, 2008 The Cytoskeleton
Ó Translocation of compounds is an important mechanism, though it cannot be described by standard diffusion processes
Nelson, Cox, Lehninger Principles of Biochemistry
Dr. Carlo Cosentino31 Carnegie Mellon University, Pittsburgh, 2008 Combinatorial Regulation
Ó Genes expression is regulated via adjacent transcription-control regions in a combinatorial way
Ó Only a subset of the whole genome is expressed at a particular time or in a specific cell type
Lodish et al, Molecular Cell Biology
Dr. Carlo Cosentino32 Carnegie Mellon University, Pittsburgh, 2008 Biosignaling: Membrane Receptors
Nelson, Cox, Lehninger Principles of Biochemistry
Dr. Carlo Cosentino33 Carnegie Mellon University, Pittsburgh, 2008 Outline
Ó Basic Principles
Ó Biology in a nutshell
Ó Experimental Techniques
Dr. Carlo Cosentino34 Carnegie Mellon University, Pittsburgh, 2008 Getting Experimental Data
Ó Systems scientists are typically not concerned about the difficult and often indirect processes by which biological relationships and interactions are identified
Ó It is worth having a glimpse at the basic techniques used in experimental biology
Ó Two main issues arise when using experimental techniques: Ô How to get quantitative data out of experiments designed to give qualitative answers Ô How to get time-sampled (synchronized) measurements of many biological objects (possibly cheaply and in a suitable time)
Dr. Carlo Cosentino35 Carnegie Mellon University, Pittsburgh, 2008 Gel Electrophoresis
Ó Both nucleic acids and proteins are electrically charge, thus they can be separated by applying electrical fields
Ó While the negative charge of DNA is proportional to the length of the fragment under consideration, for proteins the charge distribution is much more involved and case-specific
Ó By using sodium dodecyl sulfate (SDS) it is possible to confer to a protein a negative charge that is proportional to the number of its amino acids
Dr. Carlo Cosentino36 Carnegie Mellon University, Pittsburgh, 2008 Gel Electrophoresis
Ó SDS-PAGE does not provide absolute quantitative data, rather it can be used for relative measurements
Ó The samples have to be purified before measurement (at least in one- dimensional GE)
Dr. Carlo Cosentino37 Carnegie Mellon University, Pittsburgh, 2008 Hybridization and Blotting
Ó Hybridization techniques are based on the specific recognition of a probe and target molecule
Ó Used to selectively visualize only the molecules of interest in a complex mixture
Ó It can be applied both to DNA (Southern Blotting) and to mRNA samples (Northern Blotting)
Ó In the case of protein it is not possible to exploit hybridization, hence it is necessary to use a specific antibody for the protein of interest
Dr. Carlo Cosentino38 Carnegie Mellon University, Pittsburgh, 2008 Western Blotting (Immunoblotting)
Ó Measurements of intracellular protein concentration
Dr. Carlo Cosentino39 Carnegie Mellon University, Pittsburgh, 2008 Yeast Two-Hybrid (Y2H)
Ó Two-hybrid screening is used to determine protein – protein and protein – DNA interactions
Ó The protein to be tested are hybridized with two fragments of a transcription factor
Ó The TF interacts with the upstream activation site (UAS), activating the reported gene
Dr. Carlo Cosentino40 Carnegie Mellon University, Pittsburgh, 2008 DNA Microarrays
Ó DNA chips, also called DNA microarrays, are a recently developed method for the high-throughput analysis of gene expression
Ó Instead of looking at the expression of a single gene, microarrays allow one to monitor the expression of several thousand genes in a single experiment, resulting in a global picture of cellular activity
Ó Hence they represent a key tool for implementing a systems biology approach
Dr. Carlo Cosentino41 Carnegie Mellon University, Pittsburgh, 2008 Single Feature
Image courtesy of Affymetrix, Inc. Dr. Carlo Cosentino42 Carnegie Mellon University, Pittsburgh, 2008 Probes Hybridization
Image courtesy of Affymetrix, Inc. Dr. Carlo Cosentino43 Carnegie Mellon University, Pittsburgh, 2008 Hybridized DNA Visualization
Image courtesy of Affymetrix, Inc. Dr. Carlo Cosentino44 Carnegie Mellon University, Pittsburgh, 2008 Time-Course Experimental Data
Ó Microarrays enable to derive time-course experimental data, with a desired sampling time
Rustici et al, Nature Genetics 36(8), 2004 Dr. Carlo Cosentino45 Carnegie Mellon University, Pittsburgh, 2008 Protein Microarrays
Ó The function of a gene is realized by the coded protein, not by its mRNA
Ó This fact addresses a main flaw in the DNA microarray approach
Ó Solution: protein microarrays! Unfortunately… Ô Proteins are not as uniform as DNA Ô It is not (yet) possible to generate the amount of recombinant protein needed for high-throughput experiments Ô Optimal interaction conditions (in terms of temperature, ionic strenght, pH) are largely varying among proteins
Ó Nonetheless, research on protein chips is rapidly progressing
Dr. Carlo Cosentino46 Carnegie Mellon University, Pittsburgh, 2008 ChIP on Chip
Ó Chromatin Immuno–Precipitation is used to discover transcription factors (protein-DNA interactions)
Dr. Carlo Cosentino47 Carnegie Mellon University, Pittsburgh, 2008 Mass Spectrometry
Ó Mass spectrometry is used to identify unknown compounds and quantify known compounds in a solution Ô A small sample of compounds is ionized, usually to cations by loss of an electron Ô The ions are sorted and separated according to their mass and charge Ô The separated ions are then detected and tallied, and the results are displayed on a chart
Dr. Carlo Cosentino48 Carnegie Mellon University, Pittsburgh, 2008 RNA Interference (RNAi)
Ó Used for transiently down – regulating arbitrary genes
Dr. Carlo Cosentino49 Carnegie Mellon University, Pittsburgh, 2008 References
Ó Klipp et al, Systems Biology in Practice, Wiley-VCH, 2005
Ó DL Nelson, MM Cox, Lehninger Principles of Biochemistry, WH Freeman, 2004
Ó Lodish et al, Molecular Cell Biology, WH Freeman, 2007
Dr. Carlo Cosentino50 Carnegie Mellon University, Pittsburgh, 2008