Biological Networks

NETWORK THEORY Dr. Alioune Ngom School of Computer Science University of Windsor [email protected] Winter 2013 1 What is a Network? Network is a mathematical structure composed of points connected by lines Network Theory <-> Graph Theory Network Graph Nodes Vertices (points) Links Edges (Lines) F. Harary, Graph Theory, Addison Wesley, Reading, MA, 1969 Gross & Yellen, Handbook of Graph Theory, CRC Press, Boca Raton, FL, 2004 A network can be build for any functional system System vs. Parts = Networks vs. Nodes Networks As Graphs Networks can be undirected or directed, depending on whether the interaction between two neighboring nodes proceeds in both directions or in only one of them, respectively. 1 2 3 4 5 6 The specificity of network nodes and links can be quantitatively characterized by weights 2.5 7.3 3.3 12.7 5.4 8.1 2.5 Vertex-Weighted Edge-Weighted Networks As Graphs - 2 A network can be connected (presented by a single component) or disconnected (presented by several disjoint components). connected disconnected Networks having no cycles are termed trees. The more cycles the network has, the more complex it is. trees cyclic graphs Networks As Graphs - 3 Some Basic Types of Graphs Paths Stars Cycles Complete Graphs Bipartite Graphs Air Transportation Network The World Wide Web Fragment of a Social Network (Melburn, 2004) Biological Networks A. Intra-Cellular Networks Protein interaction networks Metabolic Networks Signaling Networks Gene Regulatory Networks Composite networks Networks of Modules, Functional Networks Disease networks B. Inter-Cellular Networks Neural Networks C. Organ and Tissue Networks D. Ecological Networks E. Evolution Network L-A Barabasi GENOME miRNA _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ regulation? - protein-gene - interactions PROTEOME protein-protein interactions METABOLISM Bio-chemical reactions Citrate Cycle The Protein Network of Drosophila CuraGen Corporation Science, 2003 Metabolic Networks Source: ExPASy Apoptosis Pathway - 1 Apoptosis is a mechanism of controlled cell death critically important in many biological processes Cleavage of Caspase Substrates DISC CASP6 CASP10 Heterodimer DFF FAS-L FAS-R FADD CASP3 Membrane Death protein CASP8 DFF45 DFF40 activator Death-Inducing Signaling Complex CASP7 Initiator Caspases Start DNA Executor Fragmentation Caspases D. Bonchev, L.B. Kier, C. Cheng, Lecture Series on Computer and Computational Sciences 6, 581-591 (2006). Gene Regulation Networks The Longevity Gene-Protein Network (LGPN) C. elegans T. Witten, D. Bonchev, Network of Interacting Pathways (NIP) 381 organisms A.Mazurie D.Bonchev G.A. Buck, 2007 Functional Networks Yeast: 1400 proteins, 232 complexes, nine functional groups of complexes Cell Cycle Cell Polarity & Structure (Data A.-M. Gavin 13 7 8 Number of protein complexes et al. (2002) Nature 111 25 40 61 Number of proteins Transcription/DNA 415,141-147) 77 19 Number of shared proteins Maintenance/Chromatin 14 11 15 Structure 30 16 27 7 22 55 187 43 Intermediate 740 94 33 221 and Energy 73 Metabolism 83 37 103 65 Signaling 11 20 13 20 Membrane 125 53 147 Biogenesis & Turnover 35 321 19 41 299 49 75 97 5 RNA 28 9 33 Protein Synthesis 24 260 Metabolism 692 6 419 and Turnover 172 12 75 160 Protein RNA / Transport D. Bonchev, Chemistry & Biodiversity 1(2004)312-326 Summary All complex networks in nature and technology have common features. They differ considerably from random networks of the same size By studying network structure and dynamics, and by using comparative network analysis, one can get answers of important biological questions. Some Fundamental Biological Questions to Answer (i) Which interactions and groups of interactions are likely to have equivalent functions across species? (ii) Based on these similarities, can we predict new functional information about proteins and interactions that are poorly characterized? (iii) What do these relationships tell us about the evolution of proteins, networks and whole species? (iv) How to reduce the noise in biological data: Which interactions represent true binding events? False-positive interaction is unlikely to be reproduced across the interaction maps of multiple species. Why Study Networks? It is increasingly recognized that complex systems cannot be described in a reductionist view. Understanding the behavior of such systems starts with understanding the topology of the corresponding network. Topological information is fundamental in constructing realistic models for the function of the network. Properties of Biological Networks Large network comparison is computationally hard due to NP- completeness of the underlying subgraph isomorphism problem: • Given 2 graphs G and H as input, determine whether G contains a subgraph that is isomorphic to H. Thus, network comparisons rely on easily computable heuristics (approximate solutions), called “network properties” Network properties can roughly & historically be divided in two categories: 1. Global network properties: give an overall view of the network, but might not be detailed enough to capture complex topological characteristics of large networks. 2. Local network properties: more detailed network descriptors which usually encompass larger number of constraints, thus reducing degrees of freedom in which the networks being compared can vary. 21 Biological Networks Properties Scale-Free - Power law degree distribution: Rich get richer Small World: A small average path length Mean shortest node-to-node path Robustness: Resilient and have strong resistance to failure on random attacks and vulnerable to targeted attacks Hierarchical Modularity: A large clustering coefficient How many of a node’s neighbors are connected to each other Global Network Properties Readings: Chapter 3 of “Analysis of biological networks” by Junker and Björn Global Network Measures: 1) Degree distribution P(k) 2) Average clustering coefficient 3) Clustering spectrum 4) Network Diameter 5) Average Diameter 6) Mean Path Length 7) Spectrum of shortest path lengths 8) Centralities 9) … etc 23 Global Network Properties - Degree Distribution x Definitions: deg(x)=5 degree of a node is the number of edges incident to the node. Average degree of a network: average of the degrees over all nodes in the network. However, avg. deg might not be representative, since the distribution of degrees might be skewed. 24 Global Network Properties – Degree Distribution Let P(k) be the percentage of nodes of degree k in the network. The degree distribution is the distribution of P(k) over all k. P(k) can be understood as the probability that a node has degree k. The degree distribution is the probability distribution function P(k), which shows the probability that the degree of a randomly selected node is k. 25 Degree Distribution # of nodes k having degree 10 1 2 3 4 Degree Degree Distribution P(k) 1 1 2 3 4 Degree Any randomness in the network will broaden the shape of this peak Degree Distribution # of nodes k having degree 4 2 1 2 3 4 Degree Degree Distribution P(k) 0.5 0.25 1 2 3 4 Degree Degree Distribution k P() k e k! Poisson’s Distribution e = 2.71828..., the Base of natural Logarithms Degree distribution of random graphs follow Poisson’s distribution Degree Distribution P(k) P(k) ~ k-γ Power Law Distribution Connectivity k Degree distribution of many biological networks follow Power Law distribution Power Law Distribution on log-log plot is a straight line Degree distributions fk = fraction of nodes with degree k frequency = probability of a randomly selected node to have degree k fk k degree Why measure the degree distribution? The degree distribution is a “fingerprint” of the network– it allows us to generally characterize its structure Degree distribution from a random network What if we constructed a network by adding edges between proteins at random? Log-log plot: Frequency Node degree Properties: highly concentrated around the mean the probability of very high degree nodes is exponentially small Barabasi, Oltvai. Network Biology: Understanding the cell’s functional organization. Nature Reviews Genetics 5, 101-113 (2004). What about the degree distribution of real networks? Random network: Yeast 2-hybrid interaction network Hawoong Jeong et al. Oltvai Centrality and lethality of protein networks. Nature 411, 41-42 (2001) What about other types of real networks? Random Conclusion: many real networks have the same fingerprint! [Newman, 2003] Global Network Properties – Scale-Freeness 1) Degree Distribution Example: (log-log plot) Here P(k) ~ k-γ , where often 2 ≤ γ < 3. This is a power-law, heavy-tailed distribution. Networks with power-law degree distributions are called scale-free networks. In them, most of the nodes are of low degree, but there is a small number of highly- linked nodes (nodes of high degree) called “hubs.” 36 WHAT DOES SCALE FREE REALLY MEAN, ANYWAY? P(k) is probability of each degree k For scale free: g P(k) ~ k What happens for small vs. large g? Random vs Preferential Attachment Erdos-Renyi Start with N nodes and connect each pair with equal probability p Scale-free Add nodes incrementally. New nodes connect to each existing node I with probability proportional to its degree: kI kJ J Scale-free networks have small avg. path lengths ~ log (log N)– this is called the ‘small world’ effect Global Network Properties – Small-World Network Most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops Small average path length Any node can be reached

Load more