Biological Network Representations and Databases Basic Topological

Total Page:16

File Type:pdf, Size:1020Kb

Biological Network Representations and Databases Basic Topological Biological Network Representations and Databases Transcriptional Regulatory Networks. In transcription regulatory networks the nodes represent transcrip- tion factors and genes while the links describe regulation-based interactions. Each link is directed, point- ing from the transcription factor to the gene that it regulates. To compare the topological properties of transcription-regulatory networks to those of metabolic and protein-protein interaction networks, we have stud- ied their undirected version where each directed link is replaced by an undirected one. The transcription net- work data of Escherichia coli has been reported in [2] and it is available from www.weizmann.ac.il/mcb/UriAlon. The transcription network data of Saccharomyces cerevisiae has been reported in [8], and it is available from http://www.weizmann.ac.il/mcb/UriAlon/Papers/networkMotifs/sma.html. Metabolic Networks. Metabolic networks represent the sum of all biochemical reactions taking place in an or- ganism. Different network representations are possible, depending on the purpose [4]. In our work we associate nodes with metabolites. Undirected links connect each substrate to the products of the reaction in which the sub- strate participates. The metabolic networks of E. coli and S. cerevisiae where obtained from the WIT database (http://igweb.integratedgenomics.com/IGwit), and they are also available www.nd.edu/∼networks . Protein-Protein Interaction Networks. In the protein-protein interaction network nodes represent proteins and undirected links describe the pairwise interactions between them. We focus on the protein-protein interaction network of S. cerevisiae. The data source is the Database of Interacting Proteins (DIP) available at http://dip.doe- mbi.ucla.edu. For a comparison between the different protein interaction databases see [1]. In summary, we constructed an undirected network representation of the transcription, the metabolic and the protein-protein interaction networks of E. coli and S. cerevisiae. In order to have a coherent basis of comparison we have reduced the original networks by removing self-loops, double links (i.e., if we have simultaneous links from node A to node B and from B to A, we draw only one undirected link joining the two A and B) and isolated nodes. The basic quantities characterizing the resulting networks are summarized in Tab. 3. Basic Topological Properties We denote by N the number of nodes in the network, usually referred to as the network size, and by E the number of links (or edges). Each node i is characterized by the degree ki and the clustering coefficient Ci. The degree ki counts the number of interactions that node i has with the other nodes in the network. By averaging ki over all nodes in the network we obtain the average degree N 1 2E hki ≡ k = . (1) N i N i=1 X The last equality follows from the fact that each link connects two nodes, and hence contributing to the degree of two nodes. The values of N, E and hki for the investigated biological networks are given in Table 3. Note that for each network hki ¿ N, meaning that biological networks are sparse, i.e. on average each node is connected to only a few other nodes. The clustering coefficient Ci is a measure of the fraction of connected neighbors of node i. A node with ki links ki can have at most 2 = ki(ki − 1)/2 pairs of its neighbors connected to each other. If we denote by ti the number of links among the neighbors of node i, then the clustering coefficient is defined as ¡ ¢ 2ti Ci ≡ . (2) ki(ki − 1) The average of Ci over all nodes in the network is N 1 hCi ≡ C . (3) N i i=1 X The values of hCi for the analyzed biological networks are given in Table 3. For comparison we also list the clustering coefficient of a random network with the same parameters N and E. Note that hCi À hki /N; therefore, biological networks are characterized by a high degree of local clustering [5]. The average values hki and hCi do not characterize exhaustively the topology since biological networks exhibit marked degree fluctuations [3, 4]. Indeed, the probability P (k) for a node to have degree k is given by the power law − P (k) = Ak γ , (4) where A is a constant and γ is the degree exponent. The highly inhomogeneous nature of P (k) prompted the study of the clustering coefficient as a function of the degree. Instead of taking the average over all nodes in the network, as in Eq. (3), the average is taken over nodes with the same degree, providing N i=1 Ciδkik C(k) = N , (5) P i=1 δkik where δkik = 1 if ki = k, and zero otherwise. For biologicalP networks C(k) follows a power law −α C(k) = C0k , (6) where C0 is a constant and α is the hierarchical exponent. The hierarchical exponent characterizes the network’s overall modularity, indicating the presence of many small, highly interconnected modules forming larger, less cohesive topological modules. The degree and hierarchical exponents, γ and α respectively, rather than the average degree hki and the average clustering coefficient hCi, are the two important topological quantities characterizing biological networks. Their values for the studied biological networks are summarized in Table I. Statistical Properties of Subgraphs Number of Triangles Passing by a Node. To develop an intuition for the statistical properties of general subgraphs we first study the occurrence of triangles. Since each connected pair of neighbors forms a triangle with the central node i, then ti defined in (2) also represents the number of triangles passing by node i. The average of ti over nodes of same degree k, denoted by T (k), is k T (k) = C(k) . (7) 2 µ ¶ k 2 Using 2 ∼ k /2, we find ¡ ¢ T (k) ∼ kβ , (8) with β = 2 − α . (9) Combining Eqs. (4) and (8), we predict that the probability that a randomly selected node participates in T triangles follows a power law − P (T ) ∼ T δ , (10) where γ − 1 δ = 1 + . (11) β P (T ) and T (k) for the studied biological networks are plotted in Fig. 4 together with the power law fits. The β and δ exponents provided by the fit are listed in Table I, indicating that the predicted values are in good agreement with the measured ones. Number of Small Subgraphs Passing by a Node. The above analysis based on triangles can be extended to other small subgraphs as well. We consider subgraphs with n nodes and m links that can be decomposed into a central node and n − 1 neighbors. More precisely, we have n − 1 interactions from the central node to its neighbors and t = m − (n − 1) interactions among the n − 1 neighbors. In the paper we have used m together with n to classify each motif; here, to simplify the calculations we use t instead of m. The total number of n-node subgraphs that can pass by a node k with degree k is n−1 , which tells us in how many different ways we can choose a group of n − 1 nodes out of the total k. Each of these n-node subgraphs can have at most n = (n − 1)(n − 2)/2 links between the n − 1 neighbors of ¡ ¢ p the central node. The probability that there is a link between two neighbors of a node with degree k is by definition C(k), and the probability that two neighbors are not connected is 1 − C(k). Therefore, the probability to obtain t connected pairs and np − t disconnected pairs is given by the binomial distribution n − b (k) = p C(k)t[1 − C(k)]np t . (12) nt t µ ¶ The average number of subgraphs formed by n − 1 neighbors and t interactions among them and centered at a node with degree k is given by k N (k) = b (k) . (13) nt n − 1 nt µ ¶ We readily obtain that if k is large, then Nnt(k) scales as βnt Nnt(k) ∼ k , (14) where βnt = n − 1 − αt = n − 1 − α(m − n + 1) . (15) Using Eqs. (4) and (14) we find that the probability for a randomly selected node to participate in Tnt subgraphs (n, t) scales as −δnt P (Tnt) ∼ Tnt , (16) where γ − 1 δnt = 1 + . (17) βnt In particular, triangles correspond to n = 3 and t = 1 (i.e., m = 3), leading to β31 = 2 − α and δ31 =1+(γ − 1)/δ, and we thus recover the expressions (9) and (11) derived earlier for triangles. Note that this approach can be generalized to higher order cycles, such as squares, pentagons and others. Over- (Type I) and Under-represented (Type II) Subgraphs. Here we use the scaling relations derived above to address the abundance of (n, t) subgraphs in scale-free and hierarchical networks. The expected number of (n, t) subgraphs in a network is given by Eq. (13) kmax Nnt = gntN P (k)Nnt(k) , (18) k X=1 where kmax is the maximum degree and the factor gnt takes into account that the same subgraph can have more than one node as a center (see Fig. 5). Eqs. (4), (6) and (12-18) lead to kmax n − − k − − N = NAg p Ct k γ αt [1 − C k α]np t . (19) nt nt t 0 n − 1 0 k µ ¶ X=1 µ ¶ We recognize the existence of two regimes. There is a regime where the sum in the above expression is dominated by k n−1 −α the large k region; this means that we can approximate n−1 by k /(n − 1)! and 1 − C0k by 1, being left with ¡ ¢ np t kmax Agnt C − − − N ≈ N t 0 kn 1 γ αt .
Recommended publications
  • Simplified Computational Model for Generating Bio- Logical Networks
    Simplified Computational Model for Generating Bio- logical Networks† Matthew H J Bailey,∗ David Ormrod Morley,∗ and Mark Wilson∗ A method to generate and simulate biological networks is discussed. An expanded Wooten- Winer-Weaire bond switching methods is proposed which allows for a distribution of node degrees in the network while conserving the mean average node degree. The networks are characterised in terms of their polygon structure and assortativities (a measure of local ordering). A wide range of experimental images are analysed and the underlying networks quantified in an analogous manner. Limitations in obtaining the network structure are discussed. A “network landscape” of the experimentally observed and simulated networks is constructed from the underlying metrics. The enhanced bond switching algorithm is able to generate networks spanning the full range of experimental observations. Two dimensional random networks are observed in a range of 17 contexts across considerably different length scales in nature: to be limited to avoid damaging the delicate networks . How- from nanometres, in the form of amorphous graphene; to me- ever, even when a high-quality image is obtained, there is further tres, in the form of the Giant’s causeway; to tens of kilometres, in difficulty in analysis. For example, each edge in a network is a the form of geopolitical borders 1–3. A framework for describing complex molecule made up of tens of thousands of atoms which these continuous random networks for chemical systems was first may not lie strictly in a single plane. In addition, the most inter- introduced by Zachariasen to describe silica-like glasses, and has esting dynamic behaviour often occurs over very long timescales proved to be extremely versatile in the years since 4, being used to – potentially decades.
    [Show full text]
  • Biological Network Approaches and Applications in Rare Disease Studies
    G C A T T A C G G C A T genes Review Biological Network Approaches and Applications in Rare Disease Studies Peng Zhang 1,* and Yuval Itan 2,3 1 St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065, USA 2 The Charles Bronfman Institute for Personalized Medicine, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA; [email protected] 3 Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY 10029, USA * Correspondence: [email protected]; Tel.: +1-646-830-6622 Received: 3 September 2019; Accepted: 10 October 2019; Published: 12 October 2019 Abstract: Network biology has the capability to integrate, represent, interpret, and model complex biological systems by collectively accommodating biological omics data, biological interactions and associations, graph theory, statistical measures, and visualizations. Biological networks have recently been shown to be very useful for studies that decipher biological mechanisms and disease etiologies and for studies that predict therapeutic responses, at both the molecular and system levels. In this review, we briefly summarize the general framework of biological network studies, including data resources, network construction methods, statistical measures, network topological properties, and visualization tools. We also introduce several recent biological network applications and methods for the studies of rare diseases. Keywords: biological network; bioinformatics; database; software; application; rare diseases 1. Introduction Network biology provides insights into complex biological systems and can reveal informative patterns within these systems through the integration of biological omics data (e.g., genome, transcriptome, proteome, and metabolome) and biological interactome data (e.g., protein-protein interactions and gene-gene associations).
    [Show full text]
  • Comparison of Different Generalizations of Clustering Coefficient and Local Efficiency for Weighted Undirected Graphs
    LETTER Communicated by Mika Rubinov Comparison of Different Generalizations of Clustering Coefficient and Local Efficiency for Weighted Undirected Graphs Yu Wang Eshwar Ghumare [email protected] Laboratory for Cognitive Neurology, Department of Neurosciences, KU Leuven, Leuven 3000, Belgium Rik Vandenberghe [email protected] Laboratory for Cognitive Neurology, Department of Neurosciences, KU Leuven, Leuven 3000, Belgium, and Alzheimer Research Centre, KU Leuven, Leuven Institute for Neuroscience and Disease, KU Leuven, Leuven 3000, Belgium Patrick Dupont [email protected] Laboratory for Cognitive Neurology, Department of Neurosciences, KU Leuven, Leuven 3000, Belgium; Alzheimer Research Centre, KU Leuven, Leuven Institute for Neuroscience and Disease, KU Leuven, Leuven 3000, Belgium; and Medical Imaging Research Center, KU Leuven and University Hospitals Leuven, Leuven 3000, Belgium Binary undirected graphs are well established, but when these graphs are constructed, often a threshold is applied to a parameter describing the connection between two nodes. Therefore, the use of weighted graphs is more appropriate. In this work, we focus on weighted undirected graphs. This implies that we have to incorporate edge weights in the graph mea- sures, which require generalizations of common graph metrics. After re- viewing existing generalizations of the clustering coefficient and the local efficiency, we proposed new generalizations for these graph measures. To be able to compare different generalizations, a number of essential and useful properties were defined that ideally should be satisfied. We ap- plied the generalizations to two real-world networks of different sizes. As a result, we found that not all existing generalizations satisfy all es- sential properties. Furthermore, we determined the best generalization for the clustering coefficient and local efficiency based on their properties and the performance when applied to two networks.
    [Show full text]
  • Scale- Free Networks in Cell Biology
    Scale- free networks in cell biology Réka Albert, Department of Physics and Huck Institutes of the Life Sciences, Pennsylvania State University Summary A cell’s behavior is a consequence of the complex interactions between its numerous constituents, such as DNA, RNA, proteins and small molecules. Cells use signaling pathways and regulatory mechanisms to coordinate multiple processes, allowing them to respond to and adapt to an ever-changing environment. The large number of components, the degree of interconnectivity and the complex control of cellular networks are becoming evident in the integrated genomic and proteomic analyses that are emerging. It is increasingly recognized that the understanding of properties that arise from whole-cell function require integrated, theoretical descriptions of the relationships between different cellular components. Recent theoretical advances allow us to describe cellular network structure with graph concepts, and have revealed organizational features shared with numerous non-biological networks. How do we quantitatively describe a network of hundreds or thousands of interacting components? Does the observed topology of cellular networks give us clues about their evolution? How does cellular networks’ organization influence their function and dynamical responses? This article will review the recent advances in addressing these questions. Introduction Genes and gene products interact on several level. At the genomic level, transcription factors can activate or inhibit the transcription of genes to give mRNAs. Since these transcription factors are themselves products of genes, the ultimate effect is that genes regulate each other's expression as part of gene regulatory networks. Similarly, proteins can participate in diverse post-translational interactions that lead to modified protein functions or to formation of protein complexes that have new roles; the totality of these processes is called a protein-protein interaction network.
    [Show full text]
  • NODEXL for Beginners Nasri Messarra, 2013‐2014
    NODEXL for Beginners Nasri Messarra, 2013‐2014 http://nasri.messarra.com Why do we study social networks? Definition from: http://en.wikipedia.org/wiki/Social_network: A social network is a social structure made up of a set of social actors (such as individuals or organizations) and a set of the dyadic ties between these actors. Social networks and the analysis of them is an inherently interdisciplinary academic field which emerged from social psychology, sociology, statistics, and graph theory. From http://en.wikipedia.org/wiki/Sociometry: "Sociometric explorations reveal the hidden structures that give a group its form: the alliances, the subgroups, the hidden beliefs, the forbidden agendas, the ideological agreements, the ‘stars’ of the show". In social networks (like Facebook and Twitter), sociometry can help us understand the diffusion of information and how word‐of‐mouth works (virality). Installing NODEXL (Microsoft Excel required) NodeXL Template 2014 ‐ Visit http://nodexl.codeplex.com ‐ Download the latest version of NodeXL ‐ Double‐click, follow the instructions The SocialNetImporter extends the capabilities of NodeXL mainly with extracting data from the Facebook network. To install: ‐ Download the latest version of the social importer plugins from http://socialnetimporter.codeplex.com ‐ Open the Zip file and save the files into a directory you choose, e.g. c:\social ‐ Open the NodeXL template (you can click on the Windows Start button and type its name to search for it) ‐ Open the NodeXL tab, Import, Import Options (see screenshot below) 1 | Page ‐ In the import dialog, type or browse for the directory where you saved your social importer files (screenshot below): ‐ Close and open NodeXL again For older Versions: ‐ Visit http://nodexl.codeplex.com ‐ Download the latest version of NodeXL ‐ Unzip the files to a temporary folder ‐ Close Excel if it’s open ‐ Run setup.exe ‐ Visit http://socialnetimporter.codeplex.com ‐ Download the latest version of the socialnetimporter plug in 2 | Page ‐ Extract the files and copy them to the NodeXL plugin direction.
    [Show full text]
  • The Role of Homophily Yohsuke Murase 1, Hang-Hyun Jo 2,3,4, János Török5,6,7, János Kertész6,5,4 & Kimmo Kaski4,8
    www.nature.com/scientificreports OPEN Structural transition in social networks: The role of homophily Yohsuke Murase 1, Hang-Hyun Jo 2,3,4, János Török5,6,7, János Kertész6,5,4 & Kimmo Kaski4,8 We introduce a model for the formation of social networks, which takes into account the homophily or Received: 17 August 2018 the tendency of individuals to associate and bond with similar others, and the mechanisms of global and Accepted: 27 February 2019 local attachment as well as tie reinforcement due to social interactions between people. We generalize Published: xx xx xxxx the weighted social network model such that the nodes or individuals have F features and each feature can have q diferent values. Here the tendency for the tie formation between two individuals due to the overlap in their features represents homophily. We fnd a phase transition as a function of F or q, resulting in a phase diagram. For fxed q and as a function of F the system shows two phases separated at Fc. For F < Fc large, homogeneous, and well separated communities can be identifed within which the features match almost perfectly (segregated phase). When F becomes larger than Fc, the nodes start to belong to several communities and within a community the features match only partially (overlapping phase). Several quantities refect this transition, including the average degree, clustering coefcient, feature overlap, and the number of communities per node. We also make an attempt to interpret these results in terms of observations on social behavior of humans. In human societies homophily, the tendency of similar individuals getting associated and bonded with each other, is known to be a prime tie formation factor between a pair of individuals1.
    [Show full text]
  • Comm 645 Handout – Nodexl Basics
    COMM 645 HANDOUT – NODEXL BASICS NodeXL: Network Overview, Discovery and Exploration for Excel. Download from nodexl.codeplex.com Plugin for social media/Facebook import: socialnetimporter.codeplex.com Plugin for Microsoft Exchange import: exchangespigot.codeplex.com Plugin for Voson hyperlink network import: voson.anu.edu.au/node/13#VOSON-NodeXL Note that NodeXL requires MS Office 2007 or 2010. If your system does not support those (or you do not have them installed), try using one of the computers in the PhD office. Major sections within NodeXL: • Edges Tab: Edge list (Vertex 1 = source, Vertex 2 = destination) and attributes (Fig.1→1a) • Vertices Tab: Nodes and attribute (nodes can be imported from the edge list) (Fig.1→1b) • Groups Tab: Groups of nodes defined by attribute, clusters, or components (Fig.1→1c) • Groups Vertices Tab: Nodes belonging to each group (Fig.1→1d) • Overall Metrics Tab: Network and node measures & graphs (Fig.1→1e) Figure 1: The NodeXL Interface 3 6 8 2 7 9 13 14 5 12 4 10 11 1 1a 1b 1c 1d 1e Download more network handouts at www.kateto.net / www.ognyanova.net 1 After you install the NodeXL template, a new NodeXL tab will appear in your Excel interface. The following features will be available in it: Fig.1 → 1: Switch between different data tabs. The most important two tabs are "Edges" and "Vertices". Fig.1 → 2: Import data into NodeXL. The formats you can use include GraphML, UCINET DL files, and Pajek .net files, among others. You can also import data from social media: Flickr, YouTube, Twitter, Facebook (requires a plugin), or a hyperlink networks (requires a plugin).
    [Show full text]
  • Graph and Network Analysis
    Graph and Network Analysis Dr. Derek Greene Clique Research Cluster, University College Dublin Web Science Doctoral Summer School 2011 Tutorial Overview • Practical Network Analysis • Basic concepts • Network types and structural properties • Identifying central nodes in a network • Communities in Networks • Clustering and graph partitioning • Finding communities in static networks • Finding communities in dynamic networks • Applications of Network Analysis Web Science Summer School 2011 2 Tutorial Resources • NetworkX: Python software for network analysis (v1.5) http://networkx.lanl.gov • Python 2.6.x / 2.7.x http://www.python.org • Gephi: Java interactive visualisation platform and toolkit. http://gephi.org • Slides, full resource list, sample networks, sample code snippets online here: http://mlg.ucd.ie/summer Web Science Summer School 2011 3 Introduction • Social network analysis - an old field, rediscovered... [Moreno,1934] Web Science Summer School 2011 4 Introduction • We now have the computational resources to perform network analysis on large-scale data... http://www.facebook.com/note.php?note_id=469716398919 Web Science Summer School 2011 5 Basic Concepts • Graph: a way of representing the relationships among a collection of objects. • Consists of a set of objects, called nodes, with certain pairs of these objects connected by links called edges. A B A B C D C D Undirected Graph Directed Graph • Two nodes are neighbours if they are connected by an edge. • Degree of a node is the number of edges ending at that node. • For a directed graph, the in-degree and out-degree of a node refer to numbers of edges incoming to or outgoing from the node.
    [Show full text]
  • Clustering Coefficient
    School of Information Sciences University of Pittsburgh TELCOM2125: Network Science and Analysis Konstantinos Pelechrinis Spring 2015 Figures are taken from: M.E.J. Newman, “Networks: An Introduction” Part 8: Small-World Network Model 2 Small World-Phenomenon l Milgram’s experiment § Given a target individual - stockbroker in Boston - pass the message to a person you know on a first name basis who you think is closest to the target l Outcome § 20% of the the chains were completed § Average chain length of completed trials ~ 6.5 l Dodds, Muhamad and Watts have repeated this experiment using e-mail communications § Completion rate is lower, average chain length is lower too 3 Small World-Phenomenon l Are these numbers accurate? § What bias do the uncompleted chains introduce? inter-country intra-country Source: An Experimental Study of Search in Global Social Networks: Peter Sheridan Dodds, Roby Muhamad, and 4 Duncan J. Watts (8 August 2003); Science 301 (5634), 827. Clustering in real networks l What would you expect if networks were completely cliquish? § Friends of my friends are also my friends § What happens to small paths? l Real-world networks (e.g., social networks) exhibit high levels of transitivity/clustering § But they also exhibit short paths too 5 Clustering coefficient l The network models that we have seen until now (random graph, configuration model and preferential attachment) do not show any significant clustering coefficient § For instance the random graph model has a clustering coefficient of c/n-1, which vanishes in
    [Show full text]
  • Denoising Large-Scale Biological Data Using Network Filters
    Kavran and Clauset BMC Bioinformatics (2021) 22:157 https://doi.org/10.1186/s12859‑021‑04075‑x RESEARCH ARTICLE Open Access Denoising large‑scale biological data using network flters Andrew J. Kavran1,2 and Aaron Clauset2,3,4* *Correspondence: [email protected] Abstract 2 BioFrontiers Institute, Background: Large-scale biological data sets are often contaminated by noise, which University of Colorado, Boulder, CO, USA can impede accurate inferences about underlying processes. Such measurement Full list of author information noise can arise from endogenous biological factors like cell cycle and life history vari- is available at the end of the ation, and from exogenous technical factors like sample preparation and instrument article variation. Results: We describe a general method for automatically reducing noise in large- scale biological data sets. This method uses an interaction network to identify groups of correlated or anti-correlated measurements that can be combined or “fltered” to better recover an underlying biological signal. Similar to the process of denoising an image, a single network flter may be applied to an entire system, or the system may be frst decomposed into distinct modules and a diferent flter applied to each. Applied to synthetic data with known network structure and signal, network flters accurately reduce noise across a wide range of noise levels and structures. Applied to a machine learning task of predicting changes in human protein expression in healthy and cancer- ous tissues, network fltering prior to training increases accuracy up to 43% compared to using unfltered data. Conclusions: Network flters are a general way to denoise biological data and can account for both correlation and anti-correlation between diferent measurements.
    [Show full text]
  • Graph Structure of Neural Networks
    Graph Structure of Neural Networks Jiaxuan You 1 Jure Leskovec 1 Kaiming He 2 Saining Xie 2 Abstract all the way to the output neurons (McClelland et al., 1986). Neural networks are often represented as graphs While it has been widely observed that performance of neu- of connections between neurons. However, de- ral networks depends on their architecture (LeCun et al., spite their wide use, there is currently little un- 1998; Krizhevsky et al., 2012; Simonyan & Zisserman, derstanding of the relationship between the graph 2015; Szegedy et al., 2015; He et al., 2016), there is currently structure of the neural network and its predictive little systematic understanding on the relation between a performance. Here we systematically investigate neural network’s accuracy and its underlying graph struc- how does the graph structure of neural networks ture. This is especially important for the neural architecture affect their predictive performance. To this end, search, which today exhaustively searches over all possible we develop a novel graph-based representation of connectivity patterns (Ying et al., 2019). From this per- neural networks called relational graph, where spective, several open questions arise: Is there a systematic layers of neural network computation correspond link between the network structure and its predictive perfor- to rounds of message exchange along the graph mance? What are structural signatures of well-performing structure. Using this representation we show that: neural networks? How do such structural signatures gener-
    [Show full text]
  • Core-Satellite Graphs: Clustering, Assortativity and Spectral Properties
    Core-satellite Graphs: Clustering, Assortativity and Spectral Properties Ernesto Estrada† Michele Benzi‡ October 1, 2015 Abstract Core-satellite graphs (sometimes referred to as generalized friendship graphs) are an interesting class of graphs that generalize many well known types of graphs. In this paper we show that two popular clustering measures, the average Watts-Strogatz clustering coefficient and the transitivity index, diverge when the graph size increases. We also show that these graphs are disassortative. In addition, we completely describe the spectrum of the adjacency and Laplacian matrices associated with core-satellite graphs. Finally, we introduce the class of generalized core-satellite graphs and analyze their clustering, assortativity, and spectral properties. 1 Introduction The availability of data about large real-world networked systems—commonly known as complex networks—has demanded the development of several graph-theoretic and algebraic methods to study the structure and dynamical properties of such usually giant graphs [1, 2, 3]. In a seminal paper, Watts and Strogatz [4] introduced one of such graph-theoretic indices to characterize the transitivity of relations in complex networks. The so-called clustering coefficient represents the ratio of the number of triangles in which the corresponding node takes place to the the number of potential triangles involving that node (see further for a formal definition). The clustering coefficient of a node is boundedbetween zero and one, with values close to zero indicating that the relative number of transitive relations involving that node is low. On the other hand, a clustering coefficient close to one indicates that this node is involved in as many transitive relations as possible.
    [Show full text]