Ruth and Sebastian Ahnert: People and Networks
Total Page:16
File Type:pdf, Size:1020Kb
PEOPLE AND NETWORKS Research Questions COST workshop, Oxford, 22-25 March 2015 Ruth Ahnert (QMUL) Sebastian Ahnert (Cambridge) NETWORKS SURROUND US NETWORKS SURROUND US An abundance of network data now surrounds us: • Mobile phone networks • Transport networks • Power grids • Online social networks (e.g. Facebook, Twitter) • Gene regulatory networks • Protein interaction networks • Neural networks Bethlem myopathy Heinz Trichothio- Zellweger body dystrophy NETWORKSsyndrome anemia SURROUND US Ana Humanabundance Disease Network of network data now surrounds us: Cataract Node size Myopathy Deafness Epidermolysis Retinitis 41 Muscular pigmentosa bullosa dystrophy Cardiomyopathy Leigh syndrome 34 Stroke Charcot-Marie-Tooth disease 30 Myocardial Diabetes infarction Epilepsy mellitus Alzheimer Ataxia- disease 25 Mental telangiectasia retardation Gastric Obesity cancer Hypertension Pseudohypo- 21 Prostate Atheroscierosis aldosteronism cancer Breast Lymphoma Asthma 15 Fanconi cancer Colon anemia cancer Hirschprung 10 disease Parkinson Leukemia 5 Thyroid disease carcinoma 1 Blood group Spherocytosis Spinocereballar ataxia Hemolytic anemia SCIENCES APPLIED PHYSICAL Complement_component Diseasesdeficiency Disease Gene Network LRP5 b Goh, K.-I. et al. TheSCN4A human disease network. FBN1 P Natl Acad Sci Usa 104, 8685–8690 (2007). Disorder Class PAX6 COL2A1 Bone Cancer GJB2 Cardiovascular Connective tissue Dermatological GNAS Developmental Ear, Nose, Throat Endocrine Gastrointestinal Hematological ARX Immunological ACE Metabolic Muscular FGFR2 Neurological ERBB2 MSH2 Nutritional APC Ophthamological FGFR3 PTEN Psychiatric TP53 KRAS NF1 Renal KIT MEN1 Respiratory Skeletal multiple Unclassified Fig. 2. The HDN and the DGN. (a) In the HDN, each node corresponds to a distinct disorder, colored based on the disorder class to which it belongs, the name of the 22 disorder classes being shown on the right. A link between disorders in the same disorder class is colored with the corresponding dimmer color and links connecting different disorder classes are gray. The size of each node is proportional to the number of genes participating in the corresponding disorder (see key), and the link thickness is proportional to the number of genes shared by the disorders it connects. We indicate the name of disorders with Ͼ10 associated genes, as well as those mentioned in the text. For a complete set of names, see SI Fig. 13.(b) In the DGN, each node is a gene, with two genes being connected if they are implicated in the same disorder. The size of each node is proportional to the number of disorders in which the gene is implicated (see key). Nodes are light gray if the corresponding genes are associated with more than one disorder class. Genes associated with more than five disorders, and those mentioned in the text, are indicated with the gene symbol. Only nodes with at least one link are shown. Goh et al. PNAS ͉ May 22, 2007 ͉ vol. 104 ͉ no. 21 ͉ 8687 RESEARCH ARTICLES A node color (Leamer Classification) 123 4 5 6 7 8 9 10 Petroleum Materials Raw Products Forest Cereals Intensive Labor Intensive Capital Machinery Chemicals 0.9 Agriculture Tropical Agriculture Animal NETWORKS0.8 SURROUND US Proximity 0.7 0.6 0.5 link color (proximity) node size (world trade [thousands of US$]) 0.4 021 3 4 0.3 3.0x10 1.5x10 7.5x10 3.7x10 1.9x10 0.2 0.1 An abundance of network data5 6 6 7 now8 surrounds us: fruits B fishing vegetables on May 14, 2009 coffee and cocoa products www.sciencemag.org y Downloaded from garments metallurgy Products Hidalgo, C. A., Klinger, B., Barabasi, A.-L. & Hausmann, R. The product space conditions the development of nations. Science 317, 482–487 (2007). Fig. 1. The product space. (A)Hierarchicallyclusteredproximity(f)matrix with their proximity value. The sizes of the nodes are proportional to world representing the 775 SITC-4 product classes exported in the 1998–2000 trade, and their colors are chosen according to the classification introduced by period. (B)Networkrepresentationoftheproductspace.Linksarecolorcoded Leamer. www.sciencemag.org SCIENCE VOL 317 27 JULY 2007 483 EARLY MODERN NETWORKS COST brings together people interested in particular kinds of networks: NETWORKS NETWORKS Network analysis is a highly interdisciplinary research field in its own right. NETWORKS Network analysis is a highly interdisciplinary research field in its own right. Networks consist of nodes NETWORKS Network analysis is a highly interdisciplinary research field in its own right. Networks consist of nodes and edges. NETWORKS Network analysis is a highly interdisciplinary research field in its own right. Networks consist of nodes and edges. ki =6 i The number of connections a node has is its degree k . NETWORKS Network analysis is a highly interdisciplinary research field in its own right. Networks consist of nodes and edges. j ki =6 i kj =4 The number of connections a node has is its degree k . NETWORKS Network analysis is a highly interdisciplinary research field in its own right. Networks consist of nodes and edges. j ki =6 i kj =4 The number of connections a node has is its degree k . This abstract framework allows us to examine a wide range of networks with the same tools. NETWORK ANALYSIS NETWORK ANALYSIS Many real-world networks have similar properties, such as a scale-free degree distribution. R EPORTS ing systems form a huge genetic network these two ingredients, we show that they are cited in a paper. Recently Redner (11) has whose vertices are proteins and genes, the responsible for the power-law scaling ob- shown that the probability that a paper is chemical interactions between them repre- served in real networks. Finally, we argue cited k times (representing the connectivity of senting edges (2). At a different organization- that these ingredients play an easily identifi- a paper within the network) follows a power al level, a large network is formed by the able and important role in the formation of law with exponent ␥cite ϭ 3. nervous system, whose vertices are the nerve many complex systems, which implies that The above examples (12) demonstrate that cells, connected by axons (3). But equally our results are relevant to a large class of many large random networks share the com- complex networks occur in social science, networks observed in nature. mon feature that the distribution of their local where vertices are individuals or organiza- Although there are many systems that connectivity is free of scale, following a power tions and the edges are the social interactions form complex networks, detailed topological law for large k with an exponent ␥ between between them (4), or in the World Wide Web data is available for only a few. The collab- 2.1 and 4, which is unexpected within the (WWW), whose vertices are HTML docu- oration graph of movie actors represents a framework of the existing network models. ments connected by links pointing from one well-documented example of a social net- The random graph model of ER (7) assumes page to another (5, 6). Because of their large work. Each actor is represented by a vertex, that we start with N vertices and connect each size and the complexity of their interactions, two actors being connected if they were cast pair of vertices with probability p. In the the topology of these networks is largely together in the same movie. The probability model, the probability that a vertex has k unknown. that an actor has k links (characterizing his or edges follows a Poisson distribution P(k) ϭ Traditionally, networks of complex topol- her popularity) has a power-law tail for large eϪk/k!, where ogy have been described with the random k, following P(k) ϳ kϪ␥actor, where ␥ ϭ actor N Ϫ 1 graph theory of Erdo˝s and Re´nyi (ER) (7), 2.3 Ϯ 0.1 (Fig. 1A). A more complex net- k NϪ1Ϫk but in the absence of data on large networks, work with over 800 million vertices (8)isthe ϭ N p ͑1 Ϫ p͒ ͩ k ͪ the predictions of the ER theory were rarely WWW, where a vertex is a document and the tested in the real world. However, driven by edges are the links pointing from one docu- In the small-world model recently intro- the computerization of data acquisition, such ment to another. The topology of this graph duced by Watts and Strogatz (WS) (10), N topological information is increasingly avail- determines the Web’s connectivity and, con- vertices form a one-dimensional lattice, able, raising the possibility of understanding sequently, our effectiveness in locating infor- each vertex being connected to its two the dynamical and topological stability of mation on the WWW (5). Information about nearest and next-nearest neighbors. With on May 14, 2009 large networks. P(k) can be obtained using robots (6), indi- probability p,eachedgeisreconnectedtoa Here we report on the existence of a high cating that the probability that k documents vertex chosen at random. The long-range degree of self-organization characterizing the point to a certain Web page follows a power connections generated by this process de- large-scale properties of complex networks. law, with ␥www ϭ 2.1 Ϯ 0.1 (Fig. 1B) (9). A crease the distance between the vertices, Exploring several large databases describing network whose topology reflects the histori- leading to a small-world phenomenon (13), the topology of large networks that span cal patterns of urban and industrial develop- often referred to as six degrees of separa- fields as diverse as the WWW or citation ment is the electrical power grid of the west- tion (14). For p ϭ 0, the probability distri- patterns in science, we show that, indepen- ern United States, the vertices being genera- bution of the connectivities is P(k) ϭ␦(k Ϫ dent of the system and the identity of its tors, transformers, and substations and the z), where z is the coordination number in www.sciencemag.org constituents, the probability P(k) that a ver- edges being to the high-voltage transmission the lattice; whereas for finite p, P(k)still tex in the network interacts with k other lines between them (10).