Lecture 8: Evolution (1/1)
Total Page:16
File Type:pdf, Size:1020Kb
Lecture 8: Evolution (1/1) How does time modify large networks? (i.e., how will FB look in 10 years?) COMS 4995-1: Introduction to Social Networks Thursday, September 22nd 1 Outline * Why is that important? * What we could expect from models so far? * What is empirically observed? o Densification, shrinking diameter * How can we explain and understand it? 2 Motivation * So far, we have only analyzed static graphs o But many social properties are dynamic o One may think that social graphs are results of dynamics going on (see part II of the lecture) * Understanding evolution has direct implications o Are graph generators/models accurate o Extrapolation (what if the network increases?) o Anomaly detection (see part III of the lecture) 3 Outline * Why is that important? * What we could expect from models so far? * What is empirically observed? o Densification, shrinking diameter * How can we explain and understand it? 4 Previous models * Arrival of new nodes: o Users joining Facebook, G+, etc. from invitation o Citation/Collaboration: new paper/movie/webpage o Communication Netw.: new routers, new subscribers * These nodes connect to the current graph: o Creating new connections, citations, links, etc. * Let the process runs n steps, look at final result! 5 Previous assumptions * Degree = constant or slowly varying with size o Typically fixed using the original graph to model o Assume that even if population grows large, each node remains with a finite local neighborhood * Distance = slowly growing with size o With constant degree, should be ~ log(N) o Explains the “small-world” phenomenon 6 Example 1: Unif. Random Graph * N goes to infinity, p=c/N (resp. p=c*log(N)/N) o Avg Degree : deg(N) ~ c (resp. deg(N) ~ c*log(N)) o Ensures giant connected component, connectivity o Diameter connected comp. grows as log(N) 7 Example 2: the Copying model * N comes and use preferential attachment o Nodes join sequentially, out-degree is fixed, o Some very large degree, but avg remains constant o Diameter connected comp. ~ log(N) Riordan, O., & Bollobas, B. (2004). The diameter of a scale-free random 8 graph. Combinatorica. Example 3: Augmented lattice * Nodes have constant # of neighbors o Diameter ~ log(N) (in fact, greedy routing ~ log(N)2) o Same result for any augmentation 9 Outline * Why is that important? * What we could expect from models so far? * What is empirically observed? o Densification, shrinking diameter * How can we explain and understand it? 10 Paper Distillation DATA SET EMPIRICAL RESULT Citation (arXiv, US-patents) 1. Densification (degree grows polyn.) Graph Evolution: Densification and Affiliation (arXiv, IMDB) Shrinking Diameters 2. Shrinking diameters JURE LESKOVEC Technology (Inter-AS links BGP) Carnegie Mellon University effective diameter JON KLEINBERG Cornell University Communication (Email, and effect of missing past CHRISTOS FALOUTSOS Recommendation) Carnegie Mellon University effect of disconnected component How do real graphs evolve over time? What are normal growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions, communities, small-world phenomena, and others. However, given the lack of information about network evolution3. over long periods,Relation densification/degree it has been hard to convert these findings into statements about trends over time. Here we study a wide range of real graphs, and we observe some surprising phenomena. First, most of these graphs densify over time with the number of edges growing superlinearly in the number of nodes. Second, the average distance between nodes often shrinks over time in contrast to the conventional wisdom that such distance parameters should increase slowly as a function of the number of nodes (like O(log n) or O(log(log n)). Existing graph generation models do not exhibit these types of behavior even at a qualitative level. We provide a new graph generator, based on a forest fire spreading process that has a simple, MODEL intuitive justification, requires very few parameters (like the flammabilityANALYSIS of nodes), and produces graphs exhibiting the full range of properties observed both in prior work and in the present study. Community Guided Attachment (CGA) This material is based on work supported by the National Science FoundationProof that CGA produces densification under Grants No. IIS-0209107 SENSOR-0329549 EF-0331657IIS-0326322 IIS- 0534205, CCF-0325453, IIS-0329064, CNS-0403340, CCR-0122581; a David and Lucile Packard Foundation Fellowship; and also by the Pennsylvania Infrastructure Technology Alliance (PITA), a partnership of Carnegie Mellon, Lehigh University and the Commonwealth of Pennsylvania’s Department of Community and Economic “Topical hierarchy + distance matters” Development (DCED). Additional funding was provided by a generous giftand heavy-tailed degree from Hewlett-Packard. J. Leskovec was partially supported by the Microsoft Research Graduate Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation, or other funding parties. Author’s address: J. Leskovec, Machine Learning Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, 15213 Pittsburgh PA, USA; email: jure@ cs.cmu.edu. Permission to make digital or hard copies part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or direct commercial advantage and that copies show this notice on the first page or initial screen of a display along Forest Fire (FF) with the full citation. Copyrights for components of this work owned by othersFF numerically produces densification, than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers, to redistribute to lists, or to use any component of this work in other works requires prior specific per- mission and/or a fee. Permissions may be requested from the Publications Dept., ACM, Inc., 2 Penn Plaza, Suite 701, New York, NY 10121-0701 USA, fax 1 (212) 869-0481, or [email protected]. “attachment with exploration” + heavy tailed degree and shrinking C 2007 ACM 1556-4681/2007/03-ART2 $5.00. DOI 10.1145/1217299.1217301 http://doi.acm.org/ ! 10.1145/1217299.1217301 ACM Transactions on Knowledge Discovery from Data, Vol. 1, No. 1, Article 2, Publicationdiameter date: March 2007. Leskovec, J., Kleinberg, J., & Faloutsos11 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Temporal Data Sets * Citation (arXiv, US-patents) * Affiliation (arXiv, IMDB) * Technology (Inter-AS links BGP), Communication (Email, Recommendation) 12 Analysis of degree evolution o How degree grows as time passes? o How degree grows a size grows? obeys power law degree ~ Na-1 #{edges} ~ Na a=1 (sparse, cst deg), a=2 (dense, cst fract.) Leskovec, J., Kleinberg, J., & Faloutsos13 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Example of Results Leskovec, J., Kleinberg, J., & Faloutsos14 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Example of Results How to define diameter D ? o D=max{ d(u,v) | u,v in V } o D=∞ if not connected, so only connected pairs o D large only only a pair, so 90% percentile of distance Diameter shrinks with time! Leskovec, J., Kleinberg, J., & Faloutsos15 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Validation (1): The missing past o Partial data set: What about citations before 93? o Post-t0, Post-t0-no-past o Little effect so this should not explain shrinkage Leskovec, J., Kleinberg, J., & Faloutsos16 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Validation (2): Connectedness o Random graph: giant component arrives and then distance shrinks Leskovec, J., Kleinberg, J., & Faloutsos17 , C. (2007). Graph evolution: Densification and shrinking diameters. ACM Trans. KDD Densification: Flickr and Y! 360 Kumar, R., Novak, J., & Tomkins, A. (2010). Structure and evolution of 18 online social networks. Diameter: Flickr and Y! 360 Kumar, R., Novak, J., & Tomkins, A. (2010). Structure and evolution of 19 online social networks. Paper Distillation DATA SET EMPIRICAL RESULT Citation (arXiv, US-patents) 1. Densification (degree grows polyn.) Graph Evolution: Densification and Affiliation (arXiv, IMDB) Shrinking Diameters 2. Shrinking diameters JURE LESKOVEC Technology (Inter-AS links BGP) Carnegie Mellon University effective diameter JON KLEINBERG Cornell University Communication (Email, and effect of missing past CHRISTOS FALOUTSOS Recommendation) Carnegie Mellon University effect of disconnected component How do real graphs evolve over time? What are normal growth patterns in social, technological, and information networks? Many studies have discovered patterns in static graphs, identifying properties in a single snapshot of a large network or in a very small number of snapshots; these include heavy tails for in- and out-degree distributions, communities, small-world phenomena, and others. However, given the lack of information about network evolution3. over long periods,Relation densification/degree it has been hard to convert these findings into statements about trends over time. Here