Communities in Networks Mason A. Porter, Jukka-Pekka Onnela, and Peter J. Mucha
Introduction: Networks and Communities for an example). Each node has a degree given by “But although, as a matter of history, statistical the number of edges connected to it and a strength mechanics owes its origin to investigations in given by the total weight of those edges. Graphs thermodynamics, it seems eminently worthy of can represent either man-made or natural con- an independent development, both on account of structs, such as the World Wide Web or neuronal the elegance and simplicity of its principles, and synaptic networks in the brain. Agents in such because it yields new results and places old truths networked systems are like particles in traditional in a new light in departments quite outside of statistical mechanics that we all know and (pre- thermodynamics.” sumably) love, and the structure of interactions — Josiah Willard Gibbs, between agents reflects the microscopic rules that Elementary Principles in Statistical Mechanics, govern their behavior. The simplest types of links 1902 [47] are binary pairwise connections, in which one only cares about the presence or absence of a tie. How- rom an abstract perspective, the term ever, in many situations, links can also be assigned network is used as a synonym for a math- a direction and a (positive or negative) weight to ematical graph. However, to scientists designate different interaction strengths. across a variety of fields, this label means Traditional statistical physics is concerned with so much more [13,20,44,83,88,120,124]. the dynamics of ensembles of interacting and FIn sociology, each node (or vertex) of a network noninteracting particles. Rather than tracking the represents an agent, and a pair of nodes can be motion of all of the particles simultaneously, which connected by a link (or edge) that signifies some is an impossible task due to their tremendous num- social interaction or tie between them (see Figure 1 ber, one averages (in some appropriate manner) the microscopic rules that govern the dynamics Mason A. Porter, Oxford Centre for Industrial and Ap- of individual particles to make precise statements plied Mathematics, Mathematical Institute, University of of macroscopic observables such as temperature Oxford, and CABDyN Complexity Centre, University of Oxford. His email address is [email protected]. and density [112]. It is also sometimes possible to make comments about intermediate (mesoscop- Jukka-Pekka Onnela, Harvard Kennedy School, Harvard ic) structures, which lie between the microscopic University; Department of Physics, University of Oxford; and macroscopic worlds; they are large enough CABDyN Complexity Centre, University of Oxford; and that it is reasonable to discuss their collective Department of Biomedical Engineering and Computa- tional Science, Helsinki University of Technology. His email properties but small enough so that those proper- address is [email protected]. ties are obtained through averaging over smaller numbers of constituent items. One can similarly Peter J. Mucha, Carolina Center for Interdisciplinary take a collection of interacting agents, such as Applied Mathematics, Department of Mathematics, University of North Carolina, and Institute for Advanced the nodes of a network, with some set of micro- Materials, Nanoscience and Technology, University of scopic interaction rules and attempt to derive the North Carolina. His email address is [email protected]. resulting mesoscopic and macroscopic structures.
1082 Notices of the AMS Volume 56, Number 9 One mesoscopic structure, called a community, consists of a group of nodes that are relatively densely connected to each other but sparsely con- nected to other dense groups in the network [39]. We illustrate this idea in Figure 2 using a well- known benchmark network from the sociology literature [131]. The existence of social communities is intu- itively clear, and the grouping patterns of humans have been studied for a long time in both sociol- ogy [25, 44, 79] and social anthropology [66, 113]. For example, Stuart Rice clustered data by hand to investigate political blocs in the 1920s [106], and George Homans illustrated the usefulness of rearranging the rows and columns of data matrices to reveal their underlying structure in 1950 [60]. Robert Weiss and Eugene Jacobson per- formed (using organizational data) what might have been the first analysis of network commu- nity structure in 1955 [126], and Herbert Simon espoused surprisingly modern views on communi- ty structure and complex systems in general in the 1960s [117]. Social communities are ubiquitous, arising in the flocking of animals and in so- Figure 1. The largest connected component of cial organizations in every type of human society: a network of network scientists. This network groups of hunter-gatherers, feudal structures, roy- was constructed based on the coauthorship of al families, political and business organizations, papers listed in two well-known review families, villages, cities, states,nations, continents, articles [13,83] and a small number of and even virtual communities such as Facebook additional papers that were added groups [39, 88]. Indeed, the concept of commu- manually [86]. Each node is colored according nity is one of everyday familiarity. We are all to community membership, which was connected to relatives, friends, colleagues, and determined using a leading-eigenvector acquaintances who are in turn connected to each spectral method followed by Kernighan-Lin other in groups of different sizes and cohesions. node-swapping steps [64,86,107]. To The goals of studying social communities have determine community placement, we used the aligned unknowingly with the statistical physics Fruchterman-Reingold graph paradigm. As sociologist Mark Granovetter wrote visualization [45], a force-directed layout in his seminal 1973 paper[51] on weak ties, “Large- method that is related to maximizing a quality scale statistical, as well as qualitative, studies offer function known as “modularity” [92]. To apply a good deal of insight into such macro phenomena this method, we treated the communities as if as social mobility, community organization, and they were themselves the nodes of a political structure... But how interaction in small (significantly smaller) network with groups aggregates to form large-scale patterns connections rescaled by inter-community eludes us in most cases.” links. We then used the Kamada-Kawaii Sociologists recognized early that they need- spring-embedding graph visualization ed powerful mathematical tools and large-scale algorithm [62] to place the nodes of each data manipulation to address this challenging individual community (ignoring problem. An important step was taken in 2002, inter-community links) and then to rotate and when Michelle Girvan and Mark Newman brought flip the communities for optimal placement graph-partitioning problems to the broader atten- (including inter-community links). See the tion of the statistical physics and mathematics main text for further details on some of the communities [48]. Suddenly, community detec- ideas in this caption. (We gratefully tion in networks became hip among physicists acknowledge Amanda Traud for preparing this and applied mathematicians all over the world, figure.) and numerous new methods were developed to try to attack this problem. The amount of research in this area has become massive over the past seven years (with new discussions or algorithms posted on the arXiv preprint server almost every day), and the study of what has become known
October 2009 Notices of the AMS 1083 1084 uermiseuie n a lrf oeissues some of clarify notions the can through one elusive, remains ture n oua opnns[9 8 1] nthis In 117]. 48, [39, components modular and hierarchical of set complicated potentially a encom- passes structure community network’s a general, scmuiysrcuei o n ftemost the of one now p is structure community as oigfrhrotadcliae ntefinal the in culminates outward farther Moving rctra-enodmto 4] oe are Nodes [45]. method Fruchterman-Reingold oietaeso ewr cec 2,3,110]. 39, [26, science network of areas rominent loih otentok oe r grouped are Nodes network. the to algorithm edn-ievco pcrlmxmzto of maximization spectral leading-eigenvector ahdlnssprt ieetcommunities, different separate lines dashed eut fapyn hscommunity-detection this applying of results enga-i oesapn tp sethe (see steps node-swapping Kernighan-Lin lhuhargru oino omnt struc- community of notion rigorous a Although oigotadfo h etro h ring. the of center the from outward moving icsini h antx) Bto)Polar (Bottom) text). main the in discussion rmtdteognzto’ rau) The breakup). organization’s the prompted ooe lc rwiedpnigo their on depending white or black colored ae lbalain(fe disagreement a (after affiliation club later notecmuiisidctdi h top the in indicated communities the into ewr notogop ietclt the to (identical groups two into network oriaednrga ersnigthe representing dendrogram coordinate ae.Oecnseteiiilslto the of split initial the see can One panel.
culmmesi ftenwcus by clubs) new the of membership actual
iue2 Tp h ahr aaeClub Karate Zachary The (Top) 2. Figure 31 33
30 34
27 24 23 25
n 21 26 tok[3] iulzduigthe using visualized [131], etwork
atto ftentokit four into network the of partition
ouaiy[6 ihsubsequent with [86] modularity 2 19 8
hc eedtrie sn a using determined were which
16 29 15 32 modules
10 5