Scale-Free Networks the Impact of Fat Tailed Degree Distribution on Diffusion and Communication Processes

WI – State-of-the-Art Scale-Free Networks The Impact of Fat Tailed Degree Distribution on Diffusion and Communication Processes works [ErRe59]. ER-networks are random like the degree distribution, the clustering in the sense that the links between the coefficient and the average path length will The Authors nodes of a network are created with the be presented to enable a better understand- help of a random variable. Recently, fol- ing of the two network models “ER” and Oliver Hein lowing the introduction of what were “scale-free”. Diffusion processes in net- called “scale-free” networks by Barabasi works depend very much on network to- Michael Schwind and Albert [BaAl99], the field has received pology. Both network models will be ex- Wolfgang Ko¨nig growing attention from scientific research. amined in terms of their different diffusion Scale-free networks seem to match real behavior. Finally, the potential impact for Dipl.-Inform. Oliver Hein world applications much better than ER- real-life applications will be shown by the Dipl.-Wirtsch.-Ing. Michael Schwind network models [Bara03]. The term scale- introduction of a new application of scale- Prof. Dr. Wolfgang Ko¨nig free refers to the distribution principle of free networks: The simulation of the impli- Johann Wolfgang Goethe-Universita¨t how many links there are per node. cations of network topology for the price Institut fu¨r Wirtschaftsinformatik Mertonstr. 17 This article is intended to provide an in- building process in a security trader net- 60325 Frankfurt am Main troduction into the metrics of modern network that relies on a communication net- {ohein, schwind, koenig}@is-frankfurt.de work analysis. The basic network measures work. Executive Summary 1 Introduction The study of network topologies provides interesting insights into the way the principles on which the construction of connected systems are based influence diffusion dynamics and communication processes in many socio-technical systems. The widespread presence of networked systems in technical applications as well as & Empirical research has shown that there are principles for the construction of social net- in the business world makes network re- works and their technical derivatives, like e-mail networks, the Internet, publication co- search useful for future applications in in- authoring, or business collaboration. formation systems (IS) research. Research & Such real world networks attach new members over time and the mode of attachment into networks means not just analyzing the prefers existing members that are already well connected. This principle is called “prefer- topology of networks, but also examining ential attachment” and leads to the emergence of “scale-free” networks. the dynamics of the processes that take & Scale-free networks seem to be a better fit for the description of real world networks than place in them. Research into diffusion pro- the random networks used so far. Their behavior in terms of diffusion and communication cesses in networks has been shown to pro- processes is fundamentally different from that of random networks. vide especially fruitful insights into several & To illustrate the value of scale-free networks for applications in information systems re- kinds of IS domains such as e.g. informa- search, examples will be given to illustrate their usefulness for real world network model- tion technology (IT) robustness in system ing. A communication network of security traders will show what impact network topol- failure situations, denial-of-services at- ogy has on the dynamics of complex socio-technical systems. tacks, and computer virus spreading. For decades network research was domi- Stichworte: Random Networks, Scale-Free Networks, Communication Networks, Socio- nated by random networks, also called Er- Technical Networks, Diffusion do¨ s and Renyi networks or just ER-net- WIRTSCHAFTSINFORMATIK 48 (2006) 4, S. 267–275 268 Oliver Hein, Michael Schwind, Wolfgang Ko¨nig bold edges) with four friends. Out of six possible friendship relationships (dashed lines) between the four friends (grey cir- cles), only two friendships exist (bold edges). Using the ratio between existing and possible relations, a clustering coefficient may be computed. If a node has z nearest neighbors, a maximum of zðz À 1Þ=2 edges is possible between them. Watts and Stro- gatz defined the clustering coefficient for node v as the ratio of the number i of existing nodes to the possible number of edges between the direct neighbors of node v [WaSt98]: 2i Figure 1 The Basic Concept of a Network Degree Distribution Cv ¼ : ð1Þ zðz À 1Þ As for the degree distribution, the clustering coefficient plays an important role The article is intended to emphasize the 2.1 Basics of Network Analyses when analyzing networks in terms of im- importance of choosing the right network portant properties like diffusion, which model for the right purpose. The funda- In an undirected network graph, the degree will be discussed in subsequent paragraphs. mental differences between ER- and scale- of a node is defined as the number of edges The path length between two nodes of a free networks will become apparent in the it possesses. Figure 1a shows an example of network is defined as the number of edges sample application presented. a node with degree 4. If all the nodes of a between them. The minimal path length is network that share the same degree are the shortest path between two nodes. The counted and the results are sorted by in- average path length is the average of all the 2 Network Models creasing degree a function like that in Fig- minimal path lengths between all pairs of ure 1c is most likely to occur. There are no nodes in a network. nodes with degree zero, because these This paragraph presents ER- and scale-free nodes would not be connected to the net- networks as well as their main characteris- 2.2 ER-Networks work. The network in figure 1c exhibits a tics and the processes by which they are majority of nodes with degree one to three Since the seminal paper of Erdo¨ s and Renyi created. ER-networks, because of their di- and rather fewer nodes with a degree great- in 1959, the random network theory domi- minishing importance for modeling real er than seven. Chart 1b is idealized; the de- nated scientific thinking [Bara03]. Real world networks, are only briefly intro- gree distribution of networks does not world networks had been thought to be duced to show their important differences have to be continuous. too complex to understand and therefore from scale-free networks. The basics of If one node of the network in figure 1c held to be random. In the absence of other network analysis are discussed first to en- were randomly chosen, the probability of well-understood network models, random able a better understanding. obtaining a node with only one to three networks were widely used when modeling edges is much higher than that of obtaining networks. a node with a degree higher than seven. The process of creating an ER-network Therefore it is possible to define a prob- depends on probability p. For a network ability distribution function Pðj; v; NÞ that with n nodes each possible pair of distinct returns the probability that node v has j nodes are connected with an edge with edges within network N [DoMe03]. The probability p. concept of the degree distribution of a net- An ER-network has the property that work has important consequences for the the majority of nodes have a degree that is properties of a network and will be fre- close to the average degree of the overall quently used in the paragraphs that follow. network and that there is not much devia- It will be seen that deviation from the nor- tion from the average below and above it. mal distribution will lead to new results in It has been shown that the distribution of terms of diffusion within networks. links follows a Poisson distribution (fig- Clustering within networks is another ure 3). Knowing the degree distribution, important factor when analyzing networks. the average path length, and the clustering It is interesting to know how well nodes coefficient of ER-networks, it is feasible to are interconnected within a specified area analyze their different behavior compared of the network. A similar task is the ques- to scale-free networks. Figure 2 The Bold Node is Connected tion how many of a person’s friends also ER-networks are still used in some types with Bold Edges to its Four Direct know each other. Figure 2 presents an ex- of models. The following chapters will Neighbors ample of a person (bold node with four show that ER-networks may not be applic- WIRTSCHAFTSINFORMATIK 48 (2006) 4, S. 267–275 Scale-Free Networks 269 able for every purpose, because they lack Since the important observation of Al- some of the properties of other classes of bert et al. [AlJB99], scientists have empiri- networks. cally analyzed many real world networks with the scale-free property. The physical structure of the internet (router level, do- 2.3 Scale-Free Networks main level, web links), social networks like e-mail networks, the structure of software When Albert et al. [AlJB99] started to map modules, and many more examples show the internet in 1999, they did not know surprising scale-free structures (table 1). that they were about to influence network g is the degree exponent and ranges be- research in a sustained way. Because of the tween 1.8 and 2.5. The factor is derived by diverse interests of every internet user and counting the number of edges per node the gigantic number of web pages, the lin- and plotting the results by increasing de- kages between web pages were thought to gree within a chart with log-log-scale.

Load more