Generalization of Effective Conductance for Egonetworks∗

Heman Shakeri1, Behnaz Moradi-Jamei2, Pietro Poggi-Corradini3, Nathan Albin3, and Caterina Scoglio1 1Electrical and Computer Engineering Department, Kansas State University, Manhattan, Kansas, USA 2Department of Statistics, Kansas State University, Manhattan, Kansas, USA and 3Department of Mathematics, Kansas State University, Manhattan, Kansas, USA (Dated: October 9, 2018) We study the popular centrality measure known as effective conductance or in some circles as information centrality. This is an important notion of centrality for undirected networks, with many applications, e.g., for random walks, electrical resistor networks, epidemic spreading, etc. In this paper, we first reinterpret this measure in terms of modulus (energy) of families of walks on the network. This modulus centrality measure coincides with the effective conductance measure on simple undirected networks, and extends it to much more general situations, e.g., directed networks as well. Secondly, we study a variation of this modulus approach in the egocentric network paradigm. Egonetworks are networks formed around a focal node (ego) with a specific order of neighborhoods. We propose efficient analytical and approximate methods for computing these measures on both undirected and directed networks. Finally, we describe a simple method inspired by the modulus point-of-view, called shell , which proved to be a useful tool for .

The concept of information centrality was first intro- where σ(e) > 0, is the conductance of the edge e. Thus duced in [1] and was later reinterpreted in terms of elec- modulus is a constrained convex optimization problem trical conductance in [2]. Given a network G = (V,E) that has a unique extremal density ρ∗ when 1 < p < . and a node a V , the effective conductance centrality of This point of view allows for much more flexibility, be-∞ a is defined as∈ cause it can be applied to a variety of different families of X 1 objects: walks, cycles, trees, etc, and also works when the eff(a) := . (1) underlying network is directed or weighted. Moreover, C eff(a, b) b V a ∈ \ R modulus has very useful properties of Γ-monotonicity and countable subadditivity. where (a, b) is effective resistance between a eff For undirected networks the effective conductance be- and b.R Note that this measure considers every possible tween a and b is connected to Mod (Γ(a, b)) as follow that electrical current flow might take from a to an 2 [6, 7] arbitrary sink b. The situation can be clarified by introducing the no- 1 = Mod (Γ(a, b)). (5) tion of modulus of families of walks. This is a way of (a, b) 2 measuring the richness of certain families of walks on a Reff network (and beyond, see [3–5]). Given two nodes a and In the following, we reproduce a proof for this connec- b we may consider the connecting family Γ(a, b) of all tion and how to calculate Mod2(Γ(a, b)) in symmetric walks γ from a to b. Then, given edge density ρ : E R networks using the pseudoinverse of the Laplacian. → for p [1, ], we define `ρ (Γ) := minγ Γ `ρ (γ) where Let F be the set of all unit flows f : E R that satisfy ∈ ∞ ∈ → `ρ(γ) is the ρ-length of a walk γ: Kirchoffs node law and pass through a network G from X a to b. Namely for v V `ρ (γ) := ρ (e) . (2) ∈  e γ 1 v = a ∈  The p-modulus of Γ is defined as ( .f)(v) = 1 v = b ∇ − 0 otherwise Modp (Γ) := min Energyp (ρ) (3) `ρ(Γ) 1 ≥ corresponds to the injected currents at each node. The Namely, we minimize the energy of candidate edge- arXiv:1705.02703v2 [physics.data-an] 26 Jul 2018 energy of f is densities ρ subject to the ρ-length of every walk in Γ X being greater than or equal one, i.e., `ρ(Γ) 1. These Energy(f) := (e)f(e)2 ≥ R densities can be interpreted as costs of using the given e E edge. The energy we consider is ∈ X where (e) = 1 is the resistance of edge e. A unit Energy (ρ) = σ(e)ρ (e)p , (4) R w(e) p current flow i F is a unit flow that also satisfies Ohm’s e E ∈ ∈ law, i.e., there is a function V : V R (called a poten- tial) such that for every edge (a, b):→

∗ Corresponding author: [email protected] (a, b)i(a, b) = V(b) V(a). R − 2

Let U : V R be a potential function. We I. EGOCENTRIC EFFECTIVE CONDUCTANCE → can define a density ρU as the gradient of U, i.e., for the CENTRALITY edge e = v, w { } U U As mentioned above, eff(a) is sociocentric in the sense ρU(e) := u w C | − | that it considers all walks from a to an arbitrary node in Then, ρU is admissible for walks from a to b, whenever G. However, in practice, it can be prohibitive to scale U(a) = 0, U(b) = 1. sociocentric methods to very large networks. Moreover, Conversely, if ρ is an admissible density, then we can in real-world situations it is not feasible to have access define a potential U(x) as the infimum of `ρ(γ) over all to the entire network. Rather, one can at best know walks from a to x. With this definition, ρU = ρ, see [7]. local information up to a few neighborhood levels. For In particular, assuming each edge has a unit resistance, instance, when data is anonymized to protect privacy of network entities, identifying the sociocentric picture is X 2 X Energy(ρU) = ρU(e) = U(u) U(w) . impossible, e.g., sexual networks may be limited to the | − | e E e= u,w E number of contacts of individuals. ∈ { }∈ An alternative approach is to consider measures that Hence, if we substitute U with V + C, where V eff(a,b) are adapted to egonetworks (also known as neighborhood is the electric potential when a unitR current flow i F networks). An ego network Ga(r) around a node a is ∈ is passing through the network with source a and sink b constructed by collecting data (nodes and edges) start- and the effective resistance between a and b is eff, then, ing from the ego a and searching G out to a predefined R order of neighborhood r 1, (a) ; where (a) is the T 1 ∈ { ··· } Mod (a, b) = min ρUρU = . eccentricity of node a or the maximum distance from a 2 U (6) a=0 (a, b) U eff b=1 R to nodes in G. Egonetworks are often preferred because they support By Kirchhoff’s law of current conservation: more flexible data collection methods [9] and often in- X volve less expensive computation costs. Egocentric mea- a (V V ) = ( .i)(i) i,j i − j ∇ sures are more stable [10] against network sampling and j reliable (less sensitivity) with measurement errors [11]. N N where A = [aij] R × is the of G, We concentrate on unweighted (binary) networks to sim- ∈ with aij = 1 if and only if i, j E. In matrix form: plify the algebra, although, all of our methods and dis- ∈ cussions can be easily generalized for weighted networks. LV = I (7) Thus, we let d(a, b) denote the shortest-path distance be- tween two nodes (smallest number of hops). The neigh- where L is the Laplacian matrix of G and I = .i. Be- borhood structure around an ego a is described by the cause V is defined up to an additive and the nullspace∇ of shells of order k: L is along the constant vector, we ground an arbitrary node k and thus reduce L by removing kth row and col- S(a, k) := y V : d(a, y) = k , umn denoted by kL [8]. Now we can find solve (7): { ∈ } and the corresponding families of walks Γ(v, S(a, k)), k k 1 k V = ( L)− I. consisting of simple walks that begin at ego v V and reach S(a, k) for the first time. Modulus allows∈ a quan- k 1 we denote ( L)− by (reduced conductance matrix) tification of the richness of the family of walks, i.e., a G and obtain effective resistance between nodes a and b is family with many short walks has a larger modulus than k k a family with fewer and longer walks. Here we consider eff(a, b) = Va Vb R − (8) shell modulus Mod2(v, S(a, k)) which quantifies the ca- = + 2 Ga,a Gb,b − Ga,b pacity of walks emanating from the ego up to the shell and from (6): S(a, k) [5] without having to account the data outside Ga(k). 1 Mod (a, b) = ( + 2 )− (9) 2 Ga,a Gb,b − Ga,b Theorem 1. For undirected networks, we can calcu- Therefore, using (5), we can rewrite the effective con- late 2-modulus of Γ(v, S(a, k)) analytically without going ductance centrality in (1) in the Modulus language through the optimization problem in (3): PSs 1 1 X 1 + xs − (a) = Mod (Γ(a, b)). (10) j=S1 xi Ceff 2 Mod2(a, S(a, r)) = (11) b V a xs ∈ \ PSs 1 For the rest of this paper, we consider p = 2 due to its where xi = − ij . j=S1 G physical interpretation as effective conductance as well as Proof. Similar to (5), to find Mod(a, S(a, r)) in Ga(r), computational advantages, for instance, in this case (3) we solve Kirchhoff’s law of currents is a quadratic program. Moreover, the right-hand side a V I also makes sense on directed networks. L(r) = (12) 3

VS = c

RS1 1

RS R2 2

a R1 RS3 1 ) 1) a, r a, ( ( S R S 3

R Ss z (a) (b) (c)

FIG. 2. (a) Davis southern women [12]. (b) FIG. 1. Interpreting Mod2(a, S(a, r)) as finding effective con- Social network of bottlenose dolphins [13]. (c) Jazz musicians ductance between grounded node a and nodes with the same network [14]. Node sizes are scaled with the egocentric version potential c in S(a, r) in an electrical network. Solution follows of effective conductance centrality computed by (15). The from the corresponding Laplacian system. ranking is unchanged when using the sociocentric version (1) where Lv is the Laplacian matrix of Ga(r) and I is the (r) V applied external current vector with values 1 at ego and and since a = 0 (grounded): for nodes in S(a, r) PSs 1 1 1 + xs j=−S T 1 xi 1 IS = 1 (13) Mod2(a, S(a, r)) = . − xs and zero for other nodes (see Figure 1). Nodes in S(a, r) have similar electric potential c. The above problem has a unique harmonic solution for The convex optimization problem in (3) involves a V up to a constant, we ground the potential at ego, i.e., V quadratic minimization. In the undirected case, comput- a = 0 and find other nodes potentials by ing the pseudoinverse of the Laplacian in (11) involves V = I solving a Laplacian system. In both cases, algorithms G and technique are still improving and advancing. How-   1 where = aLv − is the reduced conductance matrix. ever, graphs with more than a million edges may become G (r) untractable. Combining (12) and (13) We propose the following egocentric version of (a) Ceff  V  using shell modulus: V   0  2 0 2 . . .  .  . r  .   .   .  . X  .   .   c  . (a, r) := Mod (v, S(a, k)) (15)   I  I   shell 2  c   S   S1  1 C = 1  c  = = x (14) k=1   I   I     c  G  S2  → S G 1      2     .   .   .  . This shell modulus centrality follows the same logic as  .   .   .  . I  c  (10) but only requires the egocentric network data. For c Ss PSs−1 I 1 1 j undirected networks, we can analytically compute (15) − − j=S1 using Theorem 1. PSs 1 where xi = − ij. If S = s and for i In Figure 2, of nodes in three small net- j=S1 G | | ∈ S1, ,Ss 1 works are computed, by considering shell(v, r) with r = { ··· − } C c diam(G). In Figure 2(a-c), node sizes are scaled with I = their C (v, r) values and the computed centralities i x shell i give, as expected, the same ranking as effective conduc- From (14): tance. In general, (10) requires V modulus computations in c | | S 1 = xSs all of G, while (15) only needs r modulus computations P s 1 a 1 c j=−S in G (r). − − 1 xi Shell modulus centrality can handle fairly large net- x works, e.g. 100,000 edges. The algorithm used here com- c = − s PSs 1 1 putes (3) using an active set dual method quadratic pro- 1 + xs − j=S1 xi gramming [15]. It’s theoretically enough to consider at and the effective resistance between a and S(a, r): most E active constraints [16]. Violated (active) con- straints| are| found using Dijkstra’s algorithm and the con- V xs a,S(a,r) = v c = straint matrix is updated using the Cholesky decomposi- PSs 1 1 R − 1 + x − tion. s j=S1 xi 4

In the following, we focus on approximating (15) effi- B. Bounding from below ciently, while incorporating most of the benefits of shell modulus in a scalable framework. To provide a lower bound for shell modulus, we focus on geodesic paths (shortest walks). These are usually the most important pathways of influence between the A. Bounding from above ego and other nodes. Classical measures of centrality, such as closeness centrality and betweenness centrality, First, we provide an upper bound that is known in are based uniquely on shortest paths [19]. the complex analysis literature as Ahlfors estimate [17, When collecting the egocentric data around an ego a, Chapter 4, Equations 4-6], and in the context of electrical one can take care to avoid forming cycles, and the result- networks goes under the name of Nash-Williams inequal- ing egonetwork becomes a tree. So assuming T a(r) is a a ity [18]. Given an egonetwork G (r), we consider the set tree contained in Ga(r), we can use Γ-monotonicity to get of edges that connect a shell S(a, k 1) to the next shell − a lower bound, i.e., if Γ0 Γ, then Mod2 (Γ0) Mod2 (Γ) S(a, k), for k 1, , r : [5]. ⊂ ≤ ∈ { ··· } a E(a, k) := e = x, y E x S(a, k 1), y S(a, k) . Moreover, if we write Mod2(T (r)) for the shell modu- { { } ∈ | ∈ − ∈ } lus of all walks in T a(r) starting at the root a and reach- We call the sets E(a, k) shell connecting sets. Since ing depth-level r, this can be analytically calculated. Mod2(v, S(a, r)) is a minimization problem (3), we get an upper bound simply by choosing an appropriate ad- Theorem 3. T a(r) can be calculated using the following missible densityρ ¯. Here, we pick the best admissible recursive formula. density that is constant for all edges in each shell con- necting set. After computing the minimized energy of a X Mod2(Tc,k 1) Mod2(T (k)) = − (18) this density, we obtain the following upperbound: 1 + Mod2(Tc,k 1) c C(a) − ∈ Theorem 2 (Ahlfors upperbound). Shell modulus is bounded by the following inequality where C(a) := c1, c2, ..., cm V are the children of and represents{ the} subtree ⊆ formed from by 1 a Tc,k 1 Ta keeping only− and its descendants. Mod2(a, S(a, r)) Pr 1 . (16) c ≤ k=1 E(a,k) | | To prove Equation (18), let Ta be a rooted shortest tree Proof. Since (3) is a minimization problem, an upper at a with vertex set V , and edge set E. Every density bound for the shell modulus Mod2(a, S(a, r)) can be ρ : E [0, ) gives a weighted distance on the tree found by picking an appropriate densityρ ¯. Here we will defined→ by ∞ restrict ourselves to densities that are constant on the shell connecting sets E(a, k). Let X dρ(x, y) = ρ(e) ρ¯(e) := xk if e E(a, k). e γ(x,y) ∈ ∈ Then we solve the following minimization problem: a We define the set of admissible densities Adm(Tk ), for r walks starting from root a (ego) to leaves at depth k, X 2 minimize θkxk x denoted by lk k=1 r (17) a X Adm(Tk ) := ρ : E [0, ): `ρ(a, lk) 1 . subject to xk = 1 { → ∞ ≥ } k=1 with modulus where θk := E(a, k) . By Cauchy-Schwarz inequality a X 2 | | Mod2(Tk ) := inf ρ(e) 2 2 ρ Adm(T ) r ! r ! r r ∈ a,k e E X X 1 p X 1 X 2 ∈ 1 xk = θkxk θkxk ≤ √θ ≤ θk k=1 k=1 k k=1 k=1 Assuming a has at least one child, let C(a) := c1, c2, ... V be the children. Each child c induces and thus the minimum in 17 is greater than { } ⊆ a   1 two rooted subtrees (Figure 3). Let Tc represent the Pr 1 − . However, when x takes the form: k=1 θk subtree (still rooted at a) formed from Tv by pruning all of as children other than c along with their descendants, C xk = , and let Tc represent the subtree (now rooted at c) formed θk a by removing a from Tc . the minimum is achieved for The following lemma is an immediate consequence of 1 the parallel rule of modulus: Given two families Γ1 and C = Pr 1 . Γ2, suppose that e E and γ1 Γ1 and γ2 Γ2 ∈ ∈ ∈ k=1 θk we have (γ , e) (γ , e) = 0, then Mod (Γ Γ ) = N 1 N 2 2 1 ∪ 2 Mod2 (Γ1) + Mod2 (Γ2). 5

a Tc3,r and, thus

a a 2 Mod2(Tc,k) = inf ρ(a, c) + 0 ρ(v,c) 1{ ≤ ≤ Mod20 (Tc,k 1 : 1 ρ(v, c)) − c1 c2 c3 cm r − } (20) = inf ρ(a, c)2+ 0 ρ(a,c) 1{ ≤ ≤ 2 ··· (1 ρ(k, c)) Mod2(Tc,k 1) − − } Tc ,r 1 Tc ,r 1 Tc ,r 1 Tc ,r 1 1 − 2 − 3 − m − The infimum, given by (19), is attained when

a FIG. 3. The tree T and its subtrees. Each child ci of a can Mod2(Tc,k 1) r ρ(a, c) = − induce two subtrees–if it has descendants until depth r − 1. 1 + Mod2(Tc,k 1) a − Tci,r (outlined with a dashed line for i = 3 in the figure) is the subtree rooted at a formed by removing all other children a and their descendants from Tr . Tci,r−1 is the subtree rooted a at ci formed by removing a from Tci,r . Lemmas 4 and 5 combined prove Equation (18). a Equation (18) computes Mod2(T (k)) recursively. For

each leaf node lk, set Mod2(Tlk,0) = . Then (18) will a propagate the modulus to the ego. For∞ example, to com- Lemma 4. The modulus of Tk is related to the moduli a pute Mod2(Ta,2) in the graph in Figure 4(b), we start by of the Tc ,k as follows. i assigning for modulus of the leaves e and f. Then, by ∞ m (18), each contributes 1 to node b, and Mod2(Tb,1) = 2. a X a a Mod2(Tb,1) 2 Mod2(Tk ) = Mod2(Tc ,k). Thus Mod2(T (2)) = = . i 1+Mod2(Tb,1) 3 i=1

By Lemma 4, we may restrict ourselves to the case II. SHELL DEGREE that a has a single child c. In this case, the serial rule for modulus allows us to reduce the problem to finding In conclusion, Ahlfors’ upper bound (16) considers all the modulus of Tc,k 1. This is explained in the following − edges in the shell connecting sets even if they are not lemma. on the shortest paths, such as edge a d in Figure 4(a). On the other hand, when using the ego-tree− approxima- Lemma 5. The modulus of T a is related to the modulus c,k tion, we inevitably lose valuable information hidden in of Tc,k 1 as follows. − the edges that where discarded. For example, in Figure 4(b-c), to form a tree we need to solve the child cus- a Mod2(Tc,k 1) Mod2(Tc,k) = − (19) tody problem between parents b and c and child f. In 1 + Mod2(Tc,k 1) − particular, the lower bound calculation will discard at least one edge. Moreover, this leads to multiple possible Proof. If c is a leaf of T a , then ρ(a, c) = 1 is the min- 2 k lower bounds, e.g., Mod2(Ta,r) = 3 in Figure 4(b) and imizer for the modulus. Otherwise, by considering the Mod2(Ta,r) = 1 for Figure 4(d). density, ρ(v, c), on the edge from a to c, the optimization As a compromise between the Ahlfors upper bound and effectively decouples. In order for ρ to be admissible, it the tree modulus lower bound, we propose a measure is necessary that dρ(c, l) 1 ρ(a, c) for every leaf lk 1 we call shell degree. Fix a depth i = 1, 2, 3, . . . , r and ≥ − − of Tc,k 1 at depth k 1. For 0 ` 1, define the consider a tree rooted at the ego a, whose leaves are all − − ≤ ≤ parameterized set of admissible densities, for every leaf contained in the shell S(a, i), and such that the geodesics lk 1 − from the root to S(a, i) take exactly i hops. Let H(a, i) = (Vi,Ei) be the union of all such trees found by breadth Adm(Tc,k 1; `) := ρ : E [0, ): d(c, lk 1) ` first search. For instance, in Figure 4(d) we show H(a, 2) − { → ∞ − ≤ } in that case. Note that we discarded nodes that are not and the parameterized modulus problem on the geodesic paths from a to S(a, 2). Since, in general, we cannot use the recursion (18) X 2 Mod0 (T ; `) = inf ρ(e) on H(a, r), we instead compute the upper bound (16). p c,k 1 0 − ρ Adm (Tc,k−1;`) ∈ e E(Tc) Namely, we consider the shell connecting sets Ei(a, k) ∈ for H(a, i) and define the generalized shell degree to be the following expression: where E(Tc,k 1) represents the set of edges in the sub- − tree Tc,k 1 . It is straightforward to verify that r − X 1 gDeg(a) := (21) 2 Pi 1 Mod20 (Tc,k 1; `) = ` Mod2(Tc) i=1 k=1 E (a,k) − | i | 6

in (rna). b e b e WeO illustrate the performance of shell degree compared to the Ahlfors upper bound and the Tree modulus lower a c f a c f bound for conventional random network models such as Erd˝os-R´enyi networks, scale-free (Barabasi-Albert model d d [20]), Spatial (geometric model in the unit square [21]), and small world (Watts-Strogatz model [22]). Figure 5 (a) (b) shows that shell degree gives a better approximation for shell(a, r) than the Ahlfors and Tree modulus estimates. b e b e C We see that for egocentric network data with medium a c f a c f sizes and order of neighborhood, shell degree performs extremely well. However, it is possible to produce patho- d d logical network examples for which all of the estimates (c) (d) for shell modulus get worse as n, r , see Appendix B for more details. → ∞ FIG. 4. (a) To compute the upper bound in (16), for ego a and depth k = 2, edge {a, c} has the same role as edge {a, d}. a (b) and (c) give different ways to obtain T2 . (d) shows the III. APPLICATIONS OF SHELL DEGREE FOR edges considered in shell degree. TARGETED IMMUNIZATION STRATEGIES

Targeted immunizations in computer networks and hu- Algorithm 1 Algorithm for computing summands in man populations can greatly impact the overall outcome (21). of spreading processes [23–25]. Mitigating an epidemic 1: D ← set of all descendants for each ancestor with random immunization of nodes, requires vaccinat- 2: r ← neighborhood order ing over 80% of the population and thus identifying a 3: k ← 1 good set of target nodes has attracted much attention r 4: for nodes in {S (a, k), k ≤ r} do [26, 27]. 5: Update D with nodes as new descendants Most of the methods for finding good sets of nodes to 6: Removing ancestors that do not have any descendants immunize require global knowledge of the network, mak- in nodes 7: k ← k + 1 ing them impossible to use in some practical situations. 8: end for Therefore, scientists prefer algorithms that are agnostic 9: return harmonic means of number of ancestral relations relative to the global structure of the network. For exam- in each k ple, acquaintance immunization chooses random neigh- bors of randomly picked nodes [28]. In what follows, we illustrate the immunization performance of the approxi- Observe that the first summand of (21) is the ordinary mation of the egocentric version of effective conductance degree of the ego and thus our formula acts as a gener- that we call shell degree. We assume r = 3, i.e., knowl- alization of degree which takes into account information edge of neighbors together with neighbors of neighbors about the shells around the ego. For example, we have are available. The efficacy of immunization is compared to the popular egocentric measure of acquaintance cen- E1(a, 1) = 3, E2(a, 1) = 2, E2(a, 2) = 3 in Figure 4(d). 1 trality, and to sociocentric indices such as effective con- For r = 2, gDeg(a) = 3 + 1 + 1 = 3 + 6/5 = 4.2. 2 3 ductance, and betweenness and eigenvector centrality. We illustrate the differences between (21) with (15), We consider the epidemic model “susceptible, infected, (16), and (18) in Table I for the egonetwork in Figure 4. recovered” (SIR) that represents infectious processes that We can compute the summands in (21) with Algorithm are not reversible. Susceptible nodes (S) in the network 1. Normalization is unnecessary for shell degree, as in the become infected (I) proportionally to the infectious rate case of degree, which is critical when comparing central- β and the number of infected neighbors, and eventually ity of different egos and there is no information about 1 they rest in state (R) after a recovery period of δ days on connections between their ego-networks. average (see Figure 6). We assume a constant δ = 0.1, In short, we keep track of ancestral relations from the i.e., nodes stay in state (I) an average of 10 days. To ego to nodes in each shells, and discard nodes that do not model widespread diseases such as the flu that are caused have any descendants in shell r; leading to required in- by close contacts, the infectious rate β is chosen to have β formation about H(a, r) and thus we can find summands reproduction number R0 δ k = 3, where k is the in (21). The overall time complexity of calculating (21) average degree of the network∼ h [27].i After updatingh i the depends on the graph search in step 4 of Algorithm 1 and contact networks with the immunized nodes, we assess keeping the information of ancestral relationships, i.e, for the performance of each strategy. In our experiments, all a an ego network G (r) size na, algorithm performance is nodes are initially susceptible and the infectious process 7

TABLE I. Examples for Shell modulus, bounds and shell degree. Quantity k = 1 k = 2 k = 3 total b b e b e g

a c a c f a c f h

d d d Mod(a, S ) k 3 1.26 0.44 4.71 b e b e g

a c f a c f h

d d Lowerbound 3 0.66 0.4 4.06 b e b e g

a c f a c f h

d d Upperbpund 3 1.5 0.85 5.35 b e

a c f

d Shell Degree 3 1.2 0.4 4.6 starts from a randomly chosen patient zero. The per- Mann-Whitney test with α = 0.05 [31] and statistically formance of immunization strategies are monitored by non-significant conclusions are shown by shaded colors. measuring the epidemic final size, i.e., number of nodes in state (R) after there is no more (I) nodes. Effective conductance and betweenness centralitities perform better than shell degree for small immuniza- We simulate the process 2000 times for each immuniza- tion coverages. However, using sociocentric centrality tion strategy and each immunization coverage. The sim- measures to design targeted immunization strategies can ulations are done with GEMFsim, that employs event- overlook an important issue, namely, that after removing based exact stochastic simulation [29] for US power a fraction of the nodes in the network, the initial ranking grid and PGP networks, and the friendship network by these measures is no longer valid. On the other hand, for Princeton University extracted from Facebook [30]. this is not as dramatic for egocentric measures such as Salathe et. al. [27] suggest considering interactions of in- degree, acquaintance, and the egocentric version of effec- dividuals in the same dormitory or same year and major, tive conductance and the resulting ranking is more robust for the Facebook friendship networks, to capture poten- after changes in the network [10]. Sociocentric measures tial physical networks–this makes the networks extremely generally struggle with this fact and thus searching for modular. a good egocentric measure is critical. Therefore, with In Figure 7, each bar shows the difference of number increasing immunization coverage, shell degree performs of cases in the outbreak immunized with the two strate- better (or similarly) compared to other methods. For gies shown on y-axis for a network. Positive difference strongly modular networks, e.g. the Princeton friendship (shown in red) means the alternate strategy performs network, closeness centrality measures are generally less better than shell degree and negative difference (shown efficient compared to betweenness centrality measures. in blue) means the shell degree gives better immunization However, in this case, shell degree is performing better and prevents more cases. We test the significance of com- than both eigenvector centrality and acquaintance immu- parisons of the obtained results using the nonparametric nization. 8

Erdos-R˝ enyi´ network Scale free network 100 400 Ahlfors’ upperbound Shell degree 350 80 Shell modulus Tree modulus 300

250 60

200

40 150 Computed centrality Computed centrality 100

of a randomly choosen20 node of a randomly choosen node 50

0 0 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Network Size 100 Network Size 100 × ×

(a) (b)

Random Small world network 500 70

60 400

50 300

40

200 30 Computed centrality Computed centrality

of a randomly100 choosen node of a randomly choosen node 20

0 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Network Size 100 Network Size 100 × ×

(c) (d)

FIG. 5. Comparing the value of the Ahlfors upper bound, Tree modulus lower bound, Shell degree, and Shell modulus in simulated random network models (a) Erd˝os-R´enyi networks with p = 2 log n/n, (b) Scale free network by Barabasi and Albert model [20] with 6 edges . (c) Spatial network (random geometric network [21]) with distance threshold p value r = 2 log n/n and small world network by Watts-Strogatz model with initial degree of 2 log n and rewiring probability 0.3. Shell degree is providing a fair estimate of shell modulus in these networks.

IV. CONCLUSIONS βYi δ S I R In summary, we studied effective conductance central- ity in the language of modulus of families of walks and in the context of egocentric networks. We compared our FIG. 6. Schematic of the transition graph of node i in SIR method to its well-known sociocentric counterpart and moel. The infection and recovery rates are denoted by β and illustrated the advantages of our approach. For undi- δ and Yi is the number of neighbors in the infected state I. rected networks, shell modulus can be computed by solv- ing a Laplacian system similar to [33]. Moreover, for di- rected multi-edge networks, we propose approximations that carry the same benefits of the original definition while being easier to compute and scalable. Finally, we 9 )

eff 0 C

∆cases 200

(gDeg - −

0

1000 ∆cases − (gDeg - EV)

0

500 Grid ∆cases − PGP PR 1000 − (gDeg - acquaintance)

500

∆cases 0 (gDeg - BC)

0.01 0.06 0.12 0.18 0.24 0.3 Vaccination Coverage

FIG. 7. Comparing different immunization strategies with effective conductance, acquaintance, eigenvector centrality, and betweenness centrality with the approximation of the egocentric version of effective conductance, that we call shell degree, with (r = 3). The immunization coverage varies from 1% to 30% of the highest central nodes. Bars show the difference of the final size of the epidemic outbreak. Negative differences show that shell degree prevents more cases compared to the other policy. By increasing the coverage, shell degree outperforms other methods, since it is more robust to changes in the network structure. Results are inferred using 2000 simulations of the SIR epidemic model and statistically nonsignificant results are shown by shaded bars. The empirical networks we consider are the US power grid (Grid) [22], the PGP network (PGP) [32], and a Facebook friendship network from Princeton university (PR) [30]. introduced a generalization of degree called shell degree. ACKNOWLEDGMENTS Applications of these tools illustrate the advantages of the proposed measures, for instance to guide epidemic mitigation strategies under limited knowledge of the over- Authors are thankful to NSF grants DMS-1515810 and all network. CIF-1423411.

[1] K. Stephenson and M. Zelen, Social networks 11, 1 [3] H. Shakeri, analysis using modulus of (1989). families of walks, Ph.D. thesis, Kansas State University [2] D. J. Klein and M. Randi´c,Journal of mathematical (2017). chemistry 12, 81 (1993). [4] H. Shakeri, P. Poggi-Corradini, N. Albin, and C. Scoglio, Phys. Rev. E 95, 012316 (2017). 10

[5] H. Shakeri, P. Poggi-Corradini, C. Scoglio, and N. Al- [19] L. C. Freeman, Social networks 1, 215 (1978). bin, Journal of Computational and Applied Mathematics [20] A.-L. Barab´asiand R. Albert, Science 286, 509 (1999). 307, 307 (2016). [21] M. Penrose, Random geometric graphs, 5 (Oxford Uni- [6] R. J. Duffin, J. Math. Anal. Appl. 5, 200 (1962). versity Press, 2003). [7] N. Albin, M. Brunner, R. Perez, P. Poggi-Corradini, and [22] D. J. Watts and S. H. Strogatz, nature 393, 440 (1998). N. Wiens, Conformal Geometry and Dynamics of the [23] R. Pastor-Satorras and A. Vespignani, Physical Review American Mathematical Society 19, 298 (2015). E 65, 036104 (2002). [8] P. Van Mieghem, K. Devriendt, and H. Cetinay, Physical [24] A. E. Motter and Y.-C. Lai, Physical Review E 66, Review E 96, 032311 (2017). 065102 (2002). [9] J. A. Carrasco, B. Hogan, B. Wellman, and E. J. Miller, [25] M. Zhao, T. Zhou, B.-H. Wang, and W.-X. Wang, Phys- Environment and Planning B: Planning and Design 35, ical Review E 72, 057102 (2005). 961 (2008). [26] Y. Chen, G. Paul, S. Havlin, F. Liljeros, and H. E. Stan- [10] E. Costenbader and T. W. Valente, Social networks 25, ley, Physical review letters 101, 058701 (2008). 283 (2003). [27] M. Salath´e and J. H. Jones, PLoS Comput Biol 6, [11] B. Zemljiˇcand V. Hlebec, Social Networks 27, 73 (2005). e1000736 (2010). [12] A. Davis, B. B. Gardner, and M. R. Gardner, Deep [28] R. Cohen, S. Havlin, and D. Ben-Avraham, Physical South: A social anthropological study of caste and class review letters 91, 247901 (2003). (Univ of South Carolina Press, 2009). [29] F. D. Sahneh, A. Vajdi, H. Shakeri, F. Fan, and [13] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, C. Scoglio, arXiv preprint arXiv:1604.02175 (2016). E. Slooten, and S. M. Dawson, Behavioral Ecology and [30] A. L. Traud, P. J. Mucha, and M. A. Porter, Physica Sociobiology 54, 396 (2003). A: Statistical Mechanics and its Applications 391, 4165 [14] P. M. Gleiser and L. Danon, Advances in complex sys- (2012). tems 6, 565 (2003). [31] H. B. Mann and D. R. Whitney, The annals of mathe- [15] D. Goldfarb and A. Idnani, Mathematical programming matical statistics , 50 (1947). 27, 1 (1983). [32] M. Bogu˜n´a,R. Pastor-Satorras, A. D´ıaz-Guilera, and [16] N. Albin and P. Poggi-Corradini, The Journal of Analysis A. Arenas, Physical review E 70, 056122 (2004). 24, 183 (2016). [33] W. Ellens, F. Spieksma, P. Van Mieghem, A. Jamakovic, [17] L. V. Ahlfors, Conformal invariants: topics in geomet- and R. Kooij, Linear algebra and its applications 435, ric function theory (McGraw-Hill Book Co., New York- 2491 (2011). D¨usseldorf-Johannesburg,1973) pp. ix+157, mcGraw- [34] D. Spielman, Lecture Notes, Yale University , 740 (2009). Hill Series in Higher Mathematics. [18] R. Lyons and Y. Peres, Probability on trees and networks, Vol. 42 (Cambridge University Press, 2016).

Appendix A: Ahlfors upper bound for Erd˝os-R´enyi networks

We want to estimate the expected Ahlfors upper bound in Erd˝os-R´enyi in the connected regime:

p(N 1) = 2 log N. − We can use the concavity property of Ahlfors bound and get

r ! r X 1 X 1 E Pi 1 ≤ Pi 1 i=1 k=1 θj i=1 k=1 Eθk we would like to estimate E(θk). First, note that θ is Binomial(N 1, p). So: • 1 −

Eθ1 = p(N 1), − from the binomial distribution.

Now, given θ1 we must toss θ1 variables distribute as Binomial(N 1 θ1, p), because the ego and the first shell • are now out of consideration. So − −

E (θ2 θ1) = θ1p(N 1 θ1). | − −

Therefore, computing the second moment of θ1 we get:

2 2 Eθ2 = E(E(θ2 θ1)) = E(θ1)p(N 1) pE(θ ) = p (1 p)(N 1)(N 2). | − − 1 − − − 11

Given θ1 and θ2 we must toss a certain number s of Binomial(N 1 θ1, p) random variables, where s is the • number of nodes in the second shell. However, this number s is not− easy− to calculate because it depends on the interaction at the previous step. For instance, if all the binomial variables in the previous step are equal to zero, then s = 0. But for higher values of s it becomes quite complicated. In particular, we will have

2 Eθ1 = log N and Eθ2 (log N) . '

1. Lower bound for E(θk)

First, we will estimate Eθk from below. Given an ego a, Spielman [34] sets  N  r(a) := max r : B(r, a) | | ≤ 12 log N and then shows that for k r(a), ≤   1 1.2 S(a,k) P S(a, k + 1) log N S(a, k) N − | |. | | ≤ 5 | | ≤

He first finds that

a 5 E [ S(a, k + 1) G (k)] S(a, k) log N, (A1) | | | ≥ 3| | and then applies the theory of Chernoff bounds. Note that by simply taking the expectation in (A1) we get 5 E S(a, k + 1) (log N)E S(a, k) . | | ≥ 3 | | This gives geometric growth for k r(a): ≤ k E S(a, k) (log N) . (A2) | | ≥ In our case, since every c B(a, k) must toss S(a, k) biased coins, we get 6∈ | |

a 11 11 E [θk+1 G (k)] = S(a, k) p(N B(a, k) ) S(a, k) pN = (log N) S(a, k) . | | | − | | ≥ 12| | 6 | | Again, we can take expectations and get 11 Eθk+1 (log N)E S(a, k) . ≥ 6 | | Using (A2), we get

k Eθk (log N) . ≥

2. Upper bound for Eθk

To get an upper bound we can compare the growth in the Erd˝os-R´enyi graph with the growth for a Galton-Watson branching process with offspring distribution X = Binomial(N 1, p). This will be larger because there are no − collisions and we always toss the maximum number of coins. If Zk is the population at time k, then

k EZk = µ where µ = EX = p(N 1) = 2 log(N) and we get that − k Eθk (2 log N) . ≤ 12

a. Upper bound for the Ahlfors estimate

We can apply this to our estimate of the average Ahlfors upper bound and get that:

 r  r X 1 X 1 E   Pk 1 ≤ Pk 1 k=1 j=1 θj k=1 j=1 Eθj r X 1 ≤ Pk 1 k=1 j=1 (2 log N)j r X  1  = (2 log N 1) 1 + − (2 log N)k 1 k=1 − " r # X 1 (2 log N 1) r + ' − (2 log N)k k=1 r   1   1 1 2 log N = (2 log N 1) r + −  − 2 log N 1 1 − 2 log N  1  = (2 log N 1) r + 1 − − (2 log N)r((2 log N) 1) − (2 log N 1)(r + 1) ' −

Appendix B: Behavior of shell modulus estimates when n, r → ∞

1. Modulus on the

Verifying that a metric ρ is extremal for p-modulus can be done using Beurling’s criterion (proof in [16]). Theorem 6 (Beurling’s Criterion for Extremality). Let G be a simple graph, Γ a family of walks on G, and 1 < p < . ∞ Then, a density ρ Adm(Γ) is extremal for Mod (Γ), if there is a subfamily Γ˜ Γ with ` (γ) = 1 for all γ Γ˜, such ∈ p ⊂ ρ ∈ that for all h RE: ∈ P ˜ X p 1 e E (γ, e)h(e) 0, for all γ Γ = h(e)ρ − (e) 0. (B1) ∈ N ≥ ∈ ⇒ ≥ e E ∈ The complete graph KN is a simple graph on N nodes, where every node is connected to each other, see Figure 8.

FIG. 8. K6- Complete graph on 6 nodes

Figure 9 depicts the extremal density ρ∗ for Γ(a, b) in KN . In formulas, ρ∗(a, x) = 1/2 = ρ∗(b, x) for every x = a, b, and ρ∗(a, b) = 1, otherwise ρ∗ is zero. To verify Beurling’s 6 criterion, consider the subfamily Γ˜ of simple paths consisting of a b and a x b for any x = a, b. We get that 6 1 N Mod (Γ(a, b)) = 1 + 2(N 2) and Mod (Γ(a, b) = . p − 2p 2 2 13

∗ FIG. 9. ρ for Γ(a, b) on KN

Take n complete graphs K1,...,Kn.

2. Modulus on a chain of complete graphs

a. Constant sizes For j = 1, . . . , n, assume that V (Kj) = N , and pick a pair of distinct nodes xj 1, yj V (Kj). Then, for j = 1, . . . , n 1, glue y V (K ) to x V| (K |). We denote the resulting graph by G(N,− n). ∈ − j ∈ j j ∈ j+1 For convenience, for j = 1, . . . , n, we write Aj := V (Kj) xj 1, yj , so that the shell at level j is Sj = V (Kj) \{ − } \ xj 1 = Aj yj . Then, fix m = 1, . . . , n, and for j = 1, . . . , m 1, define the following density on E(Kj): { − } ∪ { } − ∈   1 if e = x , y  m j 1 j  { − } 1 ρ∗(e) := if e = xj 1, a or e = yj, a for some a Aj  2m { − } { } ∈   0 otherwise

For j = m, and e E(K ), set ∈ m   1  m if e = xm 1, a for some a Am ym ρ∗(e) := { − } ∈ ∪ { }  0 otherwise

Observe that the support of ρ∗ can be decomposed as the disjoint union of N 1 paths. To see this, enumerate each N 2 − A = a − . Then, for k = 1,...,N 2, let j { j,k}k=1 − γm,k := x0 a1,k x1 a2,k xm 1 am,k. ··· − Finally set

γm,0 := x0 y1 xm 1 ym. ··· − N 2 One can check that Γ˜ = γ − is a Beurling subfamily for the shell modulus Mod (x ,S ). So { m,k}k=0 2 0 m 1 2m 2 1  N  1  1 Mod (x ,S ) = + (N 2) − + = 1 + , 2 0 m m − 4m2 m2 2m m − m2 which is roughly N/(2m). Also note that for m = 1 we recover the degree of x0. If we sum we get n n X N X 1 N Mod (x ,S ) log n. 2 0 m ' 2 m ' 2 m=1 m=1 The Ahlfors upper bound gives n n X 1 X 1 = (N 1) (N 1) log n. Pm 1 − m ' − m=1 j=1 N 1 m=1 − The generalized shell degree, gives n X 1 N + log n m 1 + 1 ' m=1 N 1 − − 14

b. Increasing sizes Now we repeat the construction above, but this time, setting k := V (K ) , we have k = j | j | 1 α1 + 2 and, for j = 2, . . . , n, we assume that kj = αj(kj 1 2) + 2, for an increasing sequence of positive integers n − − αj j=2. { Then,} fix m = 1, . . . , n, and for j = 1, . . . , m 1, define the following density on E(K ): − ∈ j  Qm k=j+1 αk  1+Pm Qm α if e = xj 1, yj  j=1 k=j+1 k { − }  −1 Qm 2 αk ρ∗(e) := k=j+1 1+Pm Qm α if e = xj 1, a or e = yj, a for some a Aj  j=1 k=j+1 k { − } { } ∈   0 otherwise

For j = m, and e E(K ), set ∈ m  1  Pm Qm if e = xm 1, a for some a Am ym 1+ j=1 k=j+1 αk ρ∗(e) := { − } ∈ ∪ { }  0 otherwise

Now form k 1 paths. Set m −

γm,0 := x0 y1 xm 1 ym. ··· −

kj 2 As before, enumerate each Aj = aj,k − . Now, km 2 = αm(km 1 2), so we can group the km 2 edges xm 1, a { }k=1 − − − − { − } for a Am into km 1 2 groups of αm edges. Each such group will then flow through a different node in Am 1, and ∈ − − − then we repeat. The claim is that this gives rise to a Beurling family of paths Γ.˜ By construction, they all have ρ∗ length equal to 1. We only need to check Beurling’s criterion. So suppose h RE satisfies ∈ ` (γ) 0 for all γ Γ˜. h ≥ ∈ P Then e E ρ∗(e)h(e) is equal to: ∈

m m 1 kj 2 km 2 X X− X− X− (ρ∗h)(xj 1, yj) + [(ρ∗h)(xj 1, aj,k) + (ρ∗h)(aj,k, yj)] + (ρ∗h)(xm 1, am,k). − − − j=1 j=1 i=1 i=1

Pm Qm And if we write α := 1 + j=1 k=j+1 αk, and collect terms, this equals     m m 1 m kj 2 k 2 − m− 1 X X− Y X X α− α h(xj 1, yj) +  αk [h(xj 1, aj,k) + h(aj,k, yj)] + h(xm 1, am,k) . − − − j=1 j=1 k=j+1 i=1 i=1 which is 0, because for every j = 1. . . . , m 1 ≥ − m Y (k 2) α = k 2 j − k m − k=j+1

So we get

 m m  2 3 X Y Mod2(x0,Sm) = α− 1 + (km 2) αk + (km 2) 2 − − j=1 k=j+1

Now choose α 2. Then j ≡ m 1 m α = 1 + 2 + 4 + + 2 − = 2 1. ··· − Also

m 1 k 2 = 2 − α m − 1 15

So

Mod (x ,S ) α . 2 0 m ' 1 And

n X Mod (x ,S ) α n. 2 0 m ' 1 m=1 On the other hand the shell degree is

n X 1 log n. m 1 + 1 ' m=1 km 1 − −