Exact Computation of Influence Spread by Binary Decision Diagrams

Takanori Maehara1,2) Hirofumi Suzuki3) Masakazu Ishihata3) 1) Shizuoka University 2) RIKEN Center for Advanced Intelligence Project 3) Hokkaido University [email protected] [email protected] [email protected]

ABSTRACT 1. INTRODUCTION Evaluating influence spread in social networks is a funda- mental procedure to estimate the word-of-mouth effect in 1.1 Background and Motivation viral marketing. There are enormous studies about this Viral marketing is a strategy to promote products by giv- topic; however, under the standard stochastic cascade mod- ing free (or discounted) items to a selected group of highly in- els, the exact computation of influence spread is known to be fluential individuals (seeds), in the hope that through word- #P-hard. Thus, the existing studies have used Monte-Carlo of-mouth effects, a significant product adoption will occur [8, simulation-based approximations to avoid exact computa- 27]. To maximize the number of adoptions, Kempe, Klein- tion. berg, and Tardos [19] mathematically formalized the dy- We propose the first to compute influence spread namics of information propagation, and proposed the op- exactly under the independent cascade model. The algo- timization problem, referred to as the influence maximiza- rithm first constructs binary decision diagrams (BDDs) for tion problem. Several cascade models have been proposed, all possible realizations of influence spread, then computes and the most commonly used one is the independent cascade influence spread by dynamic programming on the constructed model, proposed by Goldberg, Libai, and Muller [10, 11]. BDDs. To construct the BDDs efficiently, we designed a In this model, the individuals are affected by information new frontier-based search-type procedure. The constructed that is stochastically and independently propagated along BDDs can also be used to solve other influence-spread - edges in the network from the seed (Section 2.1). To date, lated problems, such as random sampling without rejection, significant efforts have been devoted to the development conditional influence spread evaluation, dynamic probability of efficient for the influence maximization prob- update, and gradient computation for probability optimiza- lem [1, 4–7, 25, 26, 31]. tion problems. Here we consider the computational complexity of the We conducted computational experiments to evaluate the influence maximization problem. Under the independent proposed algorithm. The algorithm successfully computed cascade model, the expected size of influence spread is a influence spread on real-world networks with a hundred edges non-negative submodular function [19]; thus, a (1 − 1/e) in a reasonable time, which is quite impossible by the naive approximate solution can be obtained by using a greedy al- algorithm. We also conducted an experiment to evaluate gorithm [24]. However, the evaluation of influence spread the accuracy of the Monte-Carlo simulation-based approxi- is #P-hard [4] because it contains the problem of count- mation by comparing exact influence spread obtained by the ing s-t connected subgraphs [32]. Thus all existing studies proposed algorithm. avoided the exact computation and employed the Monte- Carlo simulation-based approximation, which simulates the Keywords dynamics of information propagation sufficiently many times (e.g., Ω(1/2)) to obtain an accurate (e.g., 1 ± ) approxi- viral marketing; influence spread; algorithm; mation of influence spread [25] (Section 6). In this study, we first tackle the problem of computing in- fluence spread exactly under the independent cascade model. As the problem is #P-hard, we are interested in an algo- rithm that runs on small real-world networks (i.e., having a few hundred edges) in a reasonable time. The motivations for this studies are as follows.

• Influence spread over small networks is practically im- c 2017 International World Wide Web Conference Committee (IW3C2), portant. Because real social networks often consist of published under Creative Commons CC BY 4.0 License. many small communities, it is reasonable to consider WWW 2017, April 3-7, 2017, Perth, Australia. each community separately or consider only the inter- ACM 978-1-4503-4913-0/17/04. community network. http://dx.doi.org/10.1145/3038912.3052567 • When we wish to rank vertices according to their in- fluence spread, we need to compute the values accu- . rately. Monte-Carlo simulation cannot be used for this

947 purpose because it requires Ω(1/2) samples for 1 ±  Let G = (V,E) be a directed graph with vertices V and approximation; thus  < 10−5 is impossible. On the edges E. Each edge e ∈ E has activation probability p(e). other hand, an exact method can be used because its Each vertex is either active or inactive. Note that inactive complexity does not depend on the desired accuracy. vertices may become active, but not vice versa. Here, an active vertex is considered “influenced.” • Exact influence spread helps to analyze the quality of Suppose that information is propagated from S ⊆ V , Monte-Carlo simulation. Although many experiments which is called seeds. Initially, all vertices are inactive. using Monte-Carlo simulation have been conducted, Then, propagation over the network is performed as follows. none have been compared with the exact value because First, each seed u ∈ S is activated. When u first becomes there is no algorithm that can compute this value. active, it is given a single chance to activate each currently • Establishing a practical algorithm for the fundamental inactive neighbor v with probability p((u, v)). This process #P-hard problem is interesting and important task in is repeated until no further activations are possible. The computer science. expected number of activated vertices after the end of the process is called influence spread, which is denoted as σ(S). 1.2 Contributions There is a useful interpretation of influence spread with In this study, we provide the following contributions. this model. We select each edge e ∈ E with probability p(e). Then, we obtain edge set F . We then consider the induced • We propose an algorithm to compute influence spread subgraph G[F ] = (V,F ), which is a network consisting of exactly under the independent cascade model. Note only the selected edges. Here, let σ(S; F ) be the number that this is the first attempt to compute this value of vertices reachable from some u ∈ S on G[F ]. Then, we exactly (Section 3). obtain the following: • The proposed algorithm enumerates all spread pat- X terns using binary decision diagrams (BDDs). Then, σ(S) = E[σ(S; F )] = σ(S; F )p(F ) (1) it computes influence spread by dynamic programming F ⊆E on the BDDs. Here, we have designed a new frontier- based search method, which constructs the BDD for where s-t connected subgraphs efficiently (Section 3.2). This Y Y 0 is the main technical contribution of this study. p(F ) = p(e) (1 − p(e )). (2) • We conducted computational experiments to evaluate e∈F e0∈E\F the proposed algorithm (Section 5). We obtained the exact influence on real-world and synthetic networks We use this formula to compute the influence spread. with a hundred edges in reasonable times. We also compared the obtained exact influence with the one 2.2 Binary decision diagram obtained using the Monte-Carlo simulation. As discussed in Section 3, the exact evaluation of (1) in- In addition, using the constructed BDDs, we can also solve volves enumerating S-t connecting subgraphs, which is the the following influence-spread related problems (Section 4). graph having a path from S to t. To maintain exponentially many such subgraphs, we use the binary decision diagram • Random sampling from the set of realizations that suc- (BDD), which is a data structure to represent a Boolean cessfully propagates information helps to understand function compactly based on Shannon decomposition. Note the route of influence spread. We can perform this that a can be used to represents set family without rejection by using the BDD. as the indicator function. A BDD is a directed acyclic graph D = (N , A) with node • The conditional expectation of the influence spread set N and arc set A.1 It has two terminals 0 and 1. Each under the influenced (and non-influenced) conditions non-terminal node α ∈ N is associated with variable e ∈ E, on some vertices can be used to measure the effect of and has two arcs called 0-arc and 1-arc. The nodes pointed conducted viral promotion from a small observations. by 0-arc and 1-arc are referred to as 0-child and 1-child (de- This value is efficiently computed by the BDDs. noted by α0 and α1), respectively. A BDD represents a • When the activation probability changes, we can effi- Boolean function as follows: A path from the root node to ciently update the influence spread. the 1-terminal represents a (possibly partial) variable assign- ment for which the represented Boolean function is True. As • The derivatives of the influence spread with respect to the path descends to a 0-arc (1-arc) from a node, the node’s the activation probabilities can be computed. This is variable is assigned to False (True). used to implement a gradient method for the influence A special type of BDD, i.e., reduced ordered binary deci- spread optimization problem. sion diagram (ROBDD) [3], is frequently used in practice. A BDD is ordered if different variables appear in the same 2. PRELIMINARIES order on all paths from the root. A BDD is reduced if the 2.1 Independent Cascade Model for Influence Spread 1To avoid confusion, we use the terms “vertex” and ”edge” to refer to a vertex and edge in the original graph G, and The independent cascade model [10, 11] is the most com- “node” and “arc” to refer to a vertex and edge in the BDD monly used stochastic cascade model used for social network D. Vertices are denoted using Roman letters (u, v, . . .) and analysis. The dynamics of this model is given as follows. nodes are denoted using Greek letters (α, β, . . .).

948 Algorithm 1 Influence spread computation Figure 1: BDD for {{c}, {a, b}, {a, c}, {b, c}, {a, b, c}}. 1: Create BDD D = (N , A) for R(S, t) the 0-arc is denoted by the dotted line and the 1- 2: Set B(0) = 0, B(1) = 1 arcs are denoted by the solid lines. The arcs to 0- 3: for α ∈ N \ {0, 1} in the reverse topological order do terminal are omitted. 4: B(α) = (1 − p(e(α)))B(α ) + p(e(α))B(α ) a 0 1 5: end for 6: return B(root) b

c Figure 2: A graph for example. The BDD for R({s}, t) is shown in Figure 1. 1 a b following two rules are applied as long as possible: s c t 1. Share any isomorphic subgraphs. 2. Eliminate all nodes whose two arcs point to (3) the same node. 3.1 Influence Spread Computation These rules eliminate redundant nodes in the BDD. More- Once BDD D(S, t) for R(S, t) is obtained, σ(S, t) is ef- over, when ordering is specified, the ROBDD is determined ficiently obtained by bottom-up dynamic programming as uniquely [3]. In terms of Boolean functions, the function follows. Each node α ∈ N stores value B(α), which is the represented by the subgraph rooted by α corresponds to a sum of the probabilities of all subsets represented by the Shannon co-factor. The above two rules correspond to shar- descendants of α, called the backward probability. The back- ing nodes with the same Shannon co-factor. In this paper, ward probabilities of 0-terminal and 1-terminal are initial- we use the term BDD to refer to ROBDD. ized to B(0) = 0 and B(1) = 1. We process the nodes in Figure 1 shows an example of BDD, whihch represents reverse topological order (i.e., the terminals to the root). set family {{c}, {a, b}, {a, c}, {b, c}, {a, b, c}}. The indicator For each non-terminal node α ∈ N \ {0, 1} associated with function is φ(a, b, c) = a(b + ¯bc) +ac ¯ , which corresponds to edge e(α) ∈ E, B(α) is computed as follows: the diagram. B(α) = (1 − p(e(α)))B(α ) + p(e(α))B(α ). (7) One important feature of BDD is that it allows efficient 0 1 manipulation of set families. In particular, when two set This gives a dynamic programming algorithm (Algorithm 1). families are represented by BDDs D1 and D2 with the same The backward probability of the root node is σ(S, t). variable ordering, the union and intersection of these BDDs Here, we provide an example to illustrate the procedure. are performed in O(|D1||D2|) time. The complement of a Consider the graph shown in Figure 2, which has three edges set family represented by BDD D is performed in O(|D|) (a, b, and c). These activation probabilities are p. Then, the time [3, 29]. This property is utilized in this study. {s}-t connecting subgraphs are as follows: For details about BDDs, see the latest volume of “The Art of Computer Programming” [21] by Knuth. R({s}, t) = {{c}, {a, b}, {a, c}, {b, c}, {a, b, c}}. (8) The BDD for this set family is presented in Figure 1. We 3. ALGORITHM perform dynamic programming on this BDD as follows: In this section, we propose an algorithm to compute influ- B(1) = 1, B(c) = p, B(b) = p + (1 − p)p, ence spread exactly. Let S ⊆ V be a seed set and t ∈ V be B(a) = p2 + (1 − p)p2 + (1 − p)p = p + p2 − p3X. a vertex. We consider the set of S-t connecting subgraphs Therefore the influence probability from {s} to t is p+p2−p3. R(S, t) = {F ⊆ E : t is reachable from S on G[F ]}, (4) which represents all realizations in which t is activated from 3.2 BDD Construction seed set S. Using this set, influence spread is expressed as Here, we present an algorithm to construct the BDD D(S, t) for R(S, t). This is the main technical contribution of this X σ(S) = σ(S, t), (5) study. t∈V We first consider the single seed case (i.e., S = {s}) in Sec- tion 3.2.1. Then, we consider a general case in Section 3.2.2. where σ(S, t) is the influence probability from S to t, i.e., For simplicity, we write R(s, t) and D(s, t) for R({s}, t) and X D({s}, t), respectively. σ(S, t) = p(R(S, t)) = p(F ). (6) F ∈R(S,t) 3.2.1 BDD for a single seed Our algorithm is a type of frontier-based search, which Our algorithm computes influence spread based on the above is a general procedure for enumerating all constrained sub- formulas. The algorithm first constructs the BDD for R(S, t). graphs [18].2 In the following, we first describe the general Then it computes σ(S, t) by dynamic programming on the BDD. Finally, by summing over t ∈ V , we obtain the influ- 2Frontier-based search is often applied to construct a zero- ence spread σ(S). suppressed BDD, which is a special kind of BDD. However, in

949 framework of the frontier-based search. Then, to adapt it Algorithm 2 Frontier-based search to our problem, we describe four main components: configu- 1: N0 ← {root}, Ni ← ∅ for i = 1, 2,..., |E| ration, isZeroTerminal function, isOneTerminal function, 2: for i = 1, 2,..., |E| do and createNode function. Finally we describe two tech- 3: for α ∈ Ni−1 do niques to improve performance: edge ordering and prepro- 4: for x ∈ {0, 1} do cessing. 5: if isZeroTerminal(α, ei, x) then 6: αx ← 0 Frontier-based search. 7: else if isOneTerminal(α, ei, x) then E Let us enumerate all constrained subgraphs R ⊆ 2 . We 8: αx ← 1 fix an ordering of edges (e1, . . . , em) and process the edges 9: else one by one, as the exhaustive search. The processed edges 10: β ← createNode(α, ei, x) 0 0 and the unprocessed edges at the end of i-th step are de- 11: if φ(β) = φ(β ) for some β ∈ Ni then ≤i >i 0 noted by E := {e1, . . . , ei} and E := {ei+1, . . . , em}, re- 12: β ← β spectively. The set of vertices that has both processed and 13: else unprocessed edges is called the frontier (at the i-th step) 14: Ni ← Ni ∪ {β} and denoted by Wi. 15: end if ≤i The set of nodes Ni represents all subsets of E that 16: αx ← β can possibly belongs to R. Each α ∈ Ni represents possibly 17: end if ≤i many subsets R(α) ⊆ 2E by paths from the root to α, 18: end for where a path from the root to α represents a subset in which 19: end for e is present in the set if the path descends the 1-arc of node 20: end for β associated with e. We say that two edge sets F and F 0 21: Reduce the constructed BDD by the reduction rules (3) are equivalent if for any subsets H ⊆ E>i, both F ∪ H and F 0 ∪ H belong to R or neither belong to R. The algorithm maintains that all sets in R(α) are equivalent. Configuration. 0 At the i-th iteration, the algorithm constructs Ni from For two nodes β, β ∈ Ni, we want to merge these nodes if Ni−1. For each node α ∈ Ni−1, the algorithm generates two these are equivalent, i.e., the s-t reachabilities on G[F ∪ H] 0 0 0 children for which ei is excluded or included in the sets in and G[F ∪ H] are the same for all F ∈ R(β), F ∈ R(β ), R(α). Here, the important feature is node merging. Let β and H ⊆ E>i. Thus the configuration must satisfy that and β0 be nodes generated at the i-th step. If all F ∈ R(β) φ(β) = φ(β0) implies the above condition. and F 0 ∈ R(β0) are equivalent, we can merge them to reduce Here, we propose to use the reachability information on the number of nodes. To verify this equivalence efficiently, the frontier vertices as the configuration as follows. Let each node β maintains a data φ(β), referred to as configu- s+ +t Wi ,Wi ⊆ Wi be the set of frontier vertices that are ration, which satisfies the condition that: if φ(β) = φ(β0) reachable from s and reachable to t, respectively, on G[F ] then the all corresponding sets are equivalent. Note that where F ∈ R(β). Note that these are well-defined, i.e., they the inverse is not required, which causes redundant node are independent of the choice of F , as mentioned below. Let expansions. s− s+ −t +t Wi = Wi \ Wi ,Wi = Wi \ Wi . We define the configu- After the process, the constructed BDD is not necessarily s− −t ration φ(β) as a matrix indexed by (Wi ∪{s})×(Wi ∪{t}) reduced. Thus, we repeatedly apply the reduction rules (3). whose entries denote reachability on G[F ]: This reduction is performed in time proportional to the size ( of the BDD [3]. 1 v is reachable from u on G[F ], φ(β)uv = (11) The general framework of the frontier-based search is shown 0 otherwise. in Algorithm 2, which contains three auxiliary functions. 0 isZeroTerminal(α, ei, x)(isOneTerminal(α, ei, x)) determines If F ∪ H admits (does not admit) an s-t path, any F ∈ 0 0 whether the node for the sets excluding (including) ei from R(β ) with φ(β) = φ(β ) also admits (does not admit) an R(α) is the 0-terminal (1-terminal). More precisely, these s-t path because we can transform the s-t path on G[F ] are defined as follows: to that on G[F 0] by reconnecting the path on the frontier. This shows that φ satisfies the configuration requirement isZeroTerminal(α, ei, x) described above. This also proves, by induction, that this ( True all x-descendants are excluded from R, definition is well-defined, i.e., φ(β) is independent of the = (9) False otherwise, choice of F .

isOneTerminal(α, ei, x) “isZeroTerminal” and ”isOneTerminal” functions. ( True all x-descendants are included to R, If x = 1, i.e., we include edge ei = (u, v) in the sets in = (10) R(α), we have a chance to obtain isOneTerminal(α, e , x) = otherwise. i False True, which is the case that the included edges contain a path from s to t. Using our configuration, this is easily createNode(α, e , x) creates an x-child of α. To adapt the i implemented as follows: general framework to our s-t connecting subgraph enumera- tion problem, we only have to design the configuration and isOneTerminal(α, ei, 1) these functions. ( True φ(α) = 1 and φ(α) = 1, = su vt (12) our problem, the set has many “don’t care” edges; therefore otherwise. BDD is more suitable than ZDD. False

950 Similarly, if x = 0, i.e., we exclude edge ei from the sets in proposed algorithm because it uses a smaller configuration. R(α), we have a chance to obtain isZeroTerminal(α, ei, x) = Thus, we can use the Simpath algorithm in preprocessing. True, which is the case that the excluded edges form a cutset from s to t. This is implemented as follows. 3.2.2 BDD for multiple seeds The frontier-based search described in the previous sub- isZeroTerminal(α, ei, 0) section can be easily adopted to the multiple seeds case. ( However, there is a more efficient way to construct the BDD True t is unreachable from s on G[F ∪ E>i], = (13) for multiple seeds. False otherwise The method is based on the following formula, which is immediately obtained from the definition of R(S, t): where F ∈ R(α). Note that this is well-defined for the [ same reason described above. To check the reachability on R(S, t) = R(s, t). (14) >i G[F ∪E ] efficiently, we precompute the transitive closures s∈S of G[E>j ] for all j = 0, 1,..., |E|.3 Then the reachability Because the BDD of the union of two set families represented from s to t is checked in O(|W |2) time by the DFS/BFS i by BDDs D and D is obtained in O(|D ||D |) time, and, with the configuration and the precomputed reachability. 1 2 1 2 practically, the size of the BDDs is small (Section 5), this approach is more efficient than the frontier-based practice. “createNode” function. The most important role of createNode(α, ei, x) is com- 3.2.3 Node sharing among BDDs puting the configuration of the new node. The function first To compute influence spread, we construct BDDs for all creates new node β and copies configuration φ(α) to φ(β). If pairs of s, t ∈ V . Here, intuitively, if two source-target pairs a vertex is included in the frontier (i.e., some incident edge (s, t) and (s0, t0) are close, the BDDs D(s, t) and D(s0, t0) is processed first) or excluded from the frontier (i.e., all in- may share many subgraphs. Thus, by sharing the nodes cident edges have been processed), we insert or remove the corresponding to the subgraphs, we can reduce the total size corresponding row and column from the configuration φ(β). of the BDDs [23]. This also reduces the total complexity of If x = 0, we require no further updates. Otherwise, adding computing influence spreads for all source-target pairs (s, t), a new edge changes reachability; thus we update φ(β) to be which is proportional to the total size of the shared BDDs. the transitive closure of the frontier. This is performed in 2 O(|Wi| ) time by the DFS/BFS on the frontier. 4. OTHER APPLICATIONS Edge ordering. In the previous section, we established an algorithm to The complexity of the frontier-based search depends on construct the BDD for all S-t connecting subgraphs R(S, t). 2 |Wi| the frontier size. Ni has at most O(2 ) nodes because it This data structure allows us to solve influence spread-related contains no nodes with the same configurations. It is known problems efficiently. that the frontier size is closely related to the pathwidth graph parameter [20]. 4.1 Random Sampling without Rejection Note that optimizing edge ordering is important to reduce Sometimes we want to know how the influence is propa- the frontier size (i.e., the pathwidth). For our problem, there gated from S to t. The random sampling from R(S, t) will is an additional requirement, i.e., the same edge ordering is help us to understand this; however, the naive method that used for all BDDs R(s, t) for s, t ∈ V because we perform performs Monte-Carlo simulation and rejects if S does not several set manipulations between the BDDs. connect to t usually requires impractically many simulations In this study, we use the path-decomposition based order- due to the small influence probability. Here we show that ing proposed by Inoue and Minato [15]. The algorithm first this random sampling can be performed without rejection computes a path decomposition with a small pathwidth us- using BDD D(S, t) = (N , A) [17]. ing beam search-based heuristics. Then it computes an edge As a preprocess, we perform the dynamic programming ordering using the path decomposition information. described in Section 3.1 to compute the backward proba- bility B(α) for each node α ∈ N . Then, we perform the Preprocessing. following random walk, which starts from the root node and If e ∈ E is not contained in any s-t simple path, e does ends at the 1-terminal: When we are on non-terminal node not appear in the BDD because the existence of e does not α ∈ N \ {0, 1} associated with e ∈ E, we randomly move α0 affect s-t reachability. Therefore, removing all such edges as or α1 with probability proportional to (1 − p(e))B(α0) and a preprocessing improves the performance of the algorithm. p(e)B(α1). Here, if we moved to α0, we exclude e from F ; Determining whether there is an s-t simple path contain- otherwise we include e in F . We repeat this procedure until ing e is NP-hard because it reduces to the NP-hard two- we reach the 1-terminal. Finally, for all undetermined edges, commodity flow problem [9]. However, because we are in- we randomly and independently exclude or include the edge terested in small networks, we can enumerate all s-t sim- with its probability. This yields a random sampling from R. ple paths using Knuth’s Simpath algorithm [21], which is The complexity is proportional to the height of the BDD. a frontier-based search algorithm that runs faster than the 4.2 Conditional Influence Spread After conducting a viral promotion, we must measure the 3Because we compute the BDDs for all pairs of s, t ∈ V , storing all transitive closures accelerates computation. The effect of the promotion. For this purpose, we observe the size of all transitive closures are typically much smaller than status of influence (i.e., influenced or not) on some small the size of the BDDs. vertices and estimate the total size of influence spread. This

951 value, referred to as the conditional influence spread, can be Then, the derivative is obtained as follows: obtained using the constructed BDDs. ∂σ(S, t) X For example, suppose that we have observed that “ver- = F(α)B(α ). (18) ∂p(e) 1 tices u, v are influenced and w is not influenced.” Then, the α:e(α)=e realizations that satisfy this condition is given by Because Monte-Carlo simulation cannot be used to compute R = R(S, u) ∩ R(S, v) ∩ R(S, w)c, (15) the derivative, this is an advantage of our method. Note that this technique is used in probabilistic logic learning [14,16]. where R(S, w)c = 2E \R(S, w). Then the conditional influ- ence probability from S to t under R is given by 5. EXPERIMENTS p(R(S, t) ∩ R) We conducted computational experiments to evaluate the σ(S, t|R) = , (16) proposed algorithm. All code was implemented in C++ p(R) (g++5.4.0 with the -O3 option) using the TdZdd library4, and the summation over t gives the conditional influence which is a highly optimized implementation for BDDs. All spread. experiments were conducted on 64-bit Ubuntu 16.04 LTS The BDDs for R(S, t) ∩ R and R in (16) can be efficiently with an Intel Core i7-3930K 3.2 GHz CPU and 64 GB RAM. The real-world networks were taken from the Koblenz Net- obtained because Boolean operations on set families are per- 5 formed efficiently on BDD representations. Moreover, these work Collection. All self-loops and multiple edges were probabilities can be computed by the the dynamic program- removed, and undirected edges were replaced with two di- ming described in Section 3.1. This is the method for com- rected edges in both directions. The number of vertices and puting the exact conditional influence spread. edges are described in Table 1 Note that, by combining random sampling technique de- scribed in Section 4.1, we can sample conditional realizations 5.1 Scalability on Real-World Networks without rejection. First, to evaluate the performance of the proposed algo- rithm in the real-world networks, we conducted experiments 4.3 Activation Probability Modification on the collected networks. For each network, we constructed the BDDs for all distinct s, t ∈ V and observed the computa- Activation probabilities are frequently changed in real- tional time, the size of each BDD, the total shared size of the world networks [26]. In such a case, we can recompute the BDDs, and the number of realizations that are represented influence spread easily by reusing the constructed BDDs. by the BDDs (i.e., cardinality of the set). The complexity is proportional to the size of the BDDs. The results are shown in Table 1. The algorithm suc- cessfully computed the BDDs for networks with a hundred 4.4 Activation Probability Optimization edges, but failed on some larger networks. When it suc- Sometimes we want to solve an optimization problem with ceeded, it is very efficient in both time and space, i.e., it ran respect to the activation probabilities of edges. One exam- in a few milliseconds and the size was at most a few millions ple is a time-dependent influence problem, i.e., when the for a network with a few hundred edges. The shared size activation probabilities are the function on time, we want was about the half of the sum of all sizes of BDDs, which to seek the time that maximizes influence spread. Another means that the BDDs shared many nodes. By comparing example is a network design problem where we want to max- Contiguous-USA network and the three failed networks, the imize the influence spread by modifying activation probabil- computational cost depended on the network structure. ities under some (e.g., budget) constraint. Because these It should be emphasized that the naive exhaustive search problems are non-convex optimization problems (even if the is quite impractical for these networks because, as shown in activation probabilities are simple functions), it is difficult Cardinality column, there are enormous number of connect- to compute the optimal solution. However, a local optimal ing realizations. In particular, at the extreme case, a BDD solution would be obtained by a gradient-based method. D(s, t) for American-Revolution network with some source- To implement a gradient-based method, we require deriva- target pair (s, t) consisted of only 85 nodes, but represented tives of the influence spread with respect to the activation probabilities. Here we show that if we have the BDD for 2,058,334,714,926,419,025,286,040,286,320, R(S, t), we can obtain ∂σ(S, t)/∂p(e) for all e ∈ E in time 632,494,993,236,943,086,975,345,403,704,463, (19) proportional to the size of the BDD. 133,047,043,046,026,363,318,022,843,662,336 First, we compute the backward probability B(α) for all realizations (approximately 2 × 1097), which exceeds the nodes α ∈ N by the dynamic programming described in number of atoms in the universe (approximately 1080). This Section 3.1. Then, we perform top-down dynamic program- shows the effectiveness of the BDD representation of the con- ming as follows. Each node α ∈ has a value F, called N necting realizations. the forward probability. The forward probability of the root node is initialized as F(root) = 1. We process the nodes in 5.2 Scalability on Synthetic Networks topological order (i.e., the root to the terminals). When we are on non-root node α ∈ N \{root}, its forward probability Next, to observe the performance of the algorithm pre- is determined as follows: cisely, we conducted experiment on two classes of synthetic networks. The first class was 5 × w grid graph, which has X X F(α) = (1 − p(e(β)))F(β) + p(e(γ))F(γ). n = 5w vertices and 9w − 5 undirected edges, which has β:β0=α γ:γ1=α 4https://github.com/kunisura/TdZdd (17) 5http://konect.uni-koblenz.de/

952 Table 1: Computational results on real-world networks. Time denotes the average time to construct the BDDs, BDD Size denotes the average number of nodes in the BDDs, Shared Size denotes the total number of distinct nodes in the shared BDDs, and Cardinality denotes the average number of subgraphs represented by the BDDs. Here, average is taken of all distinct s, t ∈ V . For the last three networks, the algorithm failed to compute due to the memory limit. Network Vertices Edges Time [ms] BDD Size Shared Size Cardinality South-African-Companies 11 26 0.1 12.1 472 2.2e+07 Southern-women-2 20 28 0.3 54.7 2,266 1.3e+08 Taro-exchange 22 78 4.1 1,119.2 277,756 1.6e+23 Zachary-karate-club 34 156 24.9 7,321.8 4,988,148 6.4e+46 Contiguous-USA 49 214 117.9 30,599.8 41,261,047 1.6e+64 American-Revolution 141 320 2.2 120.0 1,530,677 5.7e+95 Southern-women-1 50 178 — — — — Club-membership 65 190 — — — — Corporate-Leadership 64 198 — — — — a pathwidth of 5. The second class was the random graph Carlo samples (N = 107), the estimated influence spread that has the same number of vertices and edges as the grid by Monte-Carlo simulation has error in the order of 10−3, graph, which has a pathwidth of Θ(n). We computed influ- which is consistent with the theory [25]. ence probability σ(s, t) from the north-west corner s to the south-east corner t on the grid graph and the corresponding 6. RELATED WORK vertices on the random graph. The results are shown in Figure 3. For the grid graphs, BDD size and construction time increased slowly; thus the Influence spread computation. computation on n = 100 was tractable. On the other hand, After the seminal work by Kempe, Kleinberg, and Tar- for the random graphs, BDD size and construction time dos [19], influence spread over networks has become an im- increased rapidly; thus we could not compute a BDD for portant topic in social network analysis. However, to the n ≥ 45. These results are consistent with the pathwidths of best of our knowledge, no efforts have been devoted to the these networks. exact computation of influence spread since it is proved to For both networks, the influence probabilities decayed ex- be #P-hard by Chen et al. [4]. ponentially. It decayed faster in grid network since basically To compute influence spread, all existing studies used the influence probability depends on the network distance. Monte-Carlo simulation-based approximation, which repeats simulation until a reliable estimation is obtained. This ap- 5.3 Influence Maximization Problem proach is originally proposed in [19]. To enhance the scal- ability, many techniques, such as pruning [22] and sample Here, we consider the influence maximization problem, average approximation [4, 6, 25] have been investigated. which seeks k seeds to maximize the influence spread [19]. The recent approximation methods are based on the Borgs The greedy algorithm is commonly used to solve this prob- et al.’s reverse influence sampling (RIS) technique [1], which lem, which begins from the empty set S = ∅ and repeatedly randomly selects a vertex and then performs reverse BFS to adds the vertex u that has the maximum marginal influence compute the set of vertices reachable to the selected vertex σ(S ∪ {u}) − σ(S) into S until k vertices are added. on a random graph. It is important that this procedure is We implemented the greedy algorithm with the exact in- implemented in time proportional to the size of the sample. fluence spread to observe the performance of the proposed Therefore it successfully bounds the complexity of influence algorithm in the greedy algorithm. We used the Contiguous spread approximation. Tang, Shi, and Xiao [30,31] proposed USA network and Zachary Karate club networks. the methods to reduce the number of samples. The results are shown in Figure 4. Figure 4(b) shows that Note that our formulation (5) is related with the RIS tech- the shared size of BDD did not increase while the algorithm nique: RIS randomly selects vertices whereas we select all process. The shared size at the 10-th step of the greedy vertices, and RIS samples single reverse influence patterns algorithm was two times larger than the 1-st step for both whereas ours enumerates all reverse influence patterns. networks, and the computational times were proportional to the number of steps, as shown in Figure 4(c). Subgraph enumeration. In this study, we virtually solved the enumeration problem 5.4 Comparison with Monte-Carlo simulation of s-t connecting subgraphs for the influence spread compu- Finally, for an application of exact influence spread com- tation. This problem is known to be #P-hard [32]. putation, we compared the exact influence spread with the If the underlying network is undirected, this problem co- Monte-Carlo simulation. We used the Contiguous USA net- incides with the two-terminal network reliability problem [2, work, which was also used in the above experiment. In ad- 32], and several algorithms have been proposed to construct dition, we used a seed set of size 10 computed by the greedy a BDD for the problem [13, 33]. However, none have been algorithm with the exact influence spread, and we compared naturally generalized to our directed problem because they the quality of the approximated spread. essentially exploit the undirected nature of the graph. The results are shown in Figure 5. Even for such small size BDD is used to enumerate several kinds of subgraphs (sub- network (m = 107 edges) and the large number of Monte- structures), such as paths [21], spanning trees [28], and the

953 (a) Influence Probability (b) BDD Size (c) Time [s] 100 102

105 −10 10 me −1

Ti 10 DD Size B Grid Grid Grid nfluence probability

I Random Random Random −20 10 100 10−4 0 50 100 0 50 100 0 50 100 Number of vertices n Number of vertices n Number of vertices n

Figure 3: Computational results on 5 × w grid graphs and random graphs. The algorithm failed to compute the influence spread on the random network with n ≥ 45 vertices due to the memory limit.

(a) Influence spread (b) Shared Size (c) Time [s]

108 15 102

107 10 101 me [s] Ti

hared Size 6 0 5 S 10 10

nfluence spread ContUSA ContUSA ContUSA I Zachary Zachary Zachary 0 105 10−1 5 10 5 10 5 10 Number of steps Number of steps Number of steps

Figure 4: Computational results on the influence maximization problem with exact influence spread.

·10−3 7. CONCLUSION In this study, we have proposed an algorithm to compute 1 · 100 influence spread exactly. The proposed algorithm first con- structs the BDDs to represent all s-t connecting subgraphs. Then it computes influence spread by dynamic programming

0 on the constructed BDDs. The BDDs can also be used 0 · 10 to solve some other influence-spread related problems effi- ciently. The results of our computational experiments show that the proposed algorithm scales up to networks with a −1 · 100 hundred edges, even though they have an enormous number (i.e., ∼ 2 × 1097) of possible realizations. r of the estimated influence A similar approach will be adopted for the linear threshold model [19], which is another widely used stochastic cascade Erro −2 · 100 0 0.2 0.4 0.6 0.8 1 model: Goyal, Lu, and Lakshmanan [12] showed that the influence spread in this model is computed by enumerat- Number of samples 7 ·10 ing all s-t paths, and they proposed an algorithm, named “Simpath,” based on an exhaustive search with pruning. By Figure 5: Accuracy of Monte-Carlo simulation. constructing the BDDs for all s-t paths, rather than for all s-t connected subgraphs as in this study, similar results will be obtained. Note that there is an efficient algorithm to con- struct the BDD for all s-t paths [21], which is also named solutions of logic puzzles [34]. By comparing these methods, “Simpath.” This algorithm is used in this study to prune the the proposed method involves relatively expensive opera- redundant edges in preprocessing. tions (reachability computation) in the auxiliary functions The most important future work is computing exact (or used in the frontier-based search. Such operations usually highly accurate) influence spread in networks with a few make the algorithm non-scalable; thus these are not used in hundred edges or a thousand edges. This may require new literature. However, in our case, these are necessary to scale technique such as parallel construction of BDDs, approxi- up the algorithm by pruning many nodes in each step. mation of BDDs, or exploiting network structures.

954 Acknowledgment [16] M. Ishihata, Y. Kameya, T. Sato, and S. Minato. This work was supported by JSPS KAKENHI Grant Num- Propositionalizing the em algorithm by bdds. In Late bers 15H05711 and 16K16011, and by JST, ERATO, Breaking Papers of the 18th International Conference Kawarabayashi Large Graph Project. on Inductive Logic Programming, pages 44–49, 2008. [17] M. Ishihata and T. Sato. Bayesian inference for statistical abduction using markov chain monte carlo. 8. REFERENCES In ACML, pages 81–96, 2011. [18] J. Kawahara, T. Inoue, H. Iwashita, and S. Minato. [1] C. Borgs, M. Brautbar, J. Chayes, and B. Lucier. Frontier-based search for enumerating all constrained Maximizing social influence in nearly optimal time. In subgraphs with compressed representation. SODA, pages 946–957, 2014. TCS-TR-A-13-64. Hokkaido University, 2014. [2] T. B. Brecht and C. J. Colbourn. Lower bounds on [19] D. Kempe, J. Kleinberg, and E.´ Tardos. Maximizing two-terminal network reliability. Discrete Applied the spread of influence through a social network. In Mathematics, 21(3):185–198, 1988. KDD, pages 137–146, 2003. [3] R. E. Bryant. Graph-based algorithms for boolean [20] N. G. Kinnersley. The vertex separation number of a function manipulation. Transactions on Computers, graph equals its path-width. Information Processing 100(8):677–691, 1986. Letters, 42(6):345–350, 1992. [4] W. Chen, C. Wang, and Y. Wang. Scalable influence [21] D. Knuth. The art of computer programming: Bitwise maximization for prevalent viral marketing in tricks & techniques; binary decision diagrams, volume large-scale social networks. In KDD, pages 1029–1038, 4, fascicle 1, 2009. 2010. [22] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, [5] W. Chen, Y. Wang, and S. Yang. Efficient influence J. VanBriesen, and N. Glance. Cost-effective outbreak maximization in social networks. In KDD, pages detection in networks. In KDD, pages 420–429, 2007. 199–208, 2009. [23] S. Minato, N. Ishiura, and S. Yajima. Shared binary [6] S. Cheng, H. Shen, J. Huang, G. Zhang, and decision diagram with attributed edges for efficient X. Cheng. Staticgreedy: solving the boolean function manipulation. In DAC, pages 52–57, scalability-accuracy dilemma in influence 1990. maximization. In CIKM, pages 509–518, 2013. [24] G. L. Nemhauser, L. A. Wolsey, and M. L. Fisher. An [7] E. Cohen, D. Delling, T. Pajor, and R. F. Werneck. analysis of approximations for maximizing submodular Sketch-based influence maximization and set functions—i. Mathematical Programming, computation: Scaling up with guarantees. In CIKM, 14(1):265–294, 1978. pages 629–638, 2014. [25] N. Ohsaka, T. Akiba, Y. Yoshida, and [8] P. Domingos and M. Richardson. Mining the network K. Kawarabayashi. Fast and accurate influence value of customers. In KDD, pages 57–66, 2001. maximization on large networks with pruned [9] S. Even, A. Itai, and A. Shamir. On the complexity of monte-carlo simulations. In AAAI, pages 138–144, time table and multi-commodity flow problems. In 2014. FOCS, pages 184–193, 1975. [26] N. Ohsaka, T. Akiba, Y. Yoshida, and [10] J. Goldenberg, B. Libai, and E. Muller. Talk of the K. Kawarabayashi. Dynamic influence analysis in network: A complex systems look at the underlying evolving networks. VLDB, 9(12):1077–1088, 2016. process of word-of-mouth. Marketing Letters, [27] M. Richardson and P. Domingos. Mining 12(3):211–223, 2001. knowledge-sharing sites for viral marketing. In KDD, [11] J. Goldenberg, B. Libai, and E. Muller. Using complex pages 61–70, 2002. systems analysis to advance marketing theory [28] K. Sekine, H. Imai, and S. Tani. Computing the tutte development: Modeling heterogeneity effects on new polynomial of a graph of moderate size. In product growth through stochastic cellular automata. International Symposium on Algorithms and Academy of Marketing Science Review, 2001:1, 2001. Computation, pages 224–233. Springer, 1995. [12] A. Goyal, W. Lu, and L. V. Lakshmanan. Simpath: [29] D. Sieling and I. Wegener. Reduction of obdds in An efficient algorithm for influence maximization linear time. Information Processing Letters, under the linear threshold model. In ICDM, pages 48(3):139–144, 1993. 211–220, 2011. [30] Y. Tang, Y. Shi, and X. Xiao. Influence maximization [13] G. Hardy, C. Lucet, and N. Limnios. Computing in near-linear time: A martingale approach. In all-terminal reliability of stochastic networks with SIGMOD, pages 1539–1554, 2015. binary decision diagrams. In ASMDA, pages 17–20. [31] Y. Tang, X. Xiao, and Y. Shi. Influence maximization: Citeseer, 2005. Near-optimal meets practical [14] K. Inoue, T. Sato, M. Ishihata, Y. Kameya, and efficiency. In SIGMOD, pages 75–86, 2014. H. Nabeshima. Evaluating abductive hypotheses using [32] L. G. Valiant. The complexity of enumeration and an em algorithm on bdds. In IJCAI, pages 810–815, reliability problems. SIAM Journal on Computing, 2009. 8(3):410–421, 1979. [15] Y. Inoue and S. Minato. Acceleration of ZDD [33] Z. Yan, C. Nie, R. Dong, X. Gao, and J. Liu. A novel construction for subgraph enumeration via path-width OBDD-based reliability evaluation algorithm for optimization. TCS-TR-A-16-80. Hokkaido University, 2016.

955 wireless sensor networks on the multicast model. Mathematical Problems in Engineering, 2015, 2015. [34] R. Yoshinaka, T. Saitoh, J. Kawahara, K. Tsuruma, H. Iwashita, and S.-i. Minato. Finding all solutions and instances of numberlink and slitherlink by zdds. Algorithms, 5(2):176–213, 2012.

956