Decomposition of Shapley Value

Home , Shapley value

arXiv:1804.01785v1 [cs.IT] 5 Apr 2018 begm,plmtod hpe au,submodularity. value, Shapley polymatroid, game, able n n fte ssre stecutrhae ocletthe collect to header cluster the are as one served in is scattered them nodes of sensor one of usually and group, We or example. cluster, senso for and WSN a wireless a clients, have Take a [7]. mobile over [1], are problem (WSN) network (CCDE) coding sources source exchange the distributed data where the [6], cooperative [2], coded problem the are problem e. networks, networks. of multiterminal (P2P) to knowledge so peer-to-peer applied on be the results to without theoretic problems information coding manner the lossless allows This a others. source in individual the encoded i.e., be commu sources, interactive the without between attained los nications be the that can stated compression also they data region, rate achi Slepia coding the source the describe able derived that [5] bounds) (lower [4], constraints in (SW) Wolf authors the While the informati [3]. as least problem minimal the to with the referred random sources is using the multiple loss general, describe For to in [2]. length correlated code destination [1], are or e.g., which sink cost, forwarding sources, the minimum and many to the collecting source and with random of way a problem efficient from the most information with the deal in studies messages the transmit becom sources considerabl of number a the large. to when than reduction contributes smaller complexity and demon large strictly Experiments complexity. sources is computational of in which subgame. reduction b the number subgames, typical determined each that total of is the for the show method size separately In decomposition we maximum this obtained the decomposable, coalition. of be is each complexity due can The game in incurred value the rates cost Shapley which coding the for source quantifies case Slepia mod function the the game in entropy to coalitional vector game-theoretic a the rate formulate a where coding We sources by region. source of t achievable fair problem obtaining Wolf number a for this method value, large decomposition model Shapley a a We present and are encoded. approach d there be lossless where to the wi network for the e.g, method sensor networks, multiterminal allocation in rate problem compression coding source fair † h eerhSho fEgneig olg fEngineering of College Engineering, of School Research The h yia xmlso h uttria orecoding source multiterminal the of examples typical The to how discuss usually we communications, wireless In Terms Index Abstract aresi uttria aaCompression: Data Multiterminal in Fairness W osdrtepolmo o odtriea determine to how of problem the consider —We Caiinlgm,dt opeso,decompos- compression, data game, —Coalitional I. iDing Ni eopsto fSalyValue Shapley of Decomposition ∗ INTRODUCTION aa1 h omnelhSinicadIdsra Researc Industrial and Scientific Commonwealth The Data61, aacompression data ∗ { ai Smith David , idn,dvdsih thierry.rakotoarivelo david.smith, ni.ding, rsuc coding, source or , { parastoo.sadeghi ∗ aatoSadeghi Parastoo , strate reless can s sless urce ev- ata g., on he n- n- es el y a e r - obs tto,wihi sal ovdi itiue ma distributed a [8]. in source e.g., solved usually multiterminal is which hea two-layer station, cluster a to-base and poses header See sensors-to-cluster WSN station. problem: base a coding or Such node 1. sink a Fig. to forward and measurements node sink connected. a be to meas not measurements nodes c may the each or sensor forward in may of one to headers with group header clusters A cluster form Sensors the network: region. sensor a in Wireless features 1. Fig. enda h iet hc h rtsno oern u fbat of out runs node lifetim sensor The first WSN: the a which bounds of to [11]. time lifetime lower the the as SW prolongs defined which the sensors, convert of first We ra coding core. source the fair a in method [14], value vector decomposition Shapley a the propose game- obtaining We for coalitional view. a of point from theoretic problem compression data terminal huge systems. imposes scale value large a in burden such exponentia a computation obtaining the proposed But, of [10] value. complexity in Shapley growing the authors by The solution numbe to large. fairness the as becomes when remains fairly terminals rates question were coding of the source case WSN, the terminals a allocate to for two how [13] the [12], for in schemes proposed coding source fair rbe fhwt eetaslto ntecr,tefairness WSN. t and the that CCDE core, e.g., such privileged, the networks the equally in are multiterminal For users solution large. in very a considered be usually select can is to core how the fo of and rates problem game coding coalitional source th a all constitute of that compression [10] data multiterminal [9], lossless in Importantl the out encoded. be pointed WSN, to was a sources for of it as number such large cases, a some are In there sources. data two than more n optrSine h utainNtoa University National Australian The Science, Computer and } @anu.edu.au 1 uttria aacmrsinuulyrfr otecs o case the to refers usually compression data Multiterminal hsppradesstepolmo aresi h multi- the in fairness of problem the addresses paper This arsuc oigshm loeesottebteypwrc power battery the out evens also scheme coding source fair A † n her Rakotoarivelo Thierry and } @data61.csiro.au Organisation h ∗ lse header cluster esrnode sensor iknode sink utrsrigas serving luster 1 h cluster The . hl the While susually is e onsumption eypower tery r some ure core e ders- nner, lly he te y, r f r to upper bounds, based on which a coalitional game model is formulated with the entropy function quantifying the cost T incurred due to the source coding rates in each coalition. The submodularity of the entropy function ensures the convexity of the game and also ensures that the core, or the achievable 1 2 rate vector set for the lossless data compression, is nonempty. Z W W W W W Z W W W We then explain the incentive for the users to cooperate: The 1 = ( a, b, c, d, e) 2 = ( a, b, f ) users pay the least cost only if they cooperatively encode 3 the multiple sources. We show that, in the common case that Z3 = (Wc, Wd, Wf ) the game can be decomposed into a group of subgames, the Shapley value can be obtained separately for each subgame. Fig. 2. The multiterminal data compression problem with V = {1, 2, 3} in While the finest decomposer of a decomposable game can Example 1: User i observes Zi in private. The users are required to encode Z Z be determined by an existing algorithm in [15] in quadratic their sources is so that the sink node T is able to recover V . time, the complexity of obtaining the Shapley value by the decomposition method is determined by the maximal subgame H(ZX Y ) − H(ZY ) be the conditional entropy of ZX given size, which is strictly less than the total number of sources in ∪ ZY . In the rest of this paper, without loss of generality, we the entire system and contributes to a considerable reduction simplify the notation ZX by X. For a rate vector rV , let r be in computational complexity. Experiments demonstrate large the sum-rate function associated with rV such that reduction when the number of sources becomes large. r(X)= r , ∀X ⊆ V A. Organization X i i∈X The rest of the paper is organized as follows. In Section II, we describe the system model for the multiterminal source with the convention r(∅)=0. r(X) is the sum of source coding problem. A coalition game model characterized by coding rates over all the users in X and r(V ) is the overall the entropy function of the random sources is proposed in source coding rate in the system. Section II, where we show the convexity of the game and It is shown in [4], [5] that the sources can be encoded explain the users’ incentive to cooperate in the game. In separately at rates rV so that the sink node T is able to decode Section IV, we propose the decomposition method to obtain them with arbitrarily small error probability if and only if3 the Shapley value and discuss the method’s complexity. In r(X) ≥ H(X|V \ X), ∀X ( V, Section V, we provide some concluding remarks. (1) r(V )= H(V ). II. SYSTEM MODEL The interpretation of (1) is: For the data compression to be Let V with |V | > 1 be a finite set that contains the indices lossless, (a) the users in X must reveal H(X|V \ X), the of all users in the system. We call V the ground set. Let information that is uniquely obtained by X, to the sink; (b) Z = (Z : i ∈ V ) be a vector of discrete random variables V i the users must reveal the total information H(V ) to the sink. indexed by V . User i privately observes an n-sequence Zn i The inequalities in (1) are called the SW constraints, which of the random source Z that is i.i.d. generated according to i describe the achievable rate region the joint distribution PZV . The users are required to send their observations to a sink node in a way so that the source T R r R|V | Zn (V )= { V ∈ : sequence V can be reconstructed by T . This problem is called multiterminal data compression or source coding with side r(X) ≥ H(X|V \ X), ∀X ( V, r(V )= H(V )}. information [3]. In Section III-A, we show that R(V ) 6= ∅ due to the sub- r Let V = (ri : i ∈ V ) be a rate vector that is indexed modularity of the entropy function [16]. In Section III-C, this r by V . Each dimension ri in V denotes the source coding nonemptiness of R(V ) will be interpreted from a coalitional Z rate of i, i.e., the expected coding length at which user i game-theoretic point of view. Zn Z r encodes his/her observation i of source i. We call V an achievable rate vector if the sink T is able to recover the Example 1. Assume there are three users V = {1, 2, 3} in Zn the system. They observe respectively source sequence V by letting the users transmit at the rates that are designated by r . The achievable rate region for the V Z = (W , W , W , W , W ), lossless data compression is characterized by the Slepian-Wolf 1 a b c d e Z W W W (SW) constraints as follows. 2 = ( a, b, f ), Z W W W A. Achievable Rate Region: Slepian-Wolf Constraints 3 = ( c, d, f ), Z For X, Y ⊆ V , let H( X ) be the amount of randomness 3In this paper, we consider the perfect case of data compression r(V ) = 2 in ZX measured by Shannon entropy [3] and H(ZX |ZY )= H(V ) when there is no information redundancy, which allows us to derive the decomposition property in Section IV-B. However, one can allow r(V ) ≥ 2In this paper, we take all logarithms to base 2 so that all entropies are H(V ), or r(C) ≥ H(C) for each subgame if the game is decomposable (see measured in bits. However, the results in this paper apply to all bases. Section IV-B), in the implementation. W W 3 where b is an independent random bit with H( b) = 10 , monotonic: H(X) ≥ H(Y ) for all X, Y ⊆ V such that W W 1 f is an independent random bit with H( f ) = 2 and all Y ⊆ X; (c) submodular: other Wj are independent uniformly distributed random bit, H(X)+ H(Y ) ≥ H(X ∩ Y )+ H(X ∪ Y ), X,Y ⊆ V. i.e., H(Wj )=1 for all j ∈{a,c,d,e}. The users are required Z Z to encode is so that V can be reconstructed at T . See Fig. 2. The polyhedron and base polyhedron of H are respectively In this system, the achievable rate region is characterized [17, Section 2.3] [18, Definition 9.7.1] by the SW constraints: |V | P (H, ≤)= {rV ∈ R : r(X) ≤ H(X), ∀X ⊆ V }, |V | R(V )= rV ∈ R : r(∅)=0, r B(H, ≤)= { V ∈ P (f, ≤): r(V )= H(V )}. r({1}) ≥ H({1}|{2, 3})=1, We have the achievable rate region coincides with the base r({2}) ≥ H({2}|{1, 3})=0, polyhedron of the entropy function H: r({3}) ≥ H({3}|{1, 2})=0, R(V )= B(H, ≤). 23 r({1, 2}) ≥ H({1, 2}|{3})= , 10 B. Coalitional Game Model r({1, 3}) ≥ H({1, 3}|{2})=3, Let the users in V be the players in a game. Instead of 1 being totally selfish, the players may cooperate with each other r({2, 3}) ≥ H({2, 3}|{1})= 2 to form coalitions, e.g., instead of encoding the source Zi by 24 r({1, 2, 3})= . him/herself, user i may form the group X with the other users 5 in X \{i} so as to encode ZX cooperatively. In this game, a subset is called a coalition and is the grand We have R(V ) 6= ∅. For example, r = (1, 9 , 2) is an X ⊆ V V V 5 coalition. Let be the characteristic cost function such that, achievable rate vector in R(V ). H for each coalition X ⊆ V , H(X) quantifies the outcome of III. COALITIONAL GAME coalition X: If all the players i ∈ X cooperate in X, i.e., if coalition X forms, the total cost incurred due to the source The relationship between the data compression problem and coding rates among the users in X is H(X). The coalitional the coalitional game formulation was previously revealed in game model for the multiterminal data compression problem is [9], [10] based on the SW constraints (1). It is shown that characterized by the ground set V that indexes all the players the achievable rate region R(V ) coincides with the core of a and the entropy function H. We denote the coalition game convex game which is necessarily nonempty. In this section, model by Ω(V,H). R we express the achievable rate region (V ) by converting In Ω(V,H), rV is a cost allocation method that describes the lower bounds in (1) to the upper bounds in terms of the how the total cost r(V ) is distributed to the individual users. entropy function, based on which, we propose a coalitional Here, the source coding rate ri is interpreted as the cost that game model. We show the equivalence of the core and the user i pays in participating the data compression problem in submodular base polyhedron of the entropy function, which V . The solution of the game Ω(V,H) is the core [19], [20] allows us to derive the decomposition property of the core in |V | Section IV and address a vital aspect missing from [9], [10]: C (V )= {rV ∈ R : r(X) ≤ H(X), ∀X ⊆ V, The reason for the users to be cooperative in the multiterminal r(V )= H(V )}, (3) data compression problem in V . which contains all cost allocation methods rV that distribute A. Base Polyhedron exactly cost H(V ) among the users in V . The inequality r(X) ≤ H(X) states that the total the cost in any coalition For X ⊆ V , consider the lower bound r(X) ≥ H(X|V \ X ( V is no greater than H(X), which ensures that no users X) in the SW constraints (1). Since we also restrict the total have incentive to break from the grand coalition and form a rate r(V ) = H(V ), we necessarily impose an upper bound smaller one [19]. In this sense, the core contains all cost/rate constraint on the sum-rate in V \ X allocation methods such that all users would like to take part in 4 r(V \X)= r(V )−r(X) ≤ H(V )−H(X|V \X)= H(V \X). the data compression problem in V . Based on the definition of the core (3), it is easy to see the equivalence of the achievable By converting the lower bounds in (1) for all X ⊆ V , we have rate region, the base polyhedron of H and the core: the SW constrained region fully characterized by the entropy R(V )= B(H, ≤)= C (V ). (4) function H : 2V 7→ R: r(X) ≤ H(X), ∀X ( V, This equivalence also gives rise to the main result in this (2) paper: The decomposition of B(H, ≤) is equivalent to the r(V )= H(V ). 4It is also show in [21, Theorem 8] that the core is a stable set of cost It is shown in [16] that the entropy function H is a allocation methods such that no coalitions X ( V have incentive to break polymatroid rank function: (a) normalized: H(∅)= 0; (b) from the grand coalition. convex game is nonempty. Therefore, for the multiterminal R(V )= B(H, ≤)= C (V ) data compression problem, we have Ω(V,H) being convex P (H, ≤) with a nonempty core C (V ), i.e., all the users would like to 2 cooperate with each other and, therefore, the grand coalition

3 7

r V forms necessarily. 1 Example 3. For the system in Example 1, assume that the users are working individually at the beginning so that user 0 i encodes the source Zi at the rate that is exactly equal to 2 4 H({i}), i.e., ri = H({i}) for all i ∈ V . However, it will 1 2 r2 not take long for user i to realize that he/she can reduce the 0 0 r1 source coding rate by cooperating with another user. Take user 3 for example: If user 3 works alone, the source coding rate is Fig. 3. The polyhedron P (H, ≤) and base polyhedron B(H, ≤) of the 5 entropy function H in the system in Example 1. Here, B(H, ≤) coincides r3 = H({3})= 2 ; If user 3 cooperates with user 2, we have with the achievable rate region R V for the lossless multiterminal data 19 ( ) r2 + r3 = r({2, 3}) = H({2, 3})= 5 , which means user 3 compression problem and the core C (V ) of the coalitional game Ω(V,H). Z 5 can encode 3 with a rate strictly lower than 2 and so as user 3 23 Z 2, e.g., the rate (r2, r3) = ( 2 , 10 ) is sufficient to encode {2,3}. Note, the cooperative behavior of users 2 and 3 in coalition decomposition of the game Ω(V,H), which results in the {2, 3} is essentially due to the positive mutual information decomposition of the Shapley value (See Section IV-B). 5 I({2}∧{3})= H({2})+ H({3}) − H({2, 3})= 2 > 0. Example 2. Consider the system in Example 1. Fig. 3 shows After coalition {2, 3} forms, users 2 and 3 will soon P (H, ≤) and B(H, ≤), the polyhedron and base polyhedron realize that, if they cooperate with user 1, all the users can of the entropy function H, respectively. The base polyhedron further reduce the source coding rates and therefore the grand B(H, ≤) coincides with the achievable rate region R(V ) and coalition V forms. In fact, by assuming that the users form the core C (V ) of the game Ω(V,H), which is a region on the any coalition X ( V , it can be shown that they would finally r |V | 24 plane { V ∈ R : r(V ) = 5 } that is bounded by a set of merge with others to form the grand coalition V . linear constraints. Example 3 reveals the reason for the nonemptiness of the In general, the core C (V ) of a coalition game could be core in the data compression problem: the nonnegativity of the empty, i.e., there exists at least one user that is unwilling mutual information. The mutual information can be considered to cooperate in the grand coalition; If C (V ) 6= ∅, it is as the information redundancy in the source coding, or data not necessarily a singleton set. Then, the two fundamental compression, problem, which can be minimized if the users questions in the coalitional game are: cooperatively encode the multiple sources in ZV . The worst (a) Is the core nonempty, or do all the users in V have case is when the mutual information is zero, i.e., when all Z incentive to cooperate in the grand coalition V ? the sources in V are mutually independent, where working (b) If the core is nonempty, can we find a fair cost/rate separately or cooperating in V are indifferent for the users. allocation method in the core C (V )? See the example below. In the next subsection, we show that question (a) has a Example 4. Assume the three users in V = {1, 2, 3} observe straightforward answer due to the convexity of the game respectively Ω(V,H). The Shapley value in Section IV answers question Z1 = (Wa), (b). Z2 = (Wb),

C. Convexity and Nonemptiness of Core Z3 = (Wc), According to the deﬁnition in [21, Section 2], a coalitional where W for all j ∈ {a,b,c} is an independent uniformly game is convex if and only if the characteristic cost function is j distributed random bit, i.e., the components in ZV are mu- submodular.5 Due to the fact that the entropy function H is a tually independent. Fig. 4 shows the core C (V ) reduces polymatroid rank function, a subgroup of submodular function, to a point rV = (1, 1, 1). It can be shown that forming it is straightforward that the game Ω(V,H) is convex. The any coalition X ⊆ V , including the singleton and grand convexity of the game Ω(V,H) is consistent with the results 6 coalition, necessarily results in ri = 1 for all i ∈ V , i.e., in [9], [10]. According to [21, Theorem 4], the core of a it makes no difference whether the users work separately or cooperate. Alternatively, encoding Z s separately is equivalent 5The deﬁnition in [21, Section 2] is based on the supermodularity of the i characteristic payoff function, which corresponds to the submodularity of the to encoding ZV cooperatively. characteristic cost function. 6The authors in [9], [10] formulated a coalitional game model Ω(V,H#) with the the dual set function H# being the characteristic payoff function, 7On the other hand, the base polyhedron of a submodular function is where the nonemptiness of the core C (V ) is due to the supermodularity of nonempty [17, Theorem 2.3]. So, due to the equivalence (4), C (V ) 6= ∅. H#. See Appendix A for the details. Or, the nonemptiness of C (V ) is essentially due to the submodularity of H. R(V )= B(H, ≤)= C (V ) R V B H, C V ( )= ( ≤)= ( ) P (H, ≤) P H, ( ≤) EX(V ) path to (1, 9, 2) 1 2 5 3 r 3 r 1

0 0 1 1.5 2 1 4 1 r2 0.5 2 0 0 r1 r2 0 0 r1

Fig. 4. The polyhedron P (H, ≤) and base polyhedron B(H, ≤) of the Fig. 5. The updated path of the source coding rate vector: (0, 0, 0) → entropy function H of a mutually independent multiple sources in Example 4. (0, 9 , 0) → (0, 9 , 2) → (1, 9 , 2) resulting from the Edmond greedy In this case, C (V )= {(1, 1, 1)} and it makes no difference for the users to 5 5 5 algorithm [22] for the permutation , , in the system in Example 1. work separately or cooperate with each other. Φ = (2 3 1) The corresponding explanation is in Example 5.

IV. SHAPLEY VALUE Example 5. We apply the Edmond greedy algorithm to the Since the core is not a singleton in general, the question system in Example 1 for the permutation Φ = (2, 3, 1). follows is how to select a cost allocation method, or source For φ1 = 2, we assign the source coding rate r2 = r C 9 coding rate vector, V in the core (V ) of the game Ω(V,H). H({2}) = 5 so that the sink node T obtain the source Zn In a coalitional game where the users are considered as peers, sequence 2 ; For φ2 = 3, we assign the source coding e.g., the sensor nodes in a WSN, a natural selection criterion is rate to user 3 conditioned on the information obtained by the fairness, i.e., we want to ﬁnd a rV ∈ C (V ) that distributes T , i.e., r3 = H({3}|{T }) = H({3}|{2}) = 2 so that Zn the total cost H(V ) as fairly as possible among the users in V . H({T }) = H({2, 3}) and {2,3} is recovered at T ; For There is a well-known answer to this problem: The Shapley φ3 = 1, user 3 just need to send the remaining information, r Zn value ˆV [14], a unique value that lies in the core of the convex i.e., r1 = H({1}|{T })= H({1}|{2, 3})=1 so that {1,2,3} game Ω(V,H), with each dimension being r 9 is recovered at T . The resulting rate vector V = (1, 5 , 2) is C |C|!(|V | − |C|− 1)! an extreme point in the core (V ). See Fig. 5. We have the rˆ = (H(C ⊔{i}) − H(C)), extreme point set i X |V |! C⊆V \{i} (5) 43 1 43 1 9 EX(V )= n( , , 0), ( , 0, ), (3, , 0), where ⊔ is the disjoint union. The Shapley value has been 10 2 10 2 5 (6) 9 23 5 13 5 adopted in [7] as a fair source coding scheme. r is fair in that ˆ (1, , 2), ( , 0, ), (1, , )o, it assigns each player his/her expected marginal cost, which 5 10 2 10 2 can be explained by its relationship with the extreme points where all rV ∈ EX(V ) can be determined by applying the in the core as follows. Edmond greedy algorithm for all |V |!=6 permutations of V .

A. Extreme Points The Shapley value assigns each user i the expected source coding rate rˆi resulting from the Edmond greedy algorithm Let ΦV = (φ1,...,φ|V |) be a permutation of the user in- by assuming that each order, or permutation, Φ to form the dices, e.g., Φ=(2, 3, 1, 4) is a permutation of V = {1,..., 4}. grand coalition is equiprobable. For a subset C ( V such that Consider the Edmond greedy algorithm [22]: For i =1 to |V |, i∈ / C, assume that the user i joins the coalition C ⊔{i} after do the coalition C is formed and, then, the rest of the users in rφi = H({φi}|{φ1,...,φi−1}) V \ (C ⊔{i}) join and form the grand coalition V . In the Edmond greedy algorithm, user i will be assigned the rate with rφ = H({φ }). The resulting rV is an extreme point 1 1 for the out of in the core C (V ) [22]. In fact, the Edmond greedy algorithm H(C ⊔{i}) − H(C) |C|!(|V | − |C|− 1)! |V |! states that an achievable rate vector in the core C can time, which explains the weight in (5). In other words, the (V ) r be reached if the users in encode the source with the side Shapley value ˆV is the average value over all the extreme V C information at the sink in order. This approach has also been points in the core (V ): r adopted in [8] for the distributed source coding problem. See r EX V V ˆr = P V ( ) also Example 5 below. Let EX(V ) denote the set of all the V |EX(V )| extreme points in the core C (V ). EX(V ) can be constructed r by applying the Edmond greedy algorithm for all permutations This is the reason that ˆV is called the gravity center of the of V . extreme points of the core in a convex game in [14]. R(V )= B(H, ≤)= C (V ) It is easy to see that, for any decomposer P of a decom- P (H, ≤) posable game Ω(V,H), the components in (ZC : C ∈ P) EX(V ) are mutually independent. For any X, Y ⊂ V such that Shapley value ˆrV X ∩ Y = ∅, let rX ⊕ rY = rX⊔Y be the direct sum of 2 r r r X and Y . For example, for {1,3} = (r1, r3) = (3, 7) 3

r r r r and {2,5,6} = (r2, r5, r6) = (5, 2, 4), {1,3} ⊕ {2,5,6} = 1 r {1,2,3,5,6} = (3, 5, 7, 2, 4). We have a decomposable core C (V ) for a decomposable game Ω(V,H). 0 2 Lemma 1 ( [17, Theorems 3.15, 3.32 and 3.38, Corollary 4 1 3.40], [21, Theorem 6]). For a decomposer P of a decompos- 2 r2 able convex game Ω(V,H), the core C (V ) is decomposable: 0 0 r1 C (V )= ⊕C∈P C (C) Fig. 6. The Shapley value ˆr = ( 53 , 9 , 5 ) for the system in Example 1. V 20 10 4 r r C ˆrV is a fair rate allocation method in the core C (V ). In fact, ˆrV is the gravity = {⊕C∈P C : C ∈ (C), C ∈ P}, center of all the extreme points in EX(V ). where C (C) is the core of the subgame Ω(C,H); C (V ) has dimension |V | − |P∗| where P∗ is the finest decomposer.10 Example 6. For the system in Example 1, we have the Shapley We also have the decomposable Shapley value ˆrV due to C r 53 9 5 value in core (V ) being ˆV = ( 20 , 10 , 4 ) in Fig. 6, which the decomposition of the core C (V ) as follows. The proof of Pr rV V EX(V ) Theorem 1 is in Appendix B. is exactly the value of |EX(V )| , where the extreme point set EX is enumerated in (6). (V ) Theorem 1. If Ω(V,H) is decomposable, we have ˆr = ⊕ ˆr B. Decomposition V C∈P C r Although the Shapley value is the desired fair cost allocation for all decomposers P, where ˆC is the Shapley value in the C method in the core C (V ), calculating it might be cumbersome core (C) of subgame Ω(C,H). in large scale systems. Assume that the value of H(X) for Remark 1. Theorem 1 is a recast of [14, Corollary 2], any X ⊆ V can be obtained by an oracle call and δ refers which is derived in terms of the characteristic payoff function to the upper bound on the computation time of this oracle based on the concept of carriers and is proved by the the call. Obtaining the Shapley value in (5) requires the value of additivity of the Shapley value in [14, Axiom 3].11 But, |V | 8 H(X) for all X ⊆ V so that the complexity is O(2 · δ). Theorem 1 is derived for the decomposable game in terms Then, there is a problem of how to alleviate this exponentially of the characteristic cost function and the proof is based on growing complexity in large |V |. On the other hand, when the the decomposition of the core C (V ) in Lemma 1. system size |V | is large, it is very possible that the coalitional r game is decomposable: The users’ cooperating in smaller According to Theorem 1, the Shapley value ˆV in a decom- coalitions is equivalent to cooperating in the grand coalition. posable game Ω(V,H) can be obtained separately for each In this section, we show that the calculation of the Shapley subgame Ω(C,H) for all coalitions C in a decomposer P. In value in a decomposable game is separable, which results in Section IV-D, we show the advantage of this separation: the a reduction in computational complexity. reduction in computational complexity. Definition 1 (Decomposable Convex Game [17, Theorems Example 7. For the system in Example 1, the game Ω(V,H) is 3.32 and 3.38, Lemma 3.37]9). The convex game Ω(V,H) indecomposable since {V } is the only decomposer. Therefore, C is decomposable if we have |V | − |{V }| = 2 and the core (V ) in Fig. 3 is a 2-dimensional plane. For the system in Example 4, the game H(V )= H(C), Ω(V,H) is decomposable with all partitions P of V being X ∗ C∈P the decomposer and the finest one is P = {{1}, {2}, {3}}. Therefore, the dimension of the core C (V ) in Fig. 4 is for some proper partition P of user set V . Here, P is called a decomposer and each C ∈P forms a subgame Ω(C,H) that 10The decomposition of C (V ) was originally derived in [17, Theorems is convex. A game Ω(V,H) is indecomposable if it has only 3.15 and 3.38, Corollary 3.40] as the separation of the base polyhedron in a one decomposer {V }. disconnected submodular system. We express it in terms of C (V ) in Lemma 1 due to the equivalence (4). Also, for a decomposable convex game, the finest decomposer uniquely exists. See Appendix A. For an indecomposable game 8We assume that the time complexity of the multiplication and summation Ω(V,H), the core C (V ) has the full dimension |V |− 1 [21, Theorem 6(a)] operations in (5) is much less than that of the oracle call of H. since P∗ = {V } is the only one decomposer. 9This definition is based on the concept of the separator of a disconnected 11Based on the definition in [14, Section 2], a carrier corresponds to a submodular system in [15], [17], where the minimal separators are also called coalition C in a decomposer P. The additivity in [14, Axiom 3] is not a the elementary separators in [15, Section 3] and the principal partition of a property specifically for the decomposable game. Therefore, the application polymatroid in [23]. of [14, Corollary 2] to decomposable games is not explicit in [14]. R(V )= B(H, ≤)= C (V )H Algorithm 1: Finest Decomposer [15, Algorithm 2.6]12 P (H, ≤) input : a convex coalitional game Ω(V,H) EX(V ) ∗ r output: P , the finest decomposer, if Ω(V,H) is Shapley value ˆV decomposable or P∗ = {V } if Ω(V,H) is indecomposable and an extreme point rV ∈ C (V ) 3 1 r 1 choose any permutation Φ; ˆ 2 rφ1 ← H({φ1}) and Xφ1 ← {φ1}; 3 for i = 2 to |V | do 0 ˆ 4 Xφi ← {φ1,...,φi}; 1 5 ˆ ˆ 2 rφi ← H(Xφi ) − H(Xφi \ {φi}); 0.5 1 6 for j = 1 to i − 1 do r2 ˆ ˆ 0 0 r1 7 if r(Xφi \ {φi−j })= H(Xφi \ {φi−j }) then ˆ ˆ 8 Xφi ← Xφi \ {φi−j } ; Fig. 7. The core C (V ) of the decomposable game Ω(V,H) in the system 9 endif in Example 8. The finest decomposer is P∗ = {{1, 3}, {2}} so that the 10 endfor C ∗ r r dimension of (V ) is |V |−|P | = 1 and the Shapley value ˆV = ˆ{1,3} ⊕ 11 endfor 3 11 rˆ2 = ( , 1, ). ∗ ˆ 2 10 12 P ← {Xφi : i ∈ V } and keep merging any two intersecting elements in P∗ until there is no left; ∗ 13 return P and rV ; |V | − |P∗| = 0: C (V ) reduces to a point (1, 1, 1) which is necessarily the Shapley value. ˆ Example 8. Assume that the users in V = {1, 2, 3} observe X3 are intersecting, they are merged and result in the finest ∗ respectively decomposer P = {{1, 3}, {2}}. Consider the indecomposable game in Example 1. For the Z1 = (Wa, Wb), Z W permutation Φ = (2, 3, 1), Algorithm 1 returns an extreme 2 = ( d), r 9 ˆ ˆ point V = (1, 5 , 2). We also have X1 = {1, 2, 3}, X2 = {2} Z3 = (Wb, Wc), ∗ and Xˆ3 = {2, 3}, all of which are merged so that P = W W 3 {{1, 2, 3}}, which means the game is indecomposable. where c is an independent random bit with H( c)= 5 and W all other j are independent uniformly distributed random D. Complexity bit. In this system, the game is decomposable with the finest For the finest decomposer P∗ of a decomposable convex decomposer P∗ = {{1, 3}, {2}}. It is easy to see why P∗ = game Ω(V,H), let Cˆ = argmax{|C|: C ∈ P∗} be the {{1, 3}, {2}}: Z and Z are mutually independent. The {1,3} 2 maximum size of the subgame. The separable calculation of core C (V ) in Fig. 7 has the dimension |V | − |P∗| = 1 and the Shapley value ⊕C∈P∗ ˆrC according to Theorem 1 reduces reduces to a line segment. We calculate the Shapley value V |V | Cˆ r the complexity from O(2| | ·δ) to O( ·2| | ·δ). We remark separately for each subgame Ω(C,H): ˆ{1,3} = (ˆr1, rˆ3) = |Cˆ| 3 11 r |V | |Cˆ| ( 2 , 10 ) for {1, 3} and rˆ2 = 1 for {2}. Then, ˆ{1,3} ⊕ rˆ2 = that O( ˆ · 2 · δ) is an upper bound in that (a) obtaining 3 11 |C| (ˆr1, rˆ2, rˆ3) = ( 2 , 1, 10 ) is the Shapley value of game Ω(V,H). r ˆ the Shapley value ˆCˆ for subgame Ω(C,H) completes in O(2|Cˆ| · δ) time and (b) there are at most |V | such subgames. |Cˆ| C. Determining Finest Decomposer For example, for the system in Example 8, if obtaining the To utilize the decomposition property of a decomposer, Shapley value for each element in the finest decomposer the first question is how to determine a decomposer. In real P∗ = {{1, 3}, {2}}, we can reduce the 23 =8 oracle calls of applications, the decomposer would not be as obvious as in H to 22 +2=6. Example 8. In fact, we would not have any prior knowledge Experiment 1. Consider the WSN in Fig. 1. We set the number of the multiple random sources. In this case, whether or not of clusters to 20. For each cluster, we fix H(V )=50 and vary the game Ω(V,H) is decomposable and the finest decomposer ∗ the number of sensors |V | from 5 to 15. For each value of |V |, P , if Ω(V,H) is decomposable, can be determined by Al- we do the followings for each cluster: gorithm 1 that completes in O(|V |2 · δ) time, which does not (a) randomly generate random sources ZV so that the game require the knowledge of ZV . Note, Algorithm 1 also returns Ω(V,H) is decomposable; an extreme point in the core rV ∈ C (V ) since steps 2, 4 and 5 are in fact implementing the Edmond greedy algorithm. 12Algorithm 1 is adapted from [15, Algorithm 2.6]. It has also been independently proposed in [23] [17, Secton 3.3]. The finest decomposer has Example 9. For the system in Example 8, by applying different names: It is called the minimal separators of a polymatroid in [15, Algorithm 1 for the permutation Φ = (3, 2, 1), we have Section 3], the principal partition of a polymatroid in [23, Section 3(b)] and r 8 C the finest partition of a disconnected submodular system in [17, Secton 3.3]. V = (1, 1, 5 ), which is an extreme point in the core (V ) ˆ ˆ The validity of Algorithm 1 is due to the lattice structure of the minimizer as shown in Fig. 7. We also have X1 = {1, 3}, X2 = {2} set of the problem min{H(X) − r(X): i ∈ X ⊆ V } for all i ∈ V and and Xˆ3 = {3} at the end of each iteration. Since Xˆ1 and rV ∈ B(H, ≤). Please see [23] for further details. ·104 1.2 r H r completion time of calculating ˆV 3 complexity of obtaining ˆV 1 completion time of calculating ⊕C∈P∗ˆrC complexity of obtaining ⊕C∈P∗ˆrC 0.8

2 0.6

0.4 1 0.2 completion time in seconds

number of oracle calls of 0 0 5 6 7 8 9 10 11 12 13 14 15 5 6 7 8 9 10 11 12 13 14 15 V |V |, number of sensors in each cluster | |, number of sensors in each cluster

Fig. 9. The completion time comparison between the two methods for obtain- Fig. 8. The complexity comparison between the two methods for obtaining r r ing the Shapley value in decomposable games in Experiment 2: calculating ˆV the Shapley value in decomposable games in Experiment 1: obtaining ˆV in r ∗ ∗ in (5) and ⊕C∈P∗ ˆC for the ﬁnest decomposer P according to Theorem 1. (5) and ⊕C∈P∗ ˆrC for the ﬁnest decomposer P according to Theorem 1. Here, ˆrC is calculated by each subgame Ω(C,H) in parallel. We assume Note, the complexity of obtaining ⊕ ∗ ˆr includes the number of oracle C∈P C that all values of H have been obtained in Experiment 2, i.e., the completion calls of H for determining ∗ in Algorithm 1. P time does not include the oracle calls of H.

r Pr ′ rC (b) obtain the Shapley value ˆV in (5) and record the number C ∈EX (C) is an approximation of ˆr . Then, ⊕ ∗ ′ , the of calls of H; C C∈P |EX (C)| approximation of ˆr , can be obtained in time quadratic in the (c) run Algorithm 1 to determine the finest decomposer P∗ V ∗ maximal subgame size |Cˆ|. and obtain the Shapley value ˆrC for each C ∈ P according to Theorem 1. Record the number of calls of V. CONCLUSION H in both Algorithm 1 and obtaining ⊕ ∗ˆr . C∈P C The coalition game model for the multiterminal data com- We plot the mean number of oracle calls of H of two methods pression problem was formulated in term of the characteristic over clusters in Fig. 8. It can be seen that the separate cost function, being the entropy function. The convexity of calculation of the Shapley value in a decomposable convex the game and the nonemptiness of the core followed from the game Ω(V,H) contributes to a considerable reduction in submodularity of the entropy function. For a decomposable computational complexity as |V | grows. game, the direct sum of the Shapley values in the subgames It should be noted that overall O( |V | · 2|Cˆ| · δ) oracle calls constituted the Shapley value in the entire system. In this |Cˆ| can be distributed among the subgames Ω(C,H). And, the case, the complexity is essentially determined by the subgame of the maximum size and the finest decomposer determines computations of ˆrC can be done in a decentralized manner. For example, for the system in Example 8, the users in {1, 3} the least computational complexity. The study showed that r the finest minimizer can be determined in quadratic time and and user 2 can calculate ˆ{1,3} and rˆ2, respectively, in parallel. the reduction in complexity by the decomposition method Experiment 2. In Experiment 1, we allow the subgames for obtaining the Shapley value is large when the number of ∗ r Ω(C,H) for all C ∈ P in each cluster to obtain ˆC in users increases. The incentive for the users to cooperate was r parallel so that the completion time of calculating ⊕C∈P∗ ˆC also provided in this paper: The least source coding rates are is determined by |Cˆ|, the maximum size of subgames. After attained in the multiterminal data compression problem only obtaining the values of H by the oracle calls in Experiment 1, if the users cooperatively encode the multiple random sources. r the average run time for calculating the values of ˆV and One extension of the work in this paper is the design of the ∗ r 13 ⊕C∈P ˆC over clusters is shown in Fig. 9. It can be shown source coding scheme for the Shapley value: determining the ∗ r that it is much faster to get ⊕C∈P ˆC by allowing parallel codeword for source Zi at rate rî for all i ∈ V . computation. APPENDIX A In addition to the decomposition property, some Shapley V value approximation methods, e.g., [24]–[26], can be applied For the entropy function H : 2 7→ R+, the dual set # to each subgame Ω(C,H) to further reduce the complexity. function H is defined as [17, Section 2.3] For example, based on the random permutation method in H#(X)= H(V ) − H(V \ X)= H(X|V \ X) [24], random samples of the permutations of C, with the size 14 # quadratically growing in |C|, are generated so that an extreme and is supermodular. The polyhedron P (H , ≥) contains Pr ′ rC ′ C ∈EX (C) all SW constrained rate vectors with sum-rate r(V ) ≥ H(V ), point subset EX (C) ( EX(C) is obtained and ′ |EX (C)| while the base polyhedron B(H#, ≥) contains all SW con- 13We run Experiment 2 in MATLAB 2017a by a desktop computer with strained rate vectors with exact sum-rate r(V ) = H(V ), i.e., Intel Core i7-6600U processer, 8Gb RAM and 64-bit Windows 10 Enterprise operating system. 14f is supermodular if −f is submodular. 8 ⊕C∈P C (C) in Lemma 1. Then,

# P (H , ≥) r rV 6 V ∈EX(V ) # ˆr = P B(H, ≤)= B(H , ≥) V |EX(V )|

2 4 r r r V P V ∈⊕C∈P EX(C) P (H, ≤) = |⊕C∈P EX(C)| 2 Shapley value ˆrV ′ r EX(V ) ⊕C∈P ′ ′ |EX(C )| r C QC ∈P : C 6=C P C ∈EX(C) 0 = 0123456 |EX(C)| QC∈P r1 r rC ∈EX(C) C = ⊕C∈P P Fig. 10. The polyhedra P (H, ≤) and P (H#, ≥) of the entropy function H |EX(C)| # and its dual set function H , respectively. The base polyhedra B(H, ≤) and = ⊕C∈PˆrC . # B(H , ≥) coincide. The Shapley value ˆrV is also called the symmetric rate vector in [12], [13]. Theorem 1 holds.

REFERENCES [1] Z. Xiong, A. D. Liveris, and S. Cheng, “Distributed source coding for the SW constraints in [4], [5] are given in terms of the dual sensor networks,” IEEE Signal Process. Mag., vol. 21, no. 5, pp. 80–94, set function H#. So is the coalition game model proposed in Sep. 2004. [9], [10].15 Based on the equivalence B(H, ≤) = B(H#, ≥) [2] S. El Rouayheb, A. Sprintson, and P. Sadeghi, “On coding for cooperative data exchange,” in Proc. IEEE Inf. Theory Workshop, Cairo, Egypt, [17, Lemma 2.4], the data compression problem is studied in 2010, pp. 1–5. terms of the entropy function H in this paper. [3] T. M. Cover and J. A. Thomas, Elements of information theory. Hoboken, NJ: John Wiley & Sons, 2012. Example 10. For a two terminal source with the entropy [4] D. Slepian and J. Wolf, “Noiseless coding of correlated information sources,” IEEE Trans. Inf. Theory, vol. 19, no. 4, pp. 471–480, Jul. function: H(∅) = 0, H({1}) = 4, H({2}) = 6 and 1973. H({1, 2})=7. We plot polyhedra P (H, ≤) and P (H#, ≥) in [5] T. Cover, “A proof of the data compression theorem of slepian and wolf Fig. 10, where B(H, ≤)= B(H#, ≥) is the SW constrained for ergodic sources,” IEEE Trans. Inf. Theory, vol. 21, no. 2, pp. 226– 228, Mar. 1975. achievable rate region R(V ). [6] A. Sprintson, P. Sadeghi, G. Booker, and S. El Rouayheb, “A ran- domized algorithm and performance bounds for coded cooperative data Let Π(V ) be the set of all partitions of V . We have H(V )= exchange,” in Proc. IEEE Int. Symp. Inf. Theory, Austin, TX, 2010, pp. 1888–1892. min V H(C) [17, Theorem 2.6]. Each minimizer P∈Π( ) PC∈P [7] T. Srisooksai, K. Keamarungsi, P. Lamsrichan, and K. Araki, “Practical is a decomposer of the game Ω(V,H) [17, Theorems 3.36, data compression in wireless sensor networks: A survey,” J. Network 3.38 and 3.39]16 and all minimizers form a partition lattice and Comput. Applicat., vol. 35, no. 1, pp. 37 – 59, Jan. 2012. [27, Definition 3.9],17 where the coarsest and finest partitions [8] D. Schonberg, K. Ramchandran, and S. S. Pradhan, “Distributed code constructions for the entire Slepian-Wolf rate region for arbitrarily uniquely exist. This explains the uniqueness of the finest correlated sources,” in Proc. Data Compression Conf., Snowbird, UT, decomposer of a decomposable convex game in [21]. It is for 2004, pp. 292–301. sure that {V } always belongs to this partition lattice. Based [9] M. Madiman, “Playing games: A fresh look at rate and capacity regions,” in Proc. IEEE Int. Symp. Inf. Theory, Toronto, Canada, 2008, pp. 2528– on Definition 1, Ω(V,H) is decomposable if the partition 2532. lattice contains partitions other than {V }; otherwise, Ω(V,H) [10] ——, “Cores of cooperative games in information theory,” EURASIP J. is indecomposable. Wireless Commun. Networking, vol. 2008, no. 1, p. 318704, 2008. [11] I. Dietrich and F. Dressler, “On the lifetime of wireless sensor networks,” ACM Trans. Sensor Networks, vol. 5, no. 1, pp. 5:1–5:39, Feb. 2009. [12] S. S. Pradhan and K. Ramchandran, “Distributed source coding: symmetric rates and applications to sensor networks,” in Proc. Data Com- pression Conf., Snowbird, UT, 2000, pp. 363–372. APPENDIX B [13] M. Sartipi and F. Fekri, “Distributed source coding in wireless sensor networks using LDPC coding: the entire slepian-wolf rate region,” in IEEE Wireless Commun. Networking Conf., vol. 4, New Orleans, LA, 2005, pp. 1939–1944. For a decomposer P ∈ Π′(V ), we have EX(V ) = [14] L. S. Shapley, “A value for n-person games,” Contributions to the Theory ⊕ EX(C) due to the decomposition of the core C (V ) = of Games, vol. 2, no. 28, pp. 307–317, 1953. C∈P [15] R. E. Bixby, W. H. Cunningham, and D. M. Topkis, “The partial order of a polymatroid extreme point,” Math. Oper. Res., vol. 10, no. 3, pp. 367–378, 1985. [16] S. Fujishige, “Polymatroidal dependence structure of a set of random 15The authors in [9], [10] formulated a game model Ω(V,H#) directly variables,” Inf. Control, vol. 39, no. 1, pp. 55 – 72, Oct. 1978. based on the SW constraints, where the dual set function H#(X) quantifies [17] ——, Submodular functions and optimization, 2nd ed. Amsterdam, the payoff incurred when the users cooperate in coalition X. The convexity The Netherlands: Elsevier, 2005. of the game was shown based on the supermodularity of H#. [18] H. Narayanan, Submodular functions and electrical networks. Amster- 16Decomposable convex game corresponds to the disconnected submodular dam, The Netherlands: Elsevier, 1997. system that is defined in [17, Theorems 3.36, 3.38 and 3.39]. [19] Y. Shoham and K. Leyton-Brown, Multiagent systems: Algorithmic, 17It is called the Dilworth truncation lattice in [27, Definition 3.9] since game-theoretic, and logical foundations. New York: Cambridge the minP∈Π(V ) PC∈P H(C) is called the Dilworth truncation of H. University Press, 2008. [20] L. S. Shapley and M. Shubik, “On market games,” J. Econ. Theory, vol. 1, no. 1, pp. 9–25, Jun. 1969. [21] L. S. Shapley, “Cores of convex games,” Int. J. Game Theory, vol. 1, no. 1, pp. 11–26, Dec. 1971. [22] J. Edmonds, “Submodular functions, matroids, and certain polyhedra,” in Combinatorial Optimization—Eureka, You Shrink! Berlin, Germany: Springer, 2003, pp. 11–26. [23] S. Fujishige, “Principal structures of submodular systems,” Discrete Appl. Math., vol. 2, no. 1, pp. 77 – 79, 1980. [24] D. Liben-Nowell, A. Sharp, T. Wexler, and K. Woods, “Computing Shapley value in supermodular coalitional games,” in Proc. 18th Annu. Int. Conf. Comput. Combinatorics, Sydney, Australia, 2012, pp. 568– 579. [25] V. Conitzer and T. Sandholm, “Computing Shapley values, manipulating value division schemes, and checking core membership in multi-issue domains,” in Proc. 19th Nat. Conf. Artif. Intell., San Jose, CA, 2004, pp. 219–225. [26] S. S. Fatima, M. Wooldridge, and N. R. Jennings, “A linear approximation method for the shapley value,” Artif. Intell., vol. 172, no. 14, pp. 1673–1699, Sep. 2008. [27] H. Narayanan, “The principal lattice of partitions of a submodular function,” Linear Algebra its Appl., vol. 144, pp. 179 – 216, Jan. 1991.